How To View Number of PyPI Package Downloads by Python Version

Building Python packages that support both Python 2 and 3 is time consuming and at some point you are going to ask the question “Can I just build this thing for Python 3?” Ideally, the answer to that question is “Yes!”. But, why not make it a data-driven decision? You may be in a position to only ship with a Python 3 version and require that your users use Python 3.x+. If you are, then great!, do that. If not, then you’ll probably want to see how many users have downloaded your package with 2 and then make your decision.

It took me a few mins to figure out how to get this data. Hope this helps you out.

What I learned

  1. PyPI does not give you this data.
  2. Google BigQuery does here: https://bigquery.cloud.google.com/dataset/the-psf:pypi
  3. You need to sign up for Google Cloud Platform, Create a Project, and Enable BigQuery API for that project to run BigQuery API queries.
  4. BigQuery API free allowance only allows you to run the following query a couple of times before hitting limit. So use it wisely.

How to run the query

  1. Go to: https://bigquery.cloud.google.com/dataset/the-psf:pypi
  2. Sign In
  3. Accept Terms

  4. Create Project
  5. Accept More Terms
  6. Click “Create Project”
  7. Enter Project Name, Click Create
  8. View Notification for Creating Project
  9. Refresh Page
  10. Click on Project
  11. Click Hamburger Icon, hover over API & Services, Click Dashboard
  12. Click “View All” link to the right
  13. Search for ‘big’
  14. Click on BigQuery API
  15. Click “Enable” button
  16. Go to: https://bigquery.cloud.google.com/dataset/the-psf:pypi
    If you see “Unable to find dataset the-psf:pypi”, then that means you probably haven’t BigQuery API. See above for how to enable that.
  17. Click “Compose Query”
  18. Copy and Paste this query into New Query
SELECT
  REGEXP_EXTRACT(details.python, r"[0-9]+\.[0-9]+") AS python_version,
  COUNT(*) AS downloads
FROM `the-psf.pypi.downloads*`
WHERE file.project="iotedgedev"
GROUP BY python_version
ORDER BY downloads DESC

  1. Change ‘iotedgedev’ to the name of your PyPI Package
  2. Click “Show Options”
  3. Uncheck “Use Legacy SQL”
  4. Click “Run Query”
  5. View Results

From https://langui.sh/2016/12/09/data-driven-decisions/
null = downloads from PyPI using clients that do not support sending the statistics we’re querying against. This can be an older version of pip or alternate clients. You also see 341 downloads from 1.17, which is…who knows! When making maintenance decisions you should factor these unknowns as you feel appropriate.

The following sites were helpful

Hope this helps you out.

Jon