How To View Number of PyPI Package Downloads by Python Version
Building Python packages that support both Python 2 and 3 is time consuming and at some point you are going to ask the question “Can I just build this thing for Python 3?” Ideally, the answer to that question is “Yes!”. But, why not make it a data-driven decision? You may be in a position to only ship with a Python 3 version and require that your users use Python 3.x+. If you are, then great!, do that. If not, then you’ll probably want to see how many users have downloaded your package with 2 and then make your decision.
It took me a few mins to figure out how to get this data. Hope this helps you out.
What I learned
- PyPI does not give you this data.
- Google BigQuery does here: https://bigquery.cloud.google.com/dataset/the-psf:pypi
- You need to sign up for Google Cloud Platform, Create a Project, and Enable BigQuery API for that project to run BigQuery API queries.
- BigQuery API free allowance only allows you to run the following query a couple of times before hitting limit. So use it wisely.
How to run the query
Go to: https://bigquery.cloud.google.com/dataset/the-psf:pypi
Sign In
Accept Terms

Create Project

Accept More Terms

Click “Create Project”

Enter Project Name, Click Create

View Notification for Creating Project

Refresh Page
Click on Project

Click Hamburger Icon, hover over API & Services, Click Dashboard

Click “View All” link to the right

Search for ‘big’

Click on BigQuery API

Click “Enable” button

Go to: https://bigquery.cloud.google.com/dataset/the-psf:pypi
If you see “Unable to find dataset the-psf:pypi”, then that means you probably haven’t BigQuery API. See above for how to enable that.

Click “Compose Query”

Copy and Paste this query into New Query
SELECT REGEXP_EXTRACT(details.python, r"[0-9]+\.[0-9]+") AS python_version, COUNT(*) AS downloadsFROM `the-psf.pypi.downloads*`WHERE file.project="iotedgedev"GROUP BY python_versionORDER BY downloads DESC
Change ‘iotedgedev’ to the name of your PyPI Package
Click “Show Options”

Uncheck “Use Legacy SQL”

Click “Run Query”

View Results

From https://langui.sh/2016/12/09/data-driven-decisions/ null = downloads from PyPI using clients that do not support sending the statistics we’re querying against. This can be an older version of pip or alternate clients. You also see 341 downloads from 1.17, which is…who knows! When making maintenance decisions you should factor these unknowns as you feel appropriate.
The following sites were helpful
- https://github.com/tswast/code-snippets/blob/master/2018/python-community-insights/Python Community Insights.ipynb
- https://kirankoduru.github.io/python/pypi-stats.html
- https://stackoverflow.com/questions/38102317/why-pypi-doesnt-show-download-stats-anymore
- https://cloud.google.com/blog/big-data/2017/05/try-google-bigquery-today-now-with-10gb-of-free-storage
- https://cloud.google.com/billing/docs/how-to/modify-project#change_the_billing_account_for_a_project
- https://packaging.python.org/guides/analyzing-pypi-package-downloads/
- https://langui.sh/2016/12/09/data-driven-decisions/
Hope this helps you out.
Jon