PEP 594 Will Break 1 in 30 PyPI Packages
PEP 594 outlines the plan to deprecate and remove packages from the Python standard library. If accepted in its current form, PEP 594 will break 3.8% of all Python 3 packages on PyPI.
As of May 2019 there are 3,604 Python 3 packages on PyPI (out of 94,680) that import packages deprecated by PEP 594.
total packages: 181225 valid packages: 178642 python3 packages: 94680 scanned packages: 92779 broken packages: 3604
The majority of these packages import the parser
or imp
package. If those
two packages were removed from PEP 594, the number of broken packages drops to
828.
Is 3,600 broken packages enough to block adoption of the PEP? I don't know. I just wanted to make sure that the impact the PEP was known before a final decision.
Reproduce the Results
Before you begin, you'll need Python >= 3.6 and a recent version of Go to run the code. All commands should be run inside a cloned checkout of the dead-batteries repository. First, build the binary.
git clone https://github.com/kyleconroy/dead-batteries.git cd dead-batteries go build
Next, you'll download the metadata for every package on PyPI, which will take up about ~500MB of space.
./dead-battery mirror
simple.html
: contains a list of all known packages on PyPI
meta/*.json
: contains package metadata for each packagepython3-packages.json
: contains a list of packages that support Python3
With a complete mirror, you're now ready to scan packages for imports of deprecated packages. Open up a new shell to install and run the Python service.
python3 -m venv venv venv/bin/pip install flask venv/bin/pip install gunicorn venv/bin/pip install gunicorn[gevent] venv/bin/gunicorn -k gevent -w 10 -b 127.0.0.1:4000 imports:app
The service exposes a HTTP/JSON interface to the ast
package. It parses a
given file and returns any deprecated imports as well as parsing errors.
// INPUT { "path": "/path/to/python/file" } // OUTPUT { "imports": { "imp": 2 }, "errors": { "syntax-error": 2 } }
With that running, you can now start the scan. On my laptop, the scan took
about an hour to complete. The output is continually saved to results.json
.
./dead-battery scan
Once the search process is complete, generate the package statistics.
./dead-battery stats
Two new files have been created: import.csv
and packages.json
. The CSV file
contains the total number of imports for each deprecated standard library
package. The JSON file contains a list of every package that imports one of
these deprecated packages, along with a link to the package on PyPI.
A quick jump into the Python interpreter gives us the total number of packages affected by PEP 594.
>>> import json >>> len(json.load(open('packages.json'))) 3604
Methodology
Since PEP 594 only affects Python 3, I needed to filter out packages that don't
support Python 3. For each package, I first looked for any classifiers with the
prefix Programming Language :: Python :: 3
. Next, I checked the
python_version
of the latest release. It's a bit messy, but the code to do so
can be found here.
{ "info": { "classifiers": [ "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3" ], "version": "2.22.0" }, "releases": { "2.22.0": [ { "packagetype": "bdist_wheel", "python_version": "py2.py3", "url": "https://files.pythonhosted.org/.../requests-2.22.0-py2.py3-none-any.whl" } ] } }
If you know a better way to check for Python 3 compatibility, please reach out