This question already has answers here:
What is a Python egg?
(4 answers)
Closed 6 years ago.
When I need a Python library, I use pip to fetch it from PyPi and if I create a project and want to share it, I just need to have in place the setup.py file and that would make it easily installable. Therefore, I was wondering what is the use case for egg or wheel packages.
The Python Packaging User Guide has to say the following on this topic:
Wheel and Egg are both packaging formats that aim to support the use case of needing an install artifact that doesn’t require building or compilation, which can be costly in testing and production workflows.
These formats can be used to distribute packages that contain binary extension modules. These would otherwise require compilation during installation.
If no compilation is involved a source distribution is in principle sufficient, but the user guide still recommends to create a wheel for performance reasons:
Minimally, you should create a Source Distribution:
python setup.py sdist
A “source distribution” is unbuilt (i.e, it’s not a Built Distribution), and requires a build step when installed by pip. Even if the distribution is pure python (i.e. contains no extensions), it still involves a build step to build out the installation metadata from setup.py.
[...]
You should also create a wheel for your project. A wheel is a built package that can be installed without needing to go through the “build” process. Installing wheels is substantially faster for the end user than installing from a source distribution.
In short, packages are a convenience thing - mostly for the user.
Wheel packages unify the process of distributing and installing projects that contain pure python, platform dependent code, or compiled extensions. The user does not need to worry if the package is written in Python or in C - it just works.
Egg packages are an older standard, you should ignore them nowadays. Use pip install . instead of ./setup.py install to prevent creating them. (addendum: They are also .zips in disguise, from which Python reads package data — not exactly the most performant solution)
Wheel packages, on the other hand, are the new standard. They allow for creation of portable binary packages for Windows, macOS, and Linux (yes, Linux!). Nowadays, you can just do pip install PyQt5 (as an example) and it will just work, no C++ compiler and Qt libraries required on the system. Everything is pre-compiled and included in the wheel. Non-binary packages also benefit, because it’s safer not to run setup.py (all the metadata is in the wheel). (addendum: those are also .zips, but they are unpacked when installed)
Related
I am trying to create a binary python package that can be installed without compilation. The python package consists only one extension module written in C using the Python API. The extension module depending on the stable python ABI by using Py_LIMITED_API with 0x03060000 (3.6). Up to my knowledge, this means the extension module can work for all CPython versions that are not older than 3.6. I managed to create the sdist package, and I explored dumb, egg and wheel formats. I managed to create the binary packages, but none of them is perfect for my use case.
The problem is the extension module is depending on libssl.so, and this is the only "external" dependency of it. Because the python package itself is very stable, it doesn't require frequent releases. Therefore I wouldn't like to include libssl.so and take the burden of releasing new versions because of the security updates of OpenSSL (not mentioning to educate the users to update their python package regularly). I think because of this, the wheel format is not suitable, because the linux_*.whl packages cannot be uploaded to PyPI, but the manylinux2014_*.whl tag has to include libssl.so (and its dependencies) in the package. The dumb package are not suitable for PyPI based distribution, the egg format is not supported by pip, so they are also not suitable.
Because of the stable ABI and the widespread of libssl.so, I think it should be possible to release a single binary package for the most linux distributions and multiple python versions (similarly to the manylinux tags with wheel). Of course the pacakge would require libssl to be installed on the machine, but that is something I can accept for achieving better security. And that's where I am stuck.
My question is how can I create a binary python package, which
contains a python extension module,
depends on the system-wide installed libssl.so,
and can uploaded to and installed via pip and PyPI?
I tried to explore other possibilities, but I couldn't find anything else, so if you have any tips for other formats to look after, I would appreciate that also.
I'm trying to create and distribute (with pip) a Python package that has Python code, and C++ code compiled to a .pyd file with Pybind11 (using Visual Studio 2019). I also want to include .pyi stub files, for VScode and other editors. I can't find much documentation on doing this correctly.
I'd like to be able to just install the package via pip as normal, and write from mymodule.mysubmodule import myfunc etc like a normal Python package, including autocompletes, type annotations, VScode intellisense etc using the .pyi files I'd write.
My C++ code is in multiple cpp and header files. It uses a few standard libraries, and a few external libraries (such as boost). It defines a single module, and 2 submodules. I want to be able to distribute this on Windows and Linux, and for x86 and x64. I am currently targeting Python 3.9, and the c++17 standard.
How should I structure and distribute this package? Do I include the c++ source files, and create a setup.py similar to the Pybind11 example? If so, how do I include the external libraries? And how do I structure the .pyi stub files? Does this mean whoever tries to install my package would need a c++ compiler as well?
Or, should I compile my c++ to a .pyd/.so file for each platform and architecture? If so, is there a way to specify which one gets installed through pip? And again, how would I structure the .pyi stubs?
Generating .pyi stubs
The pybind11 issue mentions a couple of tools (1, 2) to generate stubs for binary modules. There could be more, but I'm not aware of others. Unfortunately both are far from being perfect, so you probably still need to check and adjust the generated stubs manually.
Distribution of .pyi stubs
After correction of stubs you just include those .pyi files in you distribution (e.g. in wheel or as sources) along with py.typed indication file or, alternatively, distribute them separately as standalone package (e.g. mypackage-stubs).
Building wheels
Wheels allows users of your library to install it in binary form, i.e. without compilation. Wheels makes use of older compilers in order to be compatible with greater number of systems/platforms, so you might face some troubles with a C++17 library. (C++11 is old enough and should have no problems with wheels).
Building wheels for various platforms is tedious, the pybind11's python_example uses cibuildwheels package to do that, I would recommend this route if you are already using CI.
If wheels are missing for target platform the pip will attempt to install from source. This would require compiler and 3rd party libraries you are using to be already installed.
Maybe conda?
If your setup is complex and requires a number of 3rd party libraries it might be worth to write a conda recipe and use conda-forge to generate binary versions of the package. Conda is superior to pip, since it can manage non-python dependencies as well.
I have implemented pyodide in browser for small web apps. Some python package have no pure python wheel so I have built locally then uploaded to CDN. This python wheel can be installed using micropip in pyodide.
So my questions, is there way to build the wheel in browser for pyodide. What will be difficulty for implementing this?
I am curious to know.
Thanks
To the extent that a wheel of a pure Python packages consists of some Python files with metadata, packaged as a ZIP file, you can certainly create such an archive in Pyodide.
In practice though, wheels are most commonly created with setuptools or the wheel package which has following challenges for pyodide:
setuptools, at least, uses the subprocess module which is not supported in WebAssembly VM, meaning that it would need to be patched to avoid making subprocess calls
install requirements in setup.py (and build requirements in pyproject.toml) imply that one is able to download and install dependencies with pip which poses its own chalenges, namely,
the use of subprocess in pip / setuptools
fetching of packages with standard python modules also doesn't work (no sockets in WASM) unless ones rewrites some of them with Web API pyodide#140
For those reason, micropip was written as very rudimentary alternative to pip in the context of pyodide.
It's also a good idea to create wheels for pure python packages even for Python on classical architectures, as they don't require arbitrary code execution to install. Which means that wheels are safer, more reliable and faster to install.
I'm accustomed to pre-downloading packages using Pip, then copying them over to a target machine for deployment. With the newly introduced Python Wheels, I'm forced to "pip ... --no-use-wheel", as some of the downloaded packages are platform specific (I'm developing on OSX and deploying to Debian) and will not install on the target machine. Is there a way to download Wheels for target platforms (or platform independent)?
The pip download command now has the --platform argument, which you can use to specify the desired platform:
pip download --platform=manylinux1_x86_64 --only-binary=:all: lxml
the --platform=manylinux1_x86_64 option indicates that you want wheels for this specific platform. manylinux1_x86_64 means roughly "compatible with most distributions and with an intel CPU architecture". This answer links to some PEPs that describe which platforms exist and what OS/CPU they are compatible with.
the --only-binary=:all: forces the use of binary distribution packages (ie. wheels, as opposed to sdist "source distribution packages") for ":all:" the things that will be installed in this command. Instead of :all:, one can pass a comma-separated list of specific distribution packages; see pip install --help for more info.
Note: I use the term "distribution package" to avoid confusion with the other kind of "package" (the ones one can import in a python script).
The easiest way to achieve that is IMO to use a custom script.
You can access the whole of the PyPI index via the simple interface, if the package of interest offers one or more wheels, they will be listed at the same address + /<package-name>.
For example: if you were to install setuptools all wheels would be listed at: https://pypi.python.org/simple/setuptools/
In your script, remember to implement the recommended tag priority as specified by PEP-425. Essentially that boils down to download the most specific (as opposed to the most general) version of the package as this normally translate into performance advantages, with for example C extensions replacing pure python implementations of some algorithm.
I am a bit confused. There seem to be two different kind of Python packages, source distributions (setup.py sdist) and egg distributions (setup.py bdist_egg).
Both seem to be just archives with the same data, the python source files. One difference is that pip, the most recommended package manager, is not able to install eggs.
What is the difference between the two and what is 'the' way to do distribute my packages?
(Note, I am not wanting to distribute my packages through PyPI, but I want to use a package manager that fetches my dependencies from PyPI)
setup.py sdist creates a source distribution: it contains setup.py, the source files of your module/script (.py files or .c/.cpp for binary modules), your data files, etc. The result is an archive that can then be used to recompile everything on any platform.
setup.py bdist (and bdist_*) creates a built distribution: it includes .pyc files, .so/.dll/.dylib for binary modules, .exe if using py2exe on Windows, your data files... but no setup.py. The result is an archive that is specific to a platform (for example linux-x86_64) and to a version of Python, and that can be installed simply by extracting it into the root of your filesystem (executables are in /usr/bin (or equivalent), data files in /usr/share, modules in /usr/lib/pythonX.X/site-packages/...). You can even build rpm archives that can be directly installed using your package manager.
2021 update: the tools to build and use eggs no longer exist in Python.
There are many more than two different kind of Python (distribution) packages. This command lists many subcommands:
$ python setup.py --help-commands
Notice the various different bdist types.
An egg was a new package type, introduced by setuptools but later adopted by the standard library. It is meant to be installed monolithic onto sys.path. This differs from an sdist package which is meant to have setup.py install run, copying each file into place and perhaps taking other actions as well (building extension modules, running additional arbitrary Python code included in the package).
eggs are largely obsolete at this point in time. EDIT: eggs are gone, they were used with the command "easy_install" that's been removed from Python.
The favored packaging format now is the "wheel" format, notably used by "pip install".
Whether you create an sdist or an egg (or wheel) is independent of whether you'll be able to declare what dependencies the package has (to be downloaded automatically at installation time by PyPI). All that's necessary for this dependency feature to work is for you to declare the dependencies using the extra APIs provided by distribute (the successor of setuptools) or distutils2 (the successor of distutils - otherwise known as packaging in the current development version of Python 3.x).
https://packaging.python.org/ is a good resource for further information about packaging. It covers some of the specifics of declaring dependencies (eg install_requires but not extras_require afaict).