The situation
You have 2 software products in development, a library which presents an API and a GUI tool that exposes the library for end users. Additionally, you expect a lot of technical staff at your place to use the library as a building block for all sorts of relevant custom code, tools and assets.
Both software products (library and GUI tool) are actively in development and influence each other. For both you want the most easy way of distribution and devenv setup, using pip:
pip install gui_tool
or
pip install library
Gui Tool (use case 1)
The installation of the GUI tool happens via pip and the dependencies are noted with a hard version number within the setup.py. Your library is one of those dependencies:
...
install_requires = ['library==1.2, package_x==0.3, package_y==0.6'],
...
The installation procedure consists of installing the tools and resolving the dependencies into a fresh virtualenv. Because every dependency version is hardwired, you control a stable and consistent installation. Nightly builds/bleeding edge dependencies can be controlled by devs through manually updating library to newer versions:
pip install --upgrade library # get the latest nightly build/hotfix release on your own
Custom code/tools (use case 2)
As mentioned, a lot of code and custom tools might be built using the API that your library provides. Everybody willing to use it should install/update it through pip easily with the oneliner from the top.
Problem
Fellow GUI tool developers should be able to pull in nightly builds/hotfix releases of the library dependency with pip. Other staff, using library as a building block somewhere, should always get the latest stable version using pip. You want to remain one unique release procedure for library providing stable versions, bleeding edge and hotfix releases through X.Y.Z versioning.
There are a few possible solutions to this, like:
maintaining a readme for the custom users that states the stable versions they should be installing with pip
or setting up some magic within the setup.py for the Gui Tool cloning the git repo and using it through python setup.py develop (Versioning can then be handled by devs via checkouts in the repo).
However, none of these seem particularly elegant, so I am interested in your solutions, ideas or best practices for stable/bleeding edge/nightly build dependency management for Python?
Managing a "bleeding edge" version in pip can be achieved by using the --pre flag of pip
From pip install --help:
--pre Include pre-release and development versions. By default, pip only finds stable versions.
You should add classifiers to your project, in particular set your stable release to Development Status :: 5 - Production/Stable and your "bleeding edge releases" to anything below 5.
This is how major python packages manage their alphas, e.g.: the django project with currently 1.9a in alpha and 1.8.5 in stable.
To upgrade to the latest version flagged as Development Status :: 3 - Alpha
:
pip install --pre --upgrade library
Users using the library as building blocks need not now about the alpha releases and will go with a regular pip install library which will install the latest stable flagged version.
I am not sure if you need to manage on library or several libraries?
It seems like you have a case of complex Python dependency tracking and deployment. There are few large Python projects which face the same issue with the need to track several different release channels, betas and bleeding edge. Most notable them is Plone which sports more than 300 MB of Python egg source code.
Plone used a buildout as the dependency manager instead of pip to tackle the complexity. Buildout offers mix and match configuration files for Python dependencies.
To quickly grasp how this happens see Plone core buildouts
Example of Plone 5.0 latest: http://dist.plone.org/release/5.0-latest/versions.cfg
Example of older Plone 4.x latest release: http://dist.plone.org/release/4.3rc1/versions.cfg
Github repository which drives the process https://github.com/plone/buildout.coredev - however the complexity here has grown so high that it is no longer very easy for an outsider to grasp what is going on. Release process described here http://buildoutcoredev.readthedocs.org/en/latest/release.html
Warning Buildout is arcane and rough on the edges like just surfaced volcano island of hot lava. Is it more elegant than generating pip requirements.txt with custom scripts? Yes if you have more than one library which needs to be managed.
Related
I'm trying to build a speech recognition application that works in a browser. Currently using pyodide with a web worker. I have my own package, built alongside pyaudio, that I use for the web worker.
Is there a way that I can include a brew instillation, specifically portaudio, inside of my python package so that when I build the package, portaudio is included in the wheel file? I need portaudio included for this to work in the browser.
Thank you!
I'm understanding two different questions here, so I'll try to answer them both.
Can I have a python build fetch a Homebrew project during buildtime
To my knowledge, the answer is no. The Python distribution system is separate from Homebrew, and they can't interact in this fashion.
Even if they could, this wouldn't necessarily be desirable:
What happens if the user isn't on macOS (or Linux)? Then the build would fail.
The prefix that Homebrew will install the package in isn't very deterministic. The user might be using a custom prefix, or they might be on Apple Silicon (which has a different default prefix to Intel).
Your python package might run into some difficulty locating the package.
What about if they don't have Homebrew installed? They might have another package manager like MacPorts or Fink, or maybe none at all.
Can I bundle portaudio into the build distribution?
Maybe? Even if you could, I almost certainly wouldn't recommend it.
Bundling dependencies increases the size of the distribution unnecessarily.
It would take a reasonable amount of effort to setup, assuming you can do it.
All these reasons are why for the majority of projects that have a similar setup, you will find that they recommend installing certain packages with their system package manager first, before building the Python source code.
This allows them to choose whatever package manager they have installed, and it should also be a quick and painless process.
Therefore, just change your installation instructions to the following:
# On macOS
brew install portaudio
pip install ...
I've seen 2 ways of installing OpenCV (there might be more ways which I don't know):
Installing from the source
Installing with pip: pip install opencv-python
My question is, why we need to install OpenCV from the source while we can simply install it using pip? Since people are using both of them, both must be useful. If so, there are any conditions for selecting one of them?
I will list out the differences between both
1.
Installation using pip
Installation is done at the default location where all the python packages resides.
Installation from Source
Installation location is provided by the developer.
2.
Installation using pip
In terms of performance, the packages installed might run slower because of the hidden conflicts between features.
Installation from Source
The developer can select the optimization flags during the compilation of packages which are responsible for the fast performance of library.
3.
Installation using pip
The developers can neither add nor remove features provided in the installation done by pip.
Installation from Source
The developer has all the rights to add or remove the features during the installation of library.
4.
Installation using pip
The package manager will do the work on behalf of developer. Package Manager is also responsible for taking care of library updation.
Installation from Source
The developers are responsible for feature selection and updation of library. They must be aware of new package updates, latest security patches etc, to keep themselves updated about the library.
Hope this helps you!
OpenCV is always under development, and the thing is some parts of the library is not going to published, due to compatibility and copyright issues, but if you use the source then you can have all the capabilities that you need. SURF & SIFT are examples of this problem.
The Situation
I’m trying to port an open-source library to Python 3. (SymPy, if anyone is wondering.)
So, I need to run 2to3 automatically when building for Python 3. To do that, I need to use distribute. Therefore, I need to port the current system, which (according to the doctest) is distutils.
The Problem
Unfortunately, I’m not sure what’s the difference between these modules—distutils, distribute, setuptools. The documentation is sketchy as best, as they all seem to be a fork of one another, intended to be compatible in most circumstances (but actually, not all)…and so on, and so forth.
The Question
Could someone explain the differences? What am I supposed to use? What is the most modern solution? (As an aside, I’d also appreciate some guide on porting to Distribute, but that’s a tad beyond the scope of the question…)
As of May 2022, most of the other answers to this question are several years out-of-date. When you come across advice on Python packaging issues, remember to look at the date of publication, and don't trust out-of-date information.
The Python Packaging User Guide is worth a read. Every page has a "last updated" date displayed, so you can check the recency of the manual, and it's quite comprehensive. The fact that it's hosted on a subdomain of python.org of the Python Software Foundation just adds credence to it. The Project Summaries page is especially relevant here.
Summary of tools:
Here's a summary of the Python packaging landscape:
Supported tools:
setuptools was developed to overcome Distutils' limitations, and is not included in the standard library. It introduced a command-line utility called easy_install. It also introduced the setuptools Python package that can be imported in your setup.py script, and the pkg_resources Python package that can be imported in your code to locate data files installed with a distribution. One of its gotchas is that it monkey-patches the distutils Python package. It should work well with pip. It sees regular releases.
Official docs | Pypi page | GitHub repo | setuptools section of Python Package User Guide
scikit-build is an improved build system generator that internally uses CMake to build compiled Python extensions. Because scikit-build isn't based on distutils, it doesn't really have any of its limitations. When ninja-build is present, scikit-build can compile large projects over three times faster than the alternatives. It should work well with pip.
Official docs | Pypi page | GitHub repo | scikit-build section of Python Package User Guide
distlib is a library that provides functionality that is used by higher level tools like pip.
Official Docs | Pypi page | Bitbucket repo | distlib section of Python Package User Guide
packaging is also a library that provides functionality used by higher level tools like pip and setuptools
Official Docs | Pypi page | GitHub repo | packaging section of Python Package User Guide
Deprecated/abandoned tools:
distutils is still included in the standard library of Python, but is considered deprecated as of Python 3.10. It is useful for simple Python distributions, but lacks features. It introduces the distutils Python package that can be imported in your setup.py script.
Official docs | distutils section of Python Package User Guide
distribute was a fork of setuptools. It shared the same namespace, so if you had Distribute installed, import setuptools would actually import the package distributed with Distribute. Distribute was merged back into Setuptools 0.7, so you don't need to use Distribute any more. In fact, the version on Pypi is just a compatibility layer that installs Setuptools.
distutils2 was an attempt to take the best of distutils, setuptools and distribute and become the standard tool included in Python's standard library. The idea was that distutils2 would be distributed for old Python versions, and that distutils2 would be renamed to packaging for Python 3.3, which would include it in its standard library. These plans did not go as intended, however, and currently, distutils2 is an abandoned project. The latest release was in March 2012, and its Pypi home page has finally been updated to reflect its death.
Others:
There are other tools, if you are interested, read Project Summaries in the Python Packaging User Guide. I won't list them all, to not repeat that page, and to keep the answer matching the question, which was only about distribute, distutils, setuptools and distutils2.
Recommendation:
If all of this is new to you, and you don't know where to start, I would recommend learning setuptools, along with pip and virtualenv, which all work very well together.
If you're looking into virtualenv, you might be interested in this question: What is the difference between venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, etc?. (Yes, I know, I groan with you.)
I’m a distutils maintainer and distutils2/packaging contributor. I did a talk about Python packaging at ConFoo 2011 and these days I’m writing an extended version of it. It’s not published yet, so here are excerpts that should help define things.
Distutils is the standard tool used for packaging. It works rather well for simple needs, but is limited and not trivial to extend.
Setuptools is a project born from the desire to fill missing distutils functionality and explore new directions. In some subcommunities, it’s a de facto standard. It uses monkey-patching and magic that is frowned upon by Python core developers.
Distribute is a fork of Setuptools that was started by developers feeling that its development pace was too slow and that it was not possible to evolve it. Its development was considerably slowed when distutils2 was started by the same group. 2013-August update: distribute is merged back into setuptools and discontinued.
Distutils2 is a new distutils library, started as a fork of the distutils codebase, with good ideas taken from setup tools (of which some were thoroughly discussed in PEPs), and a basic installer inspired by pip. The actual name you use to import Distutils2 is packaging in the Python 3.3+ standard library, or distutils2 in 2.4+ and 3.1–3.2. (A backport will be available soon.) Distutils2 did not make the Python 3.3 release, and it was put on hold.
More info:
The fate of Distutils – Pycon Summit + Packaging Sprint detailed report
A Quick Diff between Distutils and Distutils2
I hope to finish my guide soon, it will contain more info about each library’s strong and weak points and a transition guide.
NOTE: Answer deprecated, Distribute now obsolete. This answer is no longer valid since the Python Packaging Authority was formed and has done a lot of work cleaning this up.
Yep, you got it. :-o I think at this time the preferred package is Distribute, which is a fork of setuptools, which are an extension of distutils (the original packaging system). Setuptools was not being maintained so is was forked and renamed, however when installed it uses the package name of setuptools! I think most Python developers now use Distribute, and I can say for sure that I do.
I realize that I have replied to your secondary question without addressing unquestioned assumptions in your original problem:
I'm trying to port an open-source library (SymPy, if anyone is wondering) to Python 3. To
do this, I need to run 2to3 automatically when building for Python 3.
You may, not need. Other strategies are described at http://docs.python.org/dev/howto/pyporting
To do that, I need to use distribute,
You may :) distutils supports build-time 2to3 conversion for code (not docstrings), in a different manner that distribute’s: http://docs.python.org/dev/howto/pyporting#during-installation
Updating this question in late 2014 where fortunately the Python packaging chaos has been greatly cleaned up by Continuum's "conda" package manager.
In particular, conda quickly enables the creation of conda "environments". You can configure your environments with different versions of Python. For example:
conda create -n py34 python=3.4 anaconda
conda create -n py26 python=2.6 anaconda
will create two ("py34" or "py26") Python environments with different versions of Python.
Afterwards you can invoke the environment with the specific version of Python with:
source activate <env name>
This feature seems especially useful in your case where you are having to deal with different version of Python.
Moreover, conda has the following features:
Python agnostic
Cross platform
No admin privileges required
Smart dependency management (by way of a SAT solver)
Nicely deals with C, Fortran and system level libraries that you may have to link against
That last point is especially important if you are in the scientific computing arena.
Does Python have a package/module management system, similar to how Ruby has rubygems where you can do gem install packagename?
On Installing Python Modules, I only see references to python setup.py install, but that requires you to find the package first.
Recent progress
March 2014: Good news! Python 3.4 ships with Pip. Pip has long been Python's de-facto standard package manager. You can install a package like this:
pip install httpie
Wahey! This is the best feature of any Python release. It makes the community's wealth of libraries accessible to everyone. Newbies are no longer excluded from using community libraries by the prohibitive difficulty of setup.
However, there remains a number of outstanding frustrations with the Python packaging experience. Cumulatively, they make Python very unwelcoming for newbies. Also, the long history of neglect (ie. not shipping with a package manager for 14 years from Python 2.0 to Python 3.3) did damage to the community. I describe both below.
Outstanding frustrations
It's important to understand that while experienced users are able to work around these frustrations, they are significant barriers to people new to Python. In fact, the difficulty and general user-unfriendliness is likely to deter many of them.
PyPI website is counter-helpful
Every language with a package manager has an official (or quasi-official) repository for the community to download and publish packages. Python has the Python Package Index, PyPI. https://pypi.python.org/pypi
Let's compare its pages with those of RubyGems and Npm (the Node package manager).
https://rubygems.org/gems/rails RubyGems page for the package rails
https://www.npmjs.org/package/express Npm page for the package express
https://pypi.python.org/pypi/simplejson/ PyPI page for the package simplejson
You'll see the RubyGems and Npm pages both begin with a one-line description of the package, then large friendly instructions how to install it.
Meanwhile, woe to any hapless Python user who naively browses to PyPI. On https://pypi.python.org/pypi/simplejson/ , they'll find no such helpful instructions. There is however, a large green 'Download' link. It's not unreasonable to follow it. Aha, they click! Their browser downloads a .tar.gz file. Many Windows users can't even open it, but if they persevere they may eventually extract it, then run setup.py and eventually with the help of Google setup.py install. Some will give up and reinvent the wheel..
Of course, all of this is wrong. The easiest way to install a package is with a Pip command. But PyPI didn't even mention Pip. Instead, it led them down an archaic and tedious path.
Error: Unable to find vcvarsall.bat
Numpy is one of Python's most popular libraries. Try to install it with Pip, you get this cryptic error message:
Error: Unable to find vcvarsall.bat
Trying to fix that is one of the most popular questions on Stack Overflow: "error: Unable to find vcvarsall.bat"
Few people succeed.
For comparison, in the same situation, Ruby prints this message, which explains what's going on and how to fix it:
Please update your PATH to include build tools or download the DevKit from http://rubyinstaller.org/downloads and follow the instructions at http://github.com/oneclick/rubyinstaller/wiki/Development-Kit
Publishing packages is hard
Ruby and Nodejs ship with full-featured package managers, Gem (since 2007) and Npm (since 2011), and have nurtured sharing communities centred around GitHub. Npm makes publishing packages as easy as installing them, it already has 64k packages. RubyGems lists 72k packages. The venerable Python package index lists only 41k.
History
Flying in the face of its "batteries included" motto, Python shipped without a package manager until 2014.
Until Pip, the de facto standard was a command easy_install. It was woefully inadequate. The was no command to uninstall packages.
Pip was a massive improvement. It had most the features of Ruby's Gem. Unfortunately, Pip was--until recently--ironically difficult to install. In fact, the problem remains a top Python question on Stack Overflow: "How do I install pip on Windows?"
And just to provide a contrast, there's also pip.
The Python Package Index (PyPI) seems to be standard:
To install a package:
pip install MyProject
To update a package
pip install --upgrade MyProject
To fix a version of a package pip install MyProject==1.0
You can install the package manager as follows:
curl -O http://python-distribute.org/distribute_setup.py
python distribute_setup.py
easy_install pip
References:
http://guide.python-distribute.org/
http://pypi.python.org/pypi/distribute
As a Ruby and Perl developer and learning-Python guy, I haven't found easy_install or pip to be the equivalent to RubyGems or CPAN.
I tend to keep my development systems running the latest versions of modules as the developers update them, and freeze my production systems at set versions. Both RubyGems and CPAN make it easy to find modules by listing what's available, then install and later update them individually or in bulk if desired.
easy_install and pip make it easy to install a module ONCE I located it via a browser search or learned about it by some other means, but they won't tell me what is available. I can explicitly name the module to be updated, but the apps won't tell me what has been updated nor will they update everything in bulk if I want.
So, the basic functionality is there in pip and easy_install but there are features missing that I'd like to see that would make them friendlier and easier to use and on par with CPAN and RubyGems.
There are at least two, easy_install and its successor pip.
As of at least late 2014, Continuum Analytics' Anaconda Python distribution with the conda package manager should be considered. It solves most of the serious issues people run into with Python in general (managing different Python versions, updating Python versions, package management, virtual environments, Windows/Mac compatibility) in one cohesive download.
It enables you to do pretty much everything you could want to with Python without having to change the system at all. My next preferred solution is pip + virtualenv, but you either have to install virtualenv into your system Python (and your system Python may not be the version you want), or build from source. Anaconda makes this whole process the click of a button, as well as adding a bunch of other features.
That'd be easy_install.
It's called setuptools. You run it with the "easy_install" command.
You can find the directory at http://pypi.python.org/
I don't see either MacPorts or Homebrew mentioned in other answers here, but since I do see them mentioned elsewhere on Stack Overflow for related questions, I'll add my own US$0.02 that many folks seem to consider MacPorts as not only a package manager for packages in general (as of today they list 16311 packages/ports, 2931 matching "python", albeit only for Macs), but also as a decent (maybe better) package manager for Python packages/modules:
Question
"...what is the method that Mac python developers use to manage their modules?"
Answers
"MacPorts is perfect for Python on the Mac."
"The best way is to use MacPorts."
"I prefer MacPorts..."
"With my MacPorts setup..."
"I use MacPorts to install ... third-party modules tracked by MacPorts"
SciPy
"Macs (unlike Linux) don’t come with a package manager, but there are a couple of popular package managers you can install.
Macports..."
I'm still debating on whether or not to use MacPorts myself, but at the moment I'm leaning in that direction.
On Windows install http://chocolatey.org/ then
choco install python
Open a new cmd-window with the updated PATH. Next, do
choco install pip
After that you can
pip install pyside
pip install ipython
...
Since no one has mentioned pipenv here, I would like to describe my views why everyone should use it for managing python packages.
As #ColonelPanic mentioned there are several issues with the Python Package Index and with pip and virtualenv also.
Pipenv solves most of the issues with pip and provides additional features also.
Pipenv features
Pipenv is intended to replace pip and virtualenv, which means pipenv will automatically create a separate virtual environment for every project thus avoiding conflicts between different python versions/package versions for different projects.
Enables truly deterministic builds, while easily specifying only what you want.
Generates and checks file hashes for locked dependencies.
Automatically install required Pythons, if pyenv is available.
Automatically finds your project home, recursively, by looking for a Pipfile.
Automatically generates a Pipfile, if one doesn’t exist.
Automatically creates a virtualenv in a standard location.
Automatically adds/removes packages to a Pipfile when they are un/installed.
Automatically loads .env files, if they exist.
If you have worked on python projects before, you would realize these features make managing packages way easier.
Other Commands
check checks for security vulnerabilities and asserts that PEP 508 requirements are being met by the current environment. (which I think is a great feature especially after this - Malicious packages on PyPi)
graph will show you a dependency graph, of your installed dependencies.
You can read more about it here - Pipenv.
Installation
You can find the installation documentation here
P.S.: If you liked working with the Python Package requests , you would be pleased to know that pipenv is by the same developer Kenneth Reitz
In 2019 poetry is the package and dependency manager you are looking for.
https://github.com/sdispater/poetry#why
It's modern, simple and reliable.
Poetry is what you're looking for. It takes care of dependency management, virtual environments, running.
The last time I had to worry about installing Python packages was two years ago working with Enthought, NumPy and MayaVi2. That experience gave me lingering nightmares related to quirky behavior installing & updating Python packages in non-standard locations (in $HOME/usr/local2.6/, for example).
Anyway, my work is taking me back to installing various Python packages. The CheeseShop Tutorial mentions DistUtils and EasyInstall in addition to Buildout! I am having a hard time finding one place that compares these (and other) PyPi installation tools, so I am hoping to tap into the StackOverflow community: What are the strengths & weaknesses of each installation tool?
First of all, regardless of installation tool you decide on, start using virtualenv --no-site-packages! That way, python packages are not installed globally and you can easily get back to where you were in old as well as new projects.
Now, your comparison is a little bit apples-to-pears as the tools you list are not mutually exclusive. However, I can wholly recommend Buildout. It will install python packages as well as other stuff and lets you automate installation and deployment of (complex) projects.
Also, I recommend looking into Fabric as a means to automate administrative tasks.
I've done quiet a bit of research on this topic(a couple of weeks worth) before settling down on using buildout for all of my projects.
DistUtils and EasyInstall in addition to Buildout!
The difficulty in creating one place to compare all of these tools is that they're all part of a same tool chain and are used together to create a predictable, reliable and flexible tool set.
For example, easy_install is used to install distutils packages from pypi(cheeseshop) to your system Python's site-packages directory. This drastically simplifies installation of packages to your system/global sys.path.
easy_install is very convenient for packages that are consistent for all projects. But, I find that I prefer to use system's easy_install to install packages that projects do not depend on. For example, github-cli I use with every project, because it allows me to interact with project's Github Issues from command line. I use this with projects, but it's for convenience and the project itself does not have dependancy on this package.
For managing project's dependancies, I use buildout. Buildout allows you to indicate specifically what version of packages your project depends on. I prefer buildout over pip-requirements.txt because buildout is declarative. With pip, you install the packages and at the end of the development you generate the requirements.txt file. With Buildout on the other hand, you modify the buildout.cfg before the package egg is added to your project. This forces me to be conscious of what packages I'm adding to the project.
Now, there is a matter of virtualenv. One of the most publicized features of virtualenv is obviously --no-site-packages option. I have not found that option to be particularly useful, because I use buildout. Buildout manages the sys.path and includes only the packages I ask tell it to include. It also, includes everything in system Python's site-packages but since I don't have anything there that I use in projects, I never have conflicts.
Also, I find that --no-site-packages only hinders my development process, because some packages I install using my sistem's packaging system. Usually, anything that has C libraries that need to be compiled, I install through the system's packaging system.
In the project's fabfile.py I include test function to test for presence of system packages that I install through system's package manager.
In summary, here is how I use these tools:
System's Package Manager(apt-get, yam, port, fink ...)
I use one of these to install python versions that I need on this system. I also use it to install packages like lxml which include c libraries.
easy_install
I use to install packages from pypi that I use on all projects, but projects are not dependant on these packages.
buildout
I use to manage dependancies of a project.
In my experience, this workflow has been very flexible, portable and easy to work with.
Distribute is a new fork of setuptools (easy_install), which should also be considered. Even Guido recommends it.
Buildout is orthogonal to the packaging --- you can use buildout with distribute.
Whenever I need to remind myself of the state of play, I look at these as a starting point:
The State of Python Packaging, a response to:
On packaging, linked from:
Tools of the Modern Python Hacker
I can't easily help you with finding the strength, but I can make it a bit harder, since it also depends on the platform you want to use.
For example if you need to install python packages on Gentoo (GNU/Liunx) based computers, you can easily use g-pypi to create ebuilds for all packages which use distutils (rather: a setup.py). That way they get completely integrated into your system and can be added, updated and removed like all your other tools. But it naturally only works for Gentoo-based systems.
Also you can use yolk to find out about all packages installed via easy_install on your system (not only on Gentoo).
When I write code, I simply use distutils (because it allows building portage ebuilds very easily) and sometimes basic setuptools features, or organize my programs so people can just download and run them from the program folder (ideally just unpack the source archive / clone the repository somewhere). This isn't the perfect solution, but until the core python team decides which way they want to move, I don't want to fix onto a path (anymore) which might disappear.