Differences between distribute, distutils, setuptools and distutils2? - python

The Situation
I’m trying to port an open-source library to Python 3. (SymPy, if anyone is wondering.)
So, I need to run 2to3 automatically when building for Python 3. To do that, I need to use distribute. Therefore, I need to port the current system, which (according to the doctest) is distutils.
The Problem
Unfortunately, I’m not sure what’s the difference between these modules—distutils, distribute, setuptools. The documentation is sketchy as best, as they all seem to be a fork of one another, intended to be compatible in most circumstances (but actually, not all)…and so on, and so forth.
The Question
Could someone explain the differences? What am I supposed to use? What is the most modern solution? (As an aside, I’d also appreciate some guide on porting to Distribute, but that’s a tad beyond the scope of the question…)

As of May 2022, most of the other answers to this question are several years out-of-date. When you come across advice on Python packaging issues, remember to look at the date of publication, and don't trust out-of-date information.
The Python Packaging User Guide is worth a read. Every page has a "last updated" date displayed, so you can check the recency of the manual, and it's quite comprehensive. The fact that it's hosted on a subdomain of python.org of the Python Software Foundation just adds credence to it. The Project Summaries page is especially relevant here.
Summary of tools:
Here's a summary of the Python packaging landscape:
Supported tools:
setuptools was developed to overcome Distutils' limitations, and is not included in the standard library. It introduced a command-line utility called easy_install. It also introduced the setuptools Python package that can be imported in your setup.py script, and the pkg_resources Python package that can be imported in your code to locate data files installed with a distribution. One of its gotchas is that it monkey-patches the distutils Python package. It should work well with pip. It sees regular releases.
Official docs | Pypi page | GitHub repo | setuptools section of Python Package User Guide
scikit-build is an improved build system generator that internally uses CMake to build compiled Python extensions. Because scikit-build isn't based on distutils, it doesn't really have any of its limitations. When ninja-build is present, scikit-build can compile large projects over three times faster than the alternatives. It should work well with pip.
Official docs | Pypi page | GitHub repo | scikit-build section of Python Package User Guide
distlib is a library that provides functionality that is used by higher level tools like pip.
Official Docs | Pypi page | Bitbucket repo | distlib section of Python Package User Guide
packaging is also a library that provides functionality used by higher level tools like pip and setuptools
Official Docs | Pypi page | GitHub repo | packaging section of Python Package User Guide
Deprecated/abandoned tools:
distutils is still included in the standard library of Python, but is considered deprecated as of Python 3.10. It is useful for simple Python distributions, but lacks features. It introduces the distutils Python package that can be imported in your setup.py script.
Official docs | distutils section of Python Package User Guide
distribute was a fork of setuptools. It shared the same namespace, so if you had Distribute installed, import setuptools would actually import the package distributed with Distribute. Distribute was merged back into Setuptools 0.7, so you don't need to use Distribute any more. In fact, the version on Pypi is just a compatibility layer that installs Setuptools.
distutils2 was an attempt to take the best of distutils, setuptools and distribute and become the standard tool included in Python's standard library. The idea was that distutils2 would be distributed for old Python versions, and that distutils2 would be renamed to packaging for Python 3.3, which would include it in its standard library. These plans did not go as intended, however, and currently, distutils2 is an abandoned project. The latest release was in March 2012, and its Pypi home page has finally been updated to reflect its death.
Others:
There are other tools, if you are interested, read Project Summaries in the Python Packaging User Guide. I won't list them all, to not repeat that page, and to keep the answer matching the question, which was only about distribute, distutils, setuptools and distutils2.
Recommendation:
If all of this is new to you, and you don't know where to start, I would recommend learning setuptools, along with pip and virtualenv, which all work very well together.
If you're looking into virtualenv, you might be interested in this question: What is the difference between venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, etc?. (Yes, I know, I groan with you.)

I’m a distutils maintainer and distutils2/packaging contributor. I did a talk about Python packaging at ConFoo 2011 and these days I’m writing an extended version of it. It’s not published yet, so here are excerpts that should help define things.
Distutils is the standard tool used for packaging. It works rather well for simple needs, but is limited and not trivial to extend.
Setuptools is a project born from the desire to fill missing distutils functionality and explore new directions. In some subcommunities, it’s a de facto standard. It uses monkey-patching and magic that is frowned upon by Python core developers.
Distribute is a fork of Setuptools that was started by developers feeling that its development pace was too slow and that it was not possible to evolve it. Its development was considerably slowed when distutils2 was started by the same group. 2013-August update: distribute is merged back into setuptools and discontinued.
Distutils2 is a new distutils library, started as a fork of the distutils codebase, with good ideas taken from setup tools (of which some were thoroughly discussed in PEPs), and a basic installer inspired by pip. The actual name you use to import Distutils2 is packaging in the Python 3.3+ standard library, or distutils2 in 2.4+ and 3.1–3.2. (A backport will be available soon.) Distutils2 did not make the Python 3.3 release, and it was put on hold.
More info:
The fate of Distutils – Pycon Summit + Packaging Sprint detailed report
A Quick Diff between Distutils and Distutils2
I hope to finish my guide soon, it will contain more info about each library’s strong and weak points and a transition guide.

NOTE: Answer deprecated, Distribute now obsolete. This answer is no longer valid since the Python Packaging Authority was formed and has done a lot of work cleaning this up.
Yep, you got it. :-o I think at this time the preferred package is Distribute, which is a fork of setuptools, which are an extension of distutils (the original packaging system). Setuptools was not being maintained so is was forked and renamed, however when installed it uses the package name of setuptools! I think most Python developers now use Distribute, and I can say for sure that I do.

I realize that I have replied to your secondary question without addressing unquestioned assumptions in your original problem:
I'm trying to port an open-source library (SymPy, if anyone is wondering) to Python 3. To
do this, I need to run 2to3 automatically when building for Python 3.
You may, not need. Other strategies are described at http://docs.python.org/dev/howto/pyporting
To do that, I need to use distribute,
You may :) distutils supports build-time 2to3 conversion for code (not docstrings), in a different manner that distribute’s: http://docs.python.org/dev/howto/pyporting#during-installation

Updating this question in late 2014 where fortunately the Python packaging chaos has been greatly cleaned up by Continuum's "conda" package manager.
In particular, conda quickly enables the creation of conda "environments". You can configure your environments with different versions of Python. For example:
conda create -n py34 python=3.4 anaconda
conda create -n py26 python=2.6 anaconda
will create two ("py34" or "py26") Python environments with different versions of Python.
Afterwards you can invoke the environment with the specific version of Python with:
source activate <env name>
This feature seems especially useful in your case where you are having to deal with different version of Python.
Moreover, conda has the following features:
Python agnostic
Cross platform
No admin privileges required
Smart dependency management (by way of a SAT solver)
Nicely deals with C, Fortran and system level libraries that you may have to link against
That last point is especially important if you are in the scientific computing arena.

Related

What's a good pip/setuptools compliant version number for a fork of a package?

I am forking a python package, where I expect the package author to merge my changes in the near future. The package author doesn't release very often, so I expect to have my temporary fork as a dependency for some of my other packages. I need to create an appropriate version number for my fork that is pip/setuptools compliant.
Let's say the current version is 1.6.4, and I expect the author's next release to be 1.6.5. Would an appropriate version for the fork be 1.6.4.1 or 1.6.5.dev20140520? Both seem to be compliant with PEP440, but I also have had experience with recent versions of pip not finding dev releases unless you specifically use the pre flag. It seems that 1.6.4.1 would be a good choice, but I don't know how happy pip will be with a N.N.N.N format (e.g. will pip treat it as a pre release?).
Is there some standard convention for this? Note, I don't want to change the name of the author's package, but I do need a temporary fork that my other packages can install with minimal issues.
It seems that there's not an official convention for naming a fork for a python package. As #larsman pointed out in the question comments, a standard convention forking package-1.6.4 is package-1.6.4-forkname-0.1 -- and while this has been used by the Linux community (and others) for years, it has recently lost favor for python packages. One of the main issues is that this convention does not follow accepted versioning conventions used by pip -- and thus has garnered less use in recent years for python packages. If you do a search for "fork" on pypi's package index (https://pypi.python.org/pypi?%3Aaction=search&term=fork&submit=search) you'll see that it seems there are two common pip-compliant cases emerging:
package-forkname-1.6.4
forkname-1.6.4, where forkname is a "clever" variant on packagename (e.g. PIL and pillow)

Best practices for Python deployment -- multiple versions, standard install locations, packaging tools etc

Many posts on different aspects of this question but I haven't seen a post that brings it all together.
First a subjective statement: it seems like the simplicity we experience when working with the Python language is shot to pieces when we move outside the interpreter and start grappling with deployment issues. How best to have multiple versions of Python on the same machine? Where should packages be installed? Disutils vs. setuptools vs. pip etc. It seems like the Zen of Python is being abused pretty badly when it comes to deployment. I'm feeling eerie echoes of the "DLL hell" experience on Windows.
Do the experts agree on some degree of best practice on these questions?
Do you run multiple versions of Python on the same machine? How do you remain confident that they can co-exist -- and the newer version doesn't break assumptions of other processes that rely on the earlier version (scripts provided by OS vendor, for example)? Is this safe? Does virtualenv suffice?
What are the best choices for locations for different components of the Python environment (including 3rd party packages) on the local file system? Is there a strict or rough correspondence between locations for many different versions of Unixy and Windows OS's that can be relied upon?
And the murkiest corner of the swamp -- what install tools do you use (setuptools, distutils, pip etc.) and do they play well with your choices re: file locations, Python virtual environments, Python path etc.
These sound like hard questions. I'm hopeful the experienced Pythonistas may have defined a canonical approach (or two) to these challenges. Any approach that "hangs together" as a system that can be used with confidence (feeling less like separate, unrelated tools) would be very helpful.
I've found that virtualenv is the only reliable way to configure and maintain multiple environments on the same machine. It even has as a way of packaging up environment and installing it on another machine.
For package management I always use pip since it works so nicely with virtualenv. It also makes it easy to install and upgrade packages from a variety of sources such a git repositories.
I agree this is quite a broad question, but I’ll try to address its many parts anyway.
About your subjective statement: I don’t see why the simplicity and elegance of Python would imply that packaging and deployment matters suddenly should become simple things. Some things related to packaging are simple, other are not, other could be. It would be best for users if we had one complete, robust and easy packaging system, but it hasn’t turned that way. distutils was created and then its development paused, setuptools was created and added new solutions and new problems, distribute was forked from setuptools because of social problems, and finally distutils2 was created to make one official complete library. (More on Differences between distribute, distutils, setuptools and distutils2?) The situation is far from ideal for developers and users, but we are working on making it better.
How best to have multiple versions of Python on the same machine? Use your package manager if you’re on a modern OS, or use “make altinstall” if you compile from source on UNIX, or use the similar non-conflicting installation scheme if you compile from source on Windows. As a Debian user, I know that I can call the individual versions by using “pythonX.Y”, and that what the default versions (“python” and “python3”) are is decided by the Debian developers. A few OSes have started to break the assumption that python == python2, so there is a PEP in progress to bless or condemn that: http://www.python.org/dev/peps/pep-0394/ Windows seems to lack a way to use one Python version as default, so there’s another PEP: http://www.python.org/dev/peps/pep-0397/
Where should packages be installed? Using distutils, I can install projects into my user site-packages directory (see PEP 370 or docs.python.org). What exactly is the question?
Parallel installation of different versions of the same project is not supported. It would need a PEP to discuss the changes to the import system and the packaging tools. Before someone starts that discussion, using virtualenv or buildout works well enough.
I don’t understand the question about the location of “components of the Python environment”.
I mostly use system packages (i.e. using the Aptitude package manager on Debian). To try out projects, I clone their repository. If I need something that’s not available with Aptitude, I install (or put a .pth file to the repo) in my user site-packages directory. I don’t need a custom PYTHONPATH, but I have changed the location of my user site-packages with PYTHONUSERBASE. I don’t like the magic and the eggs concept in setuptools/distribute, so I don’t use them. I’ve started to use virtualenv and pip for one project though (they use setuptools under the cover, but I made a private install so my global Python does not have setuptools).
One resource for this area is the book Expert Python Programming by Tarek Ziade. I'm ambivalent about the quality of the book, but the topics covered are just what you're focusing on.

State of Python Packaging: Buildout, Distribute, Distutils, EasyInstall, etc

The last time I had to worry about installing Python packages was two years ago working with Enthought, NumPy and MayaVi2. That experience gave me lingering nightmares related to quirky behavior installing & updating Python packages in non-standard locations (in $HOME/usr/local2.6/, for example).
Anyway, my work is taking me back to installing various Python packages. The CheeseShop Tutorial mentions DistUtils and EasyInstall in addition to Buildout! I am having a hard time finding one place that compares these (and other) PyPi installation tools, so I am hoping to tap into the StackOverflow community: What are the strengths & weaknesses of each installation tool?
First of all, regardless of installation tool you decide on, start using virtualenv --no-site-packages! That way, python packages are not installed globally and you can easily get back to where you were in old as well as new projects.
Now, your comparison is a little bit apples-to-pears as the tools you list are not mutually exclusive. However, I can wholly recommend Buildout. It will install python packages as well as other stuff and lets you automate installation and deployment of (complex) projects.
Also, I recommend looking into Fabric as a means to automate administrative tasks.
I've done quiet a bit of research on this topic(a couple of weeks worth) before settling down on using buildout for all of my projects.
DistUtils and EasyInstall in addition to Buildout!
The difficulty in creating one place to compare all of these tools is that they're all part of a same tool chain and are used together to create a predictable, reliable and flexible tool set.
For example, easy_install is used to install distutils packages from pypi(cheeseshop) to your system Python's site-packages directory. This drastically simplifies installation of packages to your system/global sys.path.
easy_install is very convenient for packages that are consistent for all projects. But, I find that I prefer to use system's easy_install to install packages that projects do not depend on. For example, github-cli I use with every project, because it allows me to interact with project's Github Issues from command line. I use this with projects, but it's for convenience and the project itself does not have dependancy on this package.
For managing project's dependancies, I use buildout. Buildout allows you to indicate specifically what version of packages your project depends on. I prefer buildout over pip-requirements.txt because buildout is declarative. With pip, you install the packages and at the end of the development you generate the requirements.txt file. With Buildout on the other hand, you modify the buildout.cfg before the package egg is added to your project. This forces me to be conscious of what packages I'm adding to the project.
Now, there is a matter of virtualenv. One of the most publicized features of virtualenv is obviously --no-site-packages option. I have not found that option to be particularly useful, because I use buildout. Buildout manages the sys.path and includes only the packages I ask tell it to include. It also, includes everything in system Python's site-packages but since I don't have anything there that I use in projects, I never have conflicts.
Also, I find that --no-site-packages only hinders my development process, because some packages I install using my sistem's packaging system. Usually, anything that has C libraries that need to be compiled, I install through the system's packaging system.
In the project's fabfile.py I include test function to test for presence of system packages that I install through system's package manager.
In summary, here is how I use these tools:
System's Package Manager(apt-get, yam, port, fink ...)
I use one of these to install python versions that I need on this system. I also use it to install packages like lxml which include c libraries.
easy_install
I use to install packages from pypi that I use on all projects, but projects are not dependant on these packages.
buildout
I use to manage dependancies of a project.
In my experience, this workflow has been very flexible, portable and easy to work with.
Distribute is a new fork of setuptools (easy_install), which should also be considered. Even Guido recommends it.
Buildout is orthogonal to the packaging --- you can use buildout with distribute.
Whenever I need to remind myself of the state of play, I look at these as a starting point:
The State of Python Packaging, a response to:
On packaging, linked from:
Tools of the Modern Python Hacker
I can't easily help you with finding the strength, but I can make it a bit harder, since it also depends on the platform you want to use.
For example if you need to install python packages on Gentoo (GNU/Liunx) based computers, you can easily use g-pypi to create ebuilds for all packages which use distutils (rather: a setup.py). That way they get completely integrated into your system and can be added, updated and removed like all your other tools. But it naturally only works for Gentoo-based systems.
Also you can use yolk to find out about all packages installed via easy_install on your system (not only on Gentoo).
When I write code, I simply use distutils (because it allows building portage ebuilds very easily) and sometimes basic setuptools features, or organize my programs so people can just download and run them from the program folder (ideally just unpack the source archive / clone the repository somewhere). This isn't the perfect solution, but until the core python team decides which way they want to move, I don't want to fix onto a path (anymore) which might disappear.

Questions about Setuptools and alternatives

I've seen a good bit of setuptools bashing on the internets lately. Most recently, I read James Bennett's On packaging post on why no one should be using setuptools. From my time in #python on Freenode, I know that there are a few souls there who absolutely detest it. I would count myself among them, but I do actually use it.
I've used setuptools for enough projects to be aware of its deficiencies, and I would prefer something better. I don't particularly like the egg format and how it's deployed. With all of setuptools' problems, I haven't found a better alternative.
My understanding of tools like pip is that it's meant to be an easy_install replacement (not setuptools). In fact, pip uses some setuptools components, right?
Most of my packages make use of a setuptools-aware setup.py, which declares all of the dependencies. When they're ready, I'll build an sdist, bdist, and bdist_egg, and upload them to pypi.
If I wanted to switch to using pip, what kind of changes would I need to make to rid myself of easy_install dependencies? Where are the dependencies declared? I'm guessing that I would need to get away from using the egg format, and provide just source distributions. If so, how do i generate the egg-info directories? or do I even need to?
How would this change my usage of virtualenv? Doesn't virtualenv use easy_install to manage the environments?
How would this change my usage of the setuptools provided "develop" command? Should I not use that? What's the alternative?
I'm basically trying to get a picture of what my development workflow will look like.
Before anyone suggests it, I'm not looking for an OS-dependent solution. I'm mainly concerned with debian linux, but deb packages are not an option, for the reasons Ian Bicking outlines here.
pip uses Setuptools, and doesn't require any changes to packages. It actually installs packages with Setuptools, using:
python -c 'import setuptools; __file__="setup.py"; execfile(__file__)' \
install \
--single-version-externally-managed
Because it uses that option (--single-version-externally-managed) it doesn't ever install eggs as zip files, doesn't support multiple simultaneously installed versions of software, and the packages are installed flat (like python setup.py install works if you use only distutils). Egg metadata is still installed. pip also, like easy_install, downloads and installs all the requirements of a package.
In addition you can also use a requirements file to add other packages that should be installed in a batch, and to make version requirements more exact (without putting those exact requirements in your setup.py files). But if you don't make requirements files then you'd use it just like easy_install.
For your install_requires I don't recommend any changes, unless you have been trying to create very exact requirements there that are known to be good. I think there's a limit to how exact you can usefully be in setup.py files about versions, because you can't really know what the future compatibility of new libraries will be like, and I don't recommend you try to predict this. Requirement files are an alternate place to lay out conservative version requirements.
You can still use python setup.py develop, and in fact if you do pip install -e svn+http://mysite/svn/Project/trunk#egg=Project it will check that out (into src/project) and run setup.py develop on it. So that workflow isn't any different really.
If you run pip verbosely (like pip install -vv) you'll see a lot of the commands that are run, and you'll probably recognize most of them.
I'm writing this in April 2014. Be conscious of the date on anything written about Python packaging, distribution or installation. It looks like there's been some lessening of factiousness, improvement in implementations, PEP-standardizing and unifying of fronts in the last, say, three years.
For instance, the Python Packaging Authority is "a working group that maintains many of the relevant projects in Python packaging."
The python.org Python Packaging User Guide has Tool Recommendations and The Future of Python Packaging sections.
distribute was a branch of setuptools that was remerged in June 2013. The guide says, "Use setuptools to define projects and create Source Distributions."
As of PEP 453 and Python 3.4, the guide recommends, "Use pip to install Python packages from PyPI," and pip is included with Python 3.4 and installed in virtualenvs by pyvenv, which is also included. You might find the PEP 453 "rationale" section interesting.
There are also new and newish tools mentioned in the guide, including wheel and buildout.
I'm glad I read both of the following technical/semi-political histories.
By Martijn Faassen in 2009: A History of Python Packaging.
And by Armin Ronacher in June 2013 (the title is not serious): Python Packaging: Hate, hate, hate everywhere.
For starters, pip is really new. New, incomplete and largely un-tested in the real world.
It shows great promise but until such time as it can do everything that easy_install/setuptools can do it's not likely to catch on in a big way, certainly not in the corporation.
Easy_install/setuptools is big and complex - and that offends a lot of people. Unfortunately there's a really good reason for that complexity which is that it caters for a huge number of different use-cases. My own is supporting a large ( > 300 ) pool of desktop users, plus a similar sized grid with a frequently updated application. The notion that we could do this by allowing every user to install from source is ludicrous - eggs have proved themselves a reliable way to distribute my project.
My advice: Learn to use setuptools - it's really a wonderful thing. Most of the people who hate it do not understand it, or simply do not have the use-case for as full-featured distribution system.
:-)

How to use Python distutils?

I wrote a quick program in python to add a gtk GUI to a cli program. I was wondering how I can create an installer using distutils. Since it's just a GUI frontend for a command line app it only works in *nix anyway so I'm not worried about it being cross platform.
my main goal is to create a .deb package for debian/ubuntu users, but I don't understand make/configure files. I've primarily been a web developer up until now.
edit: Does anyone know of a project that uses distutils so I could see it in action and, you know, actually try building it?
Here are a few useful links
Ubuntu Python Packaging Guide
This Guide is very helpful. I don't know how I missed it during my initial wave of gooling. It even walks you through packaging up an existing python application
The Ubuntu MOTU Project
This is the official package maintaining project at ubuntu. Anyone can join, and there are lots of tutorials and info about creating packages, of all types, which include the above 'python packaging guide'.
"Python distutils to deb?" - Ars Technica Forum discussion
According to this conversation, you can't just use distutils. It doesn't follow the debian packaging format (or something like that). I guess that's why you need dh_make as seen in the Ubuntu Packaging guide
"A bdist_deb command for distutils
This one has some interesting discussion (it's also how I found the ubuntu guide) about concatenating a zip-file and a shell script to create some kind of universal executable (anything with python and bash that is). weird. Let me know if anyone finds more info on this practice because I've never heard of it.
Description of the deb format and how distutils fit in - python mailing list
See the distutils simple example. That's basically what it is like, except real install scripts usually contain a bit more information. I have not seen any that are fundamentally more complicated, though. In essence, you just give it a list of what needs to be installed. Sometimes you need to give it some mapping dicts since the source and installed trees might not be the same.
Here is a real-life (anonymized) example:
#!/usr/bin/python
from distutils.core import setup
setup (name = 'Initech Package 3',
description = "Services and libraries ABC, DEF",
author = "That Guy, Initech Ltd",
author_email = "that.guy#initech.com",
version = '1.0.5',
package_dir = {'Package3' : 'site-packages/Package3'},
packages = ['Package3', 'Package3.Queries'],
data_files = [
('/etc/Package3', ['etc/Package3/ExternalResources.conf'])
])
apt-get install python-stdeb
Python to Debian source package conversion utility
This package provides some tools to produce Debian packages from Python packages via a new distutils command, sdist_dsc. Automatic defaults are provided for the Debian package, but many aspects of the resulting package can be customized via a configuration file.
pypi-install will query the Python Package Index (PyPI) for a
package, download it, create a .deb from it, and then install
the .deb.
py2dsc will convert a distutils-built source tarball into a Debian
source package.
Most Python programs will use distutils. Django is a one - see http://code.djangoproject.com/svn/django/trunk/setup.py
You should also read the documentation, as it's very comprehensive and has some good examples.
I found the following tutorial to be very helpful. It's shorter than the distutils documentation and explains how to setup a typical project step by step.
distutils really isn't all that difficult once you get the hang of it. It's really just a matter of putting in some meta-information (program name, author, version, etc) and then selecting what files you want to include. For example, here's a sample distutils setup.py module from a decently complex python library:
Kamaelia setup.py
Note that this doesn't deal with any data files or or whatnot, so YMMV.
On another note, I agree that the distutils documentation is probably some of python's worst documentation. It is extremely inclusive in some areas, but neglects some really important information in others.

Categories