Managing many Python projects/virtualenvs - python

In my workplace, I have to manage many (currently tens, but probably hundreds eventually) Python web applications, potentially running various frameworks, libraries, etc (all at various versions). Virtualenv has been a lifesaver in managing that so far, but I'd still like to be able to manage things better, particularly when it comes to managing package upgrades.
I've thought of a few scenarios
Option 1:
Install all required modules for each project in each virtualenv using pip, upgrade each individually as necessary. This would require a significant time cost for each upgrade and would require additional documentation to keep track of things. Might be facilitated by some management scripting.
Option 2:
Install all libraries used by any application in a central repository, use symlinks to easily change versions once for all projects. Easy upgrades and central management, but forgoes some of the nicest benefits of using virtualenv in the first place.
Option 3:
Hybridize the above two somehow, centralizing the most common libraries and/or those likely to need upgrades and installing the rest locally to each virtualenv.
Does anyone else have a similar situation? What's the best way to handle this?

You might consider using zc.buildout. It's more annoying to set up than plain pip/virtualenv, but it gives you more opportunities for automation. If the disk space usage isn't an issue, I'd suggest you just keep using individual environments for each project so you can upgrade them one at a time.

We have a requirements.pip file at our project root that contains the packages for pip to install, so upgrading automatically is relatively easy. I'm not sure symlinking would solve the issue - it will make it harder to make upgrades to a subset of your projects. If diskspace isn't an issue, and you can write some simple scripts to list and upgrade packages, I'd stick with virtualenv as-is.

Related

Is there an easy solution to "No module named [...]" in Python 3? [duplicate]

I wish to place a python program on GitHub and have other people download and run it on their computers with assorted operating systems. I am relatively new to python but have used it enough to have noticed that getting the assorted versions of all the included modules to work together can be problematic. I just discovered the use of requirements.txt (generated with pipreqs and deployed with the command pip install -r /path/to/requirements.txt) but was very surprised to notice that requirements.txt does not actually state what version of python is being used so obviously it is not the complete solution on its own. So my question is: what set of specifications/files/something-else is needed to ensure that someone downloading my project will actually be able to run it with the fewest possible problems.
EDIT: My plan was to be guided by whichever answer got the most upvotes. But so far, after 4 answers and 127 views, not a single answer has even one upvote. If some of the answers are no good, it would be useful to see some comments as to why they are no good.
Have you considered setting up a setup.py file? It's a handy way of bundling all of your... well setup into a single location. So all your user has to do is A) clone your repo and B) run pip install . to run the setup.py
There's a great stack discussion about this.
As well as a handle example written by the requests guy.
This should cover most use cases. Now if you want to make it truly distributable then you'll want to look into setting it up in PyPi, the official distribution hub.
Beyond that if you're asking how to make a program "OS independent" there isn't a one size fits all. It depends on what you are doing with your code. Requires researching how your particular code interacts with those OS's etc.
There are many, many, many, many, many, many, many ways to do this. I'll skate over the principles behind each, and it's use case.
1. A python environment
There are many ways to do this. pipenv, conda, requirments.txt, etc etc.
With some of these, you can specify python versions. With others, just specify a range of python versions you know it works with - for example, if you're using python 3.7, it's unlikely not to support 3.6; there's only one or two minor changes. 3.8 should work as well.
Another similar method is setup.py. These are generally used to distribute libraries - like PyInstaller (another solution I'll mention below), or numpy, or wxPython, or PyQt5 etc - for import/command line use. The python packaging guide is quite useful, and there are loads of tutorials out there. (google python setup.py tutorial) You can also specify requirements in these files.
2. A container
Docker is the big one. If you haven't heard of it, I'll be surprised. A quick google of a summary comes up with this, which I'll quote part of:
So why does everyone love containers and Docker? James Bottomley, formerly Parallels' CTO of server virtualization and a leading Linux kernel developer, explained VM hypervisors, such as Hyper-V, KVM, and Xen, all are "based on emulating virtual hardware. That means they're fat in terms of system requirements."
Containers, however, use shared operating systems. This means they are much more efficient than hypervisors in system resource terms. Instead of virtualizing hardware, containers rest on top of a single Linux instance. This means you can "leave behind the useless 99.9 percent VM junk, leaving you with a small, neat capsule containing your application,"
That should summarise it for you. (Note you don't need a specific OS for containers.)
3. An executable file
There are 2 main tools that do this at the time of writing. PyInstaller, and cx_Freeze. Both are actively developed. Both are open source.
You take your script, and the tool compiles it to bytecode, finds the imports, copies those, and creates a portable python environment that runs your script on the target system without the end user needing python.
Personally, I prefer PyInstaller - I'm one of the developers. PyInstaller provides all of its functionality through a command line script, and supports most libraries that you can think of - and is extendable to support more. cx_Freeze requires a setup script.
Both tools support windows, Linux, macOS, and more. PyInstaller can create single file exes, or a one folder bundle, whereas cx_Freeze only supports one folder bundles. PyInstaller 3.6 supports python 2.7, and 3.5-3.7 - but 4.0 won't support python 2. cx_Freeze has dropped python 2 support as of the last major release (6.0 I think).
Anyway, enough about the tools features; you can look into those yourself. (See https://pyinstaller.org and https://cx-freeze.readthedocs.io for more info)
When using this distribution method, you usually provide source code on the GitHub repo, a couple of exes (one for each platform) ready for download, and instructions on how to build the code into an executable file.
The best tool I have used so far for this is Pipenv. Not only it unifies and simplifies the whole pip+virtualenv workflow for you, developer, but it also guarantees that the exact versions of all dependencies (including Python itself) are met when other people run your project with it.
The project website does a pretty good job at explaining how to use the tool, but, for completeness sake, I'll give a short explanation here.
Once you have Pipenv installed (for instance, by running pip install --user pipenv), you can go to the directory of your project and run pipenv --python 3.7, so Pipenv will create a new virtualenv for your project, create a Pipfile and a Pipfile.lock (more on them later). If you go ahead and run pipenv install -r requirements.txt it will install all your packages. Now you can do a pipenv shell to activate your new virtualenv, or a pipenv run your_main_file.py to simply run your project.
Now let's take a look at the contents of your Pipfile. It should be something resembling this:
[packages]
Django = "*"
djangorestframework = "*"
iso8601 = "*"
graypy = "*"
whitenoise = "*"
[requires]
python_version = "3.7"
This file has the human-readable specifications for the dependencies of your project (note that it specifies the Python version too). If your requirements.txt had pinned versions, your Pipfile could have them too, but you can safely wildcard them, because the exact versions are stored in the Pipfile.lock. Now you can run things like pipenv update to update your dependencies and don't forget to commit Pipfile and Pipfile.lock to your VCS.
Once people clone your project, all they have to do is run pipenv install and Pipenv will take care of the rest (it may even install the correct version of Python for them).
I hope this was useful. I'm not affiliated in any way with Pipenv, just wanted to share this awesome tool.
If your program is less about GUI, or has a web GUI, then you can share the code using Google Colaboratory.
https://colab.research.google.com/
Everyone can run it with the same environment. No need for installation.
In case converting all your python scripts into one executable can help you, then my answer below would help ...
I have been developing a large desktop application purely in python since 3 years. It is a GUI-based tool built on top of pyqt library (python-bindings of QT C++ framework).
I am currently using "py2exe" packaging library : is a distutils extension which allows to build standalone Windows executable programs (32-bit and 64-bit) from Python scripts; all you have to do is to:
install py2exe: 'pip install py2exe'
Create a setup.py script: It is used to specify the content of the final EXE (name, icon, author, data files, shared libraries, etc ..)
Execute: python setup.py py2exe
I am also using "Inno Setup" software to create installer: Creating shortcuts, setting environment variables, icons, etc ...
I'll give you a very brief summary of some of the existing available solutions when it comes to python packaging you may choose from (knowledge is power):
Follow the guidelines provided at Structuring Your Project, these conventions are widely accepted by python community and it's usually a good starting point when newcomers start coding in python. By following these guidelines pythonists watching your project/source at github or other similar places will know straightaway how to install it. Also, uploading your project to pypi as well as adding CI by following those rules will be painless.
Once your project is structured properly according to standard conventions, the next step might be using some of the available freezers, in case you'd like to ship to your end-users a package they can install without forcing them to have python installed on their machines. Be aware though these tools won't provide you any code protection... said otherwise, extracting the original python code from the final artifacts would be trivial in all cases
If you still want to ship your project to your users without forcing them to install any dev dependency and you do also care about code protection so you don't want to consider any of the existing freezers you might use tools such as nuitka, shedskin, cython or similar ones. Usually reversing code from the artifacts produced by these tools isn't trivial at all... Cracking protection on the other hand is a different matter and unless you don't provide a physical binary to your end-user you can't do much about it other than slowing them down :)
Also, in case you'd need to use external languages in your python project another classic link that comes to mind would be https://wiki.python.org/moin/IntegratingPythonWithOtherLanguages, adding the build systems of such tools to CI by following rules of 1 would be pretty easy.
That said, I'd suggest stick to bulletpoint 1 as I know that will be more than good enough to get you started, also that particular point should cover many of the existing use-cases for python "standard" projects.
While this is not intended to be a full guide by following those you'll be able to publish your python project to the masses in no time.
I think you can use docker with your python https://github.com/celery/celery/tree/master/docker
kindly follow the files and I think you can figure out the way to make your docker file for your python scripts!
Because it is missing from the other answers, I would like to add one completely different aspect:
Unit testing. Or testing in general.
Usually, it is good to have one known good configuration. Depending on what the dependencies of the program are, you might have to test different combinations of packages. You can do that in an automated fashion with e.g. tox or as part of a CI/CD pipeline.
There is no general rule of what combination of packages should be tested, but usually python2/3 compatability is a major issue. If you have strong dependencies on packages with major version differences, you might want to consider testing against these different versions.

Robust way to ensure other people can run my python program

I wish to place a python program on GitHub and have other people download and run it on their computers with assorted operating systems. I am relatively new to python but have used it enough to have noticed that getting the assorted versions of all the included modules to work together can be problematic. I just discovered the use of requirements.txt (generated with pipreqs and deployed with the command pip install -r /path/to/requirements.txt) but was very surprised to notice that requirements.txt does not actually state what version of python is being used so obviously it is not the complete solution on its own. So my question is: what set of specifications/files/something-else is needed to ensure that someone downloading my project will actually be able to run it with the fewest possible problems.
EDIT: My plan was to be guided by whichever answer got the most upvotes. But so far, after 4 answers and 127 views, not a single answer has even one upvote. If some of the answers are no good, it would be useful to see some comments as to why they are no good.
Have you considered setting up a setup.py file? It's a handy way of bundling all of your... well setup into a single location. So all your user has to do is A) clone your repo and B) run pip install . to run the setup.py
There's a great stack discussion about this.
As well as a handle example written by the requests guy.
This should cover most use cases. Now if you want to make it truly distributable then you'll want to look into setting it up in PyPi, the official distribution hub.
Beyond that if you're asking how to make a program "OS independent" there isn't a one size fits all. It depends on what you are doing with your code. Requires researching how your particular code interacts with those OS's etc.
There are many, many, many, many, many, many, many ways to do this. I'll skate over the principles behind each, and it's use case.
1. A python environment
There are many ways to do this. pipenv, conda, requirments.txt, etc etc.
With some of these, you can specify python versions. With others, just specify a range of python versions you know it works with - for example, if you're using python 3.7, it's unlikely not to support 3.6; there's only one or two minor changes. 3.8 should work as well.
Another similar method is setup.py. These are generally used to distribute libraries - like PyInstaller (another solution I'll mention below), or numpy, or wxPython, or PyQt5 etc - for import/command line use. The python packaging guide is quite useful, and there are loads of tutorials out there. (google python setup.py tutorial) You can also specify requirements in these files.
2. A container
Docker is the big one. If you haven't heard of it, I'll be surprised. A quick google of a summary comes up with this, which I'll quote part of:
So why does everyone love containers and Docker? James Bottomley, formerly Parallels' CTO of server virtualization and a leading Linux kernel developer, explained VM hypervisors, such as Hyper-V, KVM, and Xen, all are "based on emulating virtual hardware. That means they're fat in terms of system requirements."
Containers, however, use shared operating systems. This means they are much more efficient than hypervisors in system resource terms. Instead of virtualizing hardware, containers rest on top of a single Linux instance. This means you can "leave behind the useless 99.9 percent VM junk, leaving you with a small, neat capsule containing your application,"
That should summarise it for you. (Note you don't need a specific OS for containers.)
3. An executable file
There are 2 main tools that do this at the time of writing. PyInstaller, and cx_Freeze. Both are actively developed. Both are open source.
You take your script, and the tool compiles it to bytecode, finds the imports, copies those, and creates a portable python environment that runs your script on the target system without the end user needing python.
Personally, I prefer PyInstaller - I'm one of the developers. PyInstaller provides all of its functionality through a command line script, and supports most libraries that you can think of - and is extendable to support more. cx_Freeze requires a setup script.
Both tools support windows, Linux, macOS, and more. PyInstaller can create single file exes, or a one folder bundle, whereas cx_Freeze only supports one folder bundles. PyInstaller 3.6 supports python 2.7, and 3.5-3.7 - but 4.0 won't support python 2. cx_Freeze has dropped python 2 support as of the last major release (6.0 I think).
Anyway, enough about the tools features; you can look into those yourself. (See https://pyinstaller.org and https://cx-freeze.readthedocs.io for more info)
When using this distribution method, you usually provide source code on the GitHub repo, a couple of exes (one for each platform) ready for download, and instructions on how to build the code into an executable file.
The best tool I have used so far for this is Pipenv. Not only it unifies and simplifies the whole pip+virtualenv workflow for you, developer, but it also guarantees that the exact versions of all dependencies (including Python itself) are met when other people run your project with it.
The project website does a pretty good job at explaining how to use the tool, but, for completeness sake, I'll give a short explanation here.
Once you have Pipenv installed (for instance, by running pip install --user pipenv), you can go to the directory of your project and run pipenv --python 3.7, so Pipenv will create a new virtualenv for your project, create a Pipfile and a Pipfile.lock (more on them later). If you go ahead and run pipenv install -r requirements.txt it will install all your packages. Now you can do a pipenv shell to activate your new virtualenv, or a pipenv run your_main_file.py to simply run your project.
Now let's take a look at the contents of your Pipfile. It should be something resembling this:
[packages]
Django = "*"
djangorestframework = "*"
iso8601 = "*"
graypy = "*"
whitenoise = "*"
[requires]
python_version = "3.7"
This file has the human-readable specifications for the dependencies of your project (note that it specifies the Python version too). If your requirements.txt had pinned versions, your Pipfile could have them too, but you can safely wildcard them, because the exact versions are stored in the Pipfile.lock. Now you can run things like pipenv update to update your dependencies and don't forget to commit Pipfile and Pipfile.lock to your VCS.
Once people clone your project, all they have to do is run pipenv install and Pipenv will take care of the rest (it may even install the correct version of Python for them).
I hope this was useful. I'm not affiliated in any way with Pipenv, just wanted to share this awesome tool.
If your program is less about GUI, or has a web GUI, then you can share the code using Google Colaboratory.
https://colab.research.google.com/
Everyone can run it with the same environment. No need for installation.
In case converting all your python scripts into one executable can help you, then my answer below would help ...
I have been developing a large desktop application purely in python since 3 years. It is a GUI-based tool built on top of pyqt library (python-bindings of QT C++ framework).
I am currently using "py2exe" packaging library : is a distutils extension which allows to build standalone Windows executable programs (32-bit and 64-bit) from Python scripts; all you have to do is to:
install py2exe: 'pip install py2exe'
Create a setup.py script: It is used to specify the content of the final EXE (name, icon, author, data files, shared libraries, etc ..)
Execute: python setup.py py2exe
I am also using "Inno Setup" software to create installer: Creating shortcuts, setting environment variables, icons, etc ...
I'll give you a very brief summary of some of the existing available solutions when it comes to python packaging you may choose from (knowledge is power):
Follow the guidelines provided at Structuring Your Project, these conventions are widely accepted by python community and it's usually a good starting point when newcomers start coding in python. By following these guidelines pythonists watching your project/source at github or other similar places will know straightaway how to install it. Also, uploading your project to pypi as well as adding CI by following those rules will be painless.
Once your project is structured properly according to standard conventions, the next step might be using some of the available freezers, in case you'd like to ship to your end-users a package they can install without forcing them to have python installed on their machines. Be aware though these tools won't provide you any code protection... said otherwise, extracting the original python code from the final artifacts would be trivial in all cases
If you still want to ship your project to your users without forcing them to install any dev dependency and you do also care about code protection so you don't want to consider any of the existing freezers you might use tools such as nuitka, shedskin, cython or similar ones. Usually reversing code from the artifacts produced by these tools isn't trivial at all... Cracking protection on the other hand is a different matter and unless you don't provide a physical binary to your end-user you can't do much about it other than slowing them down :)
Also, in case you'd need to use external languages in your python project another classic link that comes to mind would be https://wiki.python.org/moin/IntegratingPythonWithOtherLanguages, adding the build systems of such tools to CI by following rules of 1 would be pretty easy.
That said, I'd suggest stick to bulletpoint 1 as I know that will be more than good enough to get you started, also that particular point should cover many of the existing use-cases for python "standard" projects.
While this is not intended to be a full guide by following those you'll be able to publish your python project to the masses in no time.
I think you can use docker with your python https://github.com/celery/celery/tree/master/docker
kindly follow the files and I think you can figure out the way to make your docker file for your python scripts!
Because it is missing from the other answers, I would like to add one completely different aspect:
Unit testing. Or testing in general.
Usually, it is good to have one known good configuration. Depending on what the dependencies of the program are, you might have to test different combinations of packages. You can do that in an automated fashion with e.g. tox or as part of a CI/CD pipeline.
There is no general rule of what combination of packages should be tested, but usually python2/3 compatability is a major issue. If you have strong dependencies on packages with major version differences, you might want to consider testing against these different versions.

Is it a good practice to upgrade all python packages in production to their latest versions?

I am running a fairly complex Django application, for about a year now. It has about 50 packages in requirements.txt
Whenever I need a new package, I install it with pip, and then manually add it in the requirements.txt file with a fixed version:
SomeNewModule==1.2.3
That means that most of my packages are now outdated after a year. I have updated couple of them manually when I specifically needed a new feature.
I am beginning to think that there might be security patches that I am missing out, but I am reluctant to update them all blindly, due to backward incompatibility.
Is there a standard best practice for this?
The common pattern for versioning python modules (and many other software) is major.minor.patch
where after the initial release, patch versions don't change the api, minor releases may change the api in a backward compatible way, and major releases usually aren't backward compatible
so if you have module==x.y.z a relatively safe requirement specification would be:
module>=x.y.z,<x.y+1.0
note that while this will usually be OK, it's based on common practices, and not guaranteed to work, and it's more likely to be stable with more "organized" modules
Do not update packages in production, you can brake things, if you have a package which has tables in database, and you update it you can brake your database. I used for example python social auth, I wanted to upgrade it to the last version, so for that, I need it to upgrade do version x, run migrations and after that got to last version and migrate.
Upgrade the packages in your development environment, test it. If you have a pre-production, do that there after testing in dev.
One of the most robust practices in the industry is to ensure that your code is well unit-tested, well property-tested, and well regression-tested. Once you've got good coverage and the tests are green, begin upgrading your dependencies. You might want to do this one at a time if you're feeling ambitious, or you might give it all a go and try your luck. If after upgrading your tests are still green and you can manually go through all workflows, chances are you're in the clear! Those tests you spent time on can then be leveraged going forward for any smaller, incremental upgrades.
It depends. Most of the time, updating the packages for minor version changes (eg: 2.45.1 -> 2.56.1) is unlikely to break your system. However, it is still advisable to run extensive regression testing. However, major version changes(eg. 2.45.1 to 3.13.0) should definitely be avoided as most of them have little backward compatibility. An example of this would be the selenium web driver which in version 3.0 cannot run without geckodriver compared to version 2.56. Regardless, extensive unit and regressive testing should be carried out on the old and new code to ensure there are no unexpected changes, especially at the corner cases.
Considering that you have mentioned that you have mentioned that you only use a single python pip, I can say that this will be an issue for future programs as you may need to use different versions of the same package for different projects.
An acceptable practice to avoid such issues would be to use a virtual environment for each project. Just create a makefile in your project root and use:
virtualenv --python=${PYTHON} env
env/bin/pip install --upgrade pip
env/bin/pip --upgrade setuptools wheel pip-tools
Then have env/bin/pip install the requirements from your requirements.txt file.
First, a little background on me. I'm a security test engineer and work heavily on Python, that being said most of my work is test code and the products I typically work on do not ship with Python running on the box so I cannot give you direct answers to questions about specific Python packages. I can give you some rules of thumb about what most companies do when faced with such questions in a general sense.
Figure out if you need to update:
Run a security scanner (like Nessus). If you have any glaring version problems with your application stack or hosts Nessus will give you some idea of things that should be fixed immediately
Check the release notes for all of the packages to see if anything actually needs updating. Pay extra close attention to any package that opens a socket. If there are no security fixes it probably isn't worth your time to update. In some cases even if there is a security problem in a package, it may be something you aren't using so pay attention to the problem description. If you have an application in production now, typically the goal is to change as little as possible to limit the amount of code update and testing that needs to be done.
If you've discovered that you need an update to start resolving dependencies:
Probably the easiest way to do this is to set up a test environment and start installing the updated versions that you need. Once more, try to make a few changes as possible while installing your code in the test environment. In Python, there is a reliance for C libraries for much of the security stuff, In your test environment be sure to have the same lower level system libs as well...
Create a list of changes that need to be made and start devising a test plan for the sections of code that you need to replace.
Once your development environment is set up, test test test. You'll probably want to rerun your security scanner and be assured to exercise all of the new code.
You are absolutely right. There will be backward incompatibility issue. Do not update packages to the latest blindly. Most likely, you will have package/module/class/variable/key undefined/notFound issues. Especially you have a complex system. Even if you use pip install --upgrade somepackage
This is lesson from my real experience.

Best practices for Python deployment -- multiple versions, standard install locations, packaging tools etc

Many posts on different aspects of this question but I haven't seen a post that brings it all together.
First a subjective statement: it seems like the simplicity we experience when working with the Python language is shot to pieces when we move outside the interpreter and start grappling with deployment issues. How best to have multiple versions of Python on the same machine? Where should packages be installed? Disutils vs. setuptools vs. pip etc. It seems like the Zen of Python is being abused pretty badly when it comes to deployment. I'm feeling eerie echoes of the "DLL hell" experience on Windows.
Do the experts agree on some degree of best practice on these questions?
Do you run multiple versions of Python on the same machine? How do you remain confident that they can co-exist -- and the newer version doesn't break assumptions of other processes that rely on the earlier version (scripts provided by OS vendor, for example)? Is this safe? Does virtualenv suffice?
What are the best choices for locations for different components of the Python environment (including 3rd party packages) on the local file system? Is there a strict or rough correspondence between locations for many different versions of Unixy and Windows OS's that can be relied upon?
And the murkiest corner of the swamp -- what install tools do you use (setuptools, distutils, pip etc.) and do they play well with your choices re: file locations, Python virtual environments, Python path etc.
These sound like hard questions. I'm hopeful the experienced Pythonistas may have defined a canonical approach (or two) to these challenges. Any approach that "hangs together" as a system that can be used with confidence (feeling less like separate, unrelated tools) would be very helpful.
I've found that virtualenv is the only reliable way to configure and maintain multiple environments on the same machine. It even has as a way of packaging up environment and installing it on another machine.
For package management I always use pip since it works so nicely with virtualenv. It also makes it easy to install and upgrade packages from a variety of sources such a git repositories.
I agree this is quite a broad question, but I’ll try to address its many parts anyway.
About your subjective statement: I don’t see why the simplicity and elegance of Python would imply that packaging and deployment matters suddenly should become simple things. Some things related to packaging are simple, other are not, other could be. It would be best for users if we had one complete, robust and easy packaging system, but it hasn’t turned that way. distutils was created and then its development paused, setuptools was created and added new solutions and new problems, distribute was forked from setuptools because of social problems, and finally distutils2 was created to make one official complete library. (More on Differences between distribute, distutils, setuptools and distutils2?) The situation is far from ideal for developers and users, but we are working on making it better.
How best to have multiple versions of Python on the same machine? Use your package manager if you’re on a modern OS, or use “make altinstall” if you compile from source on UNIX, or use the similar non-conflicting installation scheme if you compile from source on Windows. As a Debian user, I know that I can call the individual versions by using “pythonX.Y”, and that what the default versions (“python” and “python3”) are is decided by the Debian developers. A few OSes have started to break the assumption that python == python2, so there is a PEP in progress to bless or condemn that: http://www.python.org/dev/peps/pep-0394/ Windows seems to lack a way to use one Python version as default, so there’s another PEP: http://www.python.org/dev/peps/pep-0397/
Where should packages be installed? Using distutils, I can install projects into my user site-packages directory (see PEP 370 or docs.python.org). What exactly is the question?
Parallel installation of different versions of the same project is not supported. It would need a PEP to discuss the changes to the import system and the packaging tools. Before someone starts that discussion, using virtualenv or buildout works well enough.
I don’t understand the question about the location of “components of the Python environment”.
I mostly use system packages (i.e. using the Aptitude package manager on Debian). To try out projects, I clone their repository. If I need something that’s not available with Aptitude, I install (or put a .pth file to the repo) in my user site-packages directory. I don’t need a custom PYTHONPATH, but I have changed the location of my user site-packages with PYTHONUSERBASE. I don’t like the magic and the eggs concept in setuptools/distribute, so I don’t use them. I’ve started to use virtualenv and pip for one project though (they use setuptools under the cover, but I made a private install so my global Python does not have setuptools).
One resource for this area is the book Expert Python Programming by Tarek Ziade. I'm ambivalent about the quality of the book, but the topics covered are just what you're focusing on.

State of Python Packaging: Buildout, Distribute, Distutils, EasyInstall, etc

The last time I had to worry about installing Python packages was two years ago working with Enthought, NumPy and MayaVi2. That experience gave me lingering nightmares related to quirky behavior installing & updating Python packages in non-standard locations (in $HOME/usr/local2.6/, for example).
Anyway, my work is taking me back to installing various Python packages. The CheeseShop Tutorial mentions DistUtils and EasyInstall in addition to Buildout! I am having a hard time finding one place that compares these (and other) PyPi installation tools, so I am hoping to tap into the StackOverflow community: What are the strengths & weaknesses of each installation tool?
First of all, regardless of installation tool you decide on, start using virtualenv --no-site-packages! That way, python packages are not installed globally and you can easily get back to where you were in old as well as new projects.
Now, your comparison is a little bit apples-to-pears as the tools you list are not mutually exclusive. However, I can wholly recommend Buildout. It will install python packages as well as other stuff and lets you automate installation and deployment of (complex) projects.
Also, I recommend looking into Fabric as a means to automate administrative tasks.
I've done quiet a bit of research on this topic(a couple of weeks worth) before settling down on using buildout for all of my projects.
DistUtils and EasyInstall in addition to Buildout!
The difficulty in creating one place to compare all of these tools is that they're all part of a same tool chain and are used together to create a predictable, reliable and flexible tool set.
For example, easy_install is used to install distutils packages from pypi(cheeseshop) to your system Python's site-packages directory. This drastically simplifies installation of packages to your system/global sys.path.
easy_install is very convenient for packages that are consistent for all projects. But, I find that I prefer to use system's easy_install to install packages that projects do not depend on. For example, github-cli I use with every project, because it allows me to interact with project's Github Issues from command line. I use this with projects, but it's for convenience and the project itself does not have dependancy on this package.
For managing project's dependancies, I use buildout. Buildout allows you to indicate specifically what version of packages your project depends on. I prefer buildout over pip-requirements.txt because buildout is declarative. With pip, you install the packages and at the end of the development you generate the requirements.txt file. With Buildout on the other hand, you modify the buildout.cfg before the package egg is added to your project. This forces me to be conscious of what packages I'm adding to the project.
Now, there is a matter of virtualenv. One of the most publicized features of virtualenv is obviously --no-site-packages option. I have not found that option to be particularly useful, because I use buildout. Buildout manages the sys.path and includes only the packages I ask tell it to include. It also, includes everything in system Python's site-packages but since I don't have anything there that I use in projects, I never have conflicts.
Also, I find that --no-site-packages only hinders my development process, because some packages I install using my sistem's packaging system. Usually, anything that has C libraries that need to be compiled, I install through the system's packaging system.
In the project's fabfile.py I include test function to test for presence of system packages that I install through system's package manager.
In summary, here is how I use these tools:
System's Package Manager(apt-get, yam, port, fink ...)
I use one of these to install python versions that I need on this system. I also use it to install packages like lxml which include c libraries.
easy_install
I use to install packages from pypi that I use on all projects, but projects are not dependant on these packages.
buildout
I use to manage dependancies of a project.
In my experience, this workflow has been very flexible, portable and easy to work with.
Distribute is a new fork of setuptools (easy_install), which should also be considered. Even Guido recommends it.
Buildout is orthogonal to the packaging --- you can use buildout with distribute.
Whenever I need to remind myself of the state of play, I look at these as a starting point:
The State of Python Packaging, a response to:
On packaging, linked from:
Tools of the Modern Python Hacker
I can't easily help you with finding the strength, but I can make it a bit harder, since it also depends on the platform you want to use.
For example if you need to install python packages on Gentoo (GNU/Liunx) based computers, you can easily use g-pypi to create ebuilds for all packages which use distutils (rather: a setup.py). That way they get completely integrated into your system and can be added, updated and removed like all your other tools. But it naturally only works for Gentoo-based systems.
Also you can use yolk to find out about all packages installed via easy_install on your system (not only on Gentoo).
When I write code, I simply use distutils (because it allows building portage ebuilds very easily) and sometimes basic setuptools features, or organize my programs so people can just download and run them from the program folder (ideally just unpack the source archive / clone the repository somewhere). This isn't the perfect solution, but until the core python team decides which way they want to move, I don't want to fix onto a path (anymore) which might disappear.

Categories