I'm setting up a new system for a group of Python rookies to do a specific kind of scientific work using Python. It's got 2 different pythons on it (32 and 64 bit), and I want to install a set of common modules that users on the system will use.
(a) Some modules work out of the box for both pythons,
(b) some compile code and install differently depending on the python, and
(c) some don't work at all on certain pythons.
I've been told that virtualenv (+ wrapper) is good for this type of situation, but it's not clear to me how.
Can I use virtualenv to set up sandboxed modules across multiple user accounts without having to install each module for each user?
Can I use virtualenv to save me some time for case (a), i.e. install a module, but have all pythons see it?
I like the idea of isolating environments, and then having them just type "workon science32", "workon science64", depending on the issues with case (c).
Any advice is appreciated.
With virtualenv, you can allow each environment to use globally installed system packages simply by omitting the --no-site-packages option. This is the default behavior.
If you want to make each environment install all of their own packages, then use --no-site-packages and you will get a bare python installation to install your own modules. This is useful when you do not want packages to conflict with system packages. I normally do this just to keep system upgrades from interfering with working code.
I would be careful about thinking about these as sandboxes, because they are only partially isolated. The paths to python binaries and libraries are modified to use the environment, but really that is all that is going on. Virtualenv does nothing to prevent code running from doing destructive things to the system. Best way to sandbox is set Linux/Unix permissions properly, and give them their own user accounts.
EDIT For Version 1.7+
The default for 1.7 is to not include system packages, so if you want the behavior of using system packages, use the --system-site-packages option. Check the docs for more info.
Related
I would like to install a python virtualenv with relative paths, so that I can move the virtualenv to another machine (that has the same operative systems).
I googled about a solution, some suggests using the --portable options of the virtualenv command, but it is not available on my system. Maybe it is old?
Apart from changing the paths by hand, are there any other official way to create a portable virtualenv?
I am planning to use this on OSX and Linux (without mixing them of course).
The goal of virtualvenv is to create isolated Python environments.
The basic problem being addressed is one of dependencies and versions, and indirectly permissions.
Creating portable installations is not the supported use case:
Created python virtual environments are usually not self-contained. A complete python packaging is usually made up of thousands of files, so it’s not efficient to install the entire python again into a new folder. Instead virtual environments are mere shells, that contain little within themselves, and borrow most from the system python (this is what you installed, when you installed python itself). This does mean that if you upgrade your system python your virtual environments might break, so watch out.
You could always try to run a virtual environment on another machine that has the same OS, version and packages. But be warned, for having tried it myself in the past it is very fragile and prone to weird errors.
There are other tools to do what you are looking for depending on your use case and target OS (e.g. single executable, MSI installer, Docker image, etc.). See this answer.
I just read it's possible to have conflicts between globally and locally installed Python packages and that it's better to install Python itself and packages only in local VENV environments to avoid those potential problems.
I know I installed Python globally and (by mistake) Jupyterlab globally as well, but when I check pip list globally I get a long list of packages I don't remember ever installing, such as:
anyio, argon2, async-generator, ..., ipython, Jinja2 (I have pycharm but isn't is supposed to only install locally when you create a new project?), numpy, pandas, etc...
many others, perhaps 50 other names.
Should I erase everything that's installed globally and only install Python itself and project relevant packages in VENV environments?
And if so, how?
I've been looking into something similar and thought I would add an answer here for future browsers.
As with many things, the answer is it depends... It would be good practice to only use virtual environments; this helps to ensure a few things, including ensuring that different projects can run on different versions of packages without conflict, or you needing to update packages, potentially breaking older pieces of code.
On the other hand, if you maintain multiple applications using any given package, you will have to update each of these venvs individually (if you want to update), making it something of an earache, in which case you might decide to install the package globally to save yourself the pain.
As for the original question RE: deleting everything, again, no one but you can answer this. My advice would be to (if it's manageable), check each package and delete it if you don't see yourself using it often enough to justify it being global.
How to package Python itself into virtualenv? Is this even possible?
I'm trying to run python on a machine which it is not installed on, and I thought virtualenv made this possible. It activates, but can't run any Python.
When setting up the virtualenv (this can also be done if it already set up) simply do:
python -m virtualenv -p python env
And Python will be added to the virtualenv, and will become the default python of it.
The version of Python can also be passed, as python uses the first version found in the PATH.
virtualenv makes it convenient to use multiple python versions in different projects on the same machine, and isolate the pip install libraries installed by each project. It doesn’t install or manage the overall python environment. Python must be installed on the machine before you can install or configure the virtualenv tool itself or switch into a virtual environment.
Side note, consider using virtualenvwrapper — great helper for virtualenv.
You haven't specified the Operating System you are using.
In case you're using Windows, you don't use virtualenv for this. Instead you:
Download the Python embeddable package
Unpack it
Uncomment import site in the python37._pth file (only if you want to add additional packages)
Manually copy your additional packages (the ones you usually install with pip) to Lib\site-packages (you need to create that directory first, of course)
Such a python installation is configured in such a way that it can be moved and run from any location.
You only have to ensure the Microsoft C Runtime is installed on the system (but it almost always already is). See the documentation note:
Note The embedded distribution does not include the Microsoft C Runtime and it is the responsibility of the application installer to provide this. The runtime may have already been installed on a user’s system previously or automatically via Windows Update, and can be detected by finding ucrtbase.dll in the system directory.
You might need to install python in some location you have the permissions to do so.
Will Anaconda Python config scripts clash with Homebrew's? Note that I do not use these config scripts in any of my workflows, I'm just wondering if any of these config scripts may get called "behind the scenes". Sample output below (with username replaced by '..'):
$ brew doctor
...
Having additional scripts in your path can confuse software installed via
Homebrew if the config script overrides a system or Homebrew provided
script of the same name. We found the following "config" scripts:
/Users/../anaconda/bin/curl-config
/Users/../anaconda/bin/freetype-config
/Users/../anaconda/bin/libdynd-config
/Users/../anaconda/bin/libpng-config
/Users/../anaconda/bin/libpng15-config
/Users/../anaconda/bin/llvm-config
/Users/../anaconda/bin/python-config
/Users/../anaconda/bin/python2-config
/Users/../anaconda/bin/python2.7-config
/Users/../anaconda/bin/xml2-config
/Users/../anaconda/bin/xslt-config
Clearly some of these clash with some Homebrew-installed packages.
$ ls /usr/local/bin/*-config
/usr/local/bin/Magick++-config /usr/local/bin/libpng-config
/usr/local/bin/Magick-config /usr/local/bin/libpng16-config
/usr/local/bin/MagickCore-config /usr/local/bin/pcre-config
/usr/local/bin/MagickWand-config /usr/local/bin/pkg-config
/usr/local/bin/Wand-config /usr/local/bin/python-config
/usr/local/bin/freetype-config /usr/local/bin/python2-config
/usr/local/bin/gdlib-config /usr/local/bin/python2.7-config
Such clashes are entirely possible. When you install softwares that depends on Python using Homebrew, you want it to see Python packages and libraries installed via Homebrew but not those installed by Anaconda.
My solution to this is not putting
export PATH=$HOME/anaconda/bin:$PATH
into .bashrc. Normally, you'll just use Python and pip installed via Homebrew and packages installed by that pip. Sometimes, when you are developing Python projects that is convenient to use Anaconda's environment management mechanism (conda create -n my-env), you can temporarily do export PATH=$HOME/anaconda/bin:$PATH to turn it on. From what I gathered, one important benefit of using Anaconda compared to using regular Python is that conda create -n my-env anaconda will not duplicate package installations unnecessarily as virtualenv my-env will when you have a large number of virtual environments. If you do not mind having some degree of duplication, you could just avoid installing Anaconda all together and just use virtualenv.
It's entirely possible you won't notice any problems. On the other hand, you may have some pretty frustrating ones. It all depends on what you use and how your $PATH is ordered. Homebrew will take whatever file has precedence in your $PATH; if another Homebrew package needs to use Homebrew-installed config files and it sees the Anaconda versions first, it doesn't known any better than to use the wrong ones. In a sense, that's what you told it to do.
My recommendation is to keep things simple and clean. Unless you have a particular reason to keep Anaconda on your $PATH, you should probably pop it out and alias anything you need. Alternatively you could just install the things you require (e.g., numpy) via Homebrew and eliminate Anaconda altogether. (Actually, that's really what I would do. Anaconda comes with way more stuff than I have any reason to be dumping onto my machine.)
I don't know what your $PATH looks like, but in my experience, keeping it short and systematic has a lot of advantages.
short version: how can I get rid of the multiple-versions-of-python nightmare ?
long version: over the years, I've used several versions of python, and what is worse, several extensions to python (e.g. pygame, pylab, wxPython...). Each time it was on a different setup, with different OSes, sometimes different architectures (like my old PowerPC mac).
Nowadays I'm using a mac (OSX 10.6 on x86-64) and it's a dependency nightmare each time I want to revive script older than a few months. Python itself already comes in three different flavours in /usr/bin (2.5, 2.6, 3.1), but I had to install 2.4 from macports for pygame, something else (cannot remember what) forced me to install all three others from macports as well, so at the end of the day I'm the happy owner of seven (!) instances of python on my system.
But that's not the problem, the problem is, none of them has the right (i.e. same set of) libraries installed, some of them are 32bits, some 64bits, and now I'm pretty much lost.
For example right now I'm trying to run a three-year-old script (not written by me) which used to use matplotlib/numpy to draw a real-time plot within a rectangle of a wxwidgets window. But I'm failing miserably: py26-wxpython from macports won't install, stock python has wxwidgets included but also has some conflict between 32 bits and 64 bits, and it doesn't have numpy... what a mess !
Obviously, I'm doing things the wrong way. How do you usally cope with all that chaos ?
I solve this using virtualenv. I sympathise with wanting to avoid further layers of nightmare abstraction, but virtualenv is actually amazingly clean and simple to use. You literally do this (command line, Linux):
virtualenv my_env
This creates a new python binary and library location, and symlinks to your existing system libraries by default. Then, to switch paths to use the new environment, you do this:
source my_env/bin/activate
That's it. Now if you install modules (e.g. with easy_install), they get installed to the lib directory of the my_env directory. They don't interfere with existing libraries, you don't get weird conflicts, stuff doesn't stop working in your old environment. They're completely isolated.
To exit the environment, just do
deactivate
If you decide you made a mistake with an installation, or you don't want that environment anymore, just delete the directory:
rm -rf my_env
And you're done. It's really that simple.
virtualenv is great. ;)
Some tips:
on Mac OS X, use only the python installation in /Library/Frameworks/Python.framework.
whenever you use numpy/scipy/matplotlib, install the enthought python distribution
use virtualenv and virtualenvwrapper to keep those "system" installations pristine; ideally use one virtual environment per project, so each project's dependencies are fulfilled. And, yes, that means potentially a lot of code will be replicated in the various virtual envs.
That seems like a bigger mess indeed, but at least things work that way. Basically, if one of the projects works in a virtualenv, it will keep working no matter what upgrades you perform, since you never change the "system" installs.
Take a look at virtualenv.
What I usually do is trying to (progressively) keep up with the Python versions as they come along (and once all of the external dependencies have correct versions available).
Most of the time the Python code itself can be transferred as-is with only minor needed modifications.
My biggest Python project # work (15.000+ LOC) is now on Python 2.6 a few months (upgrading everything from Python 2.5 did take most of a day due to installing / checking 10+ dependencies...)
In general I think this is the best strategy with most of the interdependent components in the free software stack (think the dependencies in the linux software repositories): keep your versions (semi)-current (or at least: progressing at the same pace).
install the python versions you need, better if from sources
when you write a script, include the full python version into it (such as #!/usr/local/bin/python2.6)
I can't see what could go wrong.
If something does, it's probably macports fault anyway, not yours (one of the reasons I don't use macports anymore).
I know I'm probably missing something and this will get downvoted, but please leave at least a little comment in that case, thanks :)
I use the MacPorts version for everything, but as you note a lot of the default versions are bizarrely old. For example vim omnicomplete in Snow Leopard has python25 as a dependency. A lot of python related ports have old dependencies but you can usually flag the newer version at build time, for example port install vim +python26 instead of port install vim +python. Do a dry run before installing anything to see if you are pulling, for example, the whole of python24 when it isn't necessary. Check portfiles often because the naming convention as Darwin ports was getting off the ground left something to be desired. In practice I just leave everything in the default /opt... folders of MacPorts, including a copy of the entire framework with duplicates of PyObjC, etc., and just stick with one version at a time, retaining the option to return to the system default if things break unexpectedly. Which is all perhaps a bit too much work to avoid using virtualenv, which I've been meaning to get around to using.
I've had good luck using Buildout. You set up a list of which eggs and which versions you want. Buildout then downloads and installs private versions of each for you. It makes a private "python" binary with all the eggs already installed. A local "nosetests" makes things easy to debug. You can extend the build with your own functions.
On the down side, Buildout can be quite mysterious. Do "buildout -vvvv" for a while to see exactly what it's doing and why.
http://www.buildout.org/docs/tutorial.html
At least under Linux, multiple pythons can co-exist fairly happily. I use Python 2.6 on a CentOS system that needs Python 2.4 to be the default for various system things. I simply compiled and installed python 2.6 into a separate directory tree (and added the appropriate bin directory to my path) which was fairly painless. It's then invoked by typing "python2.6".
Once you have separate pythons up and running, installing libraries for a specific version is straightforward. If you invoke the setup.py script with the python you want, it will be installed in directories appropriate to that python, and scripts will be installed in the same directory as the python executable itself and will automagically use the correct python when invoked.
I also try to avoid using too many libraries. When I only need one or two functions from a library (eg scipy), I'll often see if I can just copy them to my own project.