We are using Python scripts to process satellite images using different toolboxes or libraries (like GDAL etc.). We now have run into the problem that some 3rd-party Python applications we use (where we have no influence) require different versions of these python packages (for example, the Sentinel2 SEN2COR atmospheric correction requires GDAL 1.x, where we also want to use GDAL 2.x to be able to modify JPEG2000 files).
What's the best way to get this thing set up? I would prefer a way where I can have multiple installations of the same python version (e.g. 2.7, but that does not REALLY matter), and install packages and versions separately for each of them. Like that I could make sure that the SEN2COR script runs in his own python installation, where I install the packages needed and never touch it again, and work in another python installation with my other scripts.
I think a thing like virtualenv would be perfect for that, but there is one important point: All our scripts are command line scripts and are started from a variety of sources, like MATLAB scripts, R scripts or even cron jobs sometimes. Is there a way in virtualenv where I can do something like /usr/bin/python-version-only-for-sen2cor process_data.py arguments and /location/of/other/python-version reload_table.py from the shell or from inside other programs? Whats the best way for our setup? I probably can just install python multiple times and always modify the environment vars to use a different python version while installing the packages, but I guess this is prone to errors. Any suggestions?
You could have a look at pythonbrew or pyenv.
I just used pythonbrew recently. I just run my program with a bash script (to avoid typing the whole path):
#!/bin/bash
~/.pythonbrew/pythons/Python-2.7.7/bin/python my_program.py
Related
I'm new to coding and I was wondering if someone could explain what exactly python interpreters/environments do and the how that relates to the python versions and packages that can be used in certain projects.
I was trying to code a twitter bot with python and I use VS code as my IDE. While trying to import tweepy into my python code, I noticed I kept getting an error that it couldn't import. After some googling I realized it was because I had to set the python interpreter of the python file to the one that had tweepy installed to. So I changed it and the error with tweepy was solved. But I noticed that I have a bunch of python interpreters in seemingly different locations (screenshot of the interpreter options I'm given). The interpreter that fixed the issue was the one in the pyenv path.
I had installed pyenv some months back because I wanted my terminal (I'm on macOS) to automatically launch python3 when I typed python into the terminal. However, I don't really know what it does beyond that.
So my questions are:
Why do I have multiple python interpreters? Is there a way to get rid of the ones and just keep the one from pyenv (like clean up the ones in /usr/local/bin/python3, /opt/homebrew/bin/python or /usr/bin/python3 since I'm not using them)? or should I not do that?
What exactly does pyenv do? Is it okay that my python libraries are getting installed to pyenv by default? Or should I change it so that it's getting installed to homebrew or one of the usr/bin paths?
Sorry, this is my first time asking a question here so I might not sound that cohesive.
The interpreter is the program which executes a python source file. This is a program like any other, and you can have various versions of it at once (indeed python is quite good at being self contained, and multiple versions will live alongside one another quite happily).
Macs ship with an outdated python 2, used for something in the system. If you remove or update it, things may break. So you install your own python 3 somewhere. If you want to load the new python when you type python in a shell, that shell's $PATH needs to point to the right python. You generally set this in something like ~/.profile.
Pyenv is a tool for managing multiple python interpreters. It's usually used on a per-project basis, to test code against multiple pythons. Further to complicate things, it's often used alongside a virtual environment tool like pipenv. Using it to avoid manually fiddling with $PATH is fine, but a slightly orthogonal use case. All pyenv does is to put little scripts with the same name as python executables somewhere in your $PATH where they supercede anything else. These scripts then figure out which python should get called in this case. So pyenv isn't installing anything, it's just working out which python is going to do it.
The solution to all your problems turns out to be quite simple:
python -m pip install abc
Get into the habit of installing stuff with the python you want to use already set up in the shell, and calling pip like that. That way whatever you use it will always be installed in the right env.
I’m trying to figure out how to install a second python environment alongside anaconda.
On windows I can just install python in a different folder stand reference the desired python environment using env variables. I’d like to do the same on Mac.
A virtual env won’t do the trick as it does not copy the standard library and other things. It needs to be a complete stand alone environment. I guess I could compile it, but is there an easier way?
Thank you very much for any input.
You can do that using pyenv.
It allows you to have several python versions, and even different distributions.
It works, mostly on user space. So, no additional requirements are needed (apart from compilation tools)
I am using PythonXY (2.7, 32-bit) and the official Python (2.7, 32-bit).
Normally it is recommended to install according to python version, example C:\python27. But since they are both python27, can I arbitrarily change the base name (example C:\pythonxy27)?
When using python extras like pylauncher, or when utilizing the setuptools user-site, will they automatically recognize my custom installation sites (they will easily differentiate C:\python27 and C:\python33), or will both installations compete for the python27 namespace. (specifically when installing 3rd party packages to user-site, which normally locates as such \APPDATA\Python\PythonVer)
As far as setuptools/distribute are concerned, the python installer will handle custom locations for you. As long as you don't move that directory, all should be fine.
As for Pylauncher:
Things are not quite so clean. Pylauncher has simple configuration/call-parameters (for shebang lines in particular), that can handle version/platform selection quite well (2.7 vs 3.3, and 32bit vs 64bit).
As for the scenario in question (two different deployments where both are based on 32bit Python 2.7), pylauncher will attempt to guess which installation you wanted. If it is picking the wrong installation, there is some debugging information you can review to tune pylauncher's selection.
If an environment variable PYLAUNCH_DEBUG is set (to any value), the
launcher will print diagnostic information
It does not seem like there is a portable way to configure this, and will have to be done per-system (once you have your installations configured, YOU CAN set an alias that will be recognized on the shebang line)
Virtualenv and friends
I have also found (after struggling with pylauncher focused solutions), virtualenv addresses many of the deployment isolation hurdles. At the time of posting, working with virtualenv was not nearly as intuitive (on Windows) as compared to a linux shell environment. But I have discovered support packages like virtualenvwrapper which handle a lot of the ugly batch file interfaces very nicely.
Final Notes
Originally, I was also handling python globally with admin accounts. Forcing myself to stay within my user home directory (C:/Users/username), utilizing python user-site configurations, and making optimum use of ipython: have all given me a much better interactive command-line experience.
How should the shebang for a Python script look like?
Some people support #!/usr/bin/env python because it can find the Python interpreter intelligently. Others support #!/usr/bin/python, because now in most GNU/Linux distributions python is the default program.
What are the benefits of the two variants?
The Debian Python Policy states:
The preferred specification for the Python interpreter is /usr/bin/python or /usr/bin/pythonX.Y. This ensures that a Debian installation of python is used and all dependencies on additional python modules are met.
Maintainers should not override the Debian Python interpreter using /usr/bin/env python or /usr/bin/env pythonX.Y. This is not advisable as it bypasses Debian's dependency checking and makes the package vulnerable to incomplete local installations of python.
Note that Debian/Ubuntu use the alternatives system to manage which version /usr/bin/python actually points to. This has been working very nicely across a lot of python versions at least for me (and I've been using python from 2.3 to 2.7 now), with excellent transitions across updates.
Note that I've never used pip. I want automatic security upgrades, so I install all my python needs via aptitude. Using the official Debian/Ubuntu packages keep my system much cleaner than me messing around with the python installation myself.
Let me emphasize one thing. The above recommendation refers to system installation of python applications. It makes perfectly sense to have these use the system managed version of python. If you are actually playing around with your own, customized installation of python that is not managed by the operating system, using the env variant probably is the correct way of saying "use the user-preferred python", instead of hard-coding either the system python installation (which would be /usr/bin/python) or any user-custom path.
Using env python will cause your programs to behave differently if you call them from e.g. a python virtualenv.
This can be desired (e.g. you are writing a script to work only in your virtualenv). And it can be problematic (you write a tool for you, and expect it to work the same even within a virtualenv - it may suddenly fail because it is missing packages then).
My humble opinion is that you should use the env-variant. It's a POSIX component thus found in pretty much every system, while directly specifying /usr/bin/python breaks in many occasions, i.e. virtualenv setups.
I use #!/usr/bin/env python as the default install location on OS-X is NOT /usr/bin. This also applies to users who like to customize their environment -- /usr/local/bin is another common place where you might find a python distribution.
That said, it really doesn't matter too much. You can always test the script with whatever python version you want: /usr/bin/strange/path/python myscript.py. Also, when you install a script via setuptools, the shebang seems to get replaced by the sys.executable which installed that script -- I don't know about pip, but I would assume it behaves similarly.
As you note, they probably both work on linux. However, if someone has installed a newer version of python for their own use, or some requirement makes people keep a particular version in /usr/bin, the env allows the caller to set up their environment so that a different version will be called through env.
Imagine someone trying to see if python 3 works with the scripts. They'll add the python3 interpreter first in their path, but want to keep the default on the system running on 2.x. With a hardcoded path that's not possible.
short version: how can I get rid of the multiple-versions-of-python nightmare ?
long version: over the years, I've used several versions of python, and what is worse, several extensions to python (e.g. pygame, pylab, wxPython...). Each time it was on a different setup, with different OSes, sometimes different architectures (like my old PowerPC mac).
Nowadays I'm using a mac (OSX 10.6 on x86-64) and it's a dependency nightmare each time I want to revive script older than a few months. Python itself already comes in three different flavours in /usr/bin (2.5, 2.6, 3.1), but I had to install 2.4 from macports for pygame, something else (cannot remember what) forced me to install all three others from macports as well, so at the end of the day I'm the happy owner of seven (!) instances of python on my system.
But that's not the problem, the problem is, none of them has the right (i.e. same set of) libraries installed, some of them are 32bits, some 64bits, and now I'm pretty much lost.
For example right now I'm trying to run a three-year-old script (not written by me) which used to use matplotlib/numpy to draw a real-time plot within a rectangle of a wxwidgets window. But I'm failing miserably: py26-wxpython from macports won't install, stock python has wxwidgets included but also has some conflict between 32 bits and 64 bits, and it doesn't have numpy... what a mess !
Obviously, I'm doing things the wrong way. How do you usally cope with all that chaos ?
I solve this using virtualenv. I sympathise with wanting to avoid further layers of nightmare abstraction, but virtualenv is actually amazingly clean and simple to use. You literally do this (command line, Linux):
virtualenv my_env
This creates a new python binary and library location, and symlinks to your existing system libraries by default. Then, to switch paths to use the new environment, you do this:
source my_env/bin/activate
That's it. Now if you install modules (e.g. with easy_install), they get installed to the lib directory of the my_env directory. They don't interfere with existing libraries, you don't get weird conflicts, stuff doesn't stop working in your old environment. They're completely isolated.
To exit the environment, just do
deactivate
If you decide you made a mistake with an installation, or you don't want that environment anymore, just delete the directory:
rm -rf my_env
And you're done. It's really that simple.
virtualenv is great. ;)
Some tips:
on Mac OS X, use only the python installation in /Library/Frameworks/Python.framework.
whenever you use numpy/scipy/matplotlib, install the enthought python distribution
use virtualenv and virtualenvwrapper to keep those "system" installations pristine; ideally use one virtual environment per project, so each project's dependencies are fulfilled. And, yes, that means potentially a lot of code will be replicated in the various virtual envs.
That seems like a bigger mess indeed, but at least things work that way. Basically, if one of the projects works in a virtualenv, it will keep working no matter what upgrades you perform, since you never change the "system" installs.
Take a look at virtualenv.
What I usually do is trying to (progressively) keep up with the Python versions as they come along (and once all of the external dependencies have correct versions available).
Most of the time the Python code itself can be transferred as-is with only minor needed modifications.
My biggest Python project # work (15.000+ LOC) is now on Python 2.6 a few months (upgrading everything from Python 2.5 did take most of a day due to installing / checking 10+ dependencies...)
In general I think this is the best strategy with most of the interdependent components in the free software stack (think the dependencies in the linux software repositories): keep your versions (semi)-current (or at least: progressing at the same pace).
install the python versions you need, better if from sources
when you write a script, include the full python version into it (such as #!/usr/local/bin/python2.6)
I can't see what could go wrong.
If something does, it's probably macports fault anyway, not yours (one of the reasons I don't use macports anymore).
I know I'm probably missing something and this will get downvoted, but please leave at least a little comment in that case, thanks :)
I use the MacPorts version for everything, but as you note a lot of the default versions are bizarrely old. For example vim omnicomplete in Snow Leopard has python25 as a dependency. A lot of python related ports have old dependencies but you can usually flag the newer version at build time, for example port install vim +python26 instead of port install vim +python. Do a dry run before installing anything to see if you are pulling, for example, the whole of python24 when it isn't necessary. Check portfiles often because the naming convention as Darwin ports was getting off the ground left something to be desired. In practice I just leave everything in the default /opt... folders of MacPorts, including a copy of the entire framework with duplicates of PyObjC, etc., and just stick with one version at a time, retaining the option to return to the system default if things break unexpectedly. Which is all perhaps a bit too much work to avoid using virtualenv, which I've been meaning to get around to using.
I've had good luck using Buildout. You set up a list of which eggs and which versions you want. Buildout then downloads and installs private versions of each for you. It makes a private "python" binary with all the eggs already installed. A local "nosetests" makes things easy to debug. You can extend the build with your own functions.
On the down side, Buildout can be quite mysterious. Do "buildout -vvvv" for a while to see exactly what it's doing and why.
http://www.buildout.org/docs/tutorial.html
At least under Linux, multiple pythons can co-exist fairly happily. I use Python 2.6 on a CentOS system that needs Python 2.4 to be the default for various system things. I simply compiled and installed python 2.6 into a separate directory tree (and added the appropriate bin directory to my path) which was fairly painless. It's then invoked by typing "python2.6".
Once you have separate pythons up and running, installing libraries for a specific version is straightforward. If you invoke the setup.py script with the python you want, it will be installed in directories appropriate to that python, and scripts will be installed in the same directory as the python executable itself and will automagically use the correct python when invoked.
I also try to avoid using too many libraries. When I only need one or two functions from a library (eg scipy), I'll often see if I can just copy them to my own project.