I just downloaded Python sources, unpacked them to /usr/local/src/Python-3.5.1/, run ./configure and make there. Now, according to documentation, I should run make install.
But I don't want to install it somewhere in common system folders, create any links, change or add environment variables, doing anything outside this folder. In other words, I want it to be portable. How do I do it? Will /usr/local/src/Python-3.5.1/python get-pip.py install Pip to /usr/local/src/Python-3.5.1/Lib/site-packages/? Will /usr/local/src/Python-3.5.1/python work properly?
make altinstall, as I understand, still creates links what is not desired. Is it correct that it creates symbolic links as well but simply doesn't touch /usr/bin/python and man?
Probably, I should do ./configure prefix=some/private/path and just make and make install but I still wonder if it's possible to use Python make install.
If you don't want to copy the binaries you built into a shared location for system-wide use, you should not make install at all. If the build was successful, it will have produced binaries you can run. You may need to set up an environment for making them use local run-time files instead of the system-wide ones, but this is a common enough requirement for developers that it will often be documented in a README or similar (though as always when dealing with development sources, be prepared that it might not be as meticulously kept up to date as end-user documentation in a released version).
Related
I am building Python 3.10 from source on Ubuntu 18.04, following instructions from several web links, primarily the Python website (https://devguide.python.org/setup) and RealPython (https://realpython.com/installing-python/#how-to-build-python-from-source-code). I extracted Python-3.10.0.tgz into /opt/Python3.10. I have three questions.
First, the Python website says to use ./configure --with-pydebug and RealPython says to use ./configure --enable-optimizations --with-ensurepip=install. Another source says to include --enable-shared and --enable-unicode=ucs4. Which of these is best? Should I use all of those flags?
Second, I currently have Python 3.6 and Python 3.8 installed. They are installed in several directories under /usr. Following the directions I have seen on the web I am building in /opt/Python3.10. I assume that make altinstall (the final build step) will take care of installing the build in the usual folders under /usr, but that's not clear. Should I use ./configure --prefix=directory although none of the web sources mention doing that?
Finally, how much does --enable-optimizations slow down the install process?
This is my first time building Python from source, and it will help to clear these things up. Thanks for any help.
Welcome to the world of Python build configuration! I'll go through the command line options to ./configure one by one.
--with-pydebug is for core Python developers, not developers (like you and me) just using Python. It creates debugging symbols and slows down execution. You don't need it.
--enable-optimizations is good for performance in the long run, at the expense of lengthening the compiling process, possibly by 3-fold (or more), depending on your system. However, it results in faster execution, so I would use it in your situation.
--with-ensurepip=install is good. You want the most up-to-date version of pip.
--enable-shared is maybe not a good idea in your case, so I'd recommend not using it here. Read Difference between static and shared libraries? to understand the difference. Basically, since you'll possibly be installing to a non-system path (/opt/local, see below) that almost certainly isn't on your system's search path for shared libraries, you'll very likely run into problems down the road. A static build has all the pieces in one place, so you can install and run it from wherever. This is at the expense of size - the python binary will be rather large - but is great for non-sys admins. Even if you end up installing to /usr/local, I would argue that static is better/easier than shared.
--enable-unicode=ucs4 is optional, and may not be compatible with your system. You don't need it. ./configure is smart enough to figure out what Unicode settings are best. This option is left over from build instructions that are quite a few versions out of date.
--prefix I would suggest you use --prefix=/opt/local if that directory already exists and is in your $PATH, or if you know how to edit your $PATH in ~/.bashrc. Otherwise, use /usr/local or $HOME. /usr/local is the designated system-wide location for local software installs (i.e., stuff that doesn't come with Ubuntu), and is likely already on your $PATH. $HOME is always an option that doesn't require the use of sudo, which is great from a security perspective. You'll need to add /home/your_username/bin to your $PATH if it isn't already present.
I'm a Java/Scala dev transitioning to Python for a work project. To dust off the cobwebs on the Python side of my brain, I wrote a webapp that acts as a front-end for Docker when doing local Docker work. I'm now working on packaging it up and, as such, am learning about setup.py and virtualenv. Coming from the JVM world, where dependencies aren't "installed" so much as downloaded to a repository and referenced when needed, the way pip handles things is a bit foreign. It seems like best practice for production Python work is to first create a virtual environment for your project, do your coding work, then package it up with setup.py.
My question is, what happens on the other end when someone needs to install what I've written? They too will have to create a virtual environment for the package but won't know how to set it up without inspecting the setup.py file to figure out what version of Python to use, etc. Is there a way for me to create a setup.py file that also creates the appropriate virtual environment as part of the install process? If not — or if that's considered a "no" as this respondent stated to this SO post — what is considered "best practice" in this situation?
You can think of virtualenv as an isolation for every package you install using pip. It is a simple way to handle different versions of python and packages. For instance you have two projects which use same packages but different versions of them. So, by using virtualenv you can isolate those two projects and install different version of packages separately, not on your working system.
Now, let's say, you want work on a project with your friend. In order to have the same packages installed you have to share somehow what versions and which packages your project depends on. If you are delivering a reusable package (a library) then you need to distribute it and here where setup.py helps. You can learn more in Quick Start
However, if you work on a web site, all you need is to put libraries versions into a separate file. Best practice is to create separate requirements for tests, development and production. In order to see the format of the file - write pip freeze. You will be presented with a list of packages installed on the system (or in the virtualenv) right now. Put it into the file and you can install it later on another pc, with completely clear virtualenv using pip install -r development.txt
And one more thing, please do not put strict versions of packages like pip freeze shows, most of time you want >= at least X.X version. And good news here is that pip handles dependencies by its own. It means you do not have to put dependent packages there, pip will sort it out.
Talking about deploy, you may want to check tox, a tool for managing virtualenvs. It helps a lot with deploy.
Python default package path always point to system environment, that need Administrator access to install. Virtualenv able to localised the installation to an isolated environment.
For deployment/distribution of package, you can choose to
Distribute by source code. User need to run python setup.py --install, or
Pack your python package and upload to Pypi or custom Devpi. So the user can simply use pip install <yourpackage>
However, as you notice the issue on top : without virtualenv, they user need administrator access to install any python package.
In addition, the Pypi package worlds contains a certain amount of badly tested package that doesn't work out of the box.
Note : virtualenv itself is actually a hack to achieve isolation.
I'm perplexed about how best to use pip in the face of security concerns about malicious packages or install scripts. I'm not much of a security expert, so I may just be confused (bear with me), but it seems that there are 4, possibly overlapping, approaches:
(1) Use sudo pip for everything
This is how I do things now. I generally do not need virtualenvs and like the convenience of having all my packages work for all my tools. I also don't install a lot of experimental packages, sticking pretty much to the well-known and widely used ones (matplotlib, six, etc).
I gather this can be a risky approach though because the installation process has su privileges, and could potentially do anything; however it has the advantage of protecting the site-packages directory from subsequent mischief by anything (not just packages) running as non-su after an install.
This approach also can't be completely avoided, as some packages (pip itself) need it to bootstrap any Python installation.
(2) Create a pip user and give it ownership of site-packages
This would seem to have the advantage of restricting what pip can do: all it can do is install to site-packages. But I'm not sure about side effects, or if it would even work (when, for example pip needs to put things in other locations). A more realistic variant of this is to set things up this way, and use pip as "pip-user" when it works, and as su when it doesn't.
(3) Give myself ownership of site-packages
I gather this is a very had idea, but I'm not sure quite why. It would mean that any code I run would be able to tamper with site-packages; but it would mean that malicious install scripts could only damage things I can damage myself anyway.
(4) "Use a virtualenv"
This suggestion comes up a lot, but I don't see how it helps. It seems no different from 3 to me since it creates a site-packages that I own.
Which, if any of these approaches, or combinations of approaches, is best for ensuring that pip does not result in exposing my system? My concern is mostly with my system as a whole, and only secondarily with my Python installation in site-packages (which I can always rebuild if need be).
Part of the problem I have, is that a don't know how to weigh the risks. An example approach, that seems to make sense to my limited understanding is simply to do (1) for the most part, and use a virtualenv (4) for any package that I worry might damage my site-packages. Anything I've installed will still be able to damage anything I have access to, but that seems unavoidable, and at least things I don't have access to will be safe (except during the installation process itself). But I have trouble evaluating whether the protection this affords is worth the risk it creates.
You probably want to look at using a virtualenv. To quote the docs:
Virtualenv is a tool to create isolated Python environments. The basic problem
being addressed is one of dependencies and versions, and indirectly
permissions.
Virtualenv will create a folder with an isolated copy of python, an isolated pip and an isolated site-packages. You're thinking that this is the same as option 3 because you're taking that advice you linked at face value and not reading into it:
If you give yourself write privilege to the system site-packages,
you're risking that any program that runs under you (not necessarily
python program) can inject malicious code into the system
site-packages and obtain root privilege.
The problem is not with having access to site-packages (you have to have privilages for site-packages to be able to do anything). The problem is with having access to the system site-packages. A virtual environment's site-packages does not expose root privilages to malicious code the same as the one that your entire system is using.
However, I see nothing wrong with using sudo pip for well known and familiar packages. At the end of the day, it's like installing any other program, even non-python. If you go to its website and it looks honest and you trust it, there's no reason not to sudo.
moreover, pip is pretty safe - it uses https for pypi and if you --allow-external it will download packages from third-party, but will still keep checksums on pypi and compare them. For third-party with no checksum you need to explicitly call --allow-unverified which is the only option considered unsafe.
As a personal note, I can add that I use sudo pip most of the times, but as a WEB developer virtualenv is kind of a day-to-day thing, and I can recommend using it as well (especially if you see anything sketchy but you still want to try it out).
So pip and virtualenv sound wonderful compared to setuptools. Being able to uninstall would be great. But my project is already using setuptools, so how do I migrate? The web sites I've been able to find so far are very vague and general. So here's an anthology of questions after reading the main web sites and trying stuff out:
First of all, are virtualenv and pip supposed to be in a usable state by now? If not, please disregard the rest as the ravings of a madman.
How should virtualenv be installed? I'm not quite ready to believe it's as convoluted as explained elsewhere.
Is there a set of tested instructions for how to install matplotlib in a virtual environment? For some reason it always wants to compile it here instead of just installing a package, and it always ends in failure (even after build-dep which took up 250 MB of disk space). After a whole bunch of warnings it prints src/mplutils.cpp:17: error: ‘vsprintf’ was not declared in this scope.
How does either tool interact with setup.py? pip is supposed to replace easy_install, but it's not clear whether it's a drop-in or more complicated relationship.
Is virtualenv only for development mode, or should the users also install it?
Will the resulting package be installed with the minimum requirements (like the current egg), or will it be installed with sources & binaries for all dependencies plus all the build tools, creating a gigabyte monster in the virtual environment?
Will the users have to modify their $PATH and $PYTHONPATH to run the resulting package if it's installed in a virtual environment?
Do I need to create a script from a text string for virtualenv like in the bad old days?
What is with the #egg=Package URL syntax? That's not part of the standard URL, so why isn't it a separate parameter?
Where is #rev included in the URL? At the end I suppose, but the documentation is not clear about this ("You can also include #rev in the URL").
What is supposed to be understood by using an existing requirements file as "as a sort of template for the new file"? This could mean any number of things.
Wow, that's quite a set of questions. Many of them would really deserve their own SO question with more details. I'll do my best:
First of all, are virtualenv and pip
supposed to be in a usable state by
now?
Yes, although they don't serve everyone's needs. Pip and virtualenv (along with everything else in Python package management) are far from perfect, but they are widely used and depended upon nonetheless.
How should virtualenv be installed?
I'm not quite ready to believe it's as
convoluted as explained elsewhere.
The answer you link is complex because it is trying to avoid making any changes at all to your global Python installation and install everything in ~/.local instead. This has some advantages, but is more complex to setup. It's also installing virtualenvwrapper, which is a set of convenience bash scripts for working with virtualenv, but is not necessary for using virtualenv.
If you are on Ubuntu, aptitude install python-setuptools followed by easy_install virtualenv should get you a working virtualenv installation without doing any damage to your global python environment (unless you also had the Ubuntu virtualenv package installed, which I don't recommend as it will likely be an old version).
Is there a set of tested instructions
for how to install matplotlib in a
virtual environment? For some reason
it always wants to compile it here
instead of just installing a package,
and it always ends in failure (even
after build-dep which took up 250 MB
of disk space). After a whole bunch of
warnings it prints
src/mplutils.cpp:17: error: ‘vsprintf’
was not declared in this scope.
It "always wants to compile" because pip, by design, installs only from source, it doesn't install pre-compiled binaries. This is a controversial choice, and is probably the primary reason why pip has seen widest adoption among Python web developers, who use more pure-Python packages and commonly develop and deploy in POSIX environments where a working compilation chain is standard.
The reason for the design choice is that providing precompiled binaries has a combinatorial explosion problem with different platforms and build architectures (including python version, UCS-2 vs UCS-4 python builds, 32 vs 64-bit...). The way easy_install finds the right binary package on PyPI sort of works, most of the time, but doesn't account for all these factors and can break. So pip just avoids that issue altogether (replacing it with a requirement that you have a working compilation environment).
In many cases, packages that require C compilation also have a slower-moving release schedule and it's acceptable to simply install OS packages for them instead. This doesn't allow working with different versions of them in different virtualenvs, though.
I don't know what's causing your compilation error, it works for me (on Ubuntu 10.10) with this series of commands:
virtualenv --no-site-packages tmp
. tmp/bin/activate
pip install numpy
pip install -f http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0.1/matplotlib-1.0.1.tar.gz matplotlib
The "-f" link is necessary to get the most recent version, due to matplotlib's unusual download URLs on PyPI.
How does either tool interact with
setup.py? pip is supposed to replace
easy_install, but it's not clear
whether it's a drop-in or more
complicated relationship.
The setup.py file is a convention of distutils, the Python standard library's package management "solution." distutils alone is missing some key features, and setuptools is a widely-used third-party package that "embraces and extends" distutils to provide some additional features. setuptools also uses setup.py. easy_install is the installer bundled with setuptools. Setuptools development stalled for several years, and distribute was a fork of setuptools to fix some longstanding bugs. Eventually the fork was resolved with a merge of distribute back into setuptools, and setuptools development is now active again (with a new maintainer).
distutils2 was a mostly-rewritten new version of distutils that attempted to incorporate the best ideas from setuptools/distribute, and was supposed to become part of the Python standard library. Unfortunately this effort failed, so for the time being setuptools remains the de facto standard for Python packaging.
Pip replaces easy_install, but it does not replace setuptools; it requires setuptools and builds on top of it. Thus it also uses setup.py.
Is virtualenv only for development
mode, or should the users also install
it?
There's no single right answer to that; it can be used either way. In the end it's really your user's choice, and your software ideally should be able to be installed inside or out of a virtualenv; though you might choose to document and emphasize one approach or the other. It depends very much on who your users are and what environments they are likely to need to install your software into.
Will the resulting package be
installed with the minimum
requirements (like the current egg),
or will it be installed with sources &
binaries for all dependencies plus all
the build tools, creating a gigabyte
monster in the virtual environment?
If a package that requires compilation is installed via pip, it will need to be compiled from source. That also applies to any dependencies that require compilation.
This is unrelated to the question of whether you use a virtualenv. easy_install is available by default in a virtualenv and works just fine there. It can install pre-compiled binary eggs, just like it does outside of a virtualenv.
Will the users have to modify their
$PATH and $PYTHONPATH to run the
resulting package if it's installed in
a virtual environment?
In order to use anything installed in a virtualenv, you need to use the python binary in the virtualenv's bin/ directory (or another script installed into the virtualenv that references this binary). The most common way to do this is to use the virtualenv's activate or activate.bat script to temporarily modify the shell PATH so the virtualenv's bin/ directory is first. Modifying PYTHONPATH is not generally useful or necessary with virtualenv.
Do I need to create a script from a
text string for virtualenv like in the
bad old days?
No.
What is with the #egg=Package URL
syntax? That's not part of the
standard URL, so why isn't it a
separate parameter?
The "#egg=projectname-version" URL fragment hack was first introduced by setuptools and easy_install. Since easy_install scrapes links from the web to find candidate distributions to install for a given package name and version, this hack allowed package authors to add links on PyPI that easy_install could understand, even if they didn't use easy_install's standard naming conventions for their files.
Where is #rev included in the URL? At
the end I suppose, but the
documentation is not clear about this
("You can also include #rev in the
URL").
A couple sentences after that quoted fragment there is a link to "read the requirements file format to learn about other features." The #rev feature is fully documented and demonstrated there.
What is supposed to be understood by
using an existing requirements file as
"as a sort of template for the new
file"? This could mean any number of
things.
The very next sentence says "it will keep the packages listed in devel-req.txt in order and preserve comments." I'm not sure what would be a better concise description.
I can't answer all your questions, but hopefully the following helps.
Both virtualenv and pip are very usable. Many Python devs use these everyday.
Since you have a working easy_install, the easiest way to install both is the following:
easy_install pip
easy_install virtualenv
Once you have virtualenv, just type virtualenv yourEnvName and you'll get your new python virtual environment in a directory named yourEnvName.
From there, it's as easy as source yourEnvName/bin/activate and the virtual python interpreter will be your active. I know nothing about matplotlib, but following the installation interactions should work out ok unless there are weird hard-coded path issues.
If you can install something via easy_install you can usually install it via pip. I haven't found anything that easy_install could do that pip couldn't.
I wouldn't count on users being able to install virtualenv (it depends on who your users are). Technically, a virtual python interpreter can be treated as a real one for most cases. It's main use is not cluttering up the real interpreter's site-packages and if you have two libraries/apps that require different and incompatible versions of the same library.
If you or a user install something in a virtualenv, it won't be available in other virtualenvs or the system Python interpreter. You'll need to use source /path/to/yourvirtualenv/bin/activate command to switch to a virtual environment you installed the library on.
What they mean by "as a sort of template for the new file" is that the pip freeze -r devel-req.txt > stable-req.txt command will create a new file stable-req.txt based on the existing file devel-req.txt. The only difference will be anything installed not already specified in the existing file will be in the new file.
short version: how can I get rid of the multiple-versions-of-python nightmare ?
long version: over the years, I've used several versions of python, and what is worse, several extensions to python (e.g. pygame, pylab, wxPython...). Each time it was on a different setup, with different OSes, sometimes different architectures (like my old PowerPC mac).
Nowadays I'm using a mac (OSX 10.6 on x86-64) and it's a dependency nightmare each time I want to revive script older than a few months. Python itself already comes in three different flavours in /usr/bin (2.5, 2.6, 3.1), but I had to install 2.4 from macports for pygame, something else (cannot remember what) forced me to install all three others from macports as well, so at the end of the day I'm the happy owner of seven (!) instances of python on my system.
But that's not the problem, the problem is, none of them has the right (i.e. same set of) libraries installed, some of them are 32bits, some 64bits, and now I'm pretty much lost.
For example right now I'm trying to run a three-year-old script (not written by me) which used to use matplotlib/numpy to draw a real-time plot within a rectangle of a wxwidgets window. But I'm failing miserably: py26-wxpython from macports won't install, stock python has wxwidgets included but also has some conflict between 32 bits and 64 bits, and it doesn't have numpy... what a mess !
Obviously, I'm doing things the wrong way. How do you usally cope with all that chaos ?
I solve this using virtualenv. I sympathise with wanting to avoid further layers of nightmare abstraction, but virtualenv is actually amazingly clean and simple to use. You literally do this (command line, Linux):
virtualenv my_env
This creates a new python binary and library location, and symlinks to your existing system libraries by default. Then, to switch paths to use the new environment, you do this:
source my_env/bin/activate
That's it. Now if you install modules (e.g. with easy_install), they get installed to the lib directory of the my_env directory. They don't interfere with existing libraries, you don't get weird conflicts, stuff doesn't stop working in your old environment. They're completely isolated.
To exit the environment, just do
deactivate
If you decide you made a mistake with an installation, or you don't want that environment anymore, just delete the directory:
rm -rf my_env
And you're done. It's really that simple.
virtualenv is great. ;)
Some tips:
on Mac OS X, use only the python installation in /Library/Frameworks/Python.framework.
whenever you use numpy/scipy/matplotlib, install the enthought python distribution
use virtualenv and virtualenvwrapper to keep those "system" installations pristine; ideally use one virtual environment per project, so each project's dependencies are fulfilled. And, yes, that means potentially a lot of code will be replicated in the various virtual envs.
That seems like a bigger mess indeed, but at least things work that way. Basically, if one of the projects works in a virtualenv, it will keep working no matter what upgrades you perform, since you never change the "system" installs.
Take a look at virtualenv.
What I usually do is trying to (progressively) keep up with the Python versions as they come along (and once all of the external dependencies have correct versions available).
Most of the time the Python code itself can be transferred as-is with only minor needed modifications.
My biggest Python project # work (15.000+ LOC) is now on Python 2.6 a few months (upgrading everything from Python 2.5 did take most of a day due to installing / checking 10+ dependencies...)
In general I think this is the best strategy with most of the interdependent components in the free software stack (think the dependencies in the linux software repositories): keep your versions (semi)-current (or at least: progressing at the same pace).
install the python versions you need, better if from sources
when you write a script, include the full python version into it (such as #!/usr/local/bin/python2.6)
I can't see what could go wrong.
If something does, it's probably macports fault anyway, not yours (one of the reasons I don't use macports anymore).
I know I'm probably missing something and this will get downvoted, but please leave at least a little comment in that case, thanks :)
I use the MacPorts version for everything, but as you note a lot of the default versions are bizarrely old. For example vim omnicomplete in Snow Leopard has python25 as a dependency. A lot of python related ports have old dependencies but you can usually flag the newer version at build time, for example port install vim +python26 instead of port install vim +python. Do a dry run before installing anything to see if you are pulling, for example, the whole of python24 when it isn't necessary. Check portfiles often because the naming convention as Darwin ports was getting off the ground left something to be desired. In practice I just leave everything in the default /opt... folders of MacPorts, including a copy of the entire framework with duplicates of PyObjC, etc., and just stick with one version at a time, retaining the option to return to the system default if things break unexpectedly. Which is all perhaps a bit too much work to avoid using virtualenv, which I've been meaning to get around to using.
I've had good luck using Buildout. You set up a list of which eggs and which versions you want. Buildout then downloads and installs private versions of each for you. It makes a private "python" binary with all the eggs already installed. A local "nosetests" makes things easy to debug. You can extend the build with your own functions.
On the down side, Buildout can be quite mysterious. Do "buildout -vvvv" for a while to see exactly what it's doing and why.
http://www.buildout.org/docs/tutorial.html
At least under Linux, multiple pythons can co-exist fairly happily. I use Python 2.6 on a CentOS system that needs Python 2.4 to be the default for various system things. I simply compiled and installed python 2.6 into a separate directory tree (and added the appropriate bin directory to my path) which was fairly painless. It's then invoked by typing "python2.6".
Once you have separate pythons up and running, installing libraries for a specific version is straightforward. If you invoke the setup.py script with the python you want, it will be installed in directories appropriate to that python, and scripts will be installed in the same directory as the python executable itself and will automagically use the correct python when invoked.
I also try to avoid using too many libraries. When I only need one or two functions from a library (eg scipy), I'll often see if I can just copy them to my own project.