Will Anaconda Python config scripts clash with Homebrew's? - python

Will Anaconda Python config scripts clash with Homebrew's? Note that I do not use these config scripts in any of my workflows, I'm just wondering if any of these config scripts may get called "behind the scenes". Sample output below (with username replaced by '..'):
$ brew doctor
...
Having additional scripts in your path can confuse software installed via
Homebrew if the config script overrides a system or Homebrew provided
script of the same name. We found the following "config" scripts:
/Users/../anaconda/bin/curl-config
/Users/../anaconda/bin/freetype-config
/Users/../anaconda/bin/libdynd-config
/Users/../anaconda/bin/libpng-config
/Users/../anaconda/bin/libpng15-config
/Users/../anaconda/bin/llvm-config
/Users/../anaconda/bin/python-config
/Users/../anaconda/bin/python2-config
/Users/../anaconda/bin/python2.7-config
/Users/../anaconda/bin/xml2-config
/Users/../anaconda/bin/xslt-config
Clearly some of these clash with some Homebrew-installed packages.
$ ls /usr/local/bin/*-config
/usr/local/bin/Magick++-config /usr/local/bin/libpng-config
/usr/local/bin/Magick-config /usr/local/bin/libpng16-config
/usr/local/bin/MagickCore-config /usr/local/bin/pcre-config
/usr/local/bin/MagickWand-config /usr/local/bin/pkg-config
/usr/local/bin/Wand-config /usr/local/bin/python-config
/usr/local/bin/freetype-config /usr/local/bin/python2-config
/usr/local/bin/gdlib-config /usr/local/bin/python2.7-config

Such clashes are entirely possible. When you install softwares that depends on Python using Homebrew, you want it to see Python packages and libraries installed via Homebrew but not those installed by Anaconda.
My solution to this is not putting
export PATH=$HOME/anaconda/bin:$PATH
into .bashrc. Normally, you'll just use Python and pip installed via Homebrew and packages installed by that pip. Sometimes, when you are developing Python projects that is convenient to use Anaconda's environment management mechanism (conda create -n my-env), you can temporarily do export PATH=$HOME/anaconda/bin:$PATH to turn it on. From what I gathered, one important benefit of using Anaconda compared to using regular Python is that conda create -n my-env anaconda will not duplicate package installations unnecessarily as virtualenv my-env will when you have a large number of virtual environments. If you do not mind having some degree of duplication, you could just avoid installing Anaconda all together and just use virtualenv.

It's entirely possible you won't notice any problems. On the other hand, you may have some pretty frustrating ones. It all depends on what you use and how your $PATH is ordered. Homebrew will take whatever file has precedence in your $PATH; if another Homebrew package needs to use Homebrew-installed config files and it sees the Anaconda versions first, it doesn't known any better than to use the wrong ones. In a sense, that's what you told it to do.
My recommendation is to keep things simple and clean. Unless you have a particular reason to keep Anaconda on your $PATH, you should probably pop it out and alias anything you need. Alternatively you could just install the things you require (e.g., numpy) via Homebrew and eliminate Anaconda altogether. (Actually, that's really what I would do. Anaconda comes with way more stuff than I have any reason to be dumping onto my machine.)
I don't know what your $PATH looks like, but in my experience, keeping it short and systematic has a lot of advantages.

Related

question about virtual environments from conda to virtualenv

I have been working on a project in python that its inside an environment created at a linux machine. I recently got a new pc and i tried freebsd so i decided to see if i can port the settings, since these environments are supposed to be platform independent.
Since there is no support for conda in freebsd, i decided to write a script to migrate the dependencies from conda to virtualenv. The script, although it translates the .yml file into the .txt file needed for pip to install the dependencies, i can see that there are still a lot of packages missing, especially from the dependencies label in the .yml file.
Does it mean that these packages are not yet ported on freebsd or is there a different way to add them in the .txt file instead of just their name?
Does it mean that these packages are not yet ported on freebsd or is there a different way to add them in the .txt file instead of just their name?
It sounds like pip can't find a number of your dependencies, so yes.
Keep in mind that conda and pip are completely different build systems, despite being mostly compatible with each other and despite most packages available on one being available on the other. This also means that conda list usually includes some packages you don't necessarily need to install via pip. So you may be better off starting from scratch with a new requirements.txt file that includes the packages you actually need, and just let pip find what else it needs (which, again, is likely different than what conda needs).

How can I make `python3 -m venv` install the new version of pip in the environment? [duplicate]

Python 3.3 includes in its standard library the new package venv. What does it do, and how does it differ from all the other packages that match the regex (py)?(v|virtual|pip)?env?
This is my personal recommendation for beginners: start by learning virtualenv and pip, tools which work with both Python 2 and 3 and in a variety of situations, and pick up other tools once you start needing them.
Now on to answer the question: what is the difference between these similarly named things: venv, virtualenv, etc?
PyPI packages not in the standard library:
virtualenv is a very popular tool that creates isolated Python environments for Python libraries. If you're not familiar with this tool, I highly recommend learning it, as it is a very useful tool.
It works by installing a bunch of files in a directory (eg: env/), and then modifying the PATH environment variable to prefix it with a custom bin directory (eg: env/bin/). An exact copy of the python or python3 binary is placed in this directory, but Python is programmed to look for libraries relative to its path first, in the environment directory. It's not part of Python's standard library, but is officially blessed by the PyPA (Python Packaging Authority). Once activated, you can install packages in the virtual environment using pip.
pyenv is used to isolate Python versions. For example, you may want to test your code against Python 2.7, 3.6, 3.7 and 3.8, so you'll need a way to switch between them. Once activated, it prefixes the PATH environment variable with ~/.pyenv/shims, where there are special files matching the Python commands (python, pip). These are not copies of the Python-shipped commands; they are special scripts that decide on the fly which version of Python to run based on the PYENV_VERSION environment variable, or the .python-version file, or the ~/.pyenv/version file. pyenv also makes the process of downloading and installing multiple Python versions easier, using the command pyenv install.
pyenv-virtualenv is a plugin for pyenv by the same author as pyenv, to allow you to use pyenv and virtualenv at the same time conveniently. However, if you're using Python 3.3 or later, pyenv-virtualenv will try to run python -m venv if it is available, instead of virtualenv. You can use virtualenv and pyenv together without pyenv-virtualenv, if you don't want the convenience features.
virtualenvwrapper is a set of extensions to virtualenv (see docs). It gives you commands like mkvirtualenv, lssitepackages, and especially workon for switching between different virtualenv directories. This tool is especially useful if you want multiple virtualenv directories.
pyenv-virtualenvwrapper is a plugin for pyenv by the same author as pyenv, to conveniently integrate virtualenvwrapper into pyenv.
pipenv aims to combine Pipfile, pip and virtualenv into one command on the command-line. The virtualenv directory typically gets placed in ~/.local/share/virtualenvs/XXX, with XXX being a hash of the path of the project directory. This is different from virtualenv, where the directory is typically in the current working directory. pipenv is meant to be used when developing Python applications (as opposed to libraries). There are alternatives to pipenv, such as poetry, which I won't list here since this question is only about the packages that are similarly named.
Standard library:
pyvenv (not to be confused with pyenv in the previous section) is a script shipped with Python 3.3 to 3.7. It was removed from Python 3.8 as it had problems (not to mention the confusing name). Running python3 -m venv has exactly the same effect as pyvenv.
venv is a package shipped with Python 3, which you can run using python3 -m venv (although for some reason some distros separate it out into a separate distro package, such as python3-venv on Ubuntu/Debian). It serves the same purpose as virtualenv, but only has a subset of its features (see a comparison here). virtualenv continues to be more popular than venv, especially since the former supports both Python 2 and 3.
I would just avoid the use of virtualenv after Python3.3+ and instead use the standard shipped library venv. To create a new virtual environment you would type:
$ python3 -m venv <MYVENV>
virtualenv tries to copy the Python binary into the virtual environment's bin directory. However it does not update library file links embedded into that binary, so if you build Python from source into a non-system directory with relative path names, the Python binary breaks. Since this is how you make a copy distributable Python, it is a big flaw. BTW to inspect embedded library file links on OS X, use otool. For example from within your virtual environment, type:
$ otool -L bin/python
python:
#executable_path/../Python (compatibility version 3.4.0, current version 3.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.0.0)
Consequently I would avoid virtualenvwrapper and pipenv. pyvenv is deprecated. pyenv seems to be used often where virtualenv is used but I would stay away from it also since I think venv also does what pyenv is built for.
venv creates virtual environments in the shell that are fresh and sandboxed, with user-installable libraries, and it's multi-python safe.
Fresh: because virtual environments only start with the standard libraries that ship with python, you have to install any other libraries all over again with pip install while the virtual environment is active.
Sandboxed: because none of these new library installs are visible outside the virtual environment, so you can delete the whole environment and start again without worrying about impacting your base python install.
User-installable libraries: because the virtual environment's target folder is created without sudo in some directory you already own, so you won't need sudo permissions to install libraries into it.
multi-python safe: because when virtual environments activate, the shell only sees the python version (3.4, 3.5 etc.) that was used to build that virtual environment.
pyenv is similar to venv in that it lets you manage multiple python environments. However with pyenv you can't conveniently rollback library installs to some start state and you will likely need admin privileges at some point to update libraries. So I think it is also best to use venv.
In the last couple of years I have found many problems in build systems (emacs packages, python standalone application builders, installers...) that ultimately come down to issues with virtualenv. I think python will be a better platform when we eliminate this additional option and only use venv.
EDIT: Tweet of the BDFL,
I use venv (in the stdlib) and a bunch of shell aliases to quickly switch.
— Guido van Rossum (#gvanrossum) October 22, 2020
UPDATE 20200825:
Added below "Conclusion" paragraph
I've went down the pipenv rabbit hole (it's a deep and dark hole indeed...) and since the last answer is over 2 years ago, felt it was useful to update the discussion with the latest developments on the Python virtual envelopes topic I've found.
DISCLAIMER:
This answer is NOT about continuing the raging debate about the merits of pipenv versus venv as envelope solutions- I make no endorsement of either. It's about PyPA endorsing conflicting standards and how future development of virtualenv promises to negate making an either/or choice between them at all. I focused on these two tools precisely because they are the anointed ones by PyPA.
venv
As the OP notes, venv is a tool for virtualizing environments. NOT a third party solution, but native tool. PyPA endorses venv for creating VIRTUAL ENVELOPES: "Changed in version 3.5: The use of venv is now recommended for creating virtual environments".
pipenv
pipenv- like venv - can be used to create virtual envelopes but additionally rolls-in package management and vulnerability checking functionality. Instead of using requirements.txt, pipenv delivers package management via Pipfile. As PyPA endorses pipenv for PACKAGE MANAGEMENT, that would seem to imply pipfile is to supplant requirements.txt.
HOWEVER: pipenv uses virtualenv as its tool for creating virtual envelopes, NOT venv which is endorsed by PyPA as the go-to tool for creating virtual envelopes.
Conflicting Standards:
So if settling on a virtual envelope solution wasn't difficult enough, we now have PyPA endorsing two different tools which use different virtual envelope solutions. The raging Github debate on venv vs virtualenv which highlights this conflict can be found here.
Conflict Resolution:
The Github debate referenced in above link has steered virtualenv development in the direction of accommodating venv in future releases:
prefer built-in venv: if the target python has venv we'll create the
environment using that (and then perform subsequent operations on that
to facilitate other guarantees we offer)
Conclusion:
So it looks like there will be some future convergence between the two rival virtual envelope solutions, but as of now pipenv- which uses virtualenv - varies materially from venv.
Given the problems pipenv solves and the fact that PyPA has given its blessing, it appears to have a bright future. And if virtualenv delivers on its proposed development objectives, choosing a virtual envelope solution should no longer be a case of either pipenv OR venv.
Update 20200825:
An oft repeated criticism of Pipenv I saw when producing this analysis was that it was not actively maintained. Indeed, what's the point of using a solution whose future could be seen questionable due to lack of continuous development? After a dry spell of about 18 months, Pipenv is once again being actively developed. Indeed, large and material updates have since been released.
Let's start with the problems these tools want to solve:
My system package manager don't have the Python versions I wanted or I want to install multiple Python versions side by side, Python 3.9.0 and Python 3.9.1, Python 3.5.3, etc
Then use pyenv.
I want to install and run multiple applications with different, conflicting dependencies.
Then use virtualenv or venv. These are almost completely interchangeable, the difference being that virtualenv supports older python versions and has a few more minor unique features, while venv is in the standard library.
I'm developing an /application/ and need to manage my dependencies, and manage the dependency resolution of the dependencies of my project.
Then use pipenv or poetry.
I'm developing a /library/ or a /package/ and want to specify the dependencies that my library users need to install
Then use setuptools.
I used virtualenv, but I don't like virtualenv folders being scattered around various project folders. I want a centralised management of the environments and some simple project management
Then use virtualenvwrapper. Variant: pyenv-virtualenvwrapper if you also use pyenv.
Not recommended
pyvenv. This is deprecated, use venv or virtualenv instead. Not to be confused with pipenv or pyenv.
Jan 2020 Update
#Flimm has explained all the differences very well. Generally, we want to know the difference between all tools because we want to decide what's best for us. So, the next question would be: which one to use? I suggest you choose one of the two official ways to manage virtual environments:
Python Packaging now recommends Pipenv
Python.org now recommends venv
pyenv - manages different python versions,
all others - create virtual environment (which has isolated python
version and installed "requirements"),
pipenv want combine all, in addition to previous it installs "requirements" (into the active virtual environment or create its own
if none is active)
So maybe you will be happy with pipenv only.
But I use: pyenv + pyenv-virtualenvwrapper, + pipenv (pipenv for installing requirements only).
In Debian:
apt install libffi-dev
install pyenv based on https://www.tecmint.com/pyenv-install-and-manage-multiple-python-versions-in-linux/, but..
.. but instead of pyenv-virtualenv install pyenv-virtualenvwrapper (which can be standalone library or pyenv plugin, here the 2nd option):
$ pyenv install 3.9.0
$ git clone https://github.com/pyenv/pyenv-virtualenvwrapper.git $(pyenv root)/plugins/pyenv-virtualenvwrapper
# inside ~/.bashrc add:
# export $VIRTUALENVWRAPPER_PYTHON="/usr/bin/python3"
$ source ~/.bashrc
$ pyenv virtualenvwrapper
Then create virtual environments for your projects (workingdir must exist):
pyenv local 3.9.0 # to prevent 'interpreter not found' in mkvirtualenv
python -m pip install --upgrade pip setuptools wheel
mkvirtualenv <venvname> -p python3.9 -a <workingdir>
and switch between projects:
workon <venvname>
python -m pip install --upgrade pip setuptools wheel pipenv
Inside a project I have the file requirements.txt, without fixing the versions inside (if some version limitation is not neccessary).
You have 2 possible tools to install them into the current virtual environment: pip-tools or pipenv. Lets say you will use pipenv:
pipenv install -r requirements.txt
this will create Pipfile and Pipfile.lock files, fixed versions are in the 2nd one. If you want reinstall somewhere exactly same versions then (Pipfile.lock must be present):
pipenv install
Remember that Pipfile.lock is related to some Python version and need to be recreated if you use a different one.
As you see I write requirements.txt. This has some problems: You must remove a removed package from Pipfile too. So writing Pipfile directly is probably better.
So you can see I use pipenv very poorly. Maybe if you will use it well, it can replace everything?
EDIT 2021.01: I have changed my stack to: pyenv + pyenv-virtualenvwrapper + poetry. Ie. I use no apt or pip installation of virtualenv or virtualenvwrapper, and instead I install pyenv's plugin pyenv-virtualenvwrapper. This is easier way.
Poetry is great for me:
poetry add <package> # install single package
poetry remove <package>
poetry install # if you remove poetry.lock poetry will re-calculate versions
As a Python newcomer this question frustrated me endlessly and confused me for months. Which virtual environment and package manager(s) should I invest in learning when I know that I will be using it for years to come?
The best article answering this vexing question is https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/ by Jake Vanderplas. Although a few years old, it provides practical answers and the history of Python package and virtual environment managers from the trenches as these state-of-the-art was developing.
It was particularly frustrating for me in the data science and "big data cloud computing" communities, because conda is widely used as a virtual environment manager and full function package manager for Python and JavaScript, SQL, Java, HTML5, and Jupyter Notebooks.
So why use pip at all, when conda does everything that pip and venv variants do?
The answer is, "because you MUST use pip if a conda package is simply not available." Many times a required package is only available in pip format and there is no easy solution but to use pip. You can learn to use conda build but if you are not the package maintainer, then you must convince the package owner to generate a conda package for each new release (or do it yourself.)
These pip-based packages differ along many important and practical dimensions:
stability
maturity
complexity
active support (versus dying or dead)
levels of adoption near the Python ecosystem "core" versus "on the
fringes" (i.e., integrated into Python.org distro)
easy to figure out and use (for beginners)
I will answer your question for two packages from dimension of package maturity and stability.
venv and virtualenv are the most mature, stability, and community support. From the online documentation you can see that virtualenv is in version 20.x as of today. virtualenv
virtualenv is a tool to create isolated Python environments. Since
Python 3.3, a subset of it has been integrated into the standard
library under the venv module. The venv module does not offer all
features of this library, to name just a few more prominent:
is slower (by not having the app-data seed method),
is not as extendable,
cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),
is not upgrade-able via pip,
does not have as rich programmatic API (describe virtual environments without creating them).
virtualenvwrapper is set of scripts to help people use virtualenv (it is a "wrapper" that not well-maintained, its last update was in 2019. virtualenvwrapper
My recommendation is to avoid ALL pip virtual environments whenever possible. Use conda instead. Conda provides a unified approach. It is maintained by teams of professional open source developers and has a reputable company providing funding and a commercially supported version. The teams that maintain pip, venv, virtualenv, pipenv, and many other pip variants have limited resources by comparison. The pip virtual environment plurality is frustrating for beginners. The pip-based virtual environment tools complexity, fragmentation, fringe and unsupported packages, and wildly inconsistent support drove me to use conda. For data science work, my recommendation is that to use a pip-based virtual environment manager as a last resort when conda packages do not exist.
The differences between the venv variants still scare me because my time is limited to learn new packages. pipenv, venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, poetry, and others have dozens of differences and complexities that take days to understand. I hate going down a path and find support for a package goes belly-up when a maintainer resigns (or gets too busy to maintain it). I just need to get my job done.
In the spirit of being helpful, here are a few links to help you dive in over your head, but not get lost in Dante's Inferno (re: pip).
A Guide to Python’s Virtual Environments
Choosing "core" Python packages to invest in for your career (long-term), versus getting a job done short term) is important. However, it is a business analysis question. Are you trying to simply get a task done, or a professional software engineer who builds scalable performant systems that require the least amount of maintenance effort over time? IMHO, conda will take you to the latter place more easily than dealing with pip-plurality problems. conda is still missing 1-step pip-package migration tools that make this a moot question. If we could simply convert pip packages into conda packages then pypi.org and conda-forge could be merged. Pip is necessary because conda packages are not (yet) universal. Many Python programmers are either too lazy to create conda packages, or they only program in Python and don't need conda's language-agnostic / multi-lingual support.
conda has been a god-send for me, because it supports cloud software engineering and data science's need for multilingual support of JavaScript, SQL, and Jupyter Notebook extensions, and conda plays well within Docker and other cloud-native environments. I encourage you to learn and master conda, which will enable you to side-step many complex questions that pip-based tools may never answer.
Keep it simple! I need one package that does 90% of what I need and guidance and workarounds for the 10% remaining edge cases.
Check out the articles linked herein to learn more about pip-based virtual environments.
I hope this is helpful to the original poster and gives pip and conda aficionados some things to think about.
I want to add docker into this list, as well as conda that several answer already mentioned.
conda is heavier than the virtual environments the title mentioned. It also give isolation on some system-python tools, such as ffmpeg or gpu drivers.
docker is even better, it gives you a whole new OS to play with. With a good Dockerfile and a docker build, docker run script, you have good documentation of how your environment is built, and it is easy to populate, migrate to other environment (staging, production, cloud). It helps you in the long run.
Another thing: PyCharm provides several options to select your virtual environment. It helps the new-comers not to worry about this thing. Recommend to use it before you know what the virtual environment is.

conda environment to AWS Lambda

I would like to set up a Python function I've written on AWS Lambda, a function that depends on a bunch of Python libraries I have already collected in a conda environment.
To set this up on Lambda, I'm supposed to zip this environment up, but the Lambda docs only give instructions for how to do this using pip/VirtualEnv. Does anyone have experience with this?
You should use the serverless framework in combination with the serverless-python-requirements plugin. You just need a requirements.txt and the plugin automatically packages your code and the dependencies in a zip-file, uploads everything to s3 and deploys your function. Bonus: Since it can do this dockerized, it is also able to help you with packages that need binary dependencies.
Have a look here (https://serverless.com/blog/serverless-python-packaging/) for a how-to.
From experience I strongly recommend you look into that. Every bit of manual labour for deployment and such is something that keeps you from developing your logic.
Edit 2017-12-17:
Your comment makes sense #eelco-hoogendoorn.
However, in my mind a conda environment is just an encapsulated place where a bunch of python packages live. So, if you would put all these dependencies (from your conda env) into a requirements.txt (and use serverless + plugin) that would solve your problem, no?
IMHO it would essentially be the same as zipping all the packages you installed in your env into your deployment package. That being said, here is a snippet, which does essentially this:
conda env export --name Name_of_your_Conda_env | yq -r '.dependencies[] | .. | select(type == "string")' | sed -E "s/(^[^=]*)(=+)([0-9.]+)(=.*|$)/\1==\3/" > requirements.txt
Unfortunately conda env export only exports the environment in yaml format. The --json flag doesn't work right now, but is supposed to be fixed in the next release. That is why I had to use yq instead of jq. You can install yq using pip install yq. It is just a wrapper around jq to allow it to also work with yaml files.
KEEP IN MIND
Lambda deployment code can only be 50MB in size. Your environment shouldn't be too big.
I have not tried deploying a lambda with serverless + serverless-python-packaging and a requirements.txt created like that and I don't know if it will work.
The main reason why I use conda is an option not to compile different binary packages myself (like numpy, matplotlib, pyqt, etc.) or compile them less frequently. When you do need to compile something yourself for the specific version of python (like uwsgi), you should compile the binaries with the same gcc version that the python within your conda environment is compiled with - most probably it is not the same gcc that your OS is using, since conda is now using the latest versions of the gcc that should be installed with conda install gxx_linux-64.
This leads us to two situations:
All you dependencies are in pure python and you can actually save a list of list of them using pip freeze and bundle them as it is stated for virtualenv.
You have some binary extensions. In that case, the the binaries from your conda environment will not work with the python used by AWS lambda. Unfortunately, you will need to visit the page describing the execution environment (AMI: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2), set up the environment, build the binaries for the specific version of built-in python in a separate directory (as well as pure python packages), and then bundle them into a zip-archive.
This is a general answer to your question, but the main idea is that you can not reuse your binary packages, only a list of them.
I can't think of a good reason why zipping up your conda environment wouldn't work.
I thik you can go into your anaconda2/envs/ or anaconda3/envs/ directory and copy/zip the env-directory you want to upload. Conda is just a souped-up version of a virtualenv, plus a different & somewhat optional package-manager. The big reason I think it's ok is that conda environments encapsulate all their dependencies within their particular .../anaconda[2|3]/envs/$VIRTUAL_ENV_DIR/ directories by default.
Using the normal virtualenv expression gives you a bit more freedom, in sort of the same way that cavemen had more freedom than modern people. Personally I prefer cars. With virtualenv you basically get a semi-empty $PYTHON_PATH variable that you can fill with whatever you want, rather than the more robust, pre-populated env that Conda spits out. The following is a good table for reference: https://conda.io/docs/commands.html#conda-vs-pip-vs-virtualenv-commands
Conda turns the command ~$ /path/to/$VIRTUAL_ENV_ROOT_DIR/bin/activate into ~$ source activate $VIRTUAL_ENV_NAME
Say you want to make a virtualenv the old fashioned way. You'd choose a directory (let's call it $VIRTUAL_ENV_ROOT_DIR,) & name (which we'll call $VIRTUAL_ENV_NAME.) At this point you would type:
~$ cd $VIRTUAL_ENV_ROOT_DIR && virtualenv $VIRTUAL_ENV_NAME
python then creates a copy of it's own interpreter library (plus pip and setuptools I think) & places an executable called activate in this clone's bin/ directory. The $VIRTUAL_ENV_ROOT_DIR/bin/activate script works by changing your current $PYTHONPATH environment variable, which determines what python interpreter gets called when you type ~$ python into the shell, & also the list of directories containing all modules which the interpreter will see when it is told to import something. This is the primary reason you'll see #!/usr/bin/env python in people's code instead of /usr/bin/python.
In https://github.com/dazza-codes/aws-lambda-layer-packing, the pip wheels seem to be working for many packages (pure-pip installs). It is difficult to bundle a lot of packages into a compact AWS Lambda layer, since pip wheels do not use shared libraries and tend to get bloated a bit, but they work. Based on some discussions in github, the conda vs. pip challenges are not trivial:
https://github.com/pypa/packaging-problems/issues/25
try https://github.com/conda-incubator/conda-press
AFAICT, the AWS SAM uses https://github.com/aws/aws-lambda-builders and it appears to be pip based, but it also has a conda package at https://anaconda.org/conda-forge/aws_lambda_builders

Replicate virtualenv without downloading all the packages again on the same machine

I have a couple projects that require similar dependencies, and I don't want to have pip going out and DLing the dependencies from the web every time. For instance I am using the norel-django package which would conflict with my standard django (rdbms version) if I installed it system wide.
Is there a way for me to "reuse" the downloaded dependancies using pip? Do I need to DL the source tar.bz2 files and make a folder structure similar to that of a pip archive or something? Any assistance would be appreciated.
Thanks
Add the following to $HOME/.pip/pip.conf:
[global]
download_cache = ~/.pip/cache
This tells pip to cache downloads in ~/.pip/cache so it won't need to go out and download them again next time.
it looks like virtualenv has a virtualenv-clone command, or perhaps virtualenvwrapper does?
Regardless, it looks to be a little more involved then just copyin and pasting virtual environment directories:
https://github.com/edwardgeorge/virtualenv-clone
additionally it appears virtualenv has a flag that will facilitate in moving your virtualenv.
http://www.virtualenv.org/en/latest/#making-environments-relocatable
$ virtualenv --relocatable ENV from virtualenv doc:
This will make some of the files created by setuptools or distribute
use relative paths, and will change all the scripts to use
activate_this.py instead of using the location of the Python
interpreter to select the environment.
Note: you must run this after you’ve installed any packages into the
environment. If you make an environment relocatable, then install a
new package, you must run virtualenv --relocatable again.
Also, this does not make your packages cross-platform. You can move
the directory around, but it can only be used on other similar
computers. Some known environmental differences that can cause
incompatibilities: a different version of Python, when one platform
uses UCS2 for its internal unicode representation and another uses
UCS4 (a compile-time option), obvious platform changes like Windows
vs. Linux, or Intel vs. ARM, and if you have libraries that bind to C
libraries on the system, if those C libraries are located somewhere
different (either different versions, or a different filesystem
layout).
If you use this flag to create an environment, currently, the
--system-site-packages option will be implied.

Migrating to pip+virtualenv from setuptools

So pip and virtualenv sound wonderful compared to setuptools. Being able to uninstall would be great. But my project is already using setuptools, so how do I migrate? The web sites I've been able to find so far are very vague and general. So here's an anthology of questions after reading the main web sites and trying stuff out:
First of all, are virtualenv and pip supposed to be in a usable state by now? If not, please disregard the rest as the ravings of a madman.
How should virtualenv be installed? I'm not quite ready to believe it's as convoluted as explained elsewhere.
Is there a set of tested instructions for how to install matplotlib in a virtual environment? For some reason it always wants to compile it here instead of just installing a package, and it always ends in failure (even after build-dep which took up 250 MB of disk space). After a whole bunch of warnings it prints src/mplutils.cpp:17: error: ‘vsprintf’ was not declared in this scope.
How does either tool interact with setup.py? pip is supposed to replace easy_install, but it's not clear whether it's a drop-in or more complicated relationship.
Is virtualenv only for development mode, or should the users also install it?
Will the resulting package be installed with the minimum requirements (like the current egg), or will it be installed with sources & binaries for all dependencies plus all the build tools, creating a gigabyte monster in the virtual environment?
Will the users have to modify their $PATH and $PYTHONPATH to run the resulting package if it's installed in a virtual environment?
Do I need to create a script from a text string for virtualenv like in the bad old days?
What is with the #egg=Package URL syntax? That's not part of the standard URL, so why isn't it a separate parameter?
Where is #rev included in the URL? At the end I suppose, but the documentation is not clear about this ("You can also include #rev in the URL").
What is supposed to be understood by using an existing requirements file as "as a sort of template for the new file"? This could mean any number of things.
Wow, that's quite a set of questions. Many of them would really deserve their own SO question with more details. I'll do my best:
First of all, are virtualenv and pip
supposed to be in a usable state by
now?
Yes, although they don't serve everyone's needs. Pip and virtualenv (along with everything else in Python package management) are far from perfect, but they are widely used and depended upon nonetheless.
How should virtualenv be installed?
I'm not quite ready to believe it's as
convoluted as explained elsewhere.
The answer you link is complex because it is trying to avoid making any changes at all to your global Python installation and install everything in ~/.local instead. This has some advantages, but is more complex to setup. It's also installing virtualenvwrapper, which is a set of convenience bash scripts for working with virtualenv, but is not necessary for using virtualenv.
If you are on Ubuntu, aptitude install python-setuptools followed by easy_install virtualenv should get you a working virtualenv installation without doing any damage to your global python environment (unless you also had the Ubuntu virtualenv package installed, which I don't recommend as it will likely be an old version).
Is there a set of tested instructions
for how to install matplotlib in a
virtual environment? For some reason
it always wants to compile it here
instead of just installing a package,
and it always ends in failure (even
after build-dep which took up 250 MB
of disk space). After a whole bunch of
warnings it prints
src/mplutils.cpp:17: error: ‘vsprintf’
was not declared in this scope.
It "always wants to compile" because pip, by design, installs only from source, it doesn't install pre-compiled binaries. This is a controversial choice, and is probably the primary reason why pip has seen widest adoption among Python web developers, who use more pure-Python packages and commonly develop and deploy in POSIX environments where a working compilation chain is standard.
The reason for the design choice is that providing precompiled binaries has a combinatorial explosion problem with different platforms and build architectures (including python version, UCS-2 vs UCS-4 python builds, 32 vs 64-bit...). The way easy_install finds the right binary package on PyPI sort of works, most of the time, but doesn't account for all these factors and can break. So pip just avoids that issue altogether (replacing it with a requirement that you have a working compilation environment).
In many cases, packages that require C compilation also have a slower-moving release schedule and it's acceptable to simply install OS packages for them instead. This doesn't allow working with different versions of them in different virtualenvs, though.
I don't know what's causing your compilation error, it works for me (on Ubuntu 10.10) with this series of commands:
virtualenv --no-site-packages tmp
. tmp/bin/activate
pip install numpy
pip install -f http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0.1/matplotlib-1.0.1.tar.gz matplotlib
The "-f" link is necessary to get the most recent version, due to matplotlib's unusual download URLs on PyPI.
How does either tool interact with
setup.py? pip is supposed to replace
easy_install, but it's not clear
whether it's a drop-in or more
complicated relationship.
The setup.py file is a convention of distutils, the Python standard library's package management "solution." distutils alone is missing some key features, and setuptools is a widely-used third-party package that "embraces and extends" distutils to provide some additional features. setuptools also uses setup.py. easy_install is the installer bundled with setuptools. Setuptools development stalled for several years, and distribute was a fork of setuptools to fix some longstanding bugs. Eventually the fork was resolved with a merge of distribute back into setuptools, and setuptools development is now active again (with a new maintainer).
distutils2 was a mostly-rewritten new version of distutils that attempted to incorporate the best ideas from setuptools/distribute, and was supposed to become part of the Python standard library. Unfortunately this effort failed, so for the time being setuptools remains the de facto standard for Python packaging.
Pip replaces easy_install, but it does not replace setuptools; it requires setuptools and builds on top of it. Thus it also uses setup.py.
Is virtualenv only for development
mode, or should the users also install
it?
There's no single right answer to that; it can be used either way. In the end it's really your user's choice, and your software ideally should be able to be installed inside or out of a virtualenv; though you might choose to document and emphasize one approach or the other. It depends very much on who your users are and what environments they are likely to need to install your software into.
Will the resulting package be
installed with the minimum
requirements (like the current egg),
or will it be installed with sources &
binaries for all dependencies plus all
the build tools, creating a gigabyte
monster in the virtual environment?
If a package that requires compilation is installed via pip, it will need to be compiled from source. That also applies to any dependencies that require compilation.
This is unrelated to the question of whether you use a virtualenv. easy_install is available by default in a virtualenv and works just fine there. It can install pre-compiled binary eggs, just like it does outside of a virtualenv.
Will the users have to modify their
$PATH and $PYTHONPATH to run the
resulting package if it's installed in
a virtual environment?
In order to use anything installed in a virtualenv, you need to use the python binary in the virtualenv's bin/ directory (or another script installed into the virtualenv that references this binary). The most common way to do this is to use the virtualenv's activate or activate.bat script to temporarily modify the shell PATH so the virtualenv's bin/ directory is first. Modifying PYTHONPATH is not generally useful or necessary with virtualenv.
Do I need to create a script from a
text string for virtualenv like in the
bad old days?
No.
What is with the #egg=Package URL
syntax? That's not part of the
standard URL, so why isn't it a
separate parameter?
The "#egg=projectname-version" URL fragment hack was first introduced by setuptools and easy_install. Since easy_install scrapes links from the web to find candidate distributions to install for a given package name and version, this hack allowed package authors to add links on PyPI that easy_install could understand, even if they didn't use easy_install's standard naming conventions for their files.
Where is #rev included in the URL? At
the end I suppose, but the
documentation is not clear about this
("You can also include #rev in the
URL").
A couple sentences after that quoted fragment there is a link to "read the requirements file format to learn about other features." The #rev feature is fully documented and demonstrated there.
What is supposed to be understood by
using an existing requirements file as
"as a sort of template for the new
file"? This could mean any number of
things.
The very next sentence says "it will keep the packages listed in devel-req.txt in order and preserve comments." I'm not sure what would be a better concise description.
I can't answer all your questions, but hopefully the following helps.
Both virtualenv and pip are very usable. Many Python devs use these everyday.
Since you have a working easy_install, the easiest way to install both is the following:
easy_install pip
easy_install virtualenv
Once you have virtualenv, just type virtualenv yourEnvName and you'll get your new python virtual environment in a directory named yourEnvName.
From there, it's as easy as source yourEnvName/bin/activate and the virtual python interpreter will be your active. I know nothing about matplotlib, but following the installation interactions should work out ok unless there are weird hard-coded path issues.
If you can install something via easy_install you can usually install it via pip. I haven't found anything that easy_install could do that pip couldn't.
I wouldn't count on users being able to install virtualenv (it depends on who your users are). Technically, a virtual python interpreter can be treated as a real one for most cases. It's main use is not cluttering up the real interpreter's site-packages and if you have two libraries/apps that require different and incompatible versions of the same library.
If you or a user install something in a virtualenv, it won't be available in other virtualenvs or the system Python interpreter. You'll need to use source /path/to/yourvirtualenv/bin/activate command to switch to a virtual environment you installed the library on.
What they mean by "as a sort of template for the new file" is that the pip freeze -r devel-req.txt > stable-req.txt command will create a new file stable-req.txt based on the existing file devel-req.txt. The only difference will be anything installed not already specified in the existing file will be in the new file.

Categories