What are the Python equivalents to Ruby's bundler / Perl's carton? - python

I know about virtualenv and pip. But these are a bit different from bundler/carton.
For instance:
pip writes the absolute path to shebang or activate script
pip doesn't have the exec sub command (bundle exec bar)
virtualenv copies the Python interpreter to a local directory
Does every Python developer use virtualenv/pip? Are there other package management tools for Python?

From what i've read about bundler — pip without virtualenv should work just fine for you. You can think of it as something between regular gem command and bundler. Common things that you can do with pip:
Installing packages (gem install)
pip install mypackage
Dependencies and bulk-install (gemfile)
Probably the easiest way is to use pip's requirements.txt files. Basically it's just a plain list of required packages with possible version constraints. It might look something like:
nose==1.1.2
django<1.3
PIL
Later when you'd want to install those dependencies you would do:
$ pip install -r requirements.txt
A simple way to see all your current packages in requirements-file syntax is to do:
$ pip freeze
You can read more about it here.
Execution (bundler exec)
All python packages that come with executable files are usually directly available after install (unless you have custom setup or it's a special package). For example:
$ pip install gunicorn
$ gunicorn -h
Package gems for install from cache (bundler package)
There is pip bundle and pip zip/unzip. But i'm not sure if many people use it.
p.s. If you do care about environment isolation you can also use virtualenv together with pip (they are close friends and work perfectly together). By default pip installs packages system-wide which might require admin rights.

You can use pipenv, which has similar interface with bundler.
$ pip install pipenv
Pipenv creates virtualenv automatically and installs dependencies from Pipfile or Pipfile.lock.
$ pipenv --three # Create virtualenv with Python3
$ pipenv install # Install dependencies from Pipfile
$ pipenv install requests # Install `requests` and update Pipfile
$ pipenv lock # Generate `Pipfile.lock`
$ pipenv shell # Run shell with virtualenv activated
You can run command with virtualenv scope like bundle exec.
$ pipenv run python3 -c "print('hello!')"

There is a clone pbundler.
The version that is currently in pip simply reads the requirements.txt file you already have, but is much out of date. It's also not totally equivalent: it insists on making a virtualenv. Bundler, I notice, only installs what packages are missing, and gives you the option of giving your sudo password to install into your system dirs or of restarting, which doesn't seem to be a feature of pbundler.
However, the version on git is an almost complete rewrite to be much closer to Bundler's behaviour... including having a "Cheesefile" and now not supporting requirements.txt. This is unfortunate, since requirements.txt is the de facto standard in pythonland, and there's even Offical BDFL-stamped work to standardize it. When that comes into force, you can be sure that something like pbundler will become the de facto standard. Alas, nothing quite stable yet that I know of (but I would love to be proven wrong).

I wrote one — https://github.com/Deepwalker/pundler .
On PIP its pundle, name was already taken.
It uses requirements(_\w+)?.txt files as your desired dependencies and creates frozen(_\w+)?.txt files with frozen versions.
About (_\w+)? thing — this is envs. You can create requirements_test.txt and then use PUNDLEENV=test to use this deps in your run with requirements.txt ones alongside.
And about virtualenv – you need not one, its what pundle takes from bundler in first head.

Python Poetry is the closest to Ruby bundler as of 2020 (and already since 2018). It's already more than two years old, still very active, has great documentation. One might complain about curl-pipe-python-style being the recommended way of installing, but there are alternatives, e.g. homebrew on macOS.
Primary website: https://python-poetry.org/
Github: https://github.com/python-poetry/poetry
Documentation: https://python-poetry.org/docs/
It uses virtualenvs behind the scenes (in contrast to bundler), but it provides and uses a lock-file, takes care of sub dependencies, adheres to specified version constraints and allows automatically updating outdated packages. There's even autocompletion for your favorite shell.
With its use of a pyproject.toml file, it's also going a bit further than bundler (closer to a gemspec. It's also comparable to JavaScript's and TypeScript's npm and yarn).
Poetrify (a complementing project) helps converting projects from requirements.txt to pyproject.toml for Poetry.
The lock file can be exported to requirements.txt by poetry export -f requirements.txt > requirements.txt, if you need that for other tooling (or the unlikely case want to go back).

I'd say Shovel is worth a look. It was developed specifically to the Pythonish version of Rake. There's not a ton of commit activity on the project, but seems stable and useful.

You can use pipx to install and run Python Applications in Isolated Environments automatically.
You can use pipenv to create and manage a virtualenv for your projects automatically.
Both wraps pip with virtual environment tools and aiming for different use cases.
All of these are one of the most stared project listed in the github PyPA repository.
FYI: Debian bullseye/testing currently lacks pipx. But package from sid should work fine. (2021-06-19)

No, no all the developers use virtualenv and/or pip, but many developers use/prefer these tools
And now, for package development tools and diferent environments that is your real question. Exist any other tools like Buildout (http://www.buildout.org/en/latest/) for the same purpose, isolate your environment Python build system for every project that you manage. For some time I use this, but not now.
Independent environments per project, in Python are a little different that the same situation in Ruby. In my case i use pyenv (https://github.com/yyuu/pyenv) that is something like rbenv but, for Python. diferent versions of python and virtualenvs per project, and, in this isolated environments, i can use pip or easy-install (if is needed).

Related

Shouldn't virtualenv be used on Ansible target nodes?

In most cases I think that Ansible engineers install pip packages 1) without using a virtualenv and 2) under root.
If we do this manually we would see a warning
WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv
Typically when our Ansible automation becomes more advanced we would need additional pip packages to make Ansible modules work. More often than not this also requires additional OS packages to be installed. For example for python-ldap pip package on Ubuntu 18.04 requires
build-essential
python3-dev
python3-wheel
libsasl2-dev
libldap2-dev
libssl-dev
The way that Ansible is made to work on target nodes by installing additional pip packages as root while this clearly not the recommended way to use Python and Pip makes me wonder if there is not a better way to do this.
Should we not use virtualenv and another account than root for installing pip for Ansible?
There are probably multiple aspects to this. The one that came to my mind first, is this:
Using the "global" python outside of any venv, will probably work for the vast majority of users very well, while using a venv can lead to all kinds of unexpected behavior, if you are not familiar with the concept.
For example, if a venv was used by default, people will install python packages and then wonder why python claims they are not available when they try to import them in python on the host.
On the other hand, it is probably relatively easy to use a venv, if you want to do that. In any playbook, you can create the venv and then just update the ansible_python_interpreter variable:
- pip:
name: pip
virtualenv: /path/to/venv
- set_fact:
ansible_python_interpreter: /path/to/venv/bin/python
- setup:
Disclaimer: I just tried that very quickly and saw that it worked, so there might be glitches in some situations.
Obviously, it is not very neat to add something like this to every play, but it can be made a lot nicer, e.g. by creating the venv on the first setup and then using the interpreter_python key in ansible.cfg.
tl;dr:
Using global python is probably the way it works the best for most users, while "power-users" still have ways to achieve what they want with some additional actions.

How can I make `python3 -m venv` install the new version of pip in the environment? [duplicate]

Python 3.3 includes in its standard library the new package venv. What does it do, and how does it differ from all the other packages that match the regex (py)?(v|virtual|pip)?env?
This is my personal recommendation for beginners: start by learning virtualenv and pip, tools which work with both Python 2 and 3 and in a variety of situations, and pick up other tools once you start needing them.
Now on to answer the question: what is the difference between these similarly named things: venv, virtualenv, etc?
PyPI packages not in the standard library:
virtualenv is a very popular tool that creates isolated Python environments for Python libraries. If you're not familiar with this tool, I highly recommend learning it, as it is a very useful tool.
It works by installing a bunch of files in a directory (eg: env/), and then modifying the PATH environment variable to prefix it with a custom bin directory (eg: env/bin/). An exact copy of the python or python3 binary is placed in this directory, but Python is programmed to look for libraries relative to its path first, in the environment directory. It's not part of Python's standard library, but is officially blessed by the PyPA (Python Packaging Authority). Once activated, you can install packages in the virtual environment using pip.
pyenv is used to isolate Python versions. For example, you may want to test your code against Python 2.7, 3.6, 3.7 and 3.8, so you'll need a way to switch between them. Once activated, it prefixes the PATH environment variable with ~/.pyenv/shims, where there are special files matching the Python commands (python, pip). These are not copies of the Python-shipped commands; they are special scripts that decide on the fly which version of Python to run based on the PYENV_VERSION environment variable, or the .python-version file, or the ~/.pyenv/version file. pyenv also makes the process of downloading and installing multiple Python versions easier, using the command pyenv install.
pyenv-virtualenv is a plugin for pyenv by the same author as pyenv, to allow you to use pyenv and virtualenv at the same time conveniently. However, if you're using Python 3.3 or later, pyenv-virtualenv will try to run python -m venv if it is available, instead of virtualenv. You can use virtualenv and pyenv together without pyenv-virtualenv, if you don't want the convenience features.
virtualenvwrapper is a set of extensions to virtualenv (see docs). It gives you commands like mkvirtualenv, lssitepackages, and especially workon for switching between different virtualenv directories. This tool is especially useful if you want multiple virtualenv directories.
pyenv-virtualenvwrapper is a plugin for pyenv by the same author as pyenv, to conveniently integrate virtualenvwrapper into pyenv.
pipenv aims to combine Pipfile, pip and virtualenv into one command on the command-line. The virtualenv directory typically gets placed in ~/.local/share/virtualenvs/XXX, with XXX being a hash of the path of the project directory. This is different from virtualenv, where the directory is typically in the current working directory. pipenv is meant to be used when developing Python applications (as opposed to libraries). There are alternatives to pipenv, such as poetry, which I won't list here since this question is only about the packages that are similarly named.
Standard library:
pyvenv (not to be confused with pyenv in the previous section) is a script shipped with Python 3.3 to 3.7. It was removed from Python 3.8 as it had problems (not to mention the confusing name). Running python3 -m venv has exactly the same effect as pyvenv.
venv is a package shipped with Python 3, which you can run using python3 -m venv (although for some reason some distros separate it out into a separate distro package, such as python3-venv on Ubuntu/Debian). It serves the same purpose as virtualenv, but only has a subset of its features (see a comparison here). virtualenv continues to be more popular than venv, especially since the former supports both Python 2 and 3.
I would just avoid the use of virtualenv after Python3.3+ and instead use the standard shipped library venv. To create a new virtual environment you would type:
$ python3 -m venv <MYVENV>
virtualenv tries to copy the Python binary into the virtual environment's bin directory. However it does not update library file links embedded into that binary, so if you build Python from source into a non-system directory with relative path names, the Python binary breaks. Since this is how you make a copy distributable Python, it is a big flaw. BTW to inspect embedded library file links on OS X, use otool. For example from within your virtual environment, type:
$ otool -L bin/python
python:
#executable_path/../Python (compatibility version 3.4.0, current version 3.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.0.0)
Consequently I would avoid virtualenvwrapper and pipenv. pyvenv is deprecated. pyenv seems to be used often where virtualenv is used but I would stay away from it also since I think venv also does what pyenv is built for.
venv creates virtual environments in the shell that are fresh and sandboxed, with user-installable libraries, and it's multi-python safe.
Fresh: because virtual environments only start with the standard libraries that ship with python, you have to install any other libraries all over again with pip install while the virtual environment is active.
Sandboxed: because none of these new library installs are visible outside the virtual environment, so you can delete the whole environment and start again without worrying about impacting your base python install.
User-installable libraries: because the virtual environment's target folder is created without sudo in some directory you already own, so you won't need sudo permissions to install libraries into it.
multi-python safe: because when virtual environments activate, the shell only sees the python version (3.4, 3.5 etc.) that was used to build that virtual environment.
pyenv is similar to venv in that it lets you manage multiple python environments. However with pyenv you can't conveniently rollback library installs to some start state and you will likely need admin privileges at some point to update libraries. So I think it is also best to use venv.
In the last couple of years I have found many problems in build systems (emacs packages, python standalone application builders, installers...) that ultimately come down to issues with virtualenv. I think python will be a better platform when we eliminate this additional option and only use venv.
EDIT: Tweet of the BDFL,
I use venv (in the stdlib) and a bunch of shell aliases to quickly switch.
— Guido van Rossum (#gvanrossum) October 22, 2020
UPDATE 20200825:
Added below "Conclusion" paragraph
I've went down the pipenv rabbit hole (it's a deep and dark hole indeed...) and since the last answer is over 2 years ago, felt it was useful to update the discussion with the latest developments on the Python virtual envelopes topic I've found.
DISCLAIMER:
This answer is NOT about continuing the raging debate about the merits of pipenv versus venv as envelope solutions- I make no endorsement of either. It's about PyPA endorsing conflicting standards and how future development of virtualenv promises to negate making an either/or choice between them at all. I focused on these two tools precisely because they are the anointed ones by PyPA.
venv
As the OP notes, venv is a tool for virtualizing environments. NOT a third party solution, but native tool. PyPA endorses venv for creating VIRTUAL ENVELOPES: "Changed in version 3.5: The use of venv is now recommended for creating virtual environments".
pipenv
pipenv- like venv - can be used to create virtual envelopes but additionally rolls-in package management and vulnerability checking functionality. Instead of using requirements.txt, pipenv delivers package management via Pipfile. As PyPA endorses pipenv for PACKAGE MANAGEMENT, that would seem to imply pipfile is to supplant requirements.txt.
HOWEVER: pipenv uses virtualenv as its tool for creating virtual envelopes, NOT venv which is endorsed by PyPA as the go-to tool for creating virtual envelopes.
Conflicting Standards:
So if settling on a virtual envelope solution wasn't difficult enough, we now have PyPA endorsing two different tools which use different virtual envelope solutions. The raging Github debate on venv vs virtualenv which highlights this conflict can be found here.
Conflict Resolution:
The Github debate referenced in above link has steered virtualenv development in the direction of accommodating venv in future releases:
prefer built-in venv: if the target python has venv we'll create the
environment using that (and then perform subsequent operations on that
to facilitate other guarantees we offer)
Conclusion:
So it looks like there will be some future convergence between the two rival virtual envelope solutions, but as of now pipenv- which uses virtualenv - varies materially from venv.
Given the problems pipenv solves and the fact that PyPA has given its blessing, it appears to have a bright future. And if virtualenv delivers on its proposed development objectives, choosing a virtual envelope solution should no longer be a case of either pipenv OR venv.
Update 20200825:
An oft repeated criticism of Pipenv I saw when producing this analysis was that it was not actively maintained. Indeed, what's the point of using a solution whose future could be seen questionable due to lack of continuous development? After a dry spell of about 18 months, Pipenv is once again being actively developed. Indeed, large and material updates have since been released.
Let's start with the problems these tools want to solve:
My system package manager don't have the Python versions I wanted or I want to install multiple Python versions side by side, Python 3.9.0 and Python 3.9.1, Python 3.5.3, etc
Then use pyenv.
I want to install and run multiple applications with different, conflicting dependencies.
Then use virtualenv or venv. These are almost completely interchangeable, the difference being that virtualenv supports older python versions and has a few more minor unique features, while venv is in the standard library.
I'm developing an /application/ and need to manage my dependencies, and manage the dependency resolution of the dependencies of my project.
Then use pipenv or poetry.
I'm developing a /library/ or a /package/ and want to specify the dependencies that my library users need to install
Then use setuptools.
I used virtualenv, but I don't like virtualenv folders being scattered around various project folders. I want a centralised management of the environments and some simple project management
Then use virtualenvwrapper. Variant: pyenv-virtualenvwrapper if you also use pyenv.
Not recommended
pyvenv. This is deprecated, use venv or virtualenv instead. Not to be confused with pipenv or pyenv.
Jan 2020 Update
#Flimm has explained all the differences very well. Generally, we want to know the difference between all tools because we want to decide what's best for us. So, the next question would be: which one to use? I suggest you choose one of the two official ways to manage virtual environments:
Python Packaging now recommends Pipenv
Python.org now recommends venv
pyenv - manages different python versions,
all others - create virtual environment (which has isolated python
version and installed "requirements"),
pipenv want combine all, in addition to previous it installs "requirements" (into the active virtual environment or create its own
if none is active)
So maybe you will be happy with pipenv only.
But I use: pyenv + pyenv-virtualenvwrapper, + pipenv (pipenv for installing requirements only).
In Debian:
apt install libffi-dev
install pyenv based on https://www.tecmint.com/pyenv-install-and-manage-multiple-python-versions-in-linux/, but..
.. but instead of pyenv-virtualenv install pyenv-virtualenvwrapper (which can be standalone library or pyenv plugin, here the 2nd option):
$ pyenv install 3.9.0
$ git clone https://github.com/pyenv/pyenv-virtualenvwrapper.git $(pyenv root)/plugins/pyenv-virtualenvwrapper
# inside ~/.bashrc add:
# export $VIRTUALENVWRAPPER_PYTHON="/usr/bin/python3"
$ source ~/.bashrc
$ pyenv virtualenvwrapper
Then create virtual environments for your projects (workingdir must exist):
pyenv local 3.9.0 # to prevent 'interpreter not found' in mkvirtualenv
python -m pip install --upgrade pip setuptools wheel
mkvirtualenv <venvname> -p python3.9 -a <workingdir>
and switch between projects:
workon <venvname>
python -m pip install --upgrade pip setuptools wheel pipenv
Inside a project I have the file requirements.txt, without fixing the versions inside (if some version limitation is not neccessary).
You have 2 possible tools to install them into the current virtual environment: pip-tools or pipenv. Lets say you will use pipenv:
pipenv install -r requirements.txt
this will create Pipfile and Pipfile.lock files, fixed versions are in the 2nd one. If you want reinstall somewhere exactly same versions then (Pipfile.lock must be present):
pipenv install
Remember that Pipfile.lock is related to some Python version and need to be recreated if you use a different one.
As you see I write requirements.txt. This has some problems: You must remove a removed package from Pipfile too. So writing Pipfile directly is probably better.
So you can see I use pipenv very poorly. Maybe if you will use it well, it can replace everything?
EDIT 2021.01: I have changed my stack to: pyenv + pyenv-virtualenvwrapper + poetry. Ie. I use no apt or pip installation of virtualenv or virtualenvwrapper, and instead I install pyenv's plugin pyenv-virtualenvwrapper. This is easier way.
Poetry is great for me:
poetry add <package> # install single package
poetry remove <package>
poetry install # if you remove poetry.lock poetry will re-calculate versions
As a Python newcomer this question frustrated me endlessly and confused me for months. Which virtual environment and package manager(s) should I invest in learning when I know that I will be using it for years to come?
The best article answering this vexing question is https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/ by Jake Vanderplas. Although a few years old, it provides practical answers and the history of Python package and virtual environment managers from the trenches as these state-of-the-art was developing.
It was particularly frustrating for me in the data science and "big data cloud computing" communities, because conda is widely used as a virtual environment manager and full function package manager for Python and JavaScript, SQL, Java, HTML5, and Jupyter Notebooks.
So why use pip at all, when conda does everything that pip and venv variants do?
The answer is, "because you MUST use pip if a conda package is simply not available." Many times a required package is only available in pip format and there is no easy solution but to use pip. You can learn to use conda build but if you are not the package maintainer, then you must convince the package owner to generate a conda package for each new release (or do it yourself.)
These pip-based packages differ along many important and practical dimensions:
stability
maturity
complexity
active support (versus dying or dead)
levels of adoption near the Python ecosystem "core" versus "on the
fringes" (i.e., integrated into Python.org distro)
easy to figure out and use (for beginners)
I will answer your question for two packages from dimension of package maturity and stability.
venv and virtualenv are the most mature, stability, and community support. From the online documentation you can see that virtualenv is in version 20.x as of today. virtualenv
virtualenv is a tool to create isolated Python environments. Since
Python 3.3, a subset of it has been integrated into the standard
library under the venv module. The venv module does not offer all
features of this library, to name just a few more prominent:
is slower (by not having the app-data seed method),
is not as extendable,
cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),
is not upgrade-able via pip,
does not have as rich programmatic API (describe virtual environments without creating them).
virtualenvwrapper is set of scripts to help people use virtualenv (it is a "wrapper" that not well-maintained, its last update was in 2019. virtualenvwrapper
My recommendation is to avoid ALL pip virtual environments whenever possible. Use conda instead. Conda provides a unified approach. It is maintained by teams of professional open source developers and has a reputable company providing funding and a commercially supported version. The teams that maintain pip, venv, virtualenv, pipenv, and many other pip variants have limited resources by comparison. The pip virtual environment plurality is frustrating for beginners. The pip-based virtual environment tools complexity, fragmentation, fringe and unsupported packages, and wildly inconsistent support drove me to use conda. For data science work, my recommendation is that to use a pip-based virtual environment manager as a last resort when conda packages do not exist.
The differences between the venv variants still scare me because my time is limited to learn new packages. pipenv, venv, pyvenv, pyenv, virtualenv, virtualenvwrapper, poetry, and others have dozens of differences and complexities that take days to understand. I hate going down a path and find support for a package goes belly-up when a maintainer resigns (or gets too busy to maintain it). I just need to get my job done.
In the spirit of being helpful, here are a few links to help you dive in over your head, but not get lost in Dante's Inferno (re: pip).
A Guide to Python’s Virtual Environments
Choosing "core" Python packages to invest in for your career (long-term), versus getting a job done short term) is important. However, it is a business analysis question. Are you trying to simply get a task done, or a professional software engineer who builds scalable performant systems that require the least amount of maintenance effort over time? IMHO, conda will take you to the latter place more easily than dealing with pip-plurality problems. conda is still missing 1-step pip-package migration tools that make this a moot question. If we could simply convert pip packages into conda packages then pypi.org and conda-forge could be merged. Pip is necessary because conda packages are not (yet) universal. Many Python programmers are either too lazy to create conda packages, or they only program in Python and don't need conda's language-agnostic / multi-lingual support.
conda has been a god-send for me, because it supports cloud software engineering and data science's need for multilingual support of JavaScript, SQL, and Jupyter Notebook extensions, and conda plays well within Docker and other cloud-native environments. I encourage you to learn and master conda, which will enable you to side-step many complex questions that pip-based tools may never answer.
Keep it simple! I need one package that does 90% of what I need and guidance and workarounds for the 10% remaining edge cases.
Check out the articles linked herein to learn more about pip-based virtual environments.
I hope this is helpful to the original poster and gives pip and conda aficionados some things to think about.
I want to add docker into this list, as well as conda that several answer already mentioned.
conda is heavier than the virtual environments the title mentioned. It also give isolation on some system-python tools, such as ffmpeg or gpu drivers.
docker is even better, it gives you a whole new OS to play with. With a good Dockerfile and a docker build, docker run script, you have good documentation of how your environment is built, and it is easy to populate, migrate to other environment (staging, production, cloud). It helps you in the long run.
Another thing: PyCharm provides several options to select your virtual environment. It helps the new-comers not to worry about this thing. Recommend to use it before you know what the virtual environment is.

conda environment to AWS Lambda

I would like to set up a Python function I've written on AWS Lambda, a function that depends on a bunch of Python libraries I have already collected in a conda environment.
To set this up on Lambda, I'm supposed to zip this environment up, but the Lambda docs only give instructions for how to do this using pip/VirtualEnv. Does anyone have experience with this?
You should use the serverless framework in combination with the serverless-python-requirements plugin. You just need a requirements.txt and the plugin automatically packages your code and the dependencies in a zip-file, uploads everything to s3 and deploys your function. Bonus: Since it can do this dockerized, it is also able to help you with packages that need binary dependencies.
Have a look here (https://serverless.com/blog/serverless-python-packaging/) for a how-to.
From experience I strongly recommend you look into that. Every bit of manual labour for deployment and such is something that keeps you from developing your logic.
Edit 2017-12-17:
Your comment makes sense #eelco-hoogendoorn.
However, in my mind a conda environment is just an encapsulated place where a bunch of python packages live. So, if you would put all these dependencies (from your conda env) into a requirements.txt (and use serverless + plugin) that would solve your problem, no?
IMHO it would essentially be the same as zipping all the packages you installed in your env into your deployment package. That being said, here is a snippet, which does essentially this:
conda env export --name Name_of_your_Conda_env | yq -r '.dependencies[] | .. | select(type == "string")' | sed -E "s/(^[^=]*)(=+)([0-9.]+)(=.*|$)/\1==\3/" > requirements.txt
Unfortunately conda env export only exports the environment in yaml format. The --json flag doesn't work right now, but is supposed to be fixed in the next release. That is why I had to use yq instead of jq. You can install yq using pip install yq. It is just a wrapper around jq to allow it to also work with yaml files.
KEEP IN MIND
Lambda deployment code can only be 50MB in size. Your environment shouldn't be too big.
I have not tried deploying a lambda with serverless + serverless-python-packaging and a requirements.txt created like that and I don't know if it will work.
The main reason why I use conda is an option not to compile different binary packages myself (like numpy, matplotlib, pyqt, etc.) or compile them less frequently. When you do need to compile something yourself for the specific version of python (like uwsgi), you should compile the binaries with the same gcc version that the python within your conda environment is compiled with - most probably it is not the same gcc that your OS is using, since conda is now using the latest versions of the gcc that should be installed with conda install gxx_linux-64.
This leads us to two situations:
All you dependencies are in pure python and you can actually save a list of list of them using pip freeze and bundle them as it is stated for virtualenv.
You have some binary extensions. In that case, the the binaries from your conda environment will not work with the python used by AWS lambda. Unfortunately, you will need to visit the page describing the execution environment (AMI: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2), set up the environment, build the binaries for the specific version of built-in python in a separate directory (as well as pure python packages), and then bundle them into a zip-archive.
This is a general answer to your question, but the main idea is that you can not reuse your binary packages, only a list of them.
I can't think of a good reason why zipping up your conda environment wouldn't work.
I thik you can go into your anaconda2/envs/ or anaconda3/envs/ directory and copy/zip the env-directory you want to upload. Conda is just a souped-up version of a virtualenv, plus a different & somewhat optional package-manager. The big reason I think it's ok is that conda environments encapsulate all their dependencies within their particular .../anaconda[2|3]/envs/$VIRTUAL_ENV_DIR/ directories by default.
Using the normal virtualenv expression gives you a bit more freedom, in sort of the same way that cavemen had more freedom than modern people. Personally I prefer cars. With virtualenv you basically get a semi-empty $PYTHON_PATH variable that you can fill with whatever you want, rather than the more robust, pre-populated env that Conda spits out. The following is a good table for reference: https://conda.io/docs/commands.html#conda-vs-pip-vs-virtualenv-commands
Conda turns the command ~$ /path/to/$VIRTUAL_ENV_ROOT_DIR/bin/activate into ~$ source activate $VIRTUAL_ENV_NAME
Say you want to make a virtualenv the old fashioned way. You'd choose a directory (let's call it $VIRTUAL_ENV_ROOT_DIR,) & name (which we'll call $VIRTUAL_ENV_NAME.) At this point you would type:
~$ cd $VIRTUAL_ENV_ROOT_DIR && virtualenv $VIRTUAL_ENV_NAME
python then creates a copy of it's own interpreter library (plus pip and setuptools I think) & places an executable called activate in this clone's bin/ directory. The $VIRTUAL_ENV_ROOT_DIR/bin/activate script works by changing your current $PYTHONPATH environment variable, which determines what python interpreter gets called when you type ~$ python into the shell, & also the list of directories containing all modules which the interpreter will see when it is told to import something. This is the primary reason you'll see #!/usr/bin/env python in people's code instead of /usr/bin/python.
In https://github.com/dazza-codes/aws-lambda-layer-packing, the pip wheels seem to be working for many packages (pure-pip installs). It is difficult to bundle a lot of packages into a compact AWS Lambda layer, since pip wheels do not use shared libraries and tend to get bloated a bit, but they work. Based on some discussions in github, the conda vs. pip challenges are not trivial:
https://github.com/pypa/packaging-problems/issues/25
try https://github.com/conda-incubator/conda-press
AFAICT, the AWS SAM uses https://github.com/aws/aws-lambda-builders and it appears to be pip based, but it also has a conda package at https://anaconda.org/conda-forge/aws_lambda_builders

For real, too many installations of Python on OSX Mountain Lion

I have three different Python 2.7s at:
/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
I use a number of packages that come from different sources. I am currently installing packages from port (MacPorts), easy_install, pip (installed by easy_install), and Mercurial. There are also some that I have to install from image or build from source. I have more control over those.
The problem is that easy_install and pip seem to be installing to one location (/Library/Frameworks/...) and MacPorts installs to another (/opt/local/Library/Frameworks/...).
What's my best action now? Delete /Library/Frameworks/.../python2.7 and move easy_install and pip to the MacPorts one at /opt/local/...? Link the two directories? Move the MacPorts installation to /Library/Frameworks/...?
How can I consolidate these Pythons? I have tried putting both site-packages locations in my path, but only certain packages are available only for one Python and not the other and others vice versa, and I need them all available at once.
It seems that you have control over the stuff you're building yourself. This is how I consolidate macports with pip:
I like using Macports for all my stuff, so I just make sure that pip and easy_install build into macports' installation of python (the one in /opt/local/...).
You can tell where pip and easy_install will install things by using:
readlink `which pip`
(those are backticks)
If you want pip to install to the macports direcectories, use macports to install pip:
sudo port install py-pip
Then, be sure that which pip points to something like:
askewchan#rock:~$ which pip
/opt/local/bin/pip
askewchan#rock:~$ readlink `which pip`
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/pip-2.7
From the comments below (thanks #Jonathan and #Ned) you can do the same with easy_install but its port is called py-distribute:
sudo port install py-distribute
But as far as I know, you never need to use easy_install because anything that can be easy_installed can be piped better.
Note the port descriptions:
askewchan#rock:Tracking {master *}$ port search *easy*install*
py-pip #1.2.1 (python, www)
An easy_install replacement
askewchan#rock:Tracking {master *}$ port search py*distribute
py-distribute #0.6.35 (python, devel)
Replacement for setuptools
I suggest deciding on one and only one Python for your development work ( personally, I use distribution from Python.org )
You can't get rid of /Library/Frameworks - that's the default OSX one, and you could break things
of the two remaining Pythons, I'm assuming one is Macports and the other is Python.org -- you need to choose which one you want to be your development env and to stick with that.
I would strongly recommend against using pip or easy_install from one Python to install modules for another. The reason is that there can be differences in the compile options. It can be hard enough as-is to get certain packages to compile on OSX properly -- if you start compiling against different binaries ( which might support different architectures ) you're just going to increase your headaches.
I personally chose the following path:
I use the Python.org package for all development.
On a terminal login, I run shell scripts to prioritize my Python choice
All of my projects have their own virtualenv , and I disable system packages
When starting to work on any project, I tend to have an environment setup script. I just type in go_myproject.source ; that cds me to the right directory and runs the source /path/to/virtualenv/bin/activate to get me set up for that project.
There's a tiny bit of overhead on getting things setup, but I have been in complete heaven ever since. Managing projects and not needing to worry about dependencies/upgrades for one thing killing something else is... blissful.
While not a general solution, I install Mercurial and other Python-based applications using virtualenv. In particular, pip and easy_install will install to the respective virtual environment only and not clutter any system folder. The downside is, of course, that I will have duplicates of some packages; the advantage is that I have a clean, self-contained environment with a known version of Python (which for things such as Mercurial and other mission-critical applications is more important for me).
Another downside is that I need to link individual applications to my personal bin directory or add the bin directories of the virtual environments to my path. (Personally, I manage this with some simple scripts that do the symlinking for me.)
I sugest to move all python installations to the one place and create symlinks.
After that configure python environment to avoid problems with imports and "visibility" of the modules.
Try to use commands:
# easy_install
env PYTHONPATH=/custom/path easy_install –install-dir /custom/path
#pip
pip install --install-option="--prefix=$PREFIX_PATH" package_name

Best practice for installing python modules from an arbitrary VCS repository

I'm newish to the python ecosystem, and have a question about module editing.
I use a bunch of third-party modules, distributed on PyPi. Coming from a C and Java background, I love the ease of easy_install <whatever>. This is a new, wonderful world, but the model breaks down when I want to edit the newly installed module for two reasons:
The egg files may be stored in a folder or archive somewhere crazy on the file system.
Using an egg seems to preclude using the version control system of the originating project, just as using a debian package precludes development from an originating VCS repository.
What is the best practice for installing modules from an arbitrary VCS repository? I want to be able to continue to import foomodule in other scripts. And if I modify the module's source code, will I need to perform any additional commands?
Pip lets you install files gives a URL to the Subversion, git, Mercurial or bzr repository.
pip install -e svn+http://path_to_some_svn/repo#egg=package_name
Example:
pip install -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
If I wanted to download the latest version of cmdutils. (Random package I decided to pull).
I installed this into a virtualenv (using the -E parameter), and pip installed cmdutls into a src folder at the top level of my virtualenv folder.
pip install -E thisIsATest -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
$ ls thisIsATest/src
cmdutils
Are you wanting to do development but have the developed version be handled as an egg by the system (for instance to get entry-points)? If so then you should check out the source and use Development Mode by doing:
python setup.py develop
If the project happens to not be a setuptools based project, which is required for the above, a quick work-around is this command:
python -c "import setuptools; execfile('setup.py')" develop
Almost everything you ever wanted to know about setuptools (the basis of easy_install) is available from the the setuptools docs. Also there are docs for easy_install.
Development mode adds the project to your import path in the same way that easy_install does. An changes you make will be available to your apps the next time they import the module.
As others mentioned, you can also directly use version control URLs if you just want to get the latest version as it is now without the ability to edit, but that will only take a snapshot, and indeed creates a normal egg as part of the process. I know for sure it does Subversion and I thought it did others but I can't find the docs on that.
You can use the PYTHONPATH environment variable or symlink your code to somewhere in site-packages.
Packages installed by easy_install tend to come from snapshots of the developer's version control, generally made when the developer releases an official version. You're therefore going to have to choose between convenient automatic downloads via easy_install and up-to-the-minute code updates via version control. If you pick the latter, you can build and install most packages seen in the python package index directly from a version control checkout by running python setup.py install.
If you don't like the default installation directory, you can install to a custom location instead, and export a PYTHONPATH environment variable whose value is the path of the installed package's parent folder.

Categories