Create anaconda environment with all packages from other environments

Create anaconda environment with all packages from other environments - python

Is it possible to create an anaconda environment with all of the packages from my other environments? It would be even better if it could dynamically stay up to date.

If the packages of interest were all pulled from pip you could attempt a pip freeze and requirements install like discussed here.
Pip freeze vs. pip list
But I doubt that would work globally for every module. I remember back in the day trying to extend my base python to include Bokeh, but all the dependency headache eventually caused me to outright install Anaconda.
Looks like there is a means to do this,
$ conda list -e > req.txt
then you can install the environment using
$ conda create -n new environment --file req.txt
These examples are for one off merging of a single source to a single target environment. If you want the union of various environments you'd need to merge the req.txt files and possibly take the highest value version so you'd need to do some string parsing and a little bit of scripting so you don't get conflicting versions installed from various environments funneling down to one.
(I'm not able to test this directly at the moment)

stack 'em.
create environments for base_env (base packages) and app_env (just your application packages)
then,
conda activate base_env
conda activate --stack app_env

Related

Python Virtual Environments Confusion

I have been learning data science using python for about a year now. I have become quite comfortable with the syntax and model creation. I have exclusively used Google Colab just due to how convenient it is and I love the notebook style. However, one thing I do not understand is the environment stuff. Although I use Colab, I do have python and anaconda on my machine and have installed various packages using the exact following format: pip install (package name). When I open my terminal, the first line is lead with (base) and when I check the Environments tab in anaconda navigator, it appears as though I installed all of these packages into a base environment named base (root)? Is that right? If so, what would my environment's name be then? What is a base environment compared to a venv?
The reason I am asking is because if I ever decide to use an IDE in the future, I would need to set my environment to be able to run packages, correct?
Just for fun I want to try using R and its reticulate package that allows python use in R. As stated in the answer to this question, I need to set my virtual environment before I can use python in R. Would my virtual environment be base (root)?
I'm a complete noob about all of this environment stuff. Again, I just opened my terminal and typed pip install (package name) for all packages I've installed. Thanks for any help in advance.

So from your description, it sounds like your default Python installation on your computer is through Anaconda. If that's the case, base is actually going to be the name of the conda virtual environment that you're using.
Virtual environments can be tricky, so I'll walk you through what I usually do here.
First, you can always check which Python installation you're currently using by using the which command on Mac/Linux, or if you're using Windows the command will probably be where (if you're on Windows, this answer might be helpful: equivalent of 'which' in Windows.)
(base) ➜ ~ which python
/Users/steven/miniconda3/bin/python
From the above, you can see that my default Python is through Miniconda, which is just a small version of Anaconda.
This means that when you use pip to install packages, those are getting installed into this base conda environment. And, by the way, you can use the which command with pip as well, just to double-check that you're using the version of pip that's in your current environment:
(base) ➜ ~ which pip
/Users/steven/miniconda3/bin/pip
If you want to see the list of packages currently installed, you can do pip freeze, or conda env export. Both pip and conda are package managers, and if you're using an Anaconda Python installation then you can (generally) use either to install packages into your virtual environment.
(Quick side note: "virtual environments" are a general concept that can be implemented in different ways. Both conda and virtualenv are ways to use virtual environments in Python. I'm also a data scientist, and I use conda for all of my virtual environments.)
If you want to create a new virtual environment using conda, it's very straightforward. First, you can create the environment and install some packages right away, like pandas and matplotlib. Then you can activate that environment, check your version of python, and then deactivate it.
(base) ➜ ~ conda create -n my-new-environment pandas matplotlib
(base) ➜ ~ which python
/Users/steven/miniconda3/bin/python
(base) ➜ ~ conda activate my-new-environment
(my-new-environment) ➜ ~ which python
/Users/steven/miniconda3/envs/my-new-environment/bin/python
(my-new-environment) ➜ ~ conda deactivate
(base) ➜ ~ which python
/Users/steven/miniconda3/bin/python
And, if you want to see which conda virtual environments you currently have available, you can run conda env list.
Here's the documentation for conda environments, which I reference all the time: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
I hope this is helpful!

Best practice to manage dependencies between conda and pip

I'm developing a Python library, which depends on multiple packages. I'm struggling to find the most straightforward of managing all those dependencies with the following constraints :
Some of those dependencies are only available as conda packages (technically the source is available but the build process is not something I want to get into)
Other dependencies are only available via pip
I need to install my own library in editable or developer mode
I regularly need to keep the dependencies up-to-date
My current setup for the initial install :
Create a new conda environment
Install the conda-only dependencies with conda install ...
Install my library with pip install -e .
At this point, some packages were installed and are now managed by conda, others by pip. When I want to update my environment, I need to:
Update the conda part of the environment with conda update --all
Update the pip part of the environment by hand
My problem is that this is unstable : when I update all conda packages, it ensures the consistency of the packages it manages. However, I can't guarantee that the environment as a whole stays consistent, and I just realized that I was missing some updates because I forgot to check for updates in the pip part of the environment.
What's the best way to do this ? I've thought of :
Using conda's pip interoperability feature : this seems to work, but I've had some dubious results, probably because of my use of extras_require
Since pip can see the conda packages, the initial install is consistent, which means I can simply reinstall everything when I want to update. This works but is not exactly elegant.

The recommendation in the official documentation for managing a Conda environment that also requires PyPI-sourced or pip-installed local packages is to define all dependencies (both Conda and Pip) in a YAML file. Something like:
env.yaml
name: my_env
channels:
- defaults
dependencies:
- python=3.8
- numpy
- pip
- pip:
- some_pypi_only_pkg
- -e path/to/a/local/pkg
The workflow for updating in such an environment is to update the YAML file (which I would recommend to keep under version control) and then either create a new environment or use
conda env update -f env.yaml
Personally, I would tend to create new envs, rather than mutate (update) an existing one, and use minimal constraints (i.e., >=version) in the YAML. When creating a new env, it should automatically pull the latest consistent packages. Plus, one can keep the previous instances of the env around in case a regression is need during the development lifecycle.

Python Conda Environment confusion(as example: problem with gym)

Trying to use gym open-ai package (and somen other) I ran into some problems,
which structure I don't really understand.
As an example:
I tried to install gym in three different conda environments.
One way to do this is
pip install gym
Another is:
git clone https://github.com/openai/gym.git
cd gym
pip install -e .
A third would be:
pip3 install gym
In some environments I would use Python2, in other env. maybe Python 3.7
Even more possibilities for installation would be:
sudo pip install gym
(and even more permutations would be possible, if we would take into account,
if we activate an environment or don't activate any environment).
To me things get even more complicated, because I tried to install conda with
a not-administrator-user-account in Ubuntu, so that conda (or rather the user itself could not install any files in the /usr directory).
I began to test some of this possibilities and cases, because installation of some libaries
(e.g. keras-rl) seemed to need access to common ressources (/usr/ dir.), even if
installed in an local conda environment. But if so: would the installations in
different conda-environments interact?
And what, if one would install a package as local user in a conda environment and
afterward install a pip or pip3 as administrator. Would the admin-installation
overwrite (or overrule or interact) the environmental installation (or parts of it)?
While experimenting with the different possibilities (or more: while trying to
find a installations, which did not produce any errors like "gym not found" or
"attribute error ... " ) there did occur errors like:
Found existing installation: gym 0.15.4
Can't uninstall 'gym'. No files were found to uninstall.
after executing:
sudo pip3 install gym --force
So on this basis my questions specifically would be:
(1) Is there a best practice for establish good conda environments
(which don't tend to interact, especially if some packages
need sudo priviledges)?
And (2) if some environments interact with
general (sudo) ressources, how can they be resolved in a way,
that distinct environments can be tested and established beneath each other?
Annotation:
there was a similiar question:
conda environment pip is trying to install dependencies globally
some time ago, but the advice, not to use sudo, seems to be difficult to follow,
if some packages require access to global ressources.
So I would like to ask for a solution to interactions at bit more specifically.

you should not use sudo to install something in a conda environment. Most likely the used pip command is not stemming from the actual (activated?) environment, but the actual system-wide pip is used. Therefore you would need to use to use sudo to install to a system owned prefix.
You can check whether you are using the desired pip by invoking "which pip". The path should point to your environment. If it does not, you shall install pip inside your conda env.

I had the same problem before. I activated conda envirement and installed with pip3 locally since conda does not have support for it. Warning: Possible of wreckig some packs.
The conda envirement should ALLWAYS be activated before installing anything orelse it ends up as a global installation.
install a new conda envirement without using sudo. If it ask for sudo you need to remove the whole thing and clean up a bit. Its very easy to forget and NEVER use sudo !
You can try installing a newer version of python3.x (python 2 is getting history very soon anyways they said. Pip = python2, pip3 = python3. And to answer one of your new question if by installing globally will mess things up, not outside conda.
google pycharm and conda. there you can just use it to install 3 different types of envirements with python. Actually a darn good editor for python coding. The rest is more linux related when we talk about cleaning up PATHS etc.
I have no better to add! Hope you get it right.

If I install a package in only one virtual environment, do I need to reinstall it in other virtual environments?

For example, if I install TensorFlow in one virtual environment, do I need to reinstall it again when I make a new project in a different virtual environment? This seems very bothersome, and I usually only need one version of a package.
Also, I want to install TensorFlow using Anaconda but the only way is using a virtual environment: https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/ Any ideas on how I can install it system wide?

Yes, you want packages per virtual environment. It's fairly easy to use with tools like pipenv.
The reason you want packages per virtual environment is version management per project. If you have 10 projects locally and only use system wide packages. They all need to use the same version. You can get away with it, but it's is something you want to avoid.

If you use conda environments you can clone and share them.
conda create --name mynewcloneenv --clone myoldoriginalenv

For example, if I install TensorFlow in one virtual environment, do I need to reinstall it again when I make a new project in a different virtual environment? This seems very bothersome, and I usually only need one version of a package.
Yes
Also, I want to install TensorFlow using Anaconda but the only way is using a virtual environment: https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/ Any ideas on how I can install it system wide?
Try to use venv for every project.
Based on your comments and your question you can prepare a pip command to install everything in a place. Its space-delimited list.
pip install package1 package2 package3 package4

Is that a bad idea to use conda and pip install on the same environment?

Since conda install and pip install in many cases do essentially the same thing, what would be the best option? Is there a case when someone should stick to pip install only? Symmetrical, is there a case when one should stick to conda install only? Is there a way to shoot in one's foot by using both conda and pip install in a single environment?
If both approaches are essentially the same and don't contradict each other there should be no reason to stick solely to one of them but not to the other.

Don't mix conda install and pip install within conda environment. Probably, decide to use conda or virtualenv+piponce and for all. And here is how you decide which one suits you best:
Conda installs various (not only python) conda-adopted packages within conda environment. It gets your environments right if you are into environments.
Pip installs python packages within Python environment (virtualenv is one of them). It gets your python packages installed right.
Safe way to use conda: don't rush for the latest stuff and stick to the available packages and you'll be fine.
Safe way to use pip+virtualenv: if you see a dependency issue or wish to remove and clean up after package - don't. Just burn the house, abandon your old environment and create a new one. One command line and 2-5 minutes later things gonna be nice and tidy again.
Pip is the best tool for installing Python packages among the two of them. Since pip packages normally come out first and only later are adopted for conda (by conda staff or contributors). Chances are, after updating or installing the latest version of Python some of the packages would only be available through pip. And the latest freshest versions of packages would only be available in pip. And mixing pip and conda packages together can be a nightmare (at least if you want to utilize conda's advantages).
Conda is the best when it comes to managing dependencies and replicating environments. When uninstalling a package conda can properly clean up after itself and has better control over conflicting dependency versions. Also, conda can export environment config and, if the planets are right at the moment and the new machine is not too different, replicate that environment somewhere else. Also, conda can have larger control over the environment and can, for example, have a different version of Python installed inside of it (virtualenv - only the Python available in the system). You can always create a conda package when you have no freedom of choosing what to use.
Some relevant facts:
Conda takes more space and time to setup
Conda might be better if you don't have admin rights on the system
Conda will help when you have no system Python
virtualenv+pip will free you up of knowing lots of details like that
Some outdated notions:
Conda used to be better for novice developers back in the day (2012ish). There is no usability gap anymore
Conda was linked to Continuum Analytics too much. Now Conda itself is open source, the packages - not so much.

Depends on the complexity of your environment really.
Using pip for a few simple packages should not generate any issues.
Using more pip installs raises the question "Why not use a pip venv then?".
If you're not doing anything major, you might be able to have a mix of pip and conda installs.
There is an extensive explanation why mixing them can be a bad idea here: Using Pip in a Conda Environment.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.