Export Conda Environment with minimized requirements - python

The typical command to export a Anaconda environment to a YAML file is:
conda env export --name my_env > myenv.yml
However, one huge issue is the readbility of this file as it includes hard specifications for all of the libraries and all of their dependencies. Is there a way for Anaconda to export a list of the optimally smallest subset of commands that would subsume these dependencies to make the YAML more readable? For example, if all you installed in a conda environment was pip and scipy, is there a way for Anaconda to realize that the file should just read:
name: my_env
channels:
- defaults
dependencies:
- scipy=1.3.1
- pip=19.2.3
That way, the anaconda environment will still have the exact same specification, if not an improved on (if an upstream bug is fixed) and anyone who looks at the yml file will understand what is "required" to run the code, in the sense that if they did want to/couldn't use the conda environment they would know what packages they needed to install?

Options from the Conda CLI
This is sort of what the --from-history flag is for, but not exactly. Instead of including exact build info for each package, it will include only what are called explicit specifications, i.e., the specifications that a user has explicitly requested via the CLI (e.g., conda install scipy=1.3.1). Have a try:
conda env export --from-history --name my_env > myenv.yml
This will only include versions if the user originally included versions during installation. Hence, creating a new environment is very likely not going to use the exact same versions and builds. On the other hand, if the user originally included additional constraints beyond version and build they will also be included (e.g., a channel specification conda install conda-forge::numpy will lead to conda-forge::numpy).
Another option worth noting is the --no-builds flag, which will export every package in the YAML, but leave out the build specifiers. These flags work in a mutually exclusive manner.
conda-minify
If this is not sufficient, then there is an external utility called conda-minify that offers some functionality to export an environment that is minimized based on a dependency tree rather than through the user's explicit specifications.

Have a look at pipreqs. It creates a requirements.txt file only based on the imports that you are explicitely doing inside your project (and you even have a --no-pin option to ignore the version numbers). You can later use this file to create a conda environemnt via conda install --file requirements.txt.
However, if you're aiming for an evironments.yml file you have to create it manually. But that's just copy and paste from the clean requirements.txt. You only have to separate conda from "pip-only" installs.

Related

Install VS Code Packages to work in Multiple Conda Environments

I have several Python projects using different environments. These environments are managed using Conda and this works well, allowing the same environment to be used in production and dev/test for each project.
Conda yml files are used to define each environment.
There are a number of packages that I would like to use during development, such as autopep8. These don't need to be in the production environment so are not included in the yml file.
How can I install autopep8 and others so that they will work across any Python environment that I load in VS Code? So far I have had to manually install these packages as I switch environments.
Default Packages
One way of managing this without violating environment isolation1 would be to use Conda's default packages functionality. The idea would be to define default packages (such as autopep8) in a .condarc on only the development systems. The conda env create will respect these and add them to every env you create, so you can still keep a single YAML that describes only the essentials for the production version.
Note that there are multiple options for where to store this .condarc, and Conda can load settings in a nested fashion. If all environments for your user are categorized as "development", then a sensible place to define the default packages would be ~/.condarc. There is additionally a --no-default-packages flag, which can be used to disable such default package installation when you don't need it.
[1] While there are ways to include packages from outside a Conda environment (e.g., through PYTHONPATH), this should be regarded as substandard and only be used as a last resort. Conda is designed with an assumption of full isolation of environments - violating that can lead to undefined behavior.

Can a new env in conda inherit specific packages from base environment

When I create conda environments for my projects where I use pytorch,
torch and torchvision packages take along time to install due to slow connection in my region (takes hours sometimes).
Therefore to start working on my projects quickly I don't create a new env, I just use the packages in the base env. I know this will get hairy soon.
That's why I want to know if there is a way to make a new created env inherit specific packages from base env without re-installing.
ps: I understand that conda leverages hard links but I don't understand how to use this in this case. I appreciate your help.
Cloning
The simplest way to use only already installed packages in a new environment is to clone an existing environment (conda create --clone foo --name bar). Generally, I don't recommend cloning the base environment since it includes Conda and other infrastructure that is only needed in base.
At a workflow-level, it might be advantageous to consider creating some template environments which you can clone for different projects.
YAML Definitions
However, OP mentions only wanting specific packages. I would still create a new env for this, but start with an existing env using an exported YAML.
conda env export -n foo > bar.yaml
Edit the bar.yaml to remove whatever packages that you don't want (again, if foo == base, remove conda), then create the new environment with
conda env create -f bar.yaml --name bar
This will ensure that exactly the packages from the previous environment are used.
Overall, if you using cloning and recreating from YAML files (which include build specifications), then Conda will minimize downloading as well as physical disk usage.

How to fix `ResolvePackageNotFound` error when creating Conda environment?

When I run the following command:
conda env create -f virtual_platform_mac.yml
I get this error
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- pytables==3.4.2=np113py35_0
- h5py==2.7.0=np113py35_0
- anaconda==custom=py35_0
How can I solve this?
I am working on Mac OS X.
Conda v4.7 dropped a branch of the Anaconda Cloud repository called the free channel for the sake of improving solving performance. Unfortunately, this includes many older packages that never got ported to the repository branches that were retained. The requirements failing here are affected by this.
Restore free Channel Searching
Conda provides a means to restore access to this part of the repository through the restore_free_channel configuration option. You can verify that this is the issue by seeing that
conda search pytables=3.4.2[build=np113py35_0]
fails, whereas
CONDA_RESTORE_FREE_CHANNEL=1 conda search pytables=3.4.2[build=np113py35_0]
successfully finds the package, and similarly for the others.
Option 1: Permanent Setting
If you expect to frequently need older packages, then you can globally set the option and then proceed with installing:
conda config --set restore_free_channel true
conda env create -f virtual_platform_mac.yml
Option 2: Temporary Setting
As with all Conda configuration options, you can also use the corresponding environment variable to temporarily restore access just for the command:
Unix/Linux
CONDA_RESTORE_FREE_CHANNEL=1 conda env create -f virtual_platform_mac.yml
Windows
SET CONDA_RESTORE_FREE_CHANNEL=1
conda env create -f virtual_platform_mac.yaml
(Yes, I realize the cognitive dissonance of a ..._mac.yaml, but Windows users need help too.)
Including Channel Manually
One can also manually include the channel as one to be searched:
conda search -c free pytables=3.4.2[build=np113py35_0]
Note that any of these approaches will only use the free channel in this particular search and any future searches or changes to the env will not search the channel.
Pro-Tip: Env-specific Settings
If you have a particular env that you always want to have access to the free channel but you don't want to set this option globally, you can instead set the configuration option only for the environment.
conda activate my_env
conda config --env --set restore_free_channel true
A similar effect can be accomplished by setting and unsetting the CONDA_RESTORE_FREE_CHANNEL variable in scripts placed in the etc/conda/activate.d and etc/conda/deactivate.d folders, respectively. See the documentation for an example.
Another solution might be explained here. Basically, if you import an environment.yml file to a different OS (e.g., from macOS to Windows) you will get build errors.
The solution is to use the flag "--no-buils", but it does not guarantee that the environment.yml will actually be compatible. Some libraries, e.g. libgfortran, are not found on Windows channels for Anaconda (see here).
I would use
CONDA_RESTORE_FREE_CHANNEL=1 conda env create -f
to keep using outdated/older packages

How to backup Anaconda added packages?

I have Anaconda for Python 2, It came packed with a lot of useful packages. During my work, I have added several packages to it using conda install command. Now I have to format my system, and I want to backup/pack all the added libraries, either as full packages or even by knowing the installation command of each one.
I searched StackOverflow, I found one unanswered question with a similar problem, the question suggested conda list -e >file_list.txt to create a file contains all the installed packages, but this is not sufficient for me, I want Anaconda to determine which package is added by me, and by which command, or to pack the added packages in full.
Thanks for help.
I think you can find the solution you are looking for here.
Open the Anaconda prompt
Activate the environment you are interested in
Type conda env export > environment.yml
In the yml you will find all the dependencies and you can use it to create a new virtual environment as a copy of the current one.
For example, on the new/rebooted machine, you can do:
conda env create -f environment.yml

How can I install a conda environment when offline?

I would like to create a conda environment on a machine that has no network connection. What I've done so far is:
On a machine that is connected to the internet:
conda create -n python3 python=3.4 anaconda
Conda archived all of the relevant packages into \Anaconda\pkgs. I put these into a separate folder and moved it to the machine with no network connection. The folder has the path PATHTO\Anaconda_py3\win-64
I tried
conda create -n python=3.4 anaconda --offline --channel PATHTO\Anaconda_py3
This gives the error message
Fetching package metadata:
Error: No packages found in current win-64 channels matching: anaconda
You can search for this package on Binstar with
binstar search -t conda anaconda
What am I doing wrong? How do I tell conda to create an environment based on the packages in this directory?
You could try cloning root which is the base env.
conda create -n yourenvname --clone root
Short answer: copy the whole environment from another machine with the same OS.
Why
Dependency. A package depends on other packages. When you install a package online, the package manager conda analyzes the package dependencies and install all the required packages for you.
The dependency is especially heavy in anaconda. Cause anaconda is a meta package depends on another 160+ packages.
Meta packages,are packages do not contain actual softwares and simply depend on other packages to be installed.
It's totally absurd to download all these dependencies one by one and install them on the offline machine.
Detail Solution
Get conda installed on another machine with same OS. Install the packages you need in an isolated virtual environment.
# create a env named "myvenv", name it whatever you want
# and install the package into this env
conda create -n myvenv --copy anaconda
--copy is used to
Install all packages using copies instead of hard- or
soft-linking.
Find where the environments are stored with
conda info
The 1st value of key "envs directories" is the location. Go there and package the whole sub-folder named "myvenv" (the env name in previous step) into an archive.
Copy the archive to your offline machine. Check "envs directories" from conda info. And extract the environment from the archive into the env directory on the offline machine.
Done.
In addition to copying the pkgs folder, you need to index it, so that conda knows how to find the dependencies. See this ticket for more details and this script for an example of indexing the pkgs folder.
Using --unknown as #asmeurer suggests will only work if the package you're trying to install has no dependencies, otherwise you will get a "Could not find some dependencies" error.
Cloning is another option, but this will give you all root packages, which may not be what you want.
A lot of the answers here are not 100% related to the "when offline" part. They talk about the rest of OP's question, not reflected in question title.
If you came here because you need offline env creation on top of an existing Anaconda install you can try:
conda create --offline --name $NAME
You can find the --offline flag documented here
Have you tried without the --offline?
conda create -n anaconda python=3.4 --channel PATHTO\Anaconda_py3
This works for me if I am not connected to the Internet if I do have anaconda already on the machine but in another location. If you are connected to the Internet when you run this command you will probably get an error associated with not finding something on Binstar.
I'm not sure whether this contradicts the other answers or is the same but I followed the instructions in the conda documentation and set up a channel on the local file system.
Then it's a simple matter of moving new package files to the local directory, running conda index on the channel sub-folder (which should have a name like linux-64).
I also set the Anaconda config setting offline to True as described here but not sure if that was essential.
Hope that helps.
The pkgs directory is not a channel. The flag you are looking for is --unknown, which causes conda to include files in the pkgs directory even if they aren't found in one of the channels.
Here's what worked for me in Linux -
(a) Create a blank environment - Just create an empty directory under $CONDA_HOME/envs. Verify with - conda info --envs.
(b) Activate the new env - source activate
(c) Download the appropriate package (*.bz2) from https://anaconda.org/anaconda/repo on a machine with internet connection and move it to the isolated host.
(d) Install using local package - conda install . For example - conda install python-3.6.4-hc3d631a_1.tar.bz2, where python-3.6.4-hc3d631a_1.tar.bz2 exists in the current dir.
That's it. You can verify by the usual means (python -V, conda list -n ). All related packages can be installed in the same manner.
I found the simplest method to be as follows:
Run 'conda create --name name package' with no special switches
Copy the URL of the first package it tried (unsuccessfully) to download
Use the URL on a connected machine to fetch the tar.bz2
Copy the tar.bz2 to the offline machine's /home/user/anaconda3/pkgs
Deploy the tar.bz2 in place
Delete the now unneeded tar.bz2
Repeat until the 'conda create' command succeeds
Here's a solution that may help. It's not very pretty but it gets the job done. So i suppose you have a machine where you have a conda environment in which you've installed all the packages you need. I will refer to this as ENV1 You will have to go to this environment directory and locate it. It is usually found in \Anaconda3\envs. I suggest compressing the folder but you could just use it as is. Copy the desired environment folder into your offline machine's directory for anaconda environments. This first step should get your new environment to respond to commands like conda activate.
You will notice though that software like spyder and jupyter don't work anymore (probably because of path differences). My solution to this was to clone the base environment in the offline machine into a new environment that i will refer to as ENV2. What you need to do then is copy the contents of ENV2 into those of ENV1 and replace files.
This should overwrite the files related to spyder, jupyter.. and keep your imported packages intact.

Categories