When I run the following command:
conda env create -f virtual_platform_mac.yml
I get this error
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- pytables==3.4.2=np113py35_0
- h5py==2.7.0=np113py35_0
- anaconda==custom=py35_0
How can I solve this?
I am working on Mac OS X.
Conda v4.7 dropped a branch of the Anaconda Cloud repository called the free channel for the sake of improving solving performance. Unfortunately, this includes many older packages that never got ported to the repository branches that were retained. The requirements failing here are affected by this.
Restore free Channel Searching
Conda provides a means to restore access to this part of the repository through the restore_free_channel configuration option. You can verify that this is the issue by seeing that
conda search pytables=3.4.2[build=np113py35_0]
fails, whereas
CONDA_RESTORE_FREE_CHANNEL=1 conda search pytables=3.4.2[build=np113py35_0]
successfully finds the package, and similarly for the others.
Option 1: Permanent Setting
If you expect to frequently need older packages, then you can globally set the option and then proceed with installing:
conda config --set restore_free_channel true
conda env create -f virtual_platform_mac.yml
Option 2: Temporary Setting
As with all Conda configuration options, you can also use the corresponding environment variable to temporarily restore access just for the command:
Unix/Linux
CONDA_RESTORE_FREE_CHANNEL=1 conda env create -f virtual_platform_mac.yml
Windows
SET CONDA_RESTORE_FREE_CHANNEL=1
conda env create -f virtual_platform_mac.yaml
(Yes, I realize the cognitive dissonance of a ..._mac.yaml, but Windows users need help too.)
Including Channel Manually
One can also manually include the channel as one to be searched:
conda search -c free pytables=3.4.2[build=np113py35_0]
Note that any of these approaches will only use the free channel in this particular search and any future searches or changes to the env will not search the channel.
Pro-Tip: Env-specific Settings
If you have a particular env that you always want to have access to the free channel but you don't want to set this option globally, you can instead set the configuration option only for the environment.
conda activate my_env
conda config --env --set restore_free_channel true
A similar effect can be accomplished by setting and unsetting the CONDA_RESTORE_FREE_CHANNEL variable in scripts placed in the etc/conda/activate.d and etc/conda/deactivate.d folders, respectively. See the documentation for an example.
Another solution might be explained here. Basically, if you import an environment.yml file to a different OS (e.g., from macOS to Windows) you will get build errors.
The solution is to use the flag "--no-buils", but it does not guarantee that the environment.yml will actually be compatible. Some libraries, e.g. libgfortran, are not found on Windows channels for Anaconda (see here).
I would use
CONDA_RESTORE_FREE_CHANNEL=1 conda env create -f
to keep using outdated/older packages
Related
I have to connect to a server where my user has access to one small partition from /home/users/user_name where I have a quota of limited space and a bigger partition into /big_partition/users/user
After I am logging into that server I will arrive at /home/users/user_name at the bigging. After that, I am doing the following steps.
cd /big_partition/users/user
create conda --prefix=envs python=3.6
on the 4th line, it says Package plan for installation in environment /big_partition/users/user/envs: which is ok.
press y, and not I am getting the following message.
OSError: [Errno 122] Disk quota exceeded: '/home/users/user_name/.conda/envs/.pkgs/python-3.6.2-0/lib/python3.6/unittest/result.py'
Can anyone help me to understand how can I move the .conda folder from /home/users/user_name to /big_partition/users/user at the moment when I am creating this environment?
Configure Environment and Package Default Locations
I'd guess that, despite your efforts to put your environments on the large partition, there is still a default user-level package cache and that is filling up the home partition. At minimum, set up a new package cache and a default environments directory on the large partition:
# create a new pkgs_dirs (wherever, doesn't have to be hidden)
mkdir -p /big_partition/users/user/.conda/pkgs
# add it to Conda as your default
conda config --add pkgs_dirs /big_partition/users/user/.conda/pkgs
# create a new envs_dirs (again wherever)
mkdir -p /big_partition/users/user/.conda/envs
# add it to Conda as your default
conda config --add envs_dirs /big_partition/users/user/.conda/envs
Now you don't have to fuss around with using the --prefix flag any more - your named environments (conda create -n foo) will by default be created inside this directory and you can activate by name instead of directory (conda activate foo).
Transferring Previous Environments and Package Cache
Unfortunately, there's not a great way to move Conda environments across filesystems without destroying the hardlinks. Instead, you'll need to recreate your environments. Since you may or may not want to bother with this, I'm only going to outline it. I can elaborate if needed.
Archive environments. Use conda env export -n foo > foo.yaml (One per environment.)
Move package cache. Copy contents of old package cache (/home/users/user_name/.conda/envs/.pkgs/) to new package cache.
Recreate environments. Use conda env create -n foo -f foo.yaml.
Again, you could just skip this altogether. This is mainly if you want to be very thorough about transferring and not having to redownload stuff for environments you already created.
After this you can delete some the stuff under the old ~/.conda/envs/pkgs folder.
I found the solution. All I need to do is to export CONDA_ENVS_PATH with the path where I want to be the .conda
export CONDA_ENVS_PATH=.
When I create conda environments for my projects where I use pytorch,
torch and torchvision packages take along time to install due to slow connection in my region (takes hours sometimes).
Therefore to start working on my projects quickly I don't create a new env, I just use the packages in the base env. I know this will get hairy soon.
That's why I want to know if there is a way to make a new created env inherit specific packages from base env without re-installing.
ps: I understand that conda leverages hard links but I don't understand how to use this in this case. I appreciate your help.
Cloning
The simplest way to use only already installed packages in a new environment is to clone an existing environment (conda create --clone foo --name bar). Generally, I don't recommend cloning the base environment since it includes Conda and other infrastructure that is only needed in base.
At a workflow-level, it might be advantageous to consider creating some template environments which you can clone for different projects.
YAML Definitions
However, OP mentions only wanting specific packages. I would still create a new env for this, but start with an existing env using an exported YAML.
conda env export -n foo > bar.yaml
Edit the bar.yaml to remove whatever packages that you don't want (again, if foo == base, remove conda), then create the new environment with
conda env create -f bar.yaml --name bar
This will ensure that exactly the packages from the previous environment are used.
Overall, if you using cloning and recreating from YAML files (which include build specifications), then Conda will minimize downloading as well as physical disk usage.
The typical command to export a Anaconda environment to a YAML file is:
conda env export --name my_env > myenv.yml
However, one huge issue is the readbility of this file as it includes hard specifications for all of the libraries and all of their dependencies. Is there a way for Anaconda to export a list of the optimally smallest subset of commands that would subsume these dependencies to make the YAML more readable? For example, if all you installed in a conda environment was pip and scipy, is there a way for Anaconda to realize that the file should just read:
name: my_env
channels:
- defaults
dependencies:
- scipy=1.3.1
- pip=19.2.3
That way, the anaconda environment will still have the exact same specification, if not an improved on (if an upstream bug is fixed) and anyone who looks at the yml file will understand what is "required" to run the code, in the sense that if they did want to/couldn't use the conda environment they would know what packages they needed to install?
Options from the Conda CLI
This is sort of what the --from-history flag is for, but not exactly. Instead of including exact build info for each package, it will include only what are called explicit specifications, i.e., the specifications that a user has explicitly requested via the CLI (e.g., conda install scipy=1.3.1). Have a try:
conda env export --from-history --name my_env > myenv.yml
This will only include versions if the user originally included versions during installation. Hence, creating a new environment is very likely not going to use the exact same versions and builds. On the other hand, if the user originally included additional constraints beyond version and build they will also be included (e.g., a channel specification conda install conda-forge::numpy will lead to conda-forge::numpy).
Another option worth noting is the --no-builds flag, which will export every package in the YAML, but leave out the build specifiers. These flags work in a mutually exclusive manner.
conda-minify
If this is not sufficient, then there is an external utility called conda-minify that offers some functionality to export an environment that is minimized based on a dependency tree rather than through the user's explicit specifications.
Have a look at pipreqs. It creates a requirements.txt file only based on the imports that you are explicitely doing inside your project (and you even have a --no-pin option to ignore the version numbers). You can later use this file to create a conda environemnt via conda install --file requirements.txt.
However, if you're aiming for an evironments.yml file you have to create it manually. But that's just copy and paste from the clean requirements.txt. You only have to separate conda from "pip-only" installs.
How do I get snakemake to activate a conda environment that already exists in my environment list?
I know you can use the --use-conda with a .yaml environment file but that seems to generate a new environment which is just annoying when the environment already exists. Any help with this would be much appreciated.
I have tried using the:
conda:
path/to/some/yamlFile
but it just returns command not found errors for packages in the environment
It is possible. It is essentially an environment config issue. You need to call bash in the snakemake rules and load conda-init'd bash profiles there. Below example works with me:
rule test_conda:
shell:
"""
bash -c '
. $HOME/.bashrc # if not loaded automatically
conda activate base
conda deactivate'
"""
In addition, --use-conda is not necessary in this case at all.
Follow up to answer by liagy, since snakemake runs with strict bash mode (set -u flag), conda activate or deactivate may throw an error showing unbound variable related to conda environment. I ended up editing parent conda.sh file which contains activate function. Doing so will temporarily disable u flag while activating or deactivating conda environments but will preserve bash strict mode for rest of snakemake workflow.
Here is what I did:
Edit (after backing up the original file) ~/anaconda3/etc/profile.d/conda.sh and add following from the first line within __conda_activate() block:
__conda_activate() {
if [[ "$-" =~ .*u.* ]]; then
local bash_set_u
bash_set_u="on"
## temporarily disable u flag
## allow unbound variables from conda env
## during activate/deactivate commands in
## subshell else script will fail with set -u flag
## https://github.com/conda/conda/issues/8186#issuecomment-532874667
set +u
else
local bash_set_u
bash_set_u="off"
fi
# ... rest of code from the original script
And also add following code at the end of __conda_activate() block to re-enable bash strict mode only if present prior to running conda activate/deactivate functions.
## reenable set -u if it was enabled prior to
## conda activate/deactivate operation
if [[ "${bash_set_u}" == "on" ]]; then
set -u
fi
}
Then in Snakefile, you can have following shell commands to manage existing conda environments.
shell:"""
## check current set flags
echo "$-"
## switch conda env
source ~/anaconda3/etc/profile.d/conda.sh && conda activate r-reticulate
## Confirm that set flags are same as prior to conda activate command
echo "$-"
## switch conda env again
conda activate dev
echo "$-"
which R
samtools --version
## revert to previous: r-reticulate
conda deactivate
"""
You do not need to add above patch for __conda_deactivate function as it sources activate script.
PS: Editing ~/anaconda3/etc/profile.d/conda.sh is not ideal. Always backup the original and edited filed. Updating conda will most likely overwrite these changes.
Prefer Snakemake-managed environments
This is an old answer, from before Snakemake added a feature to allow user-managed environments. Other answers cover the newer functionality. Nevertheless, I am retaining this answer here because I believe it adds perspective to the problem, and why this feature is still discouraged from being used. Specifically, from the documentation:
"Importantly, one should be aware that this can hamper reproducibility, because the workflow then relies on this environment to be present in exactly the same way on any new system where the workflow is executed. Essentially, you will have to take care of this manually in such a case. Therefore, the approach using environment definition files described above is highly recommended and preferred." [emphasis in the original]
(Mostly) Original Answer
This wasn't previously possible and I'd still argue it was mostly a good thing. Snakemake having sole ownership of the environment helps improve reproducibility by requiring one to update the YAML instead of directly manipulating the environment with conda (install|update|remove). Note that such a practice of updating a YAML and recreating is a Conda best practice when mixing in Pip, and it definitely doesn't hurt to adopt it generally.
Conda does a lot of hardlinking, so I wouldn't sweat the duplication too much - it's mostly superficial. Moreover, if you create a YAML from the existing environment you wish to use (conda env export > env.yaml) and give that to Snakemake, then all the identical packages that you already have downloaded will be used in the environment that Snakemake creates.
If space really is such a tight resource, you can simply not use Snakemake's --use-conda flag and instead activate your named envs as part of the shell command or script you provide. I would be very careful not to manipulate those envs or at least be very diligent about tracking changes made to them. Perhaps, consider tracking the output of conda env export > env.yaml under version control and putting that YAML as an input file in the Snakemake rules that activate the environment. This way Snakemake can detect that the environment has mutated and the downstream files are potentially outdated.
This question is still trending on Google, so an update:
Since snakemake=6.14.0 (2022-01-26) using an existing, named conda environment is a supported feature.
You simply put the name of the environment some-env-name into the rules conda directive (instead of the .yaml file) and use snakemake --use-conda:
rule NAME:
input:
"table.txt"
output:
"plots/myplot.pdf"
conda:
"some-env-name"
script:
"scripts/plot-stuff.R"
Documentation: https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html#using-already-existing-named-conda-environments
Note: It is recommended to use the feature sparsely and prefer to specify a environment.yaml file instead to increase reproduceability.
I would like to create a conda environment on a machine that has no network connection. What I've done so far is:
On a machine that is connected to the internet:
conda create -n python3 python=3.4 anaconda
Conda archived all of the relevant packages into \Anaconda\pkgs. I put these into a separate folder and moved it to the machine with no network connection. The folder has the path PATHTO\Anaconda_py3\win-64
I tried
conda create -n python=3.4 anaconda --offline --channel PATHTO\Anaconda_py3
This gives the error message
Fetching package metadata:
Error: No packages found in current win-64 channels matching: anaconda
You can search for this package on Binstar with
binstar search -t conda anaconda
What am I doing wrong? How do I tell conda to create an environment based on the packages in this directory?
You could try cloning root which is the base env.
conda create -n yourenvname --clone root
Short answer: copy the whole environment from another machine with the same OS.
Why
Dependency. A package depends on other packages. When you install a package online, the package manager conda analyzes the package dependencies and install all the required packages for you.
The dependency is especially heavy in anaconda. Cause anaconda is a meta package depends on another 160+ packages.
Meta packages,are packages do not contain actual softwares and simply depend on other packages to be installed.
It's totally absurd to download all these dependencies one by one and install them on the offline machine.
Detail Solution
Get conda installed on another machine with same OS. Install the packages you need in an isolated virtual environment.
# create a env named "myvenv", name it whatever you want
# and install the package into this env
conda create -n myvenv --copy anaconda
--copy is used to
Install all packages using copies instead of hard- or
soft-linking.
Find where the environments are stored with
conda info
The 1st value of key "envs directories" is the location. Go there and package the whole sub-folder named "myvenv" (the env name in previous step) into an archive.
Copy the archive to your offline machine. Check "envs directories" from conda info. And extract the environment from the archive into the env directory on the offline machine.
Done.
In addition to copying the pkgs folder, you need to index it, so that conda knows how to find the dependencies. See this ticket for more details and this script for an example of indexing the pkgs folder.
Using --unknown as #asmeurer suggests will only work if the package you're trying to install has no dependencies, otherwise you will get a "Could not find some dependencies" error.
Cloning is another option, but this will give you all root packages, which may not be what you want.
A lot of the answers here are not 100% related to the "when offline" part. They talk about the rest of OP's question, not reflected in question title.
If you came here because you need offline env creation on top of an existing Anaconda install you can try:
conda create --offline --name $NAME
You can find the --offline flag documented here
Have you tried without the --offline?
conda create -n anaconda python=3.4 --channel PATHTO\Anaconda_py3
This works for me if I am not connected to the Internet if I do have anaconda already on the machine but in another location. If you are connected to the Internet when you run this command you will probably get an error associated with not finding something on Binstar.
I'm not sure whether this contradicts the other answers or is the same but I followed the instructions in the conda documentation and set up a channel on the local file system.
Then it's a simple matter of moving new package files to the local directory, running conda index on the channel sub-folder (which should have a name like linux-64).
I also set the Anaconda config setting offline to True as described here but not sure if that was essential.
Hope that helps.
The pkgs directory is not a channel. The flag you are looking for is --unknown, which causes conda to include files in the pkgs directory even if they aren't found in one of the channels.
Here's what worked for me in Linux -
(a) Create a blank environment - Just create an empty directory under $CONDA_HOME/envs. Verify with - conda info --envs.
(b) Activate the new env - source activate
(c) Download the appropriate package (*.bz2) from https://anaconda.org/anaconda/repo on a machine with internet connection and move it to the isolated host.
(d) Install using local package - conda install . For example - conda install python-3.6.4-hc3d631a_1.tar.bz2, where python-3.6.4-hc3d631a_1.tar.bz2 exists in the current dir.
That's it. You can verify by the usual means (python -V, conda list -n ). All related packages can be installed in the same manner.
I found the simplest method to be as follows:
Run 'conda create --name name package' with no special switches
Copy the URL of the first package it tried (unsuccessfully) to download
Use the URL on a connected machine to fetch the tar.bz2
Copy the tar.bz2 to the offline machine's /home/user/anaconda3/pkgs
Deploy the tar.bz2 in place
Delete the now unneeded tar.bz2
Repeat until the 'conda create' command succeeds
Here's a solution that may help. It's not very pretty but it gets the job done. So i suppose you have a machine where you have a conda environment in which you've installed all the packages you need. I will refer to this as ENV1 You will have to go to this environment directory and locate it. It is usually found in \Anaconda3\envs. I suggest compressing the folder but you could just use it as is. Copy the desired environment folder into your offline machine's directory for anaconda environments. This first step should get your new environment to respond to commands like conda activate.
You will notice though that software like spyder and jupyter don't work anymore (probably because of path differences). My solution to this was to clone the base environment in the offline machine into a new environment that i will refer to as ENV2. What you need to do then is copy the contents of ENV2 into those of ENV1 and replace files.
This should overwrite the files related to spyder, jupyter.. and keep your imported packages intact.