conda env create from yml locks up - python

In a simple test, I tried to create a basic Conda environment, export that environment to a YAML file, and re-create the environment from the YAML file on the exact same computing instance.
No matter what I try, the re-create step hangs up (fails) with no error message (timeout)
This is on an AWS EC2 Linux instance.
Command sequence as follows:
conda create -n myenv python=3.10.4
conda activate myenv
conda list ## output is shown below
conda env export > newenv.yml ## yml contents are shown below...so far so good
conda deactivate
conda env create -n newenv --f newenv.yml ## this is where it hangs up/freezes
<output> Collecting package metadata (repodata.json): -
<output> Collecting package metadata (repodata.json): - Killed ## after about 5mins
I have also tried multiple variations where I remove myenv before trying to re-create it using the same name (myenv). Folder permissions to the anaconda3/envs folder are 775 and there is no problem creating the environment. I have already updated, cleaned, and re-initialized Conda and have re-booted my instance multiple times.
Any help/ideas would be greatly appreciated. This is my first Python project.
Conda list output
(myenv) [ec2-user#ip-172-31-93-141 ~]$ conda list
# packages in environment at /home/ec2-user/anaconda3/envs/myenv:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2022.9.24 ha878542_0 conda-forge
ld_impl_linux-64 2.39 hcc3a1bd_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libuuid 1.41.5 h5eee18b_0
libzlib 1.2.13 h166bdaf_4 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
openssl 1.1.1s h0b41bf4_1 conda-forge
pip 22.3.1 pyhd8ed1ab_0 conda-forge
python 3.10.4 h12debd9_0
readline 8.1.2 h0f457ee_0 conda-forge
setuptools 65.5.1 pyhd8ed1ab_0 conda-forge
sqlite 3.40.0 h4ff8645_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tzdata 2022g h191b570_0 conda-forge
wheel 0.38.4 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
YAML file output
name: myenv
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_gnu
- bzip2=1.0.8=h7f98852_4
- ca-certificates=2022.9.24=ha878542_0
- ld_impl_linux-64=2.39=hcc3a1bd_1
- libffi=3.3=h58526e2_2
- libgcc-ng=12.2.0=h65d4601_19
- libgomp=12.2.0=h65d4601_19
- libsqlite=3.40.0=h753d276_0
- libstdcxx-ng=12.2.0=h46fd767_19
- libuuid=1.41.5=h5eee18b_0
- libzlib=1.2.13=h166bdaf_4
- ncurses=6.3=h27087fc_1
- openssl=1.1.1s=h0b41bf4_1
- pip=22.3.1=pyhd8ed1ab_0
- python=3.10.4=h12debd9_0
- readline=8.1.2=h0f457ee_0
- setuptools=65.5.1=pyhd8ed1ab_0
- sqlite=3.40.0=h4ff8645_0
- tk=8.6.12=h27826a3_0
- tzdata=2022g=h191b570_0
- wheel=0.38.4=pyhd8ed1ab_0
- xz=5.2.6=h166bdaf_0
- zlib=1.2.13=h166bdaf_4
prefix: /home/ec2-user/anaconda3/envs/myenv

Possibly a channel mixing issue. I see python and libuuid come from defaults (i.e., main, anaconda) channel, whereas everything else is Conda Forge. If you have channel_priority: strict, then any packages in conda-forge channel (which is given priority by the YAML because it is first), will mask any packages of the same name in the lower priority channel (defaults).
Concretely, Conda is told to find python with a specific build (h12debd9_0) that is only available on defaults, but because conda-forge has python packages available, only those will be considered. The hanging is probably just Conda trying to "explain" why it can't find a solution.
Immediate Workaround
If this is the issue, then changing the channel priority to flexible should get it working.
conda config --set channel_priority flexible
Better: Avoid mixing channels
If you instead started from conda-forge to begin with, then everything should be fine. Conda Forge is entirely self-sufficient these days. Whereas Anaconda users often need to use Conda Forge packages to fill in the gaps. This practice (which is encouraged by Anaconda documentation with little warning about how problematic it is) generates a substantial portion of user issues.
Personally, I recommend a base install of Mambaforge. It puts conda-forge as the default, installs a minimum of packages in the base environment, and comes with Mamba for fast environment solving.

The problem turned out to be a memory issue (as in, lack thereof). I was trying to do this on an AWS EC2 instance that was on the t2.micro free tier. This instance only had 1GB of memory (RAM) and buried in the feedback I got from the dmesg command (run from the terminal after the command failed) I noticed an "Out of Memory" line. I increased the instance size and the problem went away. Thanks for the help!

Related

Why is occuring an error regarding to the boundary dataset file in Basemap in Python? [duplicate]

I have the same problem as this post:
Declaring a var usable by another function using a import in a secondary script, but the answer does not work on my side.
For context: basemap and basemap-data-hires are installed, yet when using resolution = 'f' it triggers the following error:
OSError: Unable to open boundary dataset file. Only the 'crude' and 'low',
resolution datasets are installed by default.
If you are requesting an, 'intermediate', 'high' or 'full'
resolution dataset, you may need to download and install those
files separately with
conda install -c conda-forge basemap-data-hires.
Here is the conda list output:
C:\Users\AlxndrLhr>conda list
# packages in environment at C:\Users\AlxndrLhr\Anaconda3\envs\map:
#
# Name Version Build Channel
basemap 1.2.2 py39h689385a_5 conda-forge
basemap-data 1.3.2 pyhd8ed1ab_0 conda-forge
basemap-data-hires 1.3.2 pyhd8ed1ab_0 conda-forge
As you can see, basemap-data-hires is present. I tried installing it in the base environment of conda, didn't work either.
Before basemap 1.3.0, the library was packaged in conda-forge by splitting the heavy data files into a separate basemap-data-hires conda package (and whose files were installed in the share folder).
Since basemap 1.3.0, a complete reorganisation of the basemap package has been done upstream by splitting the library into basemap, basemap-data and basemap-data-hires. These three packages are Python packages and get installed in the corresponding Python site-packages folder. This new structuring is propagated to the conda-forge packages.
Your installation is mixing the old basemap conda package (pre-1.3.0) with the new basemap-data-hires conda package (post-1.3.0). You can solve the issue by pinning versions during installation, either the following to install the latest basemap:
conda install "basemap>=1.3.0" "basemap-data-hires>=1.3.0"
or the following to install the pre-1.3.0 version:
conda install "basemap==1.2.2" "basemap-data-hires==1.2.2"

python and conda: what to do if a package (openssl=1.1.1b=h1de35cc_0) is not found?

I cloned a project where the environment.yml file contains, for example
- openssl=1.1.1b=h1de35cc_0
When I try to create the env I see a lot of not resolved packages
Solving environment: failed
ResolvePackageNotFound:
....
- openssl=1.1.1b=h1de35cc_0
I added, from another SO question, the 'free' entry to channel list
channels:
- defaults
- free
Nothing changes
I manually searched from console and I looked for 1.1.1b version
openssl 1.1.1b h0c8e037_0 pkgs/main
openssl 1.1.1b h0c8e037_1 pkgs/main
openssl 1.1.1b he774522_0 pkgs/main
openssl 1.1.1b he774522_1 pkgs/main
There is not a version 1.1.1b with h1de35cc_0. I don't know what this hash is... also.
What can I do? Can I simply replace 1.1.1b with a he774522_1 , for example?
Simply try to remove the h1de35cc_0 part (what is after the version number), as these are further specifications for exact packages which are sometimes too specific to be resolved on another OS, python version etc.

Anaconda - can't install package offline after downloading it

I'm trying to install some packages on a remote machine (with GPUs) that is not connected to the internet.
(Some people have suggested I should be using Docker and I may well do that but here's one last chance to get this working).
FYI: I'm following the instructions here.
What I've done so far:
Downloaded Anaconda Anaconda3-2019.03-Linux-x86_64.sh file and installed it on the remote machine
$ conda --version
conda 4.6.14
Then downloaded the desired package from here and moved it to the remote machine.
$ ls pkgs-for-anaconda/linux-64/*tensorflow*
pkgs-for-anaconda/linux-64/tensorflow-gpu-1.9.0-hf154084_0.tar.bz2
Setup a new channel which is the file path on the local file system.
$ conda config --prepend channels file:///home/billtubbs/pkgs-for-anaconda
Excerpt from config to confirm this worked:
channels:
- file:///home/billtubbs/pkgs-for-anaconda
- defaults
Install the package
$ conda install pkgs-for-anaconda/linux-64/tensorflow-gpu-1.9.0-hf154084_0.tar.bz2
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Index the packages
$ conda index pkgs-for-anaconda/
Subdir: noarch: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 81.80it/s]
(base) [billtubbs#localhost ~]$ ch: 0it [00:00, ?it/s]s]05 [00:00<00:00, 750741.03it/s]
Is the issue that it looked in no noarch instead of linux-64?
Try to install the package
When I use the following to create a new environment with the desired package:
$ conda create -n tf tensorflow-gpu
I get:
Collecting package metadata: done
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- tensorflow-gpu -> _tflow_190_select==0.0.1=gpu
- tensorflow-gpu -> tensorflow==1.9.0
Current channels:
- file:///home/billtubbs/pkgs-for-anaconda/linux-64
- file:///home/billtubbs/pkgs-for-anaconda/noarch
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/linux-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
What I don't understand is that it shows my channel exists. And it even seems to be looking for the right version (1.9.0). But it says it can't find it.
Just to confirm, I did the following:
$ conda search tensorflow-gpu==1.9.0
Loading channels: done
# Name Version Build Channel
tensorflow-gpu 1.9.0 hf154084_0 pkgs-for-anaconda
tensorflow-gpu 1.9.0 hf154084_0 pkgs/main
Anyone know what I am doing wrong?
UPDATE:
Here is some of the output from
$ conda list --show-channel-urls
...
sympy 1.3 py37_0 defaults
tblib 1.3.2 py37_0 defaults
tensorflow-gpu 1.9.0 hf154084_0 file:///home/billtubbs/pkgs-for-anaconda
terminado 0.8.1 py37_1 defaults
testpath 0.4.2 py37_0 defaults
I will recommend you that you uninstall the current version of anaconda that you have, when I downloaded the last version of anaconda i got some problems, I remenber that I could'nt install for example tensorflow or matplotlib..
The best version to work with tensorflow or matplotlib is having one anaconda with python 3.6. Try to install Anaconda3-4.4.0-Windows-x86_64 or Anaconda3-4.4.0-Linux-x86_64.sh which was released in 2017-05-26.
Anacoda versions
And then try to install tensorflow, matplotlib, pandas, numpy but before run
conda update conda
To update some packages of anaconda.
And you will install those the packages/libraries without problems,
Best Regards.
PD: I also tried to install docker however I got more problems than using python with pip, that's why I think anaconda is the best solution.

Can't install ggplot with anaconda

(I'm aware of this question Cannot install ggplot with anaconda but that is aimed at Windows, and I'm running a Linux OS)
I'm attempting to install the ggplot package in a python3 (v3.6.0) Anaconda environment:
$ conda install ggplot
Fetching package metadata .............
PackageNotFoundError: Package missing in current linux-64 channels:
- ggplot
Close matches found; did you mean one of these?
ggplot: r-ggplot2, r-gplots
If I use conda search I get:
$ conda search ggplot
Fetching package metadata .............
r-ggplot2 1.0.0 0 defaults
1.0.0 0a defaults
1.0.1 r3.2.2_0 defaults
1.0.1 r3.2.0_0 defaults
1.0.1 r3.2.1_0 defaults
1.0.1 r3.2.1_0a defaults
1.0.1 r3.2.2_0a defaults
1.0.1 r3.2.0_0a defaults
2.1.0 r3.3.1_0 defaults
2.2.0 r3.3.1_0 defaults
2.2.0 r3.3.2_0 defaults
but if I search https://anaconda.org/search for ggplot I get lots of results.
The questions: why am I not seeing those results when using conda search? What is the difference between ggplot and r-ggplot2 (the package it offers to install when I search for ggplot)?
Why am I not seeing those results when using conda search?
The difference for search is that conda search only searches in your channels, anaconda search or the search on anaconda.org includes all (public) channels. The name in front of the package name is the channel, for example xyz/ggplot the xyz is the channel.
What is the difference between ggplot and r-ggplot2
It's probably a naming convention. anaconda has several R based packages and it's likely they seperate them using the r- prefix from more regular python packages. So if you don't plan to use it with "R" you should probably look for a suitable candidate without the r-.

What does "the following packages will be superseded by a higher priority channel" mean?

I am trying to install fuzzywuzzy onto my Anaconda distribution in 64 bit Linux. When I do this, it tries to change my conda, and conda-env to conda-forge channels. As follows:
I search anaconda for fuzzy wuzzy by writing:
anaconda search -t fuzzywuzzy
This showed that the most up to date version available for anaconda on 64 bit Linux is 0.13 provided on the channel conda-forge.
To install, within the command line, I type:
conda install -c conda-forge fuzzywuzzy=0.13.0
I get the following output:
The following packages will be downloaded:
package | build
---------------------------|-----------------
conda-env-2.6.0 | 0 1017 B conda-forge
python-levenshtein-0.12.0 | py27_0 138 KB conda-forge
conda-4.2.13 | py27_0 375 KB conda-forge
fuzzywuzzy-0.11.0 | py27_0 15 KB conda-forge
------------------------------------------------------------
Total: 528 KB
The following new packages will be INSTALLED:
fuzzywuzzy: 0.11.0-py27_0 conda-forge
python-levenshtein: 0.12.0-py27_0 conda-forge
The following packages will be SUPERCEDED by a higher-priority channel:
conda: 4.2.13-py27_0 --> 4.2.13-py27_0 conda-forge
conda-env: 2.6.0-0 --> 2.6.0-0 conda-forge
Proceed ([y]/n)?
I do not understand what this is telling me.
What does this mean? Am I right in thinking that this is changing my default package manager channels? Can this be reversed if I go ahead and install it? Is there any way to complete the installation without changing the default channel? Or is favouring the superceding channels something that I should be doing?
I don't want to change my distribution just for one module, or cause further headaches.
This question: https://github.com/conda/conda/issues/2898 sounds like its telling me that I should just let it happen. What do?
(I am using anaconda version: 4.2.13 and Python 2.7.12)
When you ask conda to install fuzzywuzzy from conda-forge, fuzzywuzzy indicates that it needs conda and conda-env. Conda detects that you already have these installed, but it also knows that these were installed from the default channel and not conda-forge.
Now, as a user you might expect that 4.2.13-py27_0 in the default channel and in the conda-forge channel to be exactly the same (and they should) but conda can not guarantee that this is the case. The developers could very well have uploaded different packages to the default and conda-forge channels.
This would cause some really shady bugs, and in order to avoid those conda prefers to install the dependencies from the same channel as the new package. This is what the message indicates, a package getting replaced with the same package, but from a different channel which you gave higher priority by using -c conda-forge.

Categories