pip dependencies of dependencies when installed from conda environment.yaml - python

I am trying to create a conda environment.yml file for users of a project. One dependency is not distributed by conda, but available with pip+github. I assume based on this example that I can do this:
dependencies
- pip
- regular_conda_dep
- depend_of_blah
# Install in editable mode.
- -e git+https://github.com/ourgroup/blah.git
But what happens to the dependencies of blah (depend_of_blah)? Will the pip install after the conda so that as long as I am careful to include it gets installed before blah? Later on will blah update cleanly, getting as much as possible from conda?
Or do I need to add --no-deps to the pip line? Is it implied that this is done magically? I don't see a lot of advanced examples that deal with this, but in my experience is that it is a real danger in pip/conda mixes not to use --no-deps, with pip essentially hijacking anything that hasn't been explicitly handled first.

Conda parses the YAML, and partitions the dependency specifications into a Conda set and a Pip set (code). Only the Conda set is used to solve and create the initial environment.1 Once the environment has been successfully created, Conda writes all the Pip specifications to a temporary requirements.txt (code), and then using the python in the environment runs the command:
python -m pip install -U -r <requirements.txt>
So, to explicitly answer the question: If all the dependencies of blah are installed via Conda and they have sufficient versions installed, then Pip should only install blah and leave the Conda versions untouched. This is because the default value for --upgrade-strategy is only-if-needed.
Otherwise, if the Conda dependencies section does not include all the dependencies of blah, then Pip will install the necessary dependencies.
[1]: Technically, if there are create_default_packages set in the Conda configuration, Conda will first create the environment with just these packages, then subsequently install the dependencies specified in the YAML file.

You can tell pip to ignore dependencies via environment variables
PIP_NO_DEPS=1 conda env create -f myenv.yaml
From the documentation:
pip’s command line options can also be set with environment variables
using the format PIP_<UPPER_LONG_NAME> . Dashes (-) have to be
replaced with underscores (_).

Related

ML and Django: using Conda and Pip depending on what I am doing... no? [duplicate]

conda 4.2.13
MacOSX 10.12.1
I am trying to install packages from pip to a fresh environment (virtual) created using anaconda. In the Anaconda docs it says this is perfectly fine. It is done the same way as for virtualenv.
Activate the environment where you want to put the program, then pip install a program...
I created an empty environment in Ananconda like this:
conda create -n shrink_venv
Activate it:
source activate shrink_venv
I then can see in the terminal that I am working in my env (shrink_venv). Problem is coming up, when I try to install a package using pip:
(shrink_venv): pip install Pillow
Requirement already satisfied (use --upgrade to upgrade): Pillow in /Library/Python/2.7/site-packages
So I can see it thinks the requirement is satisfied from the system-wide package. So it seems the environment is not working correctly, definitely not like it said in the docs. Am I doing something wrong here?
Just a note, I know you can use conda install for the packages, but I have had an issue with Pillow from anaconda, so I wanted to get it from pip, and since the docs say that is fine.
Output of which -a pip:
/usr/local/bin/pip
/Users/my_user/anaconda/bin/pip
** UPDATE **
I see this is pretty common issue. What I have found is that the conda env doesn't play well with the PYTHONPATH. The system seems to always look in the PYTHONPATH locations even when you're using a conda environment. Now, I always run unset PYTHONPATH when using a conda environment, and it works much better. I'm on a mac.
For others who run into this situation, I found this to be the most straightforward solution:
Run conda create -n venv_name and conda activate venv_name, where venv_name is the name of your virtual environment.
Run conda install pip. This will install pip to your venv directory.
Find your anaconda directory, and find the actual venv folder. It should be somewhere like /anaconda/envs/venv_name/.
Install new packages by doing /anaconda/envs/venv_name/bin/pip install package_name.
This should now successfully install packages using that virtual environment's pip!
All you have to do is open Anaconda Prompt and type
pip install package-name
It will automatically install to the anaconda environment without having to use
conda install package-name
Since some of the conda packages may lack support overtime it is required to install using pip and this is one way to do it
If you have pip installed in anaconda you can run the following in jupyter notebook or in your python shell that is linked to anaconda
pip.main(['install', 'package-name'])
Check your version of pip with pip.__version__. If it is version 10.x.x or above, then install your python package with this line of code
subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--upgrade', 'package-name'])
In your jupyter notebook, you can install python packages through pip in a cell this way;
!pip install package-name
or you could use your python version associated with anaconda
!python3.6 -m pip install package-name
I solved this problem the following way:
If you have a non-conda pip as your default pip but conda python is your default python (as below)
>which -a pip
/home/<user>/.local/bin/pip
/home/<user>/.conda/envs/newenv/bin/pip
/usr/bin/pip
>which -a python
/home/<user>/.conda/envs/newenv/bin/python
/usr/bin/python
Then instead of just calling
pip install <package>, you can use the module flag -m with python so that it uses the anaconda python for the installation
python -m pip install <package>
This installs the package to the anaconda library directory rather than to the library directory associated with (the non-anaconda) pip
EDIT:
The reason this works is as follows:
the command pip references a specific pip file/shortcut (which -a pip tells you which one). Similarly, the command python references a specific python file (which -a python tells you which one). For one reason or another these two commands can become unsynchronized, so that your 'default' pip is in a different folder than your default python, and therefore is associated with a different version of python.
In contrast, the python -m pip construction does not use the shortcut that the pip command points to. Instead, it asks python to find its version of pip and use that version to install a package.
This is what worked for me (Refer to image linked)
Open Anaconda
Select Environments in the left hand pane below home
Just to the right of where you selected and below the "search environments" bar, you should see base(root). Click on it
A triangle pointing right should appear, click on it an select "open terminal"
Use the regular pip install command here. There is no need to point to an environment/ path
For future reference, you can find the folder your packages are downloading to if you happen to have a requirement already satisfied. You can see it if you scroll up in the terminal. It should read something like: requirement already satisfied and then the path
[]
If you didn't add pip when creating conda environment
conda create -n env_name pip
and also didn't install pip inside the environment
source activate env_name
conda install pip
then the only pip you got is the system pip, which will install packages globally.
Bus as you can see in this issue, even if you did either of the procedure mentioned above, the behavior of pip inside conda environment is still kind of undefined.
To ensure using the pip installed inside conda environment without having to type the lengthy /home/username/anaconda/envs/env_name/bin/pip, I wrote a shell function:
# Using pip to install packages inside conda environments.
cpip() {
ERROR_MSG="Not in a conda environment."
ERROR_MSG="$ERROR_MSG\nUse \`source activate ENV\`"
ERROR_MSG="$ERROR_MSG to enter a conda environment."
[ -z "$CONDA_DEFAULT_ENV" ] && echo "$ERROR_MSG" && return 1
ERROR_MSG='Pip not installed in current conda environment.'
ERROR_MSG="$ERROR_MSG\nUse \`conda install pip\`"
ERROR_MSG="$ERROR_MSG to install pip in current conda environment."
[ -e "$CONDA_PREFIX/bin/pip" ] || (echo "$ERROR_MSG" && return 2)
PIP="$CONDA_PREFIX/bin/pip"
"$PIP" "$#"
}
Hope this is helpful to you.
python -m pip install Pillow
Will use pip of current Python activated with
source activate shrink_venv
For those wishing to install a small number of packages in conda with pip then using,
sudo $(which pip) install <instert_package_name>
worked for me.
Explainaton
It seems, for me anyway, that which pip is very reliable for finding the conda env pip path to where you are. However, when using sudo, this seems to redirect paths or otherwise break this.
Using the $(which pip) executes this independently of the sudo or any of the commands and is akin to running /home/<username>/(mini)conda(3)/envs/<env_name>/pip in Linux. This is because $() is run separately and the text output added to the outer command.
All above answers are mainly based on use of virtualenv. I just have fresh installation of anaconda3 and don't have any virtualenv installed in it. So, I have found a better alternative to it without wondering about creating virtualenv.
If you have many pip and python version installed in linux, then first run below command to list all installed pip paths.
whereis pip
You will get something like this as output.
pip: /usr/bin/pip /home/prabhakar/anaconda3/bin/pip /usr/share/man/man1/pip.1.gz
Copy the path of pip which you want to use to install your package and paste it after sudo replacing /home/prabhakar/anaconda3/bin/pip in below command.
sudo /home/prabhakar/anaconda3/bin/pip install <package-name>
This worked pretty well for me. If you have any problem installing, please comment.
if you're using windows OS open Anaconda Prompt and type activate yourenvname
And if you're using mac or Linux OS open Terminal and type source activate yourenvname
yourenvname here is your desired environment in which you want to install pip package
after typing above command you must see that your environment name is changed from base to your typed environment yourenvname in console output (which means you're now in your desired environment context)
Then all you need to do is normal pip install command e.g pip install yourpackage
By doing so, the pip package will be installed in your Conda environment
I see a lot of good answers here but still wanted to share mine that worked for me especially if you are switching from pip-era to conda-era. By following this, you can install any packages using both conda and pip.
Background
PIP - Python package manager only
Conda - Both package and environment manager for many languages including Python
Install Pip by default every time you create a new conda environment
# this installs pip for your newly created environment
conda create -n my_new_env pip
# activate your new conda environment
conda activate my_new_env
# now you can install any packages using both conda and pip
conda install package_name
#or
pip install package_name
This gives you the flexibility to install any packages in conda environment even if they are not available in conda (e.g. wordcloud)
conda activate my_new_env
# will not work as wordcloud is not available in conda
conda install wordcloud
# works fine
pip install wordcloud
I was facing a problem in installing a non conda package on anaconda, I followed the most liked answer here and it didn't go well (maybe because my anaconda is in F directory and env created was in C and bin folder was not created, I have no idea but it didn't work).
According to anaconda pip is already installed ( which is found using the command "conda list" on anaconda prompt), but pip packages were not getting installed so here is what I did, I installed pip again and then pip installed the package.
conda install pip
pip install see
see is a non-conda package.
Depends on how did you configure your PATH environmental variable.
When your shell resolves the call to pip, which is the first bin it will find?
(test)$ whereis pip
pip: /home/borja/anaconda3/envs/test/bin/pip /home/borja/anaconda3/bin/pip
Make sure the bin folder from your anaconda installation is before /usr/lib (depending on how you did install pip). So an example:
(test) borja#xxxx:~$ pip install djangorestframework
....
Successfully installed asgiref-3.2.3 django-3.0.3 djangorestframework-3.11.0 pytz-2019.3 sqlparse-0.3.1
(test) borja#xxxx:~$ conda list | grep django
django 3.0.3 pypi_0 pypi
djangorestframework 3.11.0 pypi_0 pypi
We can see the djangorestframework was installed in my test environment but if I check my base:
(base) borja#xxxx:~$ conda list | grep django
It is empty.
Personally I like to handle all my PATH configuration using .pam_environment, here an example:
(base) borja#xxxx:~$ cat .pam_environment
PATH DEFAULT=/home/#{PAM_USER}/anaconda3/bin:${PATH}
One extra commet. The way how you install pip might create issues:
You should use: conda install pip --> new packages installed with pip will be added to conda list.
You shodul NOT use: sudo apt install python3-pip --> new packages will not be added to conda list (so are not managed by conda) but you will still be able to use them (chance of conflict).
Well I tried all the above methods. None worked for me because of an issue with the proxy settings within the corporate environment. Luckily I could open the pypi website from the browser. In the end, the following worked for me:
Activate your environment
Download the .whl package manually from
https://pypi.org/simple/<package_name>/
Navigate to the folder where you have downloaded the .whl from the command line with your environment activated
perform:
pip install package_name_whatever.whl
If you ONLY want to have a conda installation. Just remove all of the other python paths from your PATH variable.
Leaving only:
C:\ProgramData\Anaconda3
C:\ProgramData\Anaconda3\Scripts
C:\ProgramData\Anaconda3\Library\bin
This allows you to just use pip install * and it will install straight into your conda installation.
I know the original question was about conda under MacOS. But I would like to share the experience I've had on Ubuntu 20.04.
In my case, the issue was due to an alias defined in ~/.bashrc: alias pip='/usr/bin/pip3'. That alias was taking precedence on everything else.
So for testing purposes I've removed the alias running unalias pip command. Then the corresponding pip of the active conda environment has been executed properly.
The same issue was applicable to python command.
Given the information described in this Anaconda blog post, I think the best practice would be to create an environment file so that your conda environments can be created predictably.
I tried a few of the answers posted here without success and I didn't feel like messing around with python paths etc. Instead, I added an environment.yml file similar to this:
name: your-environment-name
channels:
- defaults
dependencies:
- python=3.9.12
- requests=2.28.1
- pandas=1.4.4
- pip=21.2.4
- pip:
- python-dotenv==0.19.2
This guarantees that you install all conda dependencies first, then install pip in the conda environment and use it to install dependencies that are unavailable through conda. This is predictable, reusable, and follows the advice described in the blog post.
You then create a new conda environment using the file with this command:
conda env create -f environment.yml
Uninstall the duplicated python installation. Just keep anaconda and create an env with the desired python version as specified here: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-python.html. Then your python and pip versions will change as you switch between envs.
I've looked at this answer and many other answers for hours today and couldn't figure this out with 30 years programming experience.
I ran:
conda create -n myenv python=3.9
conda activate myenv
and could not use pip. However, in another environment such as myenv2, myenv3, myenv4 it worked.
I was obtaining the dreaded urllib3 httpsconnection error.
So thought it has to be a missing urllib3 error or something else. It turns out that it was much more sinister than that. Unfortunately it works in other environments and for me I thought that it was related to the fact I'm using Debian on Windows 10 with WSL2. The fix was simple:
rm -rf $HOME/.cache
The pip cache was mangled from a previous install of the same environment. Probably due to the fact I had run an update on conda base and done a distribution upgrade. Because I'm wanting to run a production system with apache2 using a WSGI environment with flask, I want to always have the same conda instance name. So this was a must fix!

Best practice to manage dependencies between conda and pip

I'm developing a Python library, which depends on multiple packages. I'm struggling to find the most straightforward of managing all those dependencies with the following constraints :
Some of those dependencies are only available as conda packages (technically the source is available but the build process is not something I want to get into)
Other dependencies are only available via pip
I need to install my own library in editable or developer mode
I regularly need to keep the dependencies up-to-date
My current setup for the initial install :
Create a new conda environment
Install the conda-only dependencies with conda install ...
Install my library with pip install -e .
At this point, some packages were installed and are now managed by conda, others by pip. When I want to update my environment, I need to:
Update the conda part of the environment with conda update --all
Update the pip part of the environment by hand
My problem is that this is unstable : when I update all conda packages, it ensures the consistency of the packages it manages. However, I can't guarantee that the environment as a whole stays consistent, and I just realized that I was missing some updates because I forgot to check for updates in the pip part of the environment.
What's the best way to do this ? I've thought of :
Using conda's pip interoperability feature : this seems to work, but I've had some dubious results, probably because of my use of extras_require
Since pip can see the conda packages, the initial install is consistent, which means I can simply reinstall everything when I want to update. This works but is not exactly elegant.
The recommendation in the official documentation for managing a Conda environment that also requires PyPI-sourced or pip-installed local packages is to define all dependencies (both Conda and Pip) in a YAML file. Something like:
env.yaml
name: my_env
channels:
- defaults
dependencies:
- python=3.8
- numpy
- pip
- pip:
- some_pypi_only_pkg
- -e path/to/a/local/pkg
The workflow for updating in such an environment is to update the YAML file (which I would recommend to keep under version control) and then either create a new environment or use
conda env update -f env.yaml
Personally, I would tend to create new envs, rather than mutate (update) an existing one, and use minimal constraints (i.e., >=version) in the YAML. When creating a new env, it should automatically pull the latest consistent packages. Plus, one can keep the previous instances of the env around in case a regression is need during the development lifecycle.

Do these commands do the same action?

Basically I would like to know if those 2 snippets do the same thing :
conda install -n myEnv myPackage
VS
conda activate myEnv
pip install myPackage
Or in a different way, does a pip install when a conda environment is activated equal doing a conda install on myEnv ?
EDIT : I thought it was obvious but => more precisely, does the second snippet only install the package on the environment or on the overall system ?
PS : Asking because there's a package available with pip but not with conda and I want it to only be installed on myEnv
The Anaconda docs make it clear that if you use conda as your virtual environment manager, you should stick to conda install to install new packages as far as possible:
Unfortunately, issues can arise when conda and pip are used together
to create an environment, especially when the tools are used
back-to-back multiple times, establishing a state that can be hard to
reproduce. … Running conda after pip has the potential to
overwrite and potentially break packages installed via pip. Similarly,
pip may upgrade or remove a package which a conda-installed package
requires.
If you can't get all the packages you need from a conda channel, they say this, which is good advice even if you don't use pip:
If there is an expectation to install software using pip along-side
conda packages it is a good practice to do this installation into a
purpose-built conda environment to protect other environments from any
modifications that pip might make.
Finally the same document notes:
Use conda environments for isolation
create a conda environment to isolate any changes pip makes
environments take up little space thanks to hard links
care should be taken to avoid running pip in the “root” environment
Provided you activate the correct conda environment first, the pip install command(s) should use that environment's pip and install only into that environment.
Yes and no.
pip downloads and installs the package from PyPI whereas conda does the same from Anaconda repositories.
There are packages in PyPI not present in Anaconda, and the other way around.
For managing the environment I would choose one way or the other, since with pip you can freeze into a requirements.txt (pip freeze > requirements.txt) and conda you can either export the whole environment (conda env export) or the list of packages (conda list --export > requirements.txt). However if you try to use a conda-generated file from pip, it will most probably fail.

Does conda env support development dependencies?

Does conda allow you to install a dependency into an environment as a development dependency?
I'm thinking of something like how bower does this with --save-dev
AFAICT, no, it does not. This repo represents work around options that might be useful elsewhere:
https://github.com/dazza-codes/conda_container
In short, it supplements a conda install with subsequent pip installs from a requirements.txt and/or a requirements.dev file. Since there can be inconsistencies in conda vs. pip packages (like different name variants etc.), there are use cases for having a combination of conda and pip. Also conda can support a pip array in an environment.yml file but the version specs for conda vs. pip packages are not compatible. Liberal use of pip check is recommended for any combination of packages from different packaging systems.

Create conda environment from file without default packages

I'm trying to create conda environment from a file environment.yml.
conda env create -f environment.yml
This works but I want to avoid installing the default packages.
I found flag --no-default-packages but this applies only to conda create and this command doesn't accept the environment.yml file.
Is there a way how to use environment.yml and NOT install default packages?
EDIT:
My ultimate goal is to create environment which could be packaged (or the site-packages of the environment) as lambda for AWS. But it seems conda is installing way too much with every package.
E.g.:
bash-4.2# conda create --name test
bash-4.2# source activate test
(test) bash-4.2# conda install networkx
Fetching package metadata .........
Solving package specifications: .
Package plan for installation in environment /root/miniconda3/envs/test:
The following NEW packages will be INSTALLED:
certifi: 2016.2.28-py36_0
decorator: 4.1.2-py36_0
networkx: 1.11-py36_0
openssl: 1.0.2l-0
pip: 9.0.1-py36_1
python: 3.6.2-0
readline: 6.2-2
setuptools: 36.4.0-py36_1
sqlite: 3.13.0-0
tk: 8.5.18-0
wheel: 0.29.0-py36_0
xz: 5.2.3-0
zlib: 1.2.11-0
Why is this command installing Python, pip and other packages? Are these real dependencies of networkx?
On the other hand if I do pip install -t . networkx it installs just the networkx just as expected.
Is there a way how to use environment.yml and NOT install default packages?
Only the conda create command uses the create_default_packages. The conda env create already ignores the create_default_packages configuration by design.
For example, I just checked adding git to default packages, then created an environment from a YAML with wget as only dependency, and it did not install git.
Why is this command installing Python, pip and other packages? Are these real dependencies of networkx?
Yes, those are the actual dependencies. They may be dependencies of dependencies (and so on), but they are what is needed to install and run networkx according to the package builders.
...if I do pip install -t . networkx it installs just the networkx
The comparison to pip install is misleading. The pip install CLI command itself is an entry point defined by the pip module. An equivalent command would be:
python -m pip install networkx
which makes explicit the fact that one cannot install networkx without already having python and pip.
Keep in mind that Conda was designed from the outset to provide broader support for non-Python dependencies and thus facilitate more independence from the hosts system-level libraries. Hence, you should expect there to be more dependencies, especially ones that would never appear in a system Python using PyPI only.
These packages are part of the Anaconda metapackage which defines the core Anaconda dependencies.

Categories