TLDR: I want the ability to pip install whichever version of a package will require the minimal changes to my currently installed versions.
Long version:
I support a pretty complex base container for geoscience research.
My users sometimes fork the environment to add a new package for their specific use-case. The pip install of the new package often causes a cascade of upgrades, which inevitably breaks something. pip install --no-deps is no help -- it just means the new package won't work, because it's missing dependencies.
What I always seem to end up doing is manually walking the version history of the new package, looking for a version that will cause minimum disturbance to my existing packages.
Is there any way of automatically finding this "minimal-disturbance" historical version of the new package?
Related
Background
The official documentation and this blog in the same website - recommend to install as many requirements as possible with conda then use pip. Apparently this is because conda will be unaware of any changes to the dependencies made by pip and therefore will not be able to resolve dependencies correctly.
Question
Now if one exclusively uses pip and go without installing anything with conda, it seems reasonable to expect conda does not need to be aware of any changes made by pip - as conda effectively becomes a mere tool to isolate dependencies and manage versions. However, this goes against official recommendation as one will NOT install as many requirements as possible with conda.
So the question remains: is there any known drawback from exclusively using pip in a conda environment?
Similar Topics
A similar topic in has been touched a bit in here but does not cover the case of exclusively using pip in a conda environment. I have also been here:
Specific reasons to favor pip vs. conda when installing Python packages
What is the difference between pip and conda?
Using Pip to install packages to Anaconda Environment
Not sure one can give a comprehensive answer on this, but some of the major things that come to mind are:
Lack of deep support for non-Python dependency resolution. While more wheels that bundle non-Python resources have become available over time, it is nowhere near the coverage that Conda provides by being a general package manager rather than Python-specific. For anyone doing interoperable computing (e.g., reticulate), I would expect Conda to be favored.
Optimized libraries. Sort of related to the first point, but the Anaconda team has made an effort to build optimized versions of packages (e.g., MKL for numpy). Not sure if the equivalent is available through PyPI.1
Wasteful redundancy across environments. Conda uses hardlinking when packages and environments are on the same volume, and supports softlinking for spanning across volumes. This helps to minimize replicating any packages that are installed in multiple environments.
Complicates exporting. When exporting (conda env export) Conda doesn't pick up all pip-installed packages - only the ones that come from PyPI. That is, it'll miss things installed from GitHub, etc.. If one did go the pip-only route, I think a more reliable export strategy would be to use pip freeze > requirements.txt, and then make a YAML like
channels:
- defaults
dependencies:
- python=3.8 # specify the version
- pip
- pip:
- -r requirements.txt
with which to recreate the environment.
All that said, I could easily imagine that none of these matter to some people (most of them are conveniences), especially those who tend to work purely in Python. In such cases, however, I don't see why one would not simply forgo Conda altogether and use a Python-specific virtual environment manager.
[1] Someone please correct me if you know otherwise.
I tried to install the Twilio module:
sudo -H pip install twilio
And I got this error:
Installing collected packages: pyOpenSSL
Found existing installation: pyOpenSSL 0.13.1
Cannot uninstall 'pyOpenSSL'. It is a distutils installed project and
thus we cannot accurately determine which files belong to it which
would lead to only a partial uninstall.
Anyone know how to uninstall pyOpenSSL?
This error means that this package's metadata doesn't include a list of files that belong to it. Most probably, you have installed this package via your OS' package manager, so you need to use that rather than pip to update or remove it, too.
See e.g. Upgrading to pip 10: It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall. · Issue #5247 · pypa/pip for one such example where the package was installed with apt.
Alternatively, depending on your needs, it may be more productive to not use your system Python and/or its global environment but create a private Python installation and/or environment. There are many options here including virtualenv, venv, pyenv, pipenv and installing Python from source into
/usr/local or $HOME/$HOME/.local (or /opt/<whatever>).
Finally, I must comment on the often-suggested (e.g. at pip 10 and apt: how to avoid "Cannot uninstall X" errors for distutils packages) --ignore-installed pip switch.
It may work (potentially for a long enough time for your business needs), but may just as well break things on the system in unpredictable ways. One thing is sure: it makes the system's configuration unsupported and thus unmaintainable -- because you have essentially overwritten files from your distribution with some other arbitrary stuff. E.g.:
If the new files are binary incompatible with the old ones, other software from the distribution built to link against the originals will segfault or otherwise malfunction.
If the new version has a different set of files, you'll end up with a mix of old and new files which may break dependent software as well as the package itself.
If you change the package with your OS' package manager later, it will overwrite pip-installed files, with similarly unpredictable results.
If there are things like configuration files, differences in them between the versions can also lead to all sorts of breakage.
I had the same error and was able to resolve using the following steps:
pip install --ignore-installed pyOpenSSL
This will install the package with latest version and then if you try to install,
pip install twilio
It will work.
Generally, for similar errors, use this format:
pip install --ignore-installed [package name]==[package version]
I just had this error and the only way I was able to resolve it was by manually deleting the offending directory from site-packages.
After doing this you may need to reinstall the packages with --force-reinstall.
Reading the above comments, I understood that package a was installed with conda and the new package b that I was trying to install using pip was causing problems. I was lucky that package b had conda support so using conda to install package b solved the problem.
In my case, I was installing a package from internal git using the following command:
python -m pip install package.whl --force
I was doing this because I didn't want to explicitly uninstall the previous version and just replace it with a newer version. But what it also does is install all the dependencies again. I was getting the error in one of those packages. Removing --force fixed the problem.
I want to add, having --ignore-installed also worked for me. And removing --force is essentially doing the same thing in my case.
I tried to install the Twilio module:
sudo -H pip install twilio
And I got this error:
Installing collected packages: pyOpenSSL
Found existing installation: pyOpenSSL 0.13.1
Cannot uninstall 'pyOpenSSL'. It is a distutils installed project and
thus we cannot accurately determine which files belong to it which
would lead to only a partial uninstall.
Anyone know how to uninstall pyOpenSSL?
This error means that this package's metadata doesn't include a list of files that belong to it. Most probably, you have installed this package via your OS' package manager, so you need to use that rather than pip to update or remove it, too.
See e.g. Upgrading to pip 10: It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall. · Issue #5247 · pypa/pip for one such example where the package was installed with apt.
Alternatively, depending on your needs, it may be more productive to not use your system Python and/or its global environment but create a private Python installation and/or environment. There are many options here including virtualenv, venv, pyenv, pipenv and installing Python from source into
/usr/local or $HOME/$HOME/.local (or /opt/<whatever>).
Finally, I must comment on the often-suggested (e.g. at pip 10 and apt: how to avoid "Cannot uninstall X" errors for distutils packages) --ignore-installed pip switch.
It may work (potentially for a long enough time for your business needs), but may just as well break things on the system in unpredictable ways. One thing is sure: it makes the system's configuration unsupported and thus unmaintainable -- because you have essentially overwritten files from your distribution with some other arbitrary stuff. E.g.:
If the new files are binary incompatible with the old ones, other software from the distribution built to link against the originals will segfault or otherwise malfunction.
If the new version has a different set of files, you'll end up with a mix of old and new files which may break dependent software as well as the package itself.
If you change the package with your OS' package manager later, it will overwrite pip-installed files, with similarly unpredictable results.
If there are things like configuration files, differences in them between the versions can also lead to all sorts of breakage.
I had the same error and was able to resolve using the following steps:
pip install --ignore-installed pyOpenSSL
This will install the package with latest version and then if you try to install,
pip install twilio
It will work.
Generally, for similar errors, use this format:
pip install --ignore-installed [package name]==[package version]
I just had this error and the only way I was able to resolve it was by manually deleting the offending directory from site-packages.
After doing this you may need to reinstall the packages with --force-reinstall.
Reading the above comments, I understood that package a was installed with conda and the new package b that I was trying to install using pip was causing problems. I was lucky that package b had conda support so using conda to install package b solved the problem.
In my case, I was installing a package from internal git using the following command:
python -m pip install package.whl --force
I was doing this because I didn't want to explicitly uninstall the previous version and just replace it with a newer version. But what it also does is install all the dependencies again. I was getting the error in one of those packages. Removing --force fixed the problem.
I want to add, having --ignore-installed also worked for me. And removing --force is essentially doing the same thing in my case.
When I install some python packages in Fedora, there're two ways:
use dnf install python-package
use pip install package
I notice even I use dnf update to make my Fedora the newest,
when I use pip, it still tell me something like
pip is a old version, please use pip update
I guess the dnf package management is different with python-pip package management.
So which one is more recommended to install python packages ?
Quoted from Gentoo Wiki:
It is important to understand that packages installed using pip will not be tracked by Portage. This is the case for installing any package through means other than the emerge command. Possible conflicts can be created when installing a Python package that is available in the Portage tree, then installing the same package using pip.
Decide which package manager will work best for the use case: either use emerge or pip for Python packages, but not both. Sometimes a certain Python packages will not be available in the Portage tree, in these cases the only option is to use pip. Be wise and make good choices!
This is true for almost any nowadays package managers. If you are using packages or certain package versions that only exists in pip, use it but don't try to install that from dnf. Doing this will not only cause file collisions but also will (most possibly) break the package manager's knowledge of the system, which usually leads to major package management issues.
Other solution would be using pip in user mode, without root permissions, which will install relevant things into your home directory.
So again, it's both okay to use pip or dnf, but just don't mix these two package managers together.
I have an external package I want to install into my python virtualenv from a tar file.
What is the best way to install the package?
I've discovered 2 ways that can do it:
Extract the tar file, then run python setup.py install inside of the extracted directory.
pip install packagename.tar.gz from example # 7 in https://pip.pypa.io/en/stable/reference/pip_install/#examples
Is if there is any difference doing them in these 2 ways.
On the surface, both do the same thing: doing either python setup.py install or pip install <PACKAGE-NAME> will install your python package for you, with a minimum amount of fuss.
However, using pip offers some additional advantages that make it much nicer to use.
pip will automatically download all dependencies for a package for you. In contrast, if you use setup.py, you often have to manually search out and download dependencies, which is tedious and can become frustrating.
pip keeps track of various metadata that lets you easily uninstall and update packages with a single command: pip uninstall <PACKAGE-NAME> and pip install --upgrade <PACKAGE-NAME>. In contrast, if you install a package using setup.py, you have to manually delete and maintain a package by hand if you want to get rid of it, which could be potentially error-prone.
You no longer have to manually download your files. If you use setup.py, you have to visit the library's website, figure out where to download it, extract the file, run setup.py... In contrast, pip will automatically search the Python Package Index (PyPi) to see if the package exists there, and will automatically download, extract, and install the package for you. With a few exceptions, almost every single genuinely useful Python library can be found on PyPi.
pip will let you easily install wheels, which is the new standard of Python distribution. More info about wheels.
pip offers additional benefits that integrate well with using virtualenv, which is a program that lets you run multiple projects that require conflicting libraries and Python versions on your computer. More info.
pip is bundled by default with Python as of Python 2.7.9 on the Python 2.x series, and as of Python 3.4.0 on the Python 3.x series, making it even easier to use.
So basically, use pip. It only offers improvements over using python setup.py install.
If you're using an older version of Python, can't upgrade, and don't have pip installed, you can find more information about installing pip at the following links:
Official instructions on installing pip for all operating systems
Instructions on installing pip on Windows (including solutions to common problems)
Instructions on installing pip for Mac OX
pip, by itself, doesn't really require a tutorial. 90% of the time, the only command you really need is pip install <PACKAGE-NAME>. That said, if you're interested in learning more about the details of what exactly you can do with pip, see:
Quickstart guide
Official documentation.
It is also commonly recommended that you use pip and virtualenv together. If you're a beginner to Python, I personally think it'd be fine to start of with just using pip and install packages globally, but eventually I do think you should transition to using virtualenv as you tackle more serious projects.
If you'd like to learn more about using pip and virtualenv together, see:
Why you should be using pip and virtualenv
A non-magical introduction to Pip and Virtualenv for Python beginners
Virtual Environments
python setup.py install is the analog of make install: it’s a limited way to compile and copy files to destination directories. This doesn’t mean that it’s the best way to really install software on your system.
pip is a package manager, which can install, upgrade, list and uninstall packages, like familiar package managers including: dpkg, apt, yum, urpmi, ports etc. Under the hood, it will run python setup.py install, but with specific options to control how and where things end up installed.
In summary: use pip.
The question is about the preferred method to install a local tarball containing a python package, NOT about the advantage of uploading package to an indexing service like PyPi.
As lest I know some software distributor does not upload their package to PyPi, instead asking developers to download package from their website and install.
python setup.py install
This can work but not recommended. It's not necessary to unwrap the tarball file and go into it to run setup.py file.
pip install ../path/to/packagename.tar.gz
This is the way designed and preferred. Concise and align with PyPi-style packages.
More information about pip install can be found here: https://pip.readthedocs.io/en/stable/reference/pip_install/