conda before pip install - python

Whenever I want to install a python package, I find the pip install <package> instructions on most of the sites / README.md documentation in github etc.
A colleague told me recently to try first a conda install <package> and if this fails (because the package is not available) to use the pip install process afterwards.
Is the try with the conda installation step really necessary / beneficial or can I just do the pip install directly?

It depends on your use case. Conda does more than pip does. Conda was developed after pip because the Conda folks did not think pip did enough. It aims to handle library dependencies outside of the realm of python such as C libraries, R packages, or really anything with a wheel. as well as handling the python packages themselves. This is important because these packages do not have the standard setup.py in their source code, so python will not install them into the site-packages directory which is useful for easy importing.
It is important to note that you can't use pip and conda interchangeably, as conda has a different packaging format.
To answer your question succinctly: If you use one, I would stick to it for the entirety of whatever you are doing, and not use conda "until it doesn't work for something" and then just switch to pip for installations that conda can't handle. That is a super good way to get into trouble you can't explain.
My advice: if you are sticking to python and python only, use pip. If you are considering outside libraries that have value to your project, conda is a good option.

Related

Cannot find package on Anaconda navigator after installing it using pip

I followed the instructions here: Can't find package on Anaconda Navigator. What to do next?
I clicked Open terminal from environment on Anaconda navigator, and then used "pip3 install lmfit" in the terminal. But after installing the lmfit package using pip3, I still cannot find it in the conda list. What should I do?
The Problem
At the time of this question, Conda builds of pip had only just started including a pip3 entrypoint,1 therefore pip3 is very likely referring to a non-Conda version of Python and that is where the package was installed. Try checking which pip3 to find out where it went.
Recommendation
Conda First
Generally, it is preferable to use Conda to install packages in Conda environments, and in this case the package is available via the Conda Forge channel:
conda install -c conda-forge lmfit
Contrary to M. Newville's answer, this recommendation to prefer Conda packages is not about benefiting Conda developers, but instead a rule of thumb to help users avoid creating unstable or unreproducible environments. More info about the risks of mixing pip install and conda install can be found in the post "Using Pip in a Conda Environment".
Nevertheless, the comment that not all packages (and specifically lmfit) are found in the default repository and this complicates installation by requiring resorting to third-party channels is a good point. In fact, because third-parties are free to use different build stacks there are known problems with mixing packages built by Anaconda and those from Conda Forge. However, these issues tend to be rare and limited to compiled code. Additionally, adding trusted channels to a configuration and setting channel priorities heuristically solves the issue.
As for risks in using third-party channels, arbitrary Anaconda Cloud user channels are risky: one should only source packages from channels you trust (just like anything else one installs). Conda Forge in particular is well-reputed and all feedstocks are freely available on GitHub. Moreover, many Python package builds on Conda Forge are simply wrappers around the PyPI build of the package.
PyPI Last
Sometimes it isn't possible to avoid using PyPI. When one must resort to installing from PyPI, it is better practice to use the pip entrypoint from an activate environment, rather than pip3, since only some Conda builds of pip include pip3. For example,
conda activate my_env
pip install lmfit
Again, following the recommendations in "Using Pip in a Conda Environment", one should operate under the assumption that any subsequent calls to conda (install|upgrade|remove) in the environment could have undefined behavior.
PyPI Only
For the sake of completeness, I will note that a stable way of using Conda that is consistent with the recommendations is to limit Conda to the role of environment creation and use pip for all package installation.
This strategy is perhaps the least burden on the Python-only user, who doesn't want to deal with things like finding the Conda-equivalent package name or searching non-default channels. However, its applicability seems limited to Python-only environments, since other libraries may still need to resort to conda install.
[1]: Conda Forge and Anaconda started consistently including pip3 entrypoints for the pip module after version 20.2.
Installing a pure Python package, such as lmfit with the correct version of pip install lmfit should be fine.
Conda first is recommended to make the life of the conda maintainers and packagers easier, not the user's life. FWIW, I maintain both kinds of packages,
and there is no reason to recommend conda install lmfit over pip install lmfit.
In fact, lmfit is not in the default anaconda repository so that installing it requires going to a third-party conda channel such as conda-forge. That adds complexity and risk to your conda environment.
Really, pip install lmfit should be fine.

Python Conda Environment confusion(as example: problem with gym)

Trying to use gym open-ai package (and somen other) I ran into some problems,
which structure I don't really understand.
As an example:
I tried to install gym in three different conda environments.
One way to do this is
pip install gym
Another is:
git clone https://github.com/openai/gym.git
cd gym
pip install -e .
A third would be:
pip3 install gym
In some environments I would use Python2, in other env. maybe Python 3.7
Even more possibilities for installation would be:
sudo pip install gym
(and even more permutations would be possible, if we would take into account,
if we activate an environment or don't activate any environment).
To me things get even more complicated, because I tried to install conda with
a not-administrator-user-account in Ubuntu, so that conda (or rather the user itself could not install any files in the /usr directory).
I began to test some of this possibilities and cases, because installation of some libaries
(e.g. keras-rl) seemed to need access to common ressources (/usr/ dir.), even if
installed in an local conda environment. But if so: would the installations in
different conda-environments interact?
And what, if one would install a package as local user in a conda environment and
afterward install a pip or pip3 as administrator. Would the admin-installation
overwrite (or overrule or interact) the environmental installation (or parts of it)?
While experimenting with the different possibilities (or more: while trying to
find a installations, which did not produce any errors like "gym not found" or
"attribute error ... " ) there did occur errors like:
Found existing installation: gym 0.15.4
Can't uninstall 'gym'. No files were found to uninstall.
after executing:
sudo pip3 install gym --force
So on this basis my questions specifically would be:
(1) Is there a best practice for establish good conda environments
(which don't tend to interact, especially if some packages
need sudo priviledges)?
And (2) if some environments interact with
general (sudo) ressources, how can they be resolved in a way,
that distinct environments can be tested and established beneath each other?
Annotation:
there was a similiar question:
conda environment pip is trying to install dependencies globally
some time ago, but the advice, not to use sudo, seems to be difficult to follow,
if some packages require access to global ressources.
So I would like to ask for a solution to interactions at bit more specifically.
you should not use sudo to install something in a conda environment. Most likely the used pip command is not stemming from the actual (activated?) environment, but the actual system-wide pip is used. Therefore you would need to use to use sudo to install to a system owned prefix.
You can check whether you are using the desired pip by invoking "which pip". The path should point to your environment. If it does not, you shall install pip inside your conda env.
I had the same problem before. I activated conda envirement and installed with pip3 locally since conda does not have support for it. Warning: Possible of wreckig some packs.
The conda envirement should ALLWAYS be activated before installing anything orelse it ends up as a global installation.
install a new conda envirement without using sudo. If it ask for sudo you need to remove the whole thing and clean up a bit. Its very easy to forget and NEVER use sudo !
You can try installing a newer version of python3.x (python 2 is getting history very soon anyways they said. Pip = python2, pip3 = python3. And to answer one of your new question if by installing globally will mess things up, not outside conda.
google pycharm and conda. there you can just use it to install 3 different types of envirements with python. Actually a darn good editor for python coding. The rest is more linux related when we talk about cleaning up PATHS etc.
I have no better to add! Hope you get it right.

How to move from pip to anaconda

I'm a beginner trying to play around with machine learning. I downloaded python, and used pip to download libraries like TensorFlow, Pandas, Numpy, etc.
Now, I find that Anaconda is a better package manager to use for machine learning. I'm not sure what I'm supposed to do. Do I have to download all the libraries with Anaconda (which I tried to do with Pandas, and it said the library is already downloaded)?
Could you guys explain to me how I can move from using pip to using anaconda? I really don't understand environments, and this package manager stuff, so please help me!
In principle there is no need to change your package manager. Simply switch to do conda install the next time you would do pip install. Think of it like this: Do you have to re-download everything when switching from internet-explorer to firefox? Probably, some things work a little different between conda and pip but for a basic beginner, these differences should be neglectable.
You could freeze your pip packages and re-install them inside a conda environment to have everything (e.g. package dependencies) neatly managed by Anaconda, which is imho good practice. Pip packages will be available in every subsequent created conda environment, so if you want to use different packages in different environments, better re-install those using conda.
There is some non-trivial difference between conda and pip, mentioned here and here.
Best practices are to use different environment for different purposes. On a conda environment, download or re-download all requirement packages for that environment. Also always install a conda package only after you are done with pip install. Using both two environment, be sure not use the "--user" on pip as conda have user priviledge issues connecting to packages installed by pip.
You can check this link for more information

Use pip or dnf to install python packages in Fedora?

When I install some python packages in Fedora, there're two ways:
use dnf install python-package
use pip install package
I notice even I use dnf update to make my Fedora the newest,
when I use pip, it still tell me something like
pip is a old version, please use pip update
I guess the dnf package management is different with python-pip package management.
So which one is more recommended to install python packages ?
Quoted from Gentoo Wiki:
It is important to understand that packages installed using pip will not be tracked by Portage. This is the case for installing any package through means other than the emerge command. Possible conflicts can be created when installing a Python package that is available in the Portage tree, then installing the same package using pip.
Decide which package manager will work best for the use case: either use emerge or pip for Python packages, but not both. Sometimes a certain Python packages will not be available in the Portage tree, in these cases the only option is to use pip. Be wise and make good choices!
This is true for almost any nowadays package managers. If you are using packages or certain package versions that only exists in pip, use it but don't try to install that from dnf. Doing this will not only cause file collisions but also will (most possibly) break the package manager's knowledge of the system, which usually leads to major package management issues.
Other solution would be using pip in user mode, without root permissions, which will install relevant things into your home directory.
So again, it's both okay to use pip or dnf, but just don't mix these two package managers together.

Difference between 'python setup.py install' and 'pip install'

I have an external package I want to install into my python virtualenv from a tar file.
What is the best way to install the package?
I've discovered 2 ways that can do it:
Extract the tar file, then run python setup.py install inside of the extracted directory.
pip install packagename.tar.gz from example # 7 in https://pip.pypa.io/en/stable/reference/pip_install/#examples
Is if there is any difference doing them in these 2 ways.
On the surface, both do the same thing: doing either python setup.py install or pip install <PACKAGE-NAME> will install your python package for you, with a minimum amount of fuss.
However, using pip offers some additional advantages that make it much nicer to use.
pip will automatically download all dependencies for a package for you. In contrast, if you use setup.py, you often have to manually search out and download dependencies, which is tedious and can become frustrating.
pip keeps track of various metadata that lets you easily uninstall and update packages with a single command: pip uninstall <PACKAGE-NAME> and pip install --upgrade <PACKAGE-NAME>. In contrast, if you install a package using setup.py, you have to manually delete and maintain a package by hand if you want to get rid of it, which could be potentially error-prone.
You no longer have to manually download your files. If you use setup.py, you have to visit the library's website, figure out where to download it, extract the file, run setup.py... In contrast, pip will automatically search the Python Package Index (PyPi) to see if the package exists there, and will automatically download, extract, and install the package for you. With a few exceptions, almost every single genuinely useful Python library can be found on PyPi.
pip will let you easily install wheels, which is the new standard of Python distribution. More info about wheels.
pip offers additional benefits that integrate well with using virtualenv, which is a program that lets you run multiple projects that require conflicting libraries and Python versions on your computer. More info.
pip is bundled by default with Python as of Python 2.7.9 on the Python 2.x series, and as of Python 3.4.0 on the Python 3.x series, making it even easier to use.
So basically, use pip. It only offers improvements over using python setup.py install.
If you're using an older version of Python, can't upgrade, and don't have pip installed, you can find more information about installing pip at the following links:
Official instructions on installing pip for all operating systems
Instructions on installing pip on Windows (including solutions to common problems)
Instructions on installing pip for Mac OX
pip, by itself, doesn't really require a tutorial. 90% of the time, the only command you really need is pip install <PACKAGE-NAME>. That said, if you're interested in learning more about the details of what exactly you can do with pip, see:
Quickstart guide
Official documentation.
It is also commonly recommended that you use pip and virtualenv together. If you're a beginner to Python, I personally think it'd be fine to start of with just using pip and install packages globally, but eventually I do think you should transition to using virtualenv as you tackle more serious projects.
If you'd like to learn more about using pip and virtualenv together, see:
Why you should be using pip and virtualenv
A non-magical introduction to Pip and Virtualenv for Python beginners
Virtual Environments
python setup.py install is the analog of make install: it’s a limited way to compile and copy files to destination directories. This doesn’t mean that it’s the best way to really install software on your system.
pip is a package manager, which can install, upgrade, list and uninstall packages, like familiar package managers including: dpkg, apt, yum, urpmi, ports etc. Under the hood, it will run python setup.py install, but with specific options to control how and where things end up installed.
In summary: use pip.
The question is about the preferred method to install a local tarball containing a python package, NOT about the advantage of uploading package to an indexing service like PyPi.
As lest I know some software distributor does not upload their package to PyPi, instead asking developers to download package from their website and install.
python setup.py install
This can work but not recommended. It's not necessary to unwrap the tarball file and go into it to run setup.py file.
pip install ../path/to/packagename.tar.gz
This is the way designed and preferred. Concise and align with PyPi-style packages.
More information about pip install can be found here: https://pip.readthedocs.io/en/stable/reference/pip_install/

Categories