Is there a better alternative to pip-autoremove?

Is there a better alternative to pip-autoremove? - python

I've recently been trying to use best practices when installing packages:
Only installing global packages when it makes sense.
Using virtual environments 99% of the time.
Uninstalling packages whenever they are no longer needed.
I find this to work well for me. However, #3 is a pain when uninstalling packages that have a ton of dependencies, like tensorflow. In comes, pip-autoremove to the rescue.
The issue with pip-autoremove is that it removes everything. Here's a scenario:
I pip install matplotlib because I need to show some plots.
I then realize that maybe seaborn could fit my needs a little better, so I pip install seaborn.
I realize that I was wrong and matplotlib will actually fit my needs just fine so I pip-autoremove seaborn.
Uh, oh! That removed seaborn AND matplotlib!
Now, I have to pip install matplotlib again before I can continue development.
Is there a better solution out there to this specific problem? Or, is this more of me needing to change my philosophy on package management?

I usually recommend using Poetry to handle your virtual environments. Poetry creates a single .venv per project and tracks all your dependencies with two files that can be tracked by git, pyproject.toml and poetry.lock.
When you remove with poetry a dependency (seaborn) that also depends on another installed dependency (matplotlib), poetry will remove all the extra dependencies that seaborn has installed and rollback the matplotlib dependencies to their previously installed version.

Related

Install prebuilt packages from conda-forge (e.g. cartopy) using poetry without relying on conda (using only the channel)

I'm testing poetry and I was wondering if it is possible to install prebuilt packages from conda-forge, as cartopy without relying on conda (so keeping a 100% poetry process). I googled this a bit but the only way I found is to install poetry within a conda venv using pip and then installing from conda-forge using conda and then tweaking poetry files to make it aware of the conda venv so that the TOML is written properly.
Packages like cartopy are a pain to install if not from a prebuilt version, if possible I'd change my conda stack to poetry stack if something like poetry add [?conda-forge?] cartopy works
Thanks.

Not currently possible. Conda is a generic package manager, not just a Python package manager. Furthermore, there is no dedicated metadata in Conda packages to discriminate whether or not they are Python packages, which I think would be a prerequisite for Poetry being able to determine whether the Conda package is even valid for installation. Hence, what OP requests cannot be a thing, or at least would it be a major undertaking to make it one.
However, others have requested similar features, so someone hopeful for such functionality could subscribe to notifications on those, or follow the Feature Roadmap.

What are the pitfalls of exclusively using PIP in a CONDA environment?

Background
The official documentation and this blog in the same website - recommend to install as many requirements as possible with conda then use pip. Apparently this is because conda will be unaware of any changes to the dependencies made by pip and therefore will not be able to resolve dependencies correctly.
Question
Now if one exclusively uses pip and go without installing anything with conda, it seems reasonable to expect conda does not need to be aware of any changes made by pip - as conda effectively becomes a mere tool to isolate dependencies and manage versions. However, this goes against official recommendation as one will NOT install as many requirements as possible with conda.
So the question remains: is there any known drawback from exclusively using pip in a conda environment?
Similar Topics
A similar topic in has been touched a bit in here but does not cover the case of exclusively using pip in a conda environment. I have also been here:
Specific reasons to favor pip vs. conda when installing Python packages
What is the difference between pip and conda?
Using Pip to install packages to Anaconda Environment

Not sure one can give a comprehensive answer on this, but some of the major things that come to mind are:
Lack of deep support for non-Python dependency resolution. While more wheels that bundle non-Python resources have become available over time, it is nowhere near the coverage that Conda provides by being a general package manager rather than Python-specific. For anyone doing interoperable computing (e.g., reticulate), I would expect Conda to be favored.
Optimized libraries. Sort of related to the first point, but the Anaconda team has made an effort to build optimized versions of packages (e.g., MKL for numpy). Not sure if the equivalent is available through PyPI.1
Wasteful redundancy across environments. Conda uses hardlinking when packages and environments are on the same volume, and supports softlinking for spanning across volumes. This helps to minimize replicating any packages that are installed in multiple environments.
Complicates exporting. When exporting (conda env export) Conda doesn't pick up all pip-installed packages - only the ones that come from PyPI. That is, it'll miss things installed from GitHub, etc.. If one did go the pip-only route, I think a more reliable export strategy would be to use pip freeze > requirements.txt, and then make a YAML like
channels:
- defaults
dependencies:
- python=3.8 # specify the version
- pip
- pip:
- -r requirements.txt
with which to recreate the environment.
All that said, I could easily imagine that none of these matter to some people (most of them are conveniences), especially those who tend to work purely in Python. In such cases, however, I don't see why one would not simply forgo Conda altogether and use a Python-specific virtual environment manager.
[1] Someone please correct me if you know otherwise.

How to move from pip to anaconda

I'm a beginner trying to play around with machine learning. I downloaded python, and used pip to download libraries like TensorFlow, Pandas, Numpy, etc.
Now, I find that Anaconda is a better package manager to use for machine learning. I'm not sure what I'm supposed to do. Do I have to download all the libraries with Anaconda (which I tried to do with Pandas, and it said the library is already downloaded)?
Could you guys explain to me how I can move from using pip to using anaconda? I really don't understand environments, and this package manager stuff, so please help me!

In principle there is no need to change your package manager. Simply switch to do conda install the next time you would do pip install. Think of it like this: Do you have to re-download everything when switching from internet-explorer to firefox? Probably, some things work a little different between conda and pip but for a basic beginner, these differences should be neglectable.
You could freeze your pip packages and re-install them inside a conda environment to have everything (e.g. package dependencies) neatly managed by Anaconda, which is imho good practice. Pip packages will be available in every subsequent created conda environment, so if you want to use different packages in different environments, better re-install those using conda.
There is some non-trivial difference between conda and pip, mentioned here and here.

Best practices are to use different environment for different purposes. On a conda environment, download or re-download all requirement packages for that environment. Also always install a conda package only after you are done with pip install. Using both two environment, be sure not use the "--user" on pip as conda have user priviledge issues connecting to packages installed by pip.
You can check this link for more information

Prevent pip from manipulating other modules

I installed pyswarm package using pip (pip install pyswarm).
The problem is that this upgrades my numpy to version 1.14.xx which is something that I really don't want.
Is there anyway to install python package without letting it manipulate other already installed packages?

Actually, there is nothing much you can do because the pyswarm is dependent on the specific version of numpy.
One Solution is you can use virtualenv to create separate python environment

You can use virutalenv to create separate python environments for you projects so that it versions of the libraries won't conflict between the project. I recommend using pipenv which is the combination of pip and virtualenv. It is very easy to easy and has powerful features.

Anaconda, Spyder, Mayavi

I'm using the anaconda distribution of python, and the spyder IDE. Installing mayavi, via conda install mayavi, breaks spyder by downgrading numpy 1.10.4 -> 1.9.3 as seen via conda list --revisions. I can 'fix' this problem by manually upgrading numpy again, but I suspect there will be issues with Mayavi.
My question(s): Is there a better way to integrate Mayavi and spyder in anaconda? And, more generally, is there a recommended protocol for managing package dependencies? If installing mayavi hadn't broken the very next thing I used (spyder), it could have been quite difficult to track the source of this error. Actually, I thought package management was the value proposition of, say, the anaconda distribution...
(Related but different question arises here.)

I had the same issue and was using the same combinatation of tools.
The solution is to use conda environments. Environments are independent 'spaces' where you can install particular combinations of packages independent of the 'main' set of packages that are there somewhere else. Detailed article here
The workflow would basically involve this:
Open Anaconda Prompt and setup a new conda environment for Mayavi ,eg. called 'mayavi_environment' :
conda create -n mayavi_environment python=(<PYTHONVERSION>)
where () is either 2.7,3.4 or which ever version you would like to create the environment with.
and once it's been created type :
activate mayavi_environment
Having done this then the necessary package dependencies need to be
installed. I too had issues with spyder, and this was sorted out by completely uninstalling it and installing it afresh in the environment. Here is a bunch of solutions to running spyder from the created environment.

Installing mayavi is little bit complicated. It uses VTK, numpy==1.15.3 and traits lib which can't be compiled without VC2015. However you can find unofficial .whl files here:
https://www.lfd.uci.edu/~gohlke/pythonlibs/
There are some ways to manage these dependencies. You can use pipenv:
https://pipenv.readthedocs.io/en/latest/advanced/
https://virtualenvwrapper.readthedocs.io/en/latest/
Or you can use conda environments of course. Above are alternatives.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.