Python: How to detect unused packages and remove them [duplicate] - python

This question already has answers here:
Delete unused packages from requirements file
(5 answers)
Closed 3 months ago.
I use pip freeze > requirements.txt to gather all packages I installed. But after developing many days, some packages are now unused.
How can I find these unused packages and remove them, to make my project more clear?

Inside Pycharm
Go to code > inspect code
Select Whole project option and click OK.
In inspection results panel locate Package requirements section under Python (note that this section will be showed only if there is any requirements.txt or setup.py file). The section will contain one of the following messages:
Package requirement '' is not satisfied if there is any package that is listed in requirements.txt but not used in any .py file.
Package '' is not listed in project requirements if there is any package that is used in .py files, but not listed in requirements.txt.
You have all your required packages remove/add them accordingly.

I just found this: https://pypi.python.org/pypi/pip_check_reqs/2.0.
Find packages that should or should not be in requirements for a
project

I doubt there can be a fully automatic way to do this. "Unused packages" is a very ambiguous statement: unused by whom? The only way for a system utility to figure out whether a package is used somewhere, or not, is to parse every python script installed anywhere in the system; a rather impractical solution.
So, what you could do, is to look in every python script and module you created; find what is being imported and then, if you have two different requirements.txt files from before you installed the packages and one after, it may be possible to figure out which ones you can uninstall without breaking anything. I do not recommend this though.
A much better way is to use virtual environments, but you must do this before you start developing and installing new packages.

Related

Remove Redundant Python dependencies which are not used anywhere in the environment [duplicate]

I am aware that pip freeze > requirements.txt exists, yet that that prints out my system packages, of which only a few my directory/ project needs.
I am not using a virtualenv so I'm pretty sure I can't print out local packages like that.
I also know that pipdeptree exsists but I also don't see how that solves my problem?
I believe tools like the following could help:
pipreqs
pigar
As far as I can tell, these tools read the code in the directory and try to figure out the dependencies required based on the import statements they found in the code.
Related:
https://stackoverflow.com/a/61202584
https://stackoverflow.com/a/61540466
https://stackoverflow.com/questions/61143402/how-to-generate-requirements-txt-for-given-py-sources-folder-or-specific-py-file
https://stackoverflow.com/a/31684470

How to make pip check for already installed pgks from multiple directories when installing to a --target dir?

For internal reasons my group shares a conda environment with a number of different groups. This limits flexibility of the package installation, because we don't want to accidentally update dependent packages (I know we live in the past...) To get around the inflexibility my group wants install the packages we develop in a remote directory. Using pip to install the packages works fine using the --target flag to designate the new/remote install folder. We will then modify our PYTHONPATHin our .bashrc to access our newly installed packages via standard import x.
The issue I have is the packages in our setup.py defined in the install_requires=['pandas==0.24.1']are also being installed in the remote directory, even though that requirement is satisfied by the shared python site_packages. What appears to be happening is that pip is installing the dependencies only looking in the remote packages directory. Is there some way install our packages while also having pip look in multiple places for package requirement satisfaction, specifically our python installation's site-packages?
I was thinking pip would use PYTHONPATH to check if a dependency is met, but that does not seem to be the case.
Please let me know if this does not make sense, packaging is still new to me. So i am sure I used the wrong terms all over the place.
I believe using "path configuration files" might help.
Say you have some packages installed in /path/to/external-packages and the regular location for site packages in the current environment is /path/to/site-packages.
Then you could add a file /path/to/site-packages/external-packages.pth with the following content:
/path/to/external-packages
I believe this should at least work for some pip commands: check, list, show, maybe more.
Be careful to read about and experiment with this technique, as it may have undesired side effects. Additionally, if I am not mistaken, there should be no need for modification to PYTHONPATH environment variable.

Installing third party modules in python3 - Ubuntu

In short, my question is, how do I install the latest version of scikit-image into my usr/lib/python3/dist-packages so I can actually use it? I think there is a problem with my understanding of how third-party modules are installed. As a newb, I don’t know how to rectify that, hence this post.
I need help to understand how to install packages in python3 up until now I have used pip/pip3/apt-get/synaptic etc and it has worked fine for many packages. However, I have hit several barriers (Skimage, opencv, plantcv in python3). I must emphasise, the problem I am having is using these packages in python3, not 2.7.
For example, I want to use the latest version of scikit-image (0.14) with python3. (http://scikit-image.org/) I have tried using the installation instructions and have not yet successfully managed to install it. I have navigated to my usr/lib/python3/dist-packages and copied scikit-image into this directory (I have all the dependencies installed in here already).
Image of my folder for dist-packages as proof
As you can see, the folder containing skimage is in the directory I want to be installed in, how do I actually install it? Do I have to extract skimage out of the folder into the directory and then run the install command? If I navigate to usr/lib/python3/dist-packages/scikit-image and then run pip install -e . I get an error, stating that I need numpy. If I write a python script using python3 I can clearly see I have it installed (and I have been using it for a long time). So, there must be a problem in how I have this package in my file system. I think a janky workaround would be to copy all the modules into my working directory and Import them that way as if they were modules I have made myself, but this obviously negates the whole point of installing packages.
This has also happened with another package called plantcv. Where I went into the directory usr/lib/python3/dist-packages then cloned the source from git hub and installed as per instructions. When I import plantcv in my python3 script. It Imports fine. But, there is nothing in it, as python cannot see the modules which are inside this folder at usr/lib/python3/dist-packages/plantcv/plantcv.
There is clearly some comprehension here that I am missing, as I have a similar problem for two packages now. Please, Internet. Help me understand what I am missing!
You simply need to copy the folder in /usr/lib/python3/dist-packages/package-name
However, there are certain things that are specific to python packages. The folder named package name should be a valid package. A good indicator of that is it will contain a file "__init__.py". It is very likely that every sub-directory inside this package directory will contain a "__init__.py" file. It depends on whether there are modules inside these sub-directories.
In your code simply import the package like the following.
import package-name
where package-name can be skimage

Delete unused packages from requirements file

Is there any easy way to delete no-more-using packages from requirements file?
I wrote a bash script for this task but, it doesn't work as I expected. Because, some packages are not used following their PyPI project names. For example;
dj-database-url
package is used as
dj_database_url
My project has many packages in its own requirements file, so, searching them one-by-one is too messy, error-prone and takes too much time. As I searched, IDEs don't have this property, yet.
You can use Code Inspection in PyCharm.
Delete the contents of your requirements.txt but keep the empty file.
Load your project in,
PyCharm go to Code -> Inspect code....
Choose Whole project option in dialog and click OK.
In inspection results panel locate Package requirements section under Python (note that this section will be showed only if there is any requirements.txt or setup.py file).
The section will contain one of the following messages:
Package requirement '<package>' is not satisfied if there is any package that is listed in requirements.txt but not used in any .py file.
Package '<package>' is not listed in project requirements if there is any package that is used in .py files, but not listed in requirements.txt.
You are interested in the second inspection.
You can add all used packages to requirements.txt by right clicking the Package requirements section and selecting Apply Fix 'Add requirements '<package>' to requirements.txt'. Note that it will show only one package name, but it will actually add all used packages to requirements.txt if called for section.
If you want, you can add them one by one, just right click the inspection corresponding to certain package and choose Apply Fix 'Add requirements '<package>' to requirements.txt', repeat for each inspection of this kind.
After that you can create clean virtual environment and install packages from new requirements.txt.
Also note that PyCharm has import optimisation feature, see Optimize imports.... It can be useful to use this feature before any other steps listed above.
The best bet is to use a (fresh) python venv/virtual-env with no packages, or only those you definitely know you need, test your package - installing missing packages with pip as you hit problems which should be quite quick for most software then use the pip freeze command to list the packages you really need. Better you you could use pip wheel to create a wheel with the packages in.
The other approach would be to:
Use pylint to check each file for unused imports and delete them, (you should be doing this anyway),
Run your tests to make sure that it was right,
Use a tool like snakefood or snakefood3 to generate your new list of dependencies
Note that for any dependency checking to work well it is advisable to avoid conditional import and import within functions.
Also note that to be sure you have everything then it is a good idea to build a new venv/virtual-env and install from your dependencies list then re-test your code.
You can find obsolete dependencies by using deptry, a command line utility that checks for various issues with a project's dependencies, such as obsolete, missing or transitive dependencies.
Add it to your project with
pip install deptry
and then run
deptry .
Example output:
-----------------------------------------------------
The project contains obsolete dependencies:
Flask
scikit-learn
scipy
Consider removing them from your projects dependencies. If a package is used for development purposes, you should add
it to your development dependencies instead.
-----------------------------------------------------
Note that for the best results, you should be using a virtual environment for your project, see e.g. here.
Disclaimer: I am the author of deptry.
In pycharm go to Tools -> Sync Python Requirements. There's a 'Remove unused requirements' checkbox.
I've used with success pip-check-reqs.
With command pip-extra-reqs your_directory it will check for all unused dependencies in your_directory
Install it with pip install pip-check-reqs.

Python Package and module version management

I plan to develop a medium scale web application (with lot of back-end work) using multiple packages (internal and external), since it's my first experience with such scale, can I get some advice on:
How to ensure that all dependencies are satisfied (no required package/module is missing, packages were imported from right location etc.)
How to manage versions of packages/modules. I want to be able to introduce new versions of packages/modules with minimal code changes in other connected packages/modules.
Please let me know if there is a tool that can be of help in this.
If there is any other challenge (which I may not even be aware of) that comes in managing code with this scale then please caution me and provide ways to resolve it.
What I know currently (and think that it may help in code management):
__all__ to define exportable modules.
Use of preceding single underscore to prevent modules from getting imported
__init__.py to manage imports at package level (so when I do import package_name then __init__.py can centrally control other imports)
PEP 328 to manage relative imports, importlib and other such stuff around importing.
I suppose some of the 3rd party packages also define __version__ or VERSION variable to check the version at runtime, but I am not sure if I can rely on it.
If there are books, blogs etc. that can help then please let me know.
EDIT: I am not sure if this type of question fits here on SO. If not then I apologize.
How to ensure that all dependencies are satisfied (no required package/module is missing, packages were imported from right location etc.)
Use a virtualenv and freeze the packages with pip to a requirements file:
$ pip freeze > requirements.txt
$ cat requirements.txt
argparse==1.2.1
distribute==0.6.31
wsgiref==0.1.2
Basically, use virtualenv, distribute and pip whenever you have some code that'll need a module installed.
To install packages according to the version specified in your requirements.txt:
pip install -r requirements.txt

Categories