How to compare requirement file and actually installed Python modules?

How to compare requirement file and actually installed Python modules? - python

Given requirements.txt and a virtualenv environment, what is the best way to check from a script whether requirements are met and possibly provide details in case of mismatch?
Pip changes it's internal API with major releases, so I seen advices not to use it's parse_requirements method.
There is a way of pkg_resources.require(dependencies), but then how to parse requirements file with all it's fanciness, like github links, etc.?
This should be something pretty simple, but can't find any pointers.
UPDATE: programmatic solution is needed.

You can save your virtualenv's current installed packages with pip freeze to a file, say current.txt
pip freeze > current.txt
Then you can compare this to requirements.txt with difflib using a script like this:
import difflib
req = open('requirements.txt')
current = open('current.txt')
diff = difflib.ndiff(req.readlines(), current.readlines())
delta = ''.join([x for x in diff if x.startswith('-')])
print(delta)
This should display only the packages that are in 'requirements.txt' that aren't in 'current.txt'.

Got tired of the discrepancies between requirements.txt and the actually installed packages (e.g. when deploying to Heroku, I'd often get ModuleNotFoundError for forgetting to add a module to requirements.)
This helps:
Use compare-requirements (GitHub)
(you'll need to pip install pipdeptree to use it.)
It's then as simple as...
cmpreqs --pipdeptree
...to show you (in "Input 2") which modules are installed, but missing from requirements.txt.
You can then examine the list and see which ones should in fact be added to requirements.txt.

Related

Question about Packaging in Python with pip

I saw this nice explanation video (link) of packaging using pip and I got two questions:
The first one is:
I write a code which I want to share with my colleagues, but I do not aim to share it via pypi. Thus, I want to share it internally, so everyone can install it within his/ her environment.
I actually needn't to create a wheel file with python setup.py bdist_wheel, right? I create the setup.py file and I can install it with the command pip install -e . (for editable use), and everyone else can do it so as well, after cloning the repository. Is this right?
My second question is more technical:
I create the setup.py file:
from setuptools import setup
setup(
name = 'helloandbyemate',
version = '0.0.1',
description="Say hello in slang",
py_modules=['hellomate'],
package_dir={"": "src"}
)
To test it, I write a file hellomate.py which contains a function printing hello, mate!. I put this function in src/. In the setup.py file I put only this module in the list py_modules. In src/ is another module called byemate.py. When I install the whole module, it installs the module byemate.py as well, although I only put hellomate in the list of py_modules. Has anyone an explanation for this behaviour?

I actually needn't to create a wheel file ... everyone else can do it so as well, after cloning the repository. Is this right?
This is correct. However, the installation from source is slower, so you may want to publish wheels to an index anyway if you would like faster installs.
When I install the whole module, it installs the module byemate.py as well, although I only put hellomate in the list of py_modules. Has anyone an explanation for this behaviour?
Yes, this is an artifact of the "editable" installation mode. It works by putting the src directory onto the sys.path, via a line in the path configuration file .../lib/pythonX.Y/site-packages/easy-install.pth. This means that the entire source directory is exposed and everything in there is available to import, whether it is packaged up into a release file or not.
The benefit is that source code is "editable" without reinstallation (adding/removing/modifying files in src will be reflected in the package immediately)
The drawback is that the editable installation is not exactly the same as a "real" installation, where only the files specified in the setup.py will be copied into site-packages directly
If you don't want other files such as byemate.py to be available to import, use a regular install pip install . without the -e option. However, local changes to hellomate.py won't be reflected until the installation step is repeated.
Strict editable installs
It is possible to get a mode of installation where byemate.py is not exposed at all, but live modifications to hellomate.py are still possible. This is the "strict" editable mode of setuptools. However, it is not possible using setup.py, you have to use a modern build system declaration in pyproject.toml:
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "helloandbyemate"
version = "0.0.1"
description = "Say hello in slang"
[tool.setuptools]
py-modules = ["hellomate"]
include-package-data = false
[tool.setuptools.package-dir]
"" = "src"
Now you can perform a strict install with:
pip install -e . --config-settings editable_mode=strict

Change .egg-info directory with pip install --editable

Is there a way to modify the location of the .egg-info directory that is generated upon:
pip install --editable .
I'm asking because I store my source code on (locally synchronized) cloud storage, and I want to install the package in editable mode on independent computers. So, ideally, the package directory would not be polluted with anything related to a given installation of the package.
I have tried using the --src option but this did not work; I don't understand what this option is meant to do.

You can achieve this by adding the egg_base option to setup.cfg:
[egg_info]
egg_base = relative/path/to/egg_info_folder
I have used this successfully in pip 19.3.1.
In my environment, the actual files that this altered are:
/anaconda/envs/my_env/lib/python3.6/site-packages/easy-install.pth
/anaconda/envs/my_env/lib/python3.6/site-packages/package_name.egg-link
Note, pip install raises an error if the egg_base base is not a relative path. But directly altering the files appears to work:
/anaconda/envs/my_env/lib/python3.6/site-packages/easy-install.pth:
/path/to/repository/folder
/anaconda/envs/my_env/lib/python3.6/site-packages/package_name.egg-link:
/path/to/egg_info/folder
/path/to/repository/folder/

Not sure if still relevant, but here is a setup.py based solution: https://jbhannah.net/articles/python-docker-disappearing-egg-info

Python Libraries - Making them work on PCs that arent mine

Apologies if this is a very stupid question but I am new to python and although I have done some googling I cannot think how to phrase my search query.
I am writing a python script that relies on some libraries (pandas, numpy and others). At some point in the future I will be passing this script onto my University so they can mark it etc. I am fairly confident that the lecturer will have python installed on their PC but I cannot be sure they will have the relevant libraries.
I have included a comments section at the top of the script outlining the install instructions for each library but is there a better way of doing this so I can be sure the script will work regardless of what libraries they have?
An example of my script header
############### - Instructions on how to import libraries - ###############
#using pip install openpyxl using the command - pip install openpyxl
#########################################################################
import openpyxl
import random
import datetime

Distributing code is a huge chapter where you can invest enormous amounts of time in order to get things right, according to the current best practices and what not. I think there is different degrees of rightness to solutions to your problem, with more rightness meaning more work. So you have to pick the degree you are comfortable with and are good to go.
The best route
Python supports packaging, and the safest way to distribute code is to package it. This allows you to specify requirements in a way that installing your code will automatically install all dependencies as well.
You can use existing cookiecutters, which are project-templates, to create the base you need to build packages:
pip install cookiecutter
cookiecutter https://github.com/audreyr/cookiecutter-pypackage
Running this, and answering the ensuing questions, will leave you with python code that can be packaged. You can add the packages you need to the setup.py file:
requirements = ['openpyxl']
Then you add your script under the source directory and build the package with:
pip wheel .
Let's say you called your project my_script, you got yourself a fresh my_script-0.1.0-py2.py3-none-any.wheel file that you can send to your lecturer. When they install it with pip, openpyxl will be automatically installed in case it isn't already.
Unfortunately, if they should also be able to execute your code you are not done yet. You need to add a __main__.py file to the my_script folder before packaging it, in which you import and execute the parts of your code that are runnable:
my_script/my_script/__main__.py:
from . import runnable_script
if __name__ == '__main__':
runnable_script.run()
The installed package can then be run as a module with python -m my_script
The next best route
If you really only have a single file and want to communicate to your lecturer which requirements are needed to run the script, send them both your script and a file called requirements.txt, which contains the following lines:
openpyxl
.. and that's it. If there are other requirements, put them on separate lines. If the lecturer has spent any amount of time working with python, they should know that running pip install -r requirements.txt will install the requirements needed to run the code you have submitted.
The if-you-really-have-to route
If all your lecturer knows how to do is entering python and then the name of your script, use DudeCoders approach. But be aware that silently installing requirements without even interactive prompts to the user is a huge no-no in the software-engineering world. If you plan to work in programming you should start with good practices rather sooner than later.

You can firstly make sure that the respective library is installed or not by using try | except, like so:
try:
import numpy
except ImportError:
print('Numpy is not installed, install now to continue')
exit()
Now, if numpy is installed in his computer, then system will just import numpy and will move on, but if Numpy is not installed, then the system will exit python logging the information required, i.e., x is not installed.
And implement the exact same for each and every library you are using.
But if you want to directly install the library which is not installed, you can use this:
Note: Installing libraries silently is not a recommended way.
import os
try:
import numpy
except ImportError:
print('Numpy is not installed, installing now......')
resultCode = os.system('pip install numpy')
if resultCode == 0:
print('Numpy installed!')
import numpy
else:
print('Error occured while installing numpy')
exit()
Here, if numpy is already installed, then the system will simply move on after installing that, but if that is not installed, then the system will firstly install that and then will import that.

How do I install modules on qpython3 (Android port of python)

I found this great module on within and downloaded it as a zip file. Once I extracted the zip file, i put the two modules inside the file(setup and the main one) on the module folder including an extra read me file I needed to run. I tried installing the setup file but I couldn't install it because the console couldn't find it. So I did some research and I tried using pip to install it as well, but that didn't work. So I was wondering if any of you could give me the steps to install it manually and with pip (keep in mind that the setup.py file needs to be installed in order for the main module to work).
Thanks!

The cleanest and simplest way I have found is to use pip from within QPython console as in This Answer
import pip
pip.main(['install', 'networkx'])

Step1: Install QPython.
Step2: Install AIPY for QPython.
Step3: Then go to QPython-->QPYPI-->AIPY and install from there Numpy, SciPy, Matplotlib, openCV etc.

Extract the zip file to the site-packages folder.
Find the qpyplus folder in that Lib/python3.2/site-packages extract here that's it.Now you can directly use your module from REPL terminal by importing it.

Is there any way to automatic detect required modules and packages in my own python project

I want to create a python distribution package, and need to check all the required packages for my own package.
For requirements.txt, it include all the packages it needs. However, I want to find a way to check all the packages I need, and also, for some of the packages, they are also some requirements of other packages in my project.
Is there any way to check which packages I need, and maintain the minimum required packages for my own project?

pip freeze will print whatever packages happended to be installed in your current envirenment. To list the packages that are actually being imported use pipreqs:
pip install pipreqs
pipreqs path_to_project

Usually people know their requirements by having separate virtual environments with required modules installed. In this case, it is trivial to make the requirements.txt file by running the following while being inside the virtual environment:
pip freeze > requirements.txt
Also, to avoid surprises in production and be confident about the code you have, would be good to have tests and a good test coverage. In case, there is a module imported but not installed, tests would show it.
Another way to find modules that cannot be imported is by using pylint static code analysis tool against the package. There is a special F0401 - Unable to import %s warning.
Demo:
imagine you have a test.py file that has a single import statement
import pandas
pandas module is not installed in the current python environment
here is the output of pylint test.py:
$ pylint test.py
No config file found, using default configuration
************* Module test
C: 1, 0: Missing module docstring (missing-docstring)
F: 1, 0: Unable to import 'pandas' (import-error)
W: 1, 0: Unused import pandas (unused-import)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.