Questions about several developers working on a Python Package

Questions about several developers working on a Python Package - python

If multiple developers were to develop a python package, how should we deal with these problems:
(1) What is the best practice to make all developers use the same dev environment?
(2) On dev end, do all of us need to use .../site-package/mypackage as our dev path (map code from version control to there and develop code there) as if things were installed to that path using pip?

Suggest use virtual environment (venv) to control python version and package version. Under venv, pip freeze > requirements.txt can only output what your project needs. Other developers just need to create a new one in their project folder and run pip install -r requirements.txt. It can avoid version conflicts of different projects.
https://docs.python.org/3/library/venv.html

Related

package not being permanently installed inside the python virtual environment

I'm building a rest-api using the Django Python framework. I'm using many external python packages. I have created a python virtual environment (python -m venv venv) and after activating the venv environment (venv\Scripts\activate), I installed the requests package (python -m pip install requests). Then I pushed my project to my git repo and cloned it onto another machine. When I tried to run my Django project, it asked me to install the requests package again. Why or how can I permanently install packages into my python virtual environment or someplace else where I wouldn't have to install packages again on different machines? I'm looking for a solution similar to NodeJS - npm of installing packages as all the packages are locally installed into the node_modules folder of the project and you don't have to reinstall them on different machines. Thanks

The environment itself is not shareable in the way you specify. I'd recommend to use Docker for this use-case. If you create a docker image which has the correct dependencies, then you can easily operate in the same environment on different computers. The python venv cannot be used this way.
Nevertheless, if your requirements.txt files specify package versions, then the venv you create on the two machines should be relatively similar (depending of course on other parameters like the OS, python version, etc.).

How to reinstall all user packages after updating Python version in Windows?

I have a Windows 7 machine running Python 3.8.5 with a very large number of physics/electronics/data analysis/simulation packages. As it turned out, I must have - for some inexplicable reason - installed the 32-bit version of Python instead of the 64-bit one despite having a 64-bit system. And I didn't notice until very recently when I was trying to install some packages that require 64-bit Python. Hence I've now downloaded and installed the latest Python version that is supported by Windows 7, which seems to be 3.8.10.
Question: What is the easiest and also fail-safe way to reinstall all the user packages - that I currently have under 3.8.5 - to 3.8.10?
For some reason, I couldn't find any "canonical" solution for this online. As it seems, Python does not come with any built-in support for updating or system migration and I'm honestly wondering why...
Anyway, my first idea was to get a list of all user (= "local"?) packages currently installed under 3.8.5, but I don't know how. Reason: Doing help('modules') inside the interpreter will list all packages and I don't see a way to "selectively apply" pip to a specific Python version, e.g. something like python-3.8.5 -m pip list --local is not supported.
After getting a list of the user packages, I was thinking to pack it into a batch command pip install package_1 package_2 <...> package_N, thus reinstalling everything to Python 3.8.10. And afterwards uninstalling Python 3.8.5 and removing all environment variables from system PATH.
Is this the proper way to do this?

Anyway, my first idea was to get a list of all user (= "local"?) packages currently installed under 3.8.5, but I don't know how.
Create a list of installed packages with pip freeze > pkglist.txt or pip list --format=freeze. If you already have one, that's great.
Then uninstall 32-bit Python 3.8.5 and clean your path for all Python related variables. Now, install 64-bit Python 3.8.10.
After reinstalling, you can install back all the packages with pip install -r pkglist.txt and it will restore the exact versions of the packages.
If you insist on having both 32-bit and 64-bit versions installed and also have the Python Launcher installed, you could invoke 32 and 64 bit versions separately with py -3.8-64 -m pip and py -3.8-32 -m pip.
I don't see a way to "selectively apply" pip to a specific Python version.
This is possible with the Python Launcher on Windows. But only between major/minor versions and not the patch versions according to its help message.
I would also recommend creating a virtual environment this time before installing the packages and leaving the root environment alone. You can create one named venv with just python -m venv venv, activate it with ./venv/Scripts/activate and proceed with the installation of packages.
Nope, doesn't work. After installing the packages with the newer Python version in PATH, e.g. Jupyter won't start.
If the Jupyter error persists, you could try pinning packages to their most recent patch/minor versions to update them and yet not break your code.
As a last resort, you could try installing Python 3.10 alongside your current Python installation (without uninstall or editing the PATH) and then installing the absolute latest versions of the packages in a 3.10 virtual environment to see if it works for you. You would invoke the two versions with Py Launcher, e.g. py -3.10 and py -3.8.

If I understood correctly, you have multiple packages like NumPy, pandas etc. installed on your machine, and you want to reinstall them "automatically" on a fresh installation of python.
The method (I use) to perform such an operation is by creating a file named setup.py which includes a list of all the packages.
Bellow, I am attaching an example of such a file I use in one of my projects:
from setuptools import setup, find_packages
setup(
name='surface_quality_tools',
version='0.1',
install_requires=["matplotlib", "psutil", "numpy", "scipy", "pandas", "trimesh", "pyglet", "networkx", "protobuf",
"numpy-stl", "sklearn", "opencv-python", "seaborn", "scikit-image", "flask", "tqdm", "pytest"],
package_data={'': ['*.json']},
packages=find_packages(include=[])
)
to run the installation you should open a command prompt from inside the project directory and run:
pip install -e .
You can find a nice example in this blog page

One common way of handling packages in Python is via virtual environments. You can use Anaconda (conda), venv or any of several other solutions. For example, see this post:
https://towardsdatascience.com/virtual-environments-104c62d48c54#:~:text=A%20virtual%20environment%20is%20a,a%20system%2Dwide%20Python).
The way this works in by keeping the Python interpreter separate from the virtual environment that contains all the necessary packages.
Probably the main reason Python doesn't feature migration tools (at least as part of standard library) is because pip - the main package tool - doesn't handle conflict resolution all too well. When you update a version of Python it might so happen (especially with niche packages) that some of them won't work any more and pip often won't be able to solve the dependencies. This is why it's a good idea to keep a separate venv for different Python versions and different projects.
The other tool you could use for easy migration is Docker which is a semi-virtual machine working on top of your host OS and containing usually some linux distribution, Python along with the necessary packages necessary for running and development.
It takes a bit of time to set up a container image initially but afterwards setting everythin on a new machine or in the cloud becomes a breeze.
Listing currently installed packages is done via pip freeze command, the output of which you can then pipe into a file to keep a record of project requirements, for example pip freeze > requirements.txt.

How to modify / edit an installed anaconda package

I have some issues with a published package and wish to edit the code myself (may generate a pull request later to contribute). I am quite confused about how to do this since it seems there is a lack of step-by-step guidance. Could anybody give me a very detailed instruction about how this is done (or a link)? My understanding and also my questions about the workflow are:
Fork the package through git/github and have a local synced copy (done!).
Create a new Anaconda environment (done!)?
Install the package as normal: $conda install xxx or $python setup.py develop?
Do I make changes to the package directly in the package folder in Anaconda if I use python setup.py develop?
Or make changes to the local forked copy and install/update again and what are the commands for this?
Do I need to update the setup.py file as well before running it either way?

You can simply git-clone the package repo to your local computer and then install it in "development" or "editable" mode. This way you can easily make changes to the code while at the same time incorporating it into your own projects. Of course, this will also allow you to create pull requests later on.
Using Anaconda (or Miniconda) you have 2 equivalent options for this:
using conda (conda-develop):
conda develop <path_to_local_repo>
using pip (pip install options)
pip install --editable <path_to_local_repo>
What these commands basically do is creating a link to the local repo-folder inside the environments site-packages folder.
Note that for "editable" pip installs you need a a basic setup.py:
import setuptools
setuptools.setup(name=<anything>)
On the other hand the conda develop <path_to_local_repo> command unfortunately doesn't work in environment.yml files.

Keeping Installed Packages When Updating Python

I've got python 3.7 installed on Windows 10. The recommended way to upgrade to 3.8 appears to be to do a new installation, which means I will have both versions installed. I don't need both versions, but I would like to keep all the packages I installed for version 3.7.
How do I achieve this please? Also will new new path variable for 3.8 replace the one for 3.7?
The process for such a common use case seems strangely complex. Am I missing something?

Simple solution would be in CMD to do
pip freeze > packages.txt
This will write all your current packages to the text file 'packages.txt'
Then uninstall Python 3.7 as you would any Windows program then install Python 3.8 and in CMD do
pip install -r packages.txt
This will install all the packages that you had before.
Though I would recommend using conda as that handless Python versions and packages for you, along with environments.

One way to do this is to run:
python3.7 -m pip freeze > installed.txt
Then, after installing the new Python version you can install the packages with:
python3.8 -m pip install -r installed.txt
There is a chance that the packages you installed for your old Python installation are not compatible with the new version. For that reason it is safer to keep both Python installations and then use virtual environments for each of your projects.
You can create a virtualenv for each of your projects, using the Python version you need for that project, and install your dependencies only in the virtualenv for that specific project. This way you can avoid the situation where project A requires an old version of a certain package but project B requires a newer one. If you install all your packages globally you run into problems in this case.
See also What is a virtualenv, and why should I use one?

I would recommend moving over to conda to manage your environments.
https://docs.conda.io/projects/conda/en/latest/user-guide/install/windows.html
The current thinking for most of the development projects that I've worked on involving python is that the version and libraries are specified on a per project basis. Conda allows you to freeze the environment so that it's more portable. You can generate an environment.yml file that allows someone to recreate your environment from scratch, and you can maintain only the packages needed for a given project.
As per your original question, you can set the PYTHONPATH to point to the old and new directories. I can't guarantee that the libraries will work though since there could be version compatibility issues.

How to install python packages without root privileges?

I am using numpy / scipy / pynest to do some research computing on Mac OS X. For performance, we rent a 400-node cluster (with Linux) from our university so that the tasks could be done parallel. The problem is that we are NOT allowed to install any extra packages on the cluster (no sudo or any installation tool), they only provide the raw python itself.
How can I run my scripts on the cluster then? Is there any way to integrate the modules (numpy and scipy also have some compiled binaries I think) so that it could be interpreted and executed without installing packages?

You don't need root privileges to install packages in your home directory. You can do that with a command such as
pip install --user numpy
or from source
python setup.py install --user
See https://stackoverflow.com/a/7143496/284795
The first alternative is much more convenient, so if the server doesn't have pip or easy_install, you should politely ask the admins to add it, explaining the benefit to them (they won't be bothered anymore by requests for individual packages).

You could create a virtual environment through the virtualenv package.
This creates a folder (say venv) with a new copy of the Python executable and a new site-packages directory, into which you can "install" any number of packages without needing any kind of administrative access at all. Thus, activating the environment through source venv/bin/activate will give Python an environment that's equivalent to having those packages installed.
I know this works for SGE clusters, although how the virtual environment is activated might depend on your cluster's configuration.
You can try installing virtualenv on your cluster within your own site-packages directory using the following steps:
Download virtualenv from here, put it on your cluster
Install it using setup.py to a specific, local directory to serve as your own site-packages:
python setup.py build
python setup.py install --install-base /path/to/local-site-packages
Add that directory to your PYTHONPATH:
export PYTHONPATH="/path/to/local-site-packages:${PYTHONPATH}"
Create a virtualenv:
virtualenv venv

You can import a module from an arbitrary path by calling:
sys.path.append()

The Python Distribution Anaconda solves many of the issues discussed in this questions. Anaconda does not require Admin or root access and is able to install to your home directory. Anaconda comes with many of the packages in question (scipy, numpy, sklearn, etc...) as well as the conda installer to install additional packages should additional ones be necessary.
It can be downloaded from https://www.continuum.io/downloads

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.