Using virtual environments to manage develop/deploy packages in Jupyterhub

Using virtual environments to manage develop/deploy packages in Jupyterhub - python

I have a package that I am developing for a local server. I would like to have the current stable release importable in a Jupyter notebook using import my_package and the current development state importable (for end-to-end testing and stuff) with import my_package_dev, or something like that.
The package is version controlled with git. The master branch holds the stable release, and new development work is done in the develop branch.
I currently pulled these two branches into two different folders:
my_package/ # tracks master branch of repository
setup.py
requirements.txt
my_package/
__init__.py
# other stuff
my_package_dev/ # tracks develop branch of repository
setup.py
requirements.txt
my_package/
__init__.py
# other stuff for dev branch
My setup.py file looks like this:
from setuptools import setup
setup(
name='my_package', # or 'my_package_dev' for the dev version
# metadata stuff...
)
I can pip install my_package just fine, but I have been unable to get anything to link to the name my_package_dev in Python.
Things I have tried
pip install my_package_dev
Doesn't seem to overwrite the existing my_package, but doesn't seem to make my_package_dev available either, even though pip says it finishes OK.
pip install -e my_package_dev
makes an egg and puts the development package path in easy-install.pth, but I cannot import my_package_dev, and my_package is still the old content.
Adding a file my_package_dev.pth to site-packages directory and filling it with /path/to/my_package_dev
causes no visible change. Still does not allow me to import my_package_dev.
Thoughts on a solution
It looks like the best approach is going to be to use virtual environments, as discussed in the answers.

With pip install you install packages by its name in setup.py's name attribute. If you have installed both and execute pip freeze, you will see both packages listed. Which code is available depends on how they are included in Python path.
The issue is those two packages contains just a python module named my_package, that it why you can not import my_package_dev (it does not exist).
I would suggest you to have an working copy for each version (without modifying package name) and use virtualenv to keep environments isolated (one virtualenv for stable version and the other for dev).
You could also use pip's editable install to keep the environment updated with the working copies.
Note: Renaming my_package_dev's my_package module directory to my_package_dev, will also work. But it will be harder to merge changes from one version to the other.

The answer provided by Gonzalo got me on the right track: use virtual environments to manage two different builds. I created the virtual environment for the master (stable) branch with:
$ cd my_package
$ virtualenv venv # make the virtual environment
$ source venv/bin/activate
(venv) $ pip install -r requirements.txt # install everything listed as a requirement
(venv) $ pip install -e . # install my_package dynamicially so that any changes are visible right away
(venv) $ sudo venv/bin/python -m ipykernel install --name 'master' --display-name 'Python 3 (default)'
And for the develop branch, I followed the same procedure in my my_package_dev folder, giving it a different --name and --display-name value.
Note that I needed to use sudo for the final ipykernel install command because I kept getting permission denied errors on my system. I would recommend trying without sudo first, but for this system it needed to be installed system-wide.
Finally, to switch between which version of the tools I am using, I just have to select Kernel -> Change kernel and choose Python 3 (default) or Python 3 (develop). The import stays the same (import my_package), so nothing in the notebook has to change.
This isn't quite my ideal scenario since it means that I will then have to re-run the whole notebook any time I change kernels, but it works!

Related

How to modify / edit an installed anaconda package

I have some issues with a published package and wish to edit the code myself (may generate a pull request later to contribute). I am quite confused about how to do this since it seems there is a lack of step-by-step guidance. Could anybody give me a very detailed instruction about how this is done (or a link)? My understanding and also my questions about the workflow are:
Fork the package through git/github and have a local synced copy (done!).
Create a new Anaconda environment (done!)?
Install the package as normal: $conda install xxx or $python setup.py develop?
Do I make changes to the package directly in the package folder in Anaconda if I use python setup.py develop?
Or make changes to the local forked copy and install/update again and what are the commands for this?
Do I need to update the setup.py file as well before running it either way?

You can simply git-clone the package repo to your local computer and then install it in "development" or "editable" mode. This way you can easily make changes to the code while at the same time incorporating it into your own projects. Of course, this will also allow you to create pull requests later on.
Using Anaconda (or Miniconda) you have 2 equivalent options for this:
using conda (conda-develop):
conda develop <path_to_local_repo>
using pip (pip install options)
pip install --editable <path_to_local_repo>
What these commands basically do is creating a link to the local repo-folder inside the environments site-packages folder.
Note that for "editable" pip installs you need a a basic setup.py:
import setuptools
setuptools.setup(name=<anything>)
On the other hand the conda develop <path_to_local_repo> command unfortunately doesn't work in environment.yml files.

Installing my own python module inside a virtual environment

What I have:
local Python3 files that I want to turn into a module test_module
test_module folder containing an empty __init__.py, a setup.py file (see below) and subdirectories with several source
files
What I want:
continuously work on and improve test_module locally
have an easy way to install test_module and all its dependencies locally in my own virtual environment (created using python3 -m venv my_environment)
run files that make use of the module via python myexample.py, without having to take care of adapting my local PYTHONPATH variable each time i enter or exit the my_environment
share my python code with others via git, and allow them to install their code locally on their machines using the same procedure (as simple as possible)
learn best practices on how to create my own module
How I'm doing it at the moment:
pip freeze > requirements.txt and pip install -r requirements.txt for installing dependencies
adding export PYTHONPATH="${PYTHONPATH}:." to my_environment/bin/activate, to have my own module in the search path
(as found here: How do you set your pythonpath in an already-created virtualenv?)
I'd like to know if there are "cleaner" solutions based on setup.py, possibly involving something like pip install ./test_module or similar that takes care of 2.-3. automagically.
My current setup.py file looks as follows
from setuptools import setup
setup(
name='test_module',
version='0.1',
description='Some really good stuff, that I am still working on',
author='Bud Spencer',
author_email='bud.spencer#stackoverflow.com',
packages=['test_module'], # same as name
install_requires=['numpy', 'scipy', 'sklearn', 'argparse'], # external packages as dependencies
)

It sounds like you want to run pip install -e <path/url> from within your virtual env, which will install a package (with a setup.py file as you have) from either a local path or a Git repo. See https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support for an explanation on the syntax of the latter.
Example:
pip install -e git+https://github.com/me/test_module/#egg=test-module
If you have already installed and want to pull the latest code from the repo, add an --upgrade switch to the above.

Does python pip have the equivalent of node's package.json?

In NodeJS's npm you can create a package.json file to track your project dependencies. When you want to install them you just run npm install and it looks at your package file and installs them all with that single command.
When distributing my code, does python have an equivalent concept or do I need to tell people in my README to install each dependency like so:
pip install package1
pip install package2
Before they can use my code?

Once all necessary packages are added
pip freeze > requirements.txt
creates a requirement file.
pip install -r requirements.txt
installs those packages again, say during production.

The best way may be pipenv! I personally use it!
However in this guide i'll explain how to do it with just python and pip! And without pipenv! That's the first part! And it will give us a good understanding about how pipenv works! There is a second part that treat pipenv! Check the section pipenv (The more close to npm).
Python and pip
To get it all well with python! Here the main elements:
virtual environment
requirements file (listing of packages)
pip freeze command
How to install packages from a requirements file
Virtual environment and why
Note that for this the package venv is to be used! It's the official thing! And shiped with python 3 installation starting from 3.3+ !
To know well the what is it and the why check this out
https://docs.python.org/3/tutorial/venv.html
In short! A virtual environment will help us manage an isolated version of python interpreter! And so too installed packages! In this way! Different project will not have to depends on the same packages installation and have to conflict! Read the link above explain and show it well!
... This means it may not be possible for one Python installation to meet the requirements of every application. If application A needs version 1.0 of a particular module but application B needs version 2.0, then the requirements are in conflict and installing either version 1.0 or 2.0 will leave one application unable to run.
You may like to check the explanation on flask framework doc!
https://flask.palletsprojects.com/en/1.1.x/installation/#virtual-environments
Why we care about this and should use it! To isolate the projects! (each have it's environment)! And then freeze command will work per project base! Check the last section
Usage
Here a good guide on how to setup and work:
https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
Check the installation section first!
Then
To create a virtual environment you go to your project directory and run:
On macOS and Linux:
> python3 -m venv env
On Windows:
> py -m venv env
Note You should exclude your virtual environment directory from your version control system using .gitignore or similar.
To start using the environment in the console, you have to activate it
https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#activating-a-virtual-environment
On macOS and Linux:
> source env/bin/activate
On Windows:
> .\env\Scripts\activate
See the part on how you check that you are in the environment (using which (linux, unix) or where (windows)!
To deactivate you use
> deactivate
Requirement files
https://pip.pypa.io/en/latest/user_guide/#requirements-files
“Requirements files” are files containing a list of dependencies to be installed using pip install like so
(How to Install requirements files)
pip install -r requirements.txt
Requirements files are used to hold the result from pip freeze for the purpose of achieving repeatable installations. In this case, your requirement file contains a pinned version of everything that was installed when pip freeze was run.
python -m pip freeze > requirements.txt
python -m pip install -r requirements.txt
Some of the syntax:
pkg1
pkg2==2.1.0
pkg3>=1.0,<=2.0
== for precise!
requests==2.18.4
google-auth==1.1.0
Force pip to accept earlier versions
ProjectA
ProjectB<1.3
Using git with a tag (fixing a bug yourself and not waiting)
git+https://myvcs.com/some_dependency#sometag#egg=SomeDependency
Again check the link https://pip.pypa.io/en/latest/user_guide/#requirements-files
I picked all the examples from them! You should see the explanations! And details!
For the format details check: https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format
Freeze command
Pip can export a list of all installed packages and their versions using the freeze comman! At the run of the command! The list of all installed packages in the current environment get listed!
pip freeze
Which will output something like:
cachetools==2.0.1
certifi==2017.7.27.1
chardet==3.0.4
google-auth==1.1.1
idna==2.6
pyasn1==0.3.6
pyasn1-modules==0.1.4
requests==2.18.4
rsa==3.4.2
six==1.11.0
urllib3==1.22
We can write that to a requirements file as such
pip freeze > requirements.txt
https://pip.pypa.io/en/latest/cli/pip_freeze/#pip-freeze
Installing packages Resume
By using venv (virtual environment) for each project! The projects are isolated! And then freeze command will list only the packages installed on that particular environmnent! Which make it by project bases! Freeze command make the listing of the packages at the time of it's run! With the exact versions matching! We generate a requirements file from it (requirements.txt)! Which we can add to a project repo! And have the dependencies installed!
The whole can be done in this sense:
Linux/unix
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
Windows
py -m venv env
.\env\Scripts\activate
pip3 install -r requirements.txt
First time setup after cloning a repo!
Creating the new env!
Then activating it!
Then installing the needed packages to it!
Otherwise here a complete guide about installing packages using requiremnets files and virtual environment from the official doc: https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
This second guide show all well too: https://docs.python.org/3/tutorial/venv.html
Links listing (already listed):
https://pip.pypa.io/en/latest/user_guide/#requirements-files
https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format
https://pip.pypa.io/en/latest/cli/pip_freeze/#pip-freeze
pipenv (The more close to npm)
https://pipenv.pypa.io/en/latest/
pipenv is a tool that try to be like npm for python! Is a super set of pip!
pipenv create virtual environment for us! And manage the dependencies!
A good feature too is the ability to writie packages.json like files! With scripts section too in them!
Executing pipfile scripts
run python command with alias in command line like npm
Installation
https://pipenv.pypa.io/en/latest/install/
virtualenv-mapping-caveat
https://pipenv.pypa.io/en/latest/install/#virtualenv-mapping-caveat
For me having the env created within the project (just like node_modules) should be even the default! Make sure to activate it! By setting the environment variable!
pipenv can seems just more convenient!
Mainly managing run scripts is too good to miss on! And a one tool that simplify it all!
Basic usage and comparing to npm
https://pipenv.pypa.io/en/latest/basics/
(make sure to check the guide above to get familiar with the basics)
Note that the equivalent of npm package.json is the PipFile file!
An example:
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
flask = "*"
simplejson = "*"
python-dotenv = "*"
[dev-packages]
watchdog = "*"
[scripts]
start = "python -m flask run"
[requires]
python_version = "3.9"
There is Pipfile.lock like package.lock
To run npm install equivalent! You run pipenv install!
To insall a new package
pipenv install <package>
This will create a Pipfile if one doesn’t exist. If one does exist, it will automatically be edited with the new package you provided.
Just like with npm!
$ pipenv install "requests>=1.4" # will install a version equal or larger than 1.4.0
$ pipenv install "requests<=2.13" # will install a version equal or lower than 2.13.0
$ pipenv install "requests>2.19" # will install 2.19.1 but not 2.19.0
If the PIPENV_VENV_IN_PROJECT=1 env variable is set! To make pipenv set the virtual environmnent within the project! Which is created in a directory named .venv (equiv to node_modules).
Also running pipenv install without a PipFile in the directory! Neither a virtual environment! Will create the virtual environment on .venv directory (node_modules equiv)! And generate a PipFile and Pipfile.lock!
Installing flask example:
pipenv install flask
Installing as dev dependency
pipenv install watchdog -d
or
pipenv install watchdog -dev
just like with npm!
pipenv all commands (pipenv -h)
Commands:
check Checks for PyUp Safety security vulnerabilities and against PEP
508 markers provided in Pipfile.
clean Uninstalls all packages not specified in Pipfile.lock.
graph Displays currently-installed dependency graph information.
install Installs provided packages and adds them to Pipfile, or (if no
packages are given), installs all packages from Pipfile.
lock Generates Pipfile.lock.
open View a given module in your editor.
run Spawns a command installed into the virtualenv.
scripts Lists scripts in current environment config.
shell Spawns a shell within the virtualenv.
sync Installs all packages specified in Pipfile.lock.
uninstall Uninstalls a provided package and removes it from Pipfile.
update Runs lock, then sync.
Command help
pipenv install -h
importing from requirements.txt
https://pipenv.pypa.io/en/latest/basics/#importing-from-requirements-txt
environment management with pipenv
https://pipenv.pypa.io/en/latest/basics/#environment-management-with-pipenv
pipenv run
To run anything with the project virtual environment you need to use pipenv run
As like pipenv run python server.py!
Custom scripts shortcuts
scripts in npm!
https://pipenv.pypa.io/en/latest/advanced/#custom-script-shortcuts
[scripts]
start = "python -m flask run"
And to run
pipenv run start
Just like with npm!
If you’d like a requirements.txt output of the lockfile, run $ pipenv lock -r. This will include all hashes, however (which is great!). To get a requirements.txt without hashes, use $ pipenv run pip freeze.
To mention too the pipenv cli rendering is well done:
Make sure to read the basics guide!
And you can see how rich is pipenv!

Yes, it's called the requirements file:
https://pip.pypa.io/en/stable/cli/pip_install/#requirement-specifiers
You can specify the package name & version number.
You can also specify a git url or a local path.
In the usual case, you would specify the package followed by the version number, e.g.
sqlalchemy=1.0.1
You can install all the packages specified in a requirements.txt file through the command
pip install -r requirements.txt

Once all the packages have been installed, run
pip freeze > requirements.txt
This will save the package details in the file requirements.txt.
For installation, run
pip install -r requirements.txt
to install the packages specified by requirements.txt.

I would like to propose pipenv here. Managing packages with Pipenv is easier as it manages the list and the versions of packages for you because I think you need to run pip freeze command each time you make changes to your packages.
It will need a Pipfile. This file will contain all of your required packages and their version just like package.json.
You can delete/update/add projects using pipenv install/uninstall/update <package>
This also generates a dependency tree for your project. Just like package-lock.json
Checkout this post on Pipfiles
Learn more about Pipenv

Python - Distributing Library With Source

I'm writing a program that uses some cryptography for a class. Since I'm low on time, I'd like to go with Python for this assignment. The issue that I run into is that the code must be able to work on the Linux machines at the school. We are able to SSH into those machines and run the code, but we aren't allowed to install anything. I'm using the Cryptography library for Python:
pip install cryptography
Is there a straightforward way that I can include this with my .py file such that the issue of not being able to install the library on the Linux machines won't be a problem?

You have few options:
virtualenv
Install into virtualenv (assuming command virtualenv is installed):
$ cd projectdir
$ virtualenv venv
$ source venv/bin/activate
(venv)$ pip install cryptography
(venv)$ vim mycode.py
(venv)$ python mycode.py
The trick is, you install into local virtual environment, which does not
requires root priviledges.
tox
tox is great tool. After investing a bit of time, you can easily create multiple virtualenvs.
It assumes, you have tox installed in your system.
$ tox-quickstart
$ ...accept all defaults
$ vim tox.ini
The tox.ini my look like:
[tox]
envlist = py27
skipsdist = true
[testenv]
commands = python --version
deps =
cryptography
then run (with virtualenvs being deactivated):
$ tox
it will create virtualenv in directory .tox/py27
Activate it (still being in the same dir):
$ source .tox/py27/bin/activate
(py27)$ pip freeze
cryptography==1.2.2
... and few more...
Install into --user python profile
While this allows installing without root priviledges, it is not recommended as
it soon ends in one big mess.
EDIT (reaction to MattDMo comment):
If one user has two project with conflicting requirements (e.g. different
package versions), --user installation will not work as the packages are
living in one scope shared across all user projects.
With virtualenvs you may keep virtualenv inside of project folders and feel
free to destroy and recreate or modify any of them without affecting any other
project.
Virtualenvs have no problem with "piling up": if you can find your project
folder, you shall be able to find and manage related virtualenv(s) in it.
Use of virtualenv became de-facto recommended standard. I remember numerous
examples starting with creating virtualenv, but I cannot remember one case
using $ pip install --user.

Renaming a virtualenv folder without breaking it

I've created folder and initialized a virtualenv instance in it.
$ mkdir myproject
$ cd myproject
$ virtualenv env
When I run (env)$ pip freeze, it shows the installed packages as it should.
Now I want to rename myproject/ to project/.
$ mv myproject/ project/
However, now when I run
$ . env/bin/activate
(env)$ pip freeze
it says pip is not installed. How do I rename the project folder without breaking the environment?

You need to adjust your install to use relative paths. virtualenv provides for this with the --relocatable option. From the docs:
Normally environments are tied to a
specific path. That means that you
cannot move an environment around or
copy it to another computer. You can
fix up an environment to make it
relocatable with the command:
$ virtualenv --relocatable ENV
NOTE: ENV is the name of the virtual environment and you must run this from outside the ENV directory.
This will make some of the files
created by setuptools or distribute
use relative paths, and will change
all the scripts to use
activate_this.py instead of using the
location of the Python interpreter to
select the environment.
Note: you must run this after you've
installed any packages into the
environment. If you make an
environment relocatable, then install
a new package, you must run virtualenv
--relocatable again.

I believe "knowing why" matters more than "knowing how". So, here is another approach to fix this.
When you run . env/bin/activate, it actually executes the following commands (using /tmp for example):
VIRTUAL_ENV="/tmp/myproject/env"
export VIRTUAL_ENV
However, you have just renamed myproject to project, so that command failed to execute.
That is why it says pip is not installed, because you haven't installed pip in the system global environment and your virtualenv pip is not sourced correctly.
If you want to fix this manually, this is the way:
With your favorite editor like Vim, modify /tmp/project/env/bin/activate usually in line 42:
VIRTUAL_ENV='/tmp/myproject/env' => VIRTUAL_ENV='/tmp/project/env'
Modify /tmp/project/env/bin/pip in line 1:
#!/tmp/myproject/env/bin/python => #!/tmp/project/env/bin/python
After that, activate your virtual environment env again, and you will see your pip has come back again.

NOTE: As #jb. points out, this solution only applies to easily (re)created virtualenvs. If an environment takes several hours to install this solution is not recommended
Virtualenvs are great because they are easy to make and switch around; they keep you from getting locked into a single configuration. If you know the project requirements, or can get them, Make a new virtualenv:
Create a requirements.txt file
(env)$ pip freeze > requirements.txt
If you can't create the requirements.txt file, check env/lib/pythonX.X/site-packages before removing the original env.
Delete the existing (env)
deactivate && rm -rf env
Create a new virtualenv, activate it, and install requirements
virtualenv env && . env/bin/activate && pip install -r requirements.txt
Alternatively, use virtualenvwrapper to make things a little easier as all virtualenvs are kept in a centralized location
$(old-venv) pip freeze > temp-reqs.txt
$(old-venv) deactivate
$ mkvirtualenv new-venv
$(new-venv) pip install -r temp-reqs.txt
$(new-venv) rmvirtualenv old-venv

I always install virtualenvwrapper to help out. From the shell prompt:
pip install virtualenvwrapper
There is a way documented in the virtualenvwrapper documents - cpvirtualenv
This is what you do. Make sure you are out of your environment and back to the shell prompt. Type in this with the names required:
cpvirtualenv oldenv newenv
And then, if necessary:
rmvirtualenv oldenv
To go to your newenv:
workon newenv

You can fix your issue by following these steps:
rename your directory
rerun this: $ virtualenv ..\path\renamed_directory
virtualenv will correct the directory associations while leaving your packages in place
$ scripts/activate
$ pip freeze to verify your packages are in place
An important caveat, if you have any static path dependencies in script files in your virtualenv directory, you will have to manually change those.

Yet another way to do it that worked for me many times without problems is virtualenv-clone:
pip install virtualenv-clone
virtualenv-clone old-dir/env new-dir/env

Run this inside your project folder:
cd bin
sed -i 's/old_dir_name/new_dir_name/g' *
Don't forget to deactivate and activate.

In Python 3.3+ with built-in venv
As of Python 3.3 the virtualenv package is now built-in to Python as the venv module. There are a few minor differences, one of which is the --relocatable option has been removed. As a result, it is normally best to recreate a virtual environment rather than attempt to move it. See this answer for more information on how to do that.
What is the purpose behind wanting to move rather than just recreate any virtual environment? A virtual environment is intended to manage the dependencies of a module/package with the venv so that it can have different and specific versions of a given package or module it is dependent on, and allow a location for those things to be installed locally.
As a result, a package should provide a way to recreate the venv from scratch. Typically this is done with a requirements.txt file and sometimes also a requirements-dev.txt file, and even a script to recreate the venv in the setup/install of the package itself.
One part that may give headaches is that you may need a particular version of Python as the executable, which is difficult to automate, if not already present. However, when recreating an existing virtual environment, one can simply run python from the existing venv when creating the new one. After that it is typically just a matter of using pip to reinstall all dependencies from the requirements.txt file:
From Git Bash on Windows:
python -m venv mynewvenv
source myvenv/Scripts/activate
pip install -r requirements.txt
It can get a bit more involved if you have several local dependencies from other locally developed packages, because you may need to update local absolute paths, etc. - though if you set them up as proper Python packages, you can install from a git repo, and thus avoid this issue by having a static URL as the source.

virtualenv --relocatable ENV is not a desirable solution. I assume most people want the ability to rename a virtualenv without any long-term side effects.
So I've created a simple tool to do just that. The project page for virtualenv-mv outlines it in a bit more detail, but essentially you can use virtualenv-mv just like you'd use a simple implementation of mv (without any options).
For example:
virtualenv-mv myproject project
Please note however that I just hacked this up. It could break under unusual circumstances (e.g. symlinked virtualenvs) so please be careful (back up what you can't afford to lose) and let me know if you encounter any problems.

Even easier solution which worked for me: just copy the site-packages folder of your old virtual environment into a new one.

Using Visual Studio Code (vscode), I just opened the ./env folder in my project root, and did a bulk find/replace to switch to my updated project name. This resolved the issue.
Confirm with which python

If you are using an conda env,
conda create --name new_name --clone old_name
conda remove --name old_name --all # or its alias: `conda env remove --name old_name`

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.