How should I write a simple installer for python package?

How should I write a simple installer for python package? - python

I'm writing a simple python package with many helper functions to be used with other projects.
How should handle I packaging and make installer for said package?
Directory layout
package:
package (/git/dev/package)
|
+-- __init__.py
|
+-- foo.py
|
+-- bar.py
project using said package:
project (/git/dev/project)
|
+-- project.py
How do I make this package available to every local python project (I don't need to distribute it publicly)? Should installer add current package location to path or use some other way?
Current preferred workflow:
1. checkout package from version control
2. do something so python finds and can use that said package
3. use package in some project
4. edit package (project should use edited package even before I push those changes to repo)
File contents
project.py:
# Doesn't work currently since package isn't added to path
from package import foo
from package import bar
foo.do_stuff()
bar.do_things()

How do I make this package available to every python project?
The standard way to distribute packages is as source distributions on PyPI (or a private pip-compatible repository). The details take up many pages, so I can't explain them all here, but the Python Packaging User Guide has everything you want to know.
The basic idea is that you create a setup.py file that tells Python how to install your program. Look at the official sample linked from the tutorial.
Now, someone can just download and unzip your package and run python setup.py install and it will install your program for them. Or, better, they can type pip install ..
But, even better, you can python setup.py sdist to create a source distribution, test it out, upload it to PyPI, and then users don't have to download anything, they can just pip install mylib.
If you want binary installations for different platforms, setuptools knows how to make both Windows installers, which can be double-clicked and run, and wheel files, which can be installed with pip (in fact, it'll automatically find and use wheels instead of source distributions). If you don't have any compiled C extensions, python setup.py bdist_wheel --universal is enough to build a wheel that works everywhere.
Should installer add current package location to path or use some other way?
The installer should install the package into the user's system or user (or current-virtual-environment-system) site-packages. If you use setuptools, pip will take care of this automatically.
If you're not trying to make this public, you can still use setuptools and pip. Just check the source distribution into source control, then you can install the latest version at any time like this:
pip install --upgrade git+https://github.com/mycompany/mylib
This also means you can skip a lot of the PyPI metadata in your setup.py file (e.g., nobody cares about your project's classifiers if it's not going to end up in the PyPI repository).
But you can still take advantage of all of the parts of the Python packaging system that make your life easier, while skipping the parts that aren't relevant.

Related

What is the minimal setup.py needed to develop poetry packages?

I am developing a python package managed by poetry. The package has some complex requirements that are very difficult to install successfully on my system. I want the ability to install this in editable mode, with the ability to ignore dependencies (something which the developer of poetry frowns on). Unfortunately, I do not have the option of converting this package to a more mature packaging system.
Apparently the simple solution is to create a setup.py for the project and pip install -e that. Since unfortunately poetry has spread like a cancer to many projects now, I will have to employ such a workaround frequently. As such, I want to minimize the tedium by not copying over fields like description which are irrelevant to the developing the package.
What is the minimal setup.py file that I can use as a template for such poetry projects? I assume it must at least include the package name, version and location. Is there anything else?
I am also planning to not put any requirements in the setup.py file, since the whole point is to bypass the requirements defined by poetry and pyproject.toml. I am fine with manually resolving ModuleNotFoundError: No module named 'foo' errors by typing pip install foo.

It appears sufficient to create the following file:
from distutils.core import setup
setup(
name="<PACKAGE_NAME>",
version="<PACKAGE_VERSION>"
)
And also comment out the entire [build-system] block in the pyproject.toml file (see also How do I configure git to ignore some files locally? so you don't accidentally commit to that).
I think the package name and version can be automatically pulled from the toml file as well, but not sure right now how to do it.

How does setuptools installs test dependencies on python setup.py test command

When I use the command python setup.py test, all of the documentation I've seen says setuptools will handle installing the testing dependencies. Where does it install them and are they deleted from the machine after the test suite runs? I've noticed none of the testing modules are actually installed into my virtual environment after this command completes.
I understand it takes all of the modules in the tests_require list and installs them somewhere but I'm not sure where, what it does with them afterward and why it does this. Also, is there any way to pass arguments to the command without using flags, like with a config file or something?

Avoid python setup.py test and tests_require, it's crufty and is now deprecated.
That old feature just downloads the test deps to the project's setup directory, which is seldom what the developer wanted or expected to happen! That doesn't work well in a modern CI workflows with virtual environments, where you would want your dependencies installed to site-packages.
The recommended way to do it using setuptools these days is with an extras_require tag.. See here for an example.

It installs them into an automatically-created subdirectory of the code base named .eggs as .eggs. That's because .eggs are designed to be importable from any location.
This will thus most likely not work in a modern environment because packages are not distributed as .eggs (which lost competition to .whls) so setuptools will have to build them from source (with bdist_egg). Which is likely to fail for many widely-used binary packages with nontrivial build requirements (not to mention the time needed and the fact that packages are not tested as .eggs, either, and may fail when packaged like this).
Instead, listing build requirements in requirements.txt and invoking pip install -r requirements.txt before the build seems to have become widespread practice. This does not make setup.py automatically buildable from source by pip though.
I tried to install them myself from setup.py but this proved to be fragile (e.g. if the user doesn't have write access to site-packages).
The best solution adopted by at least a number of high-profile projects seems to be to just make setup.py fail if they are not present. This is especially useful if the requirements are not Python but C libraries as setup.py doesn't know how to install these in the specific environment anyway. As you can see, this complements requirements.txt naturally.

requirements.txt vs setup.py

I started working with Python. I've added requirements.txt and setup.py to my project. But, I am still confused about the purpose of both files. I have read that setup.py is designed for redistributable things and that requirements.txt is designed for non-redistributable things. But I am not certain this is accurate.
How are those two files truly intended to be used?

requirements.txt:
This helps you to set up your development environment.
Programs like pip can be used to install all packages listed in the file in one fell swoop. After that you can start developing your python script. Especially useful if you plan to have others contribute to the development or use virtual environments.
This is how you use it:
pip install -r requirements.txt
It can be produced easily by pip itself:
pip freeze > requirements.txt
pip automatically tries to only add packages that are not installed by default, so the produced file is pretty minimal.
setup.py:
This helps you to create packages that you can redistribute.
The setup.py script is meant to install your package on the end user's system, not to prepare the development environment as pip install -r requirements.txt does. See this answer for more details on setup.py.
The dependencies of your project are listed in both files.

The short answer is that requirements.txt is for listing package requirements only. setup.py on the other hand is more like an installation script. If you don't plan on installing the python code, typically you would only need requirements.txt.
The file setup.py describes, in addition to the package dependencies, the set of files and modules that should be packaged (or compiled, in the case of native modules (i.e., written in C)), and metadata to add to the python package listings (e.g. package name, package version, package description, author, ...).
Because both files list dependencies, this can lead to a bit of duplication. Read below for details.
requirements.txt
This file lists python package requirements. It is a plain text file (optionally with comments) that lists the package dependencies of your python project (one per line). It does not describe the way in which your python package is installed. You would generally consume the requirements file with pip install -r requirements.txt.
The filename of the text file is arbitrary, but is often requirements.txt by convention. When exploring source code repositories of other python packages, you might stumble on other names, such as dev-dependencies.txt or dependencies-dev.txt. Those serve the same purpose as dependencies.txt but generally list additional dependencies of interest to developers of the particular package, namely for testing the source code (e.g. pytest, pylint, etc.) before release. Users of the package generally wouldn't need the entire set of developer dependencies to run the package.
If multiplerequirements-X.txt variants are present, then usually one will list runtime dependencies, and the other build-time, or test dependencies. Some projects also cascade their requirements file, i.e. when one requirements file includes another file (example). Doing so can reduce repetition.
setup.py
This is a python script which uses the setuptools module to define a python package (name, files included, package metadata, and installation). It will, like requirements.txt, also list runtime dependencies of the package. Setuptools is the de-facto way to build and install python packages, but it has its shortcomings, which over time have sprouted the development of new "meta-package managers", like pip. Example shortcomings of setuptools are its inability to install multiple versions of the same package, and lack of an uninstall command.
When a python user does pip install ./pkgdir_my_module (or pip install my-module), pip will run setup.py in the given directory (or module). Similarly, any module which has a setup.py can be pip-installed, e.g. by running pip install . from the same folder.
Do I really need both?
Short answer is no, but it's nice to have both. They achieve different purposes, but they can both be used to list your dependencies.
There is one trick you may consider to avoid duplicating your list of dependencies between requirements.txt and setup.py. If you have written a fully working setup.py for your package already, and your dependencies are mostly external, you could consider having a simple requirements.txt with only the following:
# requirements.txt
#
# installs dependencies from ./setup.py, and the package itself,
# in editable mode
-e .
# (the -e above is optional). you could also just install the package
# normally with just the line below (after uncommenting)
# .
The -e is a special pip install option which installs the given package in editable mode. When pip -r requirements.txt is run on this file, pip will install your dependencies via the list in ./setup.py. The editable option will place a symlink in your install directory (instead of an egg or archived copy). It allows developers to edit code in place from the repository without reinstalling.
You can also take advantage of what's called "setuptools extras" when you have both files in your package repository. You can define optional packages in setup.py under a custom category, and install those packages from just that category with pip:
# setup.py
from setuptools import setup
setup(
name="FOO"
...
extras_require = {
'dev': ['pylint'],
'build': ['requests']
}
...
)
and then, in the requirements file:
# install packages in the [build] category, from setup.py
# (path/to/mypkg is the directory where setup.py is)
-e path/to/mypkg[build]
This would keep all your dependency lists inside setup.py.
Note: You would normally execute pip and setup.py from a sandbox, such as those created with the program virtualenv. This will avoid installing python packages outside the context of your project's development environment.

For the sake of completeness, here is how I see it in 3 4 different angles.
Their design purposes are different
This is the precise description quoted from the official documentation (emphasis mine):
Whereas install_requires (in setup.py) defines the dependencies for a single project, Requirements Files are often used to define the requirements for a complete Python environment.
Whereas install_requires requirements are minimal, requirements files often contain an exhaustive listing of pinned versions for the purpose of achieving repeatable installations of a complete environment.
But it might still not easy to be understood, so in next section, there come 2 factual examples to demonstrate how the 2 approaches are supposed to be used, differently.
Their actual usages are therefore (supposed to be) different
If your project foo is going to be released as a standalone library (meaning, others would probably do import foo), then you (and your downstream users) would want to have a flexible declaration of dependency, so that your library would not (and it must not) be "picky" about what exact version of YOUR dependencies should be. So, typically, your setup.py would contain lines like this:
install_requires=[
'A>=1,<2',
'B>=2'
]
If you just want to somehow "document" or "pin" your EXACT current environment for your application bar, meaning, you or your users would like to use your application bar as-is, i.e. running python bar.py, you may want to freeze your environment so that it would always behave the same. In such case, your requirements file would look like this:
A==1.2.3
B==2.3.4
# It could even contain some dependencies NOT strickly required by your library
pylint==3.4.5
In reality, which one do I use?
If you are developing an application bar which will be used by python bar.py, even if that is "just script for fun", you are still recommended to use requirements.txt because, who knows, next week (which happens to be Christmas) you would receive a new computer as a gift, so you would need to setup your exact environment there again.
If you are developing a library foo which will be used by import foo, you have to prepare a setup.py. Period.
But you may still choose to also provide a requirements.txt at the same time, which can:
(a) either be in the A==1.2.3 style (as explained in #2 above);
(b) or just contain a magical single .
.
The latter is essentially using the conventional requirements.txt habit to document your installation step is pip install ., which means to "install the requirements based on setup.py" while without duplication. Personally I consider this last approach kind of blurs the line, adds to the confusion, but it is nonetheless a convenient way to explicitly opt out for dependency pinning when running in a CI environment. The trick was derived from an approach mentioned by Python packaging maintainer Donald in his blog post.
Different lower bounds.
Assuming there is an existing engine library with this history:
engine 1.1.0 Use steam
...
engine 1.2.0 Internal combustion is invented
engine 1.2.1 Fix engine leaking oil
engine 1.2.2 Fix engine overheat
engine 1.2.3 Fix occasional engine stalling
engine 2.0.0 Introducing nuclear reactor
You follow the above 3 criteria and correctly decided that your new library hybrid-engine would use a setup.py to declare its dependency engine>=1.2.0,<2, and then your separated application reliable-car would use requirements.txt to declare its dependency engine>=1.2.3,<2 (or you may want to just pin engine==1.2.3). As you see, your choice for their lower bound number are still subtly different, and neither of them uses the latest engine==2.0.0. And here is why.
hybrid-engine depends on engine>=1.2.0 because, the needed add_fuel() API was first introduced in engine 1.2.0, and that capability is the necessity of hybrid-engine, regardless of whether there might be some (minor) bugs inside such version and been fixed in subsequent versions 1.2.1, 1.2.2 and 1.2.3.
reliable-car depends on engine>=1.2.3 because that is the earliest version WITHOUT known issues, so far. Sure there are new capabilities in later versions, i.e. "nuclear reactor" introduced in engine 2.0.0, but they are not necessarily desirable for project reliable-car. (Your yet another new project time-machine would likely use engine>=2.0.0, but that is a different topic, though.)

TL;DR
requirements.txt lists concrete dependencies
setup.py lists abstract dependencies
A common misunderstanding with respect to dependency management in Python is whether you need to use a requirements.txt or setup.py file in order to handle dependencies.
The chances are you may have to use both in order to ensure that dependencies are handled appropriately in your Python project.
The requirements.txt file is supposed to list the concrete dependencies. In other words, it should list pinned dependencies (using the == specifier). This file will then be used in order to create a working virtual environment that will have all the dependencies installed, with the specified versions.
On the other hand, the setup.py file should list the abstract dependencies. This means that it should list the minimal dependencies for running the project. Apart from dependency management though, this file also serves the package distribution (say on PyPI).
For a more comprehensive read, you can read the article requirements.txt vs setup.py in Python on TDS.
Now going forward and as of PEP-517 and PEP-518, you may have to use a pyproject.toml in order to specify that you want to use setuptools as the build-tool and an additional setup.cfg file to specify the details.
For more details you can read the article setup.py vs setup.cfg in Python.

How can I install a python package without pip or virtualenv

I have to deploy a python application to a production server (Ubuntu) that I do not control nor do I have permissions to apt-get, pip, virtualenv, etc. Currently, its the server is running python 2.6+. I need to install pycrypto as a dependency for the application but given my limited permissions, I'm not sure as to how to do it. The only think I have permissions to do is wget a resource and unpack it or things along those lines.
First off, is it possible to use it without getting it installed in the aforementioned approach? If not, could I download the package then drop in __init__.py files in the pycrypto dir so python knows how to find it like so:
/my_app
/pycrypto
/__init__.py
/pycrypto.py

According to PEP370, starting with python 2.6 you can have a per-user site directory (see the What's new in Python 2.6?).
So you can use the --user option of easy_install to install the directory for each user instead of system-wide. I believe a similar option exists for pip too.
This doesn't require any privileges since it only uses current user directories.
If you don't have any installer installed you can manually unpack the package into:
~/.local/lib/python2.6/site-packages
Or, if you are on Windows, into:
%APPDATA%/Python/Python26/site-packages
In the case of pycrypto, the package requires building before installation because it contains some C code. The sources should contain a setup.py file. You have to build the library running
python setup.py build
Afterwards you can install it in the user directory by giving:
python setup.py install --user
Note that the building phase might require some C library to already be installed.
If you don't want to do this, the only option is to ship the library together with your application.
By the way: I believe easy_install doesn't really check whether you are root before performing a system wide install. It simply checks whether it can write in the system-wide site directory. So, if you do have the privileges to write there, there's no need to use sudo in the first place. However this would be really odd...

Use easy_install. It should be installed already on Ubuntu for python 2.6+. If not take a look at these install instructions.

Best practice for installing python modules from an arbitrary VCS repository

I'm newish to the python ecosystem, and have a question about module editing.
I use a bunch of third-party modules, distributed on PyPi. Coming from a C and Java background, I love the ease of easy_install <whatever>. This is a new, wonderful world, but the model breaks down when I want to edit the newly installed module for two reasons:
The egg files may be stored in a folder or archive somewhere crazy on the file system.
Using an egg seems to preclude using the version control system of the originating project, just as using a debian package precludes development from an originating VCS repository.
What is the best practice for installing modules from an arbitrary VCS repository? I want to be able to continue to import foomodule in other scripts. And if I modify the module's source code, will I need to perform any additional commands?

Pip lets you install files gives a URL to the Subversion, git, Mercurial or bzr repository.
pip install -e svn+http://path_to_some_svn/repo#egg=package_name
Example:
pip install -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
If I wanted to download the latest version of cmdutils. (Random package I decided to pull).
I installed this into a virtualenv (using the -E parameter), and pip installed cmdutls into a src folder at the top level of my virtualenv folder.
pip install -E thisIsATest -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
$ ls thisIsATest/src
cmdutils

Are you wanting to do development but have the developed version be handled as an egg by the system (for instance to get entry-points)? If so then you should check out the source and use Development Mode by doing:
python setup.py develop
If the project happens to not be a setuptools based project, which is required for the above, a quick work-around is this command:
python -c "import setuptools; execfile('setup.py')" develop
Almost everything you ever wanted to know about setuptools (the basis of easy_install) is available from the the setuptools docs. Also there are docs for easy_install.
Development mode adds the project to your import path in the same way that easy_install does. An changes you make will be available to your apps the next time they import the module.
As others mentioned, you can also directly use version control URLs if you just want to get the latest version as it is now without the ability to edit, but that will only take a snapshot, and indeed creates a normal egg as part of the process. I know for sure it does Subversion and I thought it did others but I can't find the docs on that.

You can use the PYTHONPATH environment variable or symlink your code to somewhere in site-packages.

Packages installed by easy_install tend to come from snapshots of the developer's version control, generally made when the developer releases an official version. You're therefore going to have to choose between convenient automatic downloads via easy_install and up-to-the-minute code updates via version control. If you pick the latter, you can build and install most packages seen in the python package index directly from a version control checkout by running python setup.py install.
If you don't like the default installation directory, you can install to a custom location instead, and export a PYTHONPATH environment variable whose value is the path of the installed package's parent folder.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.