I want to package a python module containing python source and a native c++ library. Cppyy is used to dynamically generate the bindings so the library is really just a normal library. The build system for the library is meson and should not be replaced. The whole thing is in a git repository. I only care about Linux.
My question is how to get from this to “pip install url_to_package builds/installs everything.” in the least complicated way possible.
What I’ve tried:
Extending setuptools with a custom build command:
…that executes meson compile and copies the result in the right place. But pip install will perform its work in some random split-off temporary directory and I can’t find my C++ sources from there.
The Meson python module:
…can build my library and install files directly into some python env. Does not work with pip and has very limited functionality.
Wheels:
…are incredibly confusing and overkill for me. I will likely be the only user of this module. Actually, all I want is to easily use the module in projects that live in different directories…
Along the way, I also came across different CMake solutions, but those are disqualified because of my build system choice. What should I do?
I have a virtualenv that serves a few separate projects. I'd like to write a utility library that each of these projects can use. It seems to make sense to stick that in the virtualenv. By all means shoot me down now but please give an alternative if you do.
Assuming I'm not completely crazy though, Where's the best place to stick my library?
My virtualenv sticks everything I install with pip in lib/pyton2.7/site-packages. I wonder if it would make more sense to follow suit or to hack in a separate home (further up the directory tree) so if things do ever clash, my work isn't overwritten by pip (et al).
If your project follows the standard packaging practices with setuptools, then all you have to do is run python setup.py develop inside the virtualenvs that you want the library to be used for. A .egg-link file will be created pointing to your package from which your other libraries will use much like any other packages, with the added benefit that your latest changes will be available to all packages at the same time (if that's your intention). If not, then either you could call python setup.py install or use multiple versions at different locations on the file system.
To get yourself started, take a look at Getting Started With setuptools and setup.py (you can skip the part on registering your package on pypi if this is your private work).
Another relevant stackoverflow thread: setup.py examples? The Hitchhiker's Guide to Packaging can also be quite useful.
I had my fair chance of getting through the python management of modules, and every time is a challenge: packaging is not what people do every day, and it becomes a burden to learn, and a burden to remember, even when you actually do it, since this happens normally once.
I would like to collect here the definitive overview of how import, package management and distribution works in python, so that this question becomes the definitive explanation for all the magic that happens under the hood. Although I understand the broad level of the question, these things are so intertwined that any focused answer will not solve the main problem: understand how all works, what is outdated, what is current, what are just alternatives for the same task, what are the quirks.
The list of keywords to refer to is the following, but this is just a sample out of the bunch. There's a lot more and you are welcome to add additional details.
PyPI
setuptools / Distribute
distutils
eggs
egg-link
pip
zipimport
site.py
site-packages
.pth files
virtualenv
handling of compiled modules in eggs (with and without installation via easy_install)
use of get_data()
pypm
bento
PEP 376
the cheese shop
eggsecutable
Linking to other answers is probably a good idea. As I said, this question is for the high-level overview.
For the most part, this is an attempt to look at the packaging/distribution side, not the mechanics of import. Unfortunately, packaging is the place where Python provides way more than one way to do it. I'm just trying to get the ball rolling, hopefully others will help fill what I miss or point out mistakes.
First of all there's some messy terminology here. A directory containing an __init__.py file is a package. However, most of what we're talking about here are specific versions of packages published on PyPI, one of it's mirrors, or in a vendor specific package management system like Debian's Apt, Redhat's Yum, Fink, Macports, Homebrew, or ActiveState's pypm.
These published packages are what folks are trying to call "Distributions" going forward in an attempt to use "Package" only as the Python language construct. You can see some of that usage in PEP-376 PEP-376.
Now, your list of keywords relate to several different aspects of the Python Ecosystem:
Finding and publishing python distributions:
PyPI (aka the cheese shop)
PyPI Mirrors
Various package management tools / systems: apt, yum, fink, macports, homebrew
pypm (ActiveState's alternative to PyPI)
The above are all services that provide a place to publish Python distributions in various formats. Some, like PyPI mirrors and apt / yum repositories can be run on your local machine or within your companies network but folks typically use the official ones. Most, if not all provide a tool (or multiple tools in the case of PyPI) to help find and download distributions.
Libraries used to create and install distributions:
setuptools / Distribute
distutils
Distutils is the standard infrastructure on which Python packages are compiled and built into distributions. There's a ton of functionality in distutils but most folks just know:
from distutils.core import setup
setup(name='Distutils',
version='1.0',
description='Python Distribution Utilities',
author='Greg Ward',
author_email='gward#python.net',
url='http://www.python.org/sigs/distutils-sig/',
packages=['distutils', 'distutils.command'],
)
And to some extent that's a most of what you need. With the prior 9 lines of code you have enough information to install a pure Python package and also the minimal metadata required to publish that package a distribution on PyPI.
Setuptools provides the hooks necessary to support the Egg format and all of it's features and foibles. Distribute is an alternative to Setuptools that adds some features while trying to be mostly backwards compatible. I believe Distribute is going to be included in Python 3 as the successor to Distutil's from distutils.core import setup.
Both Setuptools and Distribute provide a custom version of the distutils setup command
that does useful things like support the Egg format.
Python Distribution Formats:
source
eggs
Distributions are typically provided either as source archives (tarball or zipfile). The standard way to install a source distribution is by downloading and uncompressing the archive and then running the setup.py file inside.
For example, the following will download, build, and install the Pygments syntax highlighting library:
curl -O -G http://pypi.python.org/packages/source/P/Pygments/Pygments-1.4.tar.gz
tar -zxvf Pygments-1.4.tar.gz
cd Pygments-1.4
python setup.py build
sudo python setup.py install
Alternatively you can download the Egg file and install it. Typically this is accomplished by using easy_install or pip:
sudo easy_install pygments
or
sudo pip install pygments
Eggs were inspired by Java's Jarfiles and they have quite a few features you should read about here
Python Package Formats:
uncompressed directories
zipimport (zip compressed directories)
A normal python package is just a directory containing an __init__.py file and an arbitrary number of additional modules or sub-packages. Python also has support for finding and loading source code within *.zip files as long as they are included on the PYTHONPATH (sys.path).
Installing Python Packages:
easy_install: the original egg installation tool, depends on setuptools
pip: currently the most popular way to install python packages. Similar to easy_install but more flexible and has some nice features like requirements files to help document dependencies and reproduce deployments.
pypm, apt, yum, fink, etc
Environment Management / Automated Deployment:
bento
buildout
virtualenv (and virtualenvwrapper)
The above tools are used to help automate and manage dependencies for a Python project. Basically they give you tools to describe what distributions your application requires and automate the installation of those specific versions of your dependencies.
Locations of Packages / Distributions:
site-packages
PYTHONPATH
the current working directory (depends on your OS and environment settings)
By default, installing a python distribution is going to drop it into the site-packages directory. That directory is usually something like /usr/lib/pythonX.Y/site-packages.
A simple programmatic way to find your site-packages directory:
from distuils import sysconfig
print sysconfig.get_python_lib()
Ways to modify your PYTHONPATH:
Python's import statement will only find packages that are located in one of the directories included in your PYTHONPATH.
You can inspect and change your path from within Python by accessing:
import sys
print sys.path
sys.path.append("/home/myname/lib")
Besides that, you can set the PYTHONPATH environment variable like you would any other environment variable on your OS or you could use:
.pth files: *.pth files located in directories that are already on your PYTHONPATH are read and each line of the *.pth file is added to your PYTHONPATH. Basically any time you would copy a package into a directory on your PYTHONPATH you could instead create a mypackages.pth. Read more about *.pth files: site module
egg-link files: Internal structure of python eggs they are a cross platform alternative to symbolic links. Creating an egg link file is similar to creating a pth file.
site.py modifications
To add the above /home/myname/lib to site-packages with a *.pth file you'd create a *.pth file. The name of the file doesn't matter but you should still probably choose something sensible.
Let's create myname.pth:
# myname.pth
/home/myname/lib
That's it. Drop that into sysconfig.get_python_lib() on your system or any other directory in your PYTHONPATH and /home/myname/lib will be added to the path.
For packaging question, this should help http://guide.python-distribute.org/
For import, the old article from Fredrik Lundh http://effbot.org/zone/import-confusion.htm still a very good starting point.
I recommend Tarek Ziadek's Book on Python. There's a chapter dedicated to packaging and distribution.
I don't think import needs to be explored (Python's namespacing and importing functionality is intuitive IMHO).
I use pip exclusively now. I haven't run into any issues with it.
However, the topic of packaging and distribution is something worth exploring. Instead of giving a lengthy answer, I will say this:
I learned how to package and distribute my own "packages" by simply copying how Pylons or many other open-source packages do it. I then combined that sort-of template with reading up of the docs to flesh it out even further and have come up with a solid distribution method.
When you grok package management and distribution for python (distutils and pypi) it's actually quite powerful. I like it a lot.
[edit]
I also wanted to add in a bit about virtualenv. USE IT. I create a virtualenv for every project and I always use --no-site-packages; I install all the packages I need for that particular project (even if it's something common amongst them all, like lxml) inside the virtualev. It keeps everything isolated and it's much easier for me to maintain the grouping in my head (rather than trying to keep track of what's where and for which version of python!)
[/edit]
I downloaded an open source project http://gmapcatcher.googlecode.com/files/GMapCatcher-0.7.2.0.tar.gz and I am trying to modify a few things in the code but don't know how to test the code!
I tried to make an installer for the project but nothing worked till now maybe I didn't follow the right steps or I am missing somthing.
my question is how can I modify the code and test it ? and how can I make an installer for this project (I know there is an installer already in google but I want to make it myself).
Looks like the package has a setup.py for the use with distutils. The setup.py works kind of like Makefile for python. The way you use it is (in the directory where setup.py is located:
$ python setup.py command
Where "command" is... well... a command. Type
$ python setup.py --help
for more information. The two basic commands are build and install. install installs, as the name suggests, the package to your system. It is not probably going to do anything like create shortcuts on your desktop or anything like that. It simply installs the Python package into your Python installation. Judging by the contents of setup.py, it seems they're somehow using py2exe (google it; being a newbie I can only include one hyperlink in my answer) to prepare the Windows installer.
To simply run the software, however, it seems all you need to do is to unpack it and do
$ python maps.py
in the package's root directory - provided you have all the necessary dependencies already installed, of course.
I want to develop a common python package, I got other packages depends on it. For example:
packageA/
packageB/
packageC/
commonPackage/
packageA, packageB and packageC can all be executed directly, but they are all depend on commonPackage. I want to install the commonPackage into lib/site-packages, but I don't want it copys the source code. Instead, I want it creates a commonPackage.pth in lib/site-packages with the path of where the commonPackage at. So that when I modify commonPackage or update it from SVN, I don't need to install it again. Here comes the problem, how can I write the setup.py or use the options of python setup.py install so that it would do what I want?
Oops, I just find exactly what I want here. The develop command of setuptools do what I said. Here you type
python setup.py develop
It creates .pth rather than copying everything into site-packages.
You can always take a look at virtualenv which will allow you to create a python environment for each of your projects - this is the ideal way to develop/build/deploy your app without having load up your site-packages directory with all and sundry.
There's a good tutorial here:
http://iamzed.com/2009/05/07/a-primer-on-virtualenv/
Good luck !