packaging with numpy and test suite

packaging with numpy and test suite - python

Introduction
Disclaimer: I'm very new to python packaging with distutils. So far I've just stashed everything into modules, and packages manually and developed on top of that. I never wrote a setup.py file before.
I have a Fortran module that I want to use in my python code with numpy. I figured the best way to do that would be f2py, since it is included in numpy. To automate the build process I want to use distutils and the corresponding numpy enhancement, which includes convenience functions for f2py wrappers.
I do not understand how I should organize my files, and how to include my test suite.
What I want is the possibility to use ./setup.py for building, installing, and testing, and developing.
My directory structure looks as follows:
volterra
├── setup.py
└── volterra
├── __init__.py
├── integral.f90
├── test
│   ├── __init__.py
│   └── test_volterra.py
└── volterra.f90
And the setup.py file contains this:
def configuration(parent_package='', top_path=None):
from numpy.distutils.misc_util import Configuration
config = Configuration('volterra', parent_package, top_path)
config.add_extension('_volterra',
sources=['volterra/integral.f90', 'volterra/volterra.f90'])
return config
if __name__ == '__main__':
from numpy.distutils.core import setup
setup(**configuration(top_path='').todict())
After running ./setup.py build I get.
build/lib.linux-x86_64-2.7/
└── volterra
└── _volterra.so
Which includes neither the __init__.py file, nor the tests.
Questions
Is it really necessary to add the path to every single source file of the extension? (I.e. volterra/integral.f90) Can't I give a parameter which says, look for stuff in volterra/? The top_path, and package_dir parameters didn't do the trick.
Currently, the __init__.py file is not included in the build. Why is that?
How can I run my tests in this setup?
What's the best workflow for doing development in such an environment? I don't want to install my package for every single change I do. How do you do development in the source directory when you need to compile some extension modules?

Here is a setup.py that works for me:
# pkg - A fancy software package
# Copyright (C) 2013 author (email)
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see http://www.gnu.org/licenses/gpl.html.
"""pkg: a software suite for
Hey look at me I'm a long description
But how long am I?
"""
from __future__ import division, print_function
#ideas for setup/f2py came from:
# -numpy setup.py: https://github.com/numpy/numpy/blob/master/setup.py 2013-11-07
# -winpython setup.py: http://code.google.com/p/winpython/source/browse/setup.py 2013-11-07
# -needing to use
# import setuptools; from numpy.distutils.core import setup, Extension:
# http://comments.gmane.org/gmane.comp.python.f2py.user/707 2013-11-07
# -wrapping FORTRAN code with f2py: http://www2-pcmdi.llnl.gov/cdat/tutorials/f2py-wrapping-fortran-code 2013-11-07
# -numpy disutils: http://docs.scipy.org/doc/numpy/reference/distutils.html 2013-11-07
# -manifest files in disutils:
# 'distutils doesn't properly update MANIFEST. when the contents of directories change.'
# https://github.com/numpy/numpy/blob/master/setup.py
# -if things are not woring try deleting build, sdist, egg directories and try again:
# https://stackoverflow.com/a/9982133/2530083 2013-11-07
# -getting fortran extensions to be installed in their appropriate sub package
# i.e. "my_ext = Extension(name = 'my_pack._fortran', sources = ['my_pack/code.f90'])"
# Note that sources is a list even if one file:
# http://numpy-discussion.10968.n7.nabble.com/f2py-and-setup-py-how-can-I-specify-where-the-so-file-goes-tp34490p34497.html 2013-11-07
# -install fortran source files into their appropriate sub-package
# i.e. "package_data={'': ['*.f95','*.f90']}# Note it's a dict and list":
# https://stackoverflow.com/a/19373744/2530083 2013-11-07
# -Chapter 9 Fortran Programming with NumPy Arrays:
# Langtangen, Hans Petter. 2013. Python Scripting for Computational Science. 3rd edition. Springer.
# -Hitchhikers guide to packaging :
# http://guide.python-distribute.org/
# -Python Packaging: Hate, hate, hate everywhere :
# http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/
# -How To Package Your Python Code:
# http://www.scotttorborg.com/python-packaging/
# -install testing requirements:
# https://stackoverflow.com/a/7747140/2530083 2013-11-07
import setuptools
from numpy.distutils.core import setup, Extension
import os
import os.path as osp
def readme(filename='README.rst'):
with open('README.rst') as f:
text=f.read()
f.close()
return text
def get_package_data(name, extlist):
"""Return data files for package *name* with extensions in *extlist*"""
#modified slightly from taken from http://code.google.com/p/winpython/source/browse/setup.py 2013-11-7
flist = []
# Workaround to replace os.path.relpath (not available until Python 2.6):
offset = len(name)+len(os.pathsep)
for dirpath, _dirnames, filenames in os.walk(name):
for fname in filenames:
if not fname.startswith('.') and osp.splitext(fname)[1] in extlist:
# flist.append(osp.join(dirpath, fname[offset:]))
flist.append(osp.join(dirpath, fname))
return flist
DOCLINES = __doc__.split("\n")
CLASSIFIERS = """\
Development Status :: 1 - Planning
License :: OSI Approved :: GNU Lesser General Public License v3 or later (LGPLv3+)
Programming Language :: Python :: 2.7
Topic :: Scientific/Engineering
"""
NAME = 'pkg'
MAINTAINER = "me"
MAINTAINER_EMAIL = "me#me.com"
DESCRIPTION = DOCLINES[0]
LONG_DESCRIPTION = "\n".join(DOCLINES[2:])#readme('readme.rst')
URL = "http://meeeee.mmemem"
DOWNLOAD_URL = "https://github.com/rtrwalker/geotecha.git"
LICENSE = 'GNU General Public License v3 or later (GPLv3+)'
CLASSIFIERS = [_f for _f in CLASSIFIERS.split('\n') if _f]
KEYWORDS=''
AUTHOR = "me"
AUTHOR_EMAIL = "me.com"
PLATFORMS = ["Windows"]#, "Linux", "Solaris", "Mac OS-X", "Unix"]
MAJOR = 0
MINOR = 1
MICRO = 0
ISRELEASED = False
VERSION = '%d.%d.%d' % (MAJOR, MINOR, MICRO)
INSTALL_REQUIRES=[]
ZIP_SAFE=False
TEST_SUITE='nose.collector'
TESTS_REQUIRE=['nose']
DATA_FILES = [(NAME, ['LICENSE.txt','README.rst'])]
PACKAGES=setuptools.find_packages()
PACKAGES.remove('tools')
PACKAGE_DATA={'': ['*.f95','*f90']}
ext_files = get_package_data(NAME,['.f90', '.f95','.F90', '.F95'])
ext_module_names = ['.'.join(osp.splitext(v)[0].split(osp.sep)) for v in ext_files]
EXT_MODULES = [Extension(name=x,sources=[y]) for x, y in zip(ext_module_names, ext_files)]
setup(
name=NAME,
version=VERSION,
maintainer=MAINTAINER,
maintainer_email=MAINTAINER_EMAIL,
description=DESCRIPTION,
long_description=LONG_DESCRIPTION,
url=URL,
download_url=DOWNLOAD_URL,
license=LICENSE,
classifiers=CLASSIFIERS,
author=AUTHOR,
author_email=AUTHOR_EMAIL,
platforms=PLATFORMS,
packages=PACKAGES,
data_files=DATA_FILES,
install_requires=INSTALL_REQUIRES,
zip_safe=ZIP_SAFE,
test_suite=TEST_SUITE,
tests_require=TESTS_REQUIRE,
package_data=PACKAGE_DATA,
ext_modules=EXT_MODULES,
)
To install, at the command line I use:
python setup.py install
python setup.py clean --all
The only issue I seem to have is a minor one. when I look in site-packages for my package it is installed inside the egg folder C:\Python27\Lib\site-packages\pkg-0.1.0-py2.7-win32.egg\pkg. Most other packages I see there have a C:\Python27\Lib\site-packages\pkg folder separate to the egg folder. Does anyone know how to get that separation?
As for testing, after installing, I type the following at the command line:
nosetests package_name -v
Try investigating python setup.py develop (Python setup.py develop vs install) for not having to install the package after every change.
As I commented in the code I found the following useful:
numpy setup.py: https://github.com/numpy/numpy/blob/master/setup.py 2013-11-07
winpython setup.py: http://code.google.com/p/winpython/source/browse/setup.py 2013-11-07
needing to use
import setuptools; from numpy.distutils.core import setup, Extension:
http://comments.gmane.org/gmane.comp.python.f2py.user/707 2013-11-07
wrapping FORTRAN code with f2py: http://www2-pcmdi.llnl.gov/cdat/tutorials/f2py-wrapping-fortran-code 2013-11-07
numpy disutils: http://docs.scipy.org/doc/numpy/reference/distutils.html 2013-11-07
manifest files in disutils:
'distutils doesn't properly update MANIFEST. when the contents of directories change.'
https://github.com/numpy/numpy/blob/master/setup.py
if things are not woring try deleting build, sdist, egg directories and try again:
https://stackoverflow.com/a/9982133/2530083 2013-11-07
getting fortran extensions to be installed in their appropriate sub package
i.e. "my_ext = Extension(name = 'my_pack._fortran', sources = ['my_pack/code.f90'])"
Note that sources is a list even if one file:
http://numpy-discussion.10968.n7.nabble.com/f2py-and-setup-py-how-can-I-specify-where-the-so-file-goes-tp34490p34497.html 2013-11-07
install fortran source files into their appropriate sub-package
i.e. "package_data={'': ['.f95','.f90']}# Note it's a dict and list":
https://stackoverflow.com/a/19373744/2530083 2013-11-07
Chapter 9 Fortran Programming with NumPy Arrays:
Langtangen, Hans Petter. 2013. Python Scripting for Computational Science. 3rd edition. Springer.
Hitchhikers guide to packaging :
http://guide.python-distribute.org/
Python Packaging: Hate, hate, hate everywhere :
http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/
How To Package Your Python Code:
http://www.scotttorborg.com/python-packaging/
install testing requirements:
https://stackoverflow.com/a/7747140/2530083 2013-11-07
'python setup.py develop' :
https://stackoverflow.com/a/19048754/2530083

Here is setup.py from a project I made. I have found figuring out setup.py / packaging to be frustrating with no solid answers and definitely not pythonic in the sense of having one and only one obvious way to do something. Hopefully this will help a little.
The points you may find useful are:
find_packages which removes the drudgery of including lots of files or messing around with generating manifest.
package_data which allows you to easily specify non .py files to be included
install_requires / tests_require
You'll need to find the source for distribute_setup.py if you don't have it already.
Is it really necessary to add the path to every single source file of
the extension? (I.e. volterra/integral.f90) Can't I give a parameter
which says, look for stuff in volterra/? The top_path, and package_dir
parameters didn't do the trick.
Currently, the init.py file is not
included in the build. Why is that?
Hopefully find_packages() will solve both of those. I don't have much experience packaging but I haven't had to go back to manual inclusion yet.
How can I run my tests in this
setup?
I think this is probably a different question with many answers depending on how you are doing tests. Maybe you can ask it separately?
As a side note, I am under the impression that the standard is to put your tests directory at the top level. I.e. volterra/volterra and volterra/tests.
What's the best workflow for doing development in such an
environment? I don't want to install my package for every single
change I do. How do you do development in the source directory when
you need to compile some extension modules?
This might be worth another question as well. I don't see why you would need to install your package for every single change. If you are uploading the package, just don't install it on your dev system (except to test installation) and work directly from your development copy. Maybe I'm missing something though since I don't work with compiled extensions.
Here is the example
try:
from setuptools import setup, find_packages
except ImportError:
from distribute_setup import use_setuptools
use_setuptools()
from setuptools import setup, find_packages
setup(
# ... other stuff
py_modules=['distribute_setup'],
packages=find_packages(),
package_data={'': ['*.png']}, # for me to include anything with png
install_requires=['numpy', 'treenode', 'investigators'],
tests_require=['mock', 'numpy', 'treenode', 'investigators'],
)

Related

Packaging a Python extension for a C++ library: what to include, how to include it in .whl?

I'm writing an extension for a C++ library to make it available in Python using Pybind11. The library itself depends on a couple of other C++ libraries.
What I don't get is what files am I supposed to include in my distribution package and how. After mixing up some code from Python Packaging Guide and Building C++ Extensions I got the following files
setup.py
from setuptools import setup, Extension
#from distutils.core import setup, Extension #used this at first, switched to setuptools. Didn't see the difference
src = ['module.cpp', /*... other cpp files */]
include = ['MyLibrary/include', /*... other header files for 3rdparty libs*/]
module = Extension(
'TestlibPy',
sources = src,
include_dirs= include,
libraries=[/* library names*/]
lib_dirs=[/*library dirs*/]
language='c++',
)
setup(
ext_modules = [module],
)
setup.cfg
[metadata]
name = TestlibPy
version = 0.0.1
description = Python interface for Test library
classifiers =
Programming Language:: Python :: 3
[options]
packages = find:
python_requires = >=3.6
pyproject.toml
[build-system]
requires = [
"setup tools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
After building with
py -m build
I got the bare minimum packages that don't even include the headers (and I don't get the logic here - it builds the distribution with them in mind, from the directory provided in 'include', yet doesn't think other users will need those heades?).
So I wrote a MANIFEST.in:
graft MyLibC++/include
graft MyLib/3rdpartyLibs
After another build I get
package.tar.gz - contains everything I asked for and works on my other laptop after installation, but it's an unencoded archive that gives away the source code to anyone who bothers to look (I obviously don't want that);
package.whl - ignores my MANIFEST.in and doesn't seem to work (is it supposed to? Did all the necessary information get into the .pyd file without me knowing any better?)
My questions are:
Is it alright to include all the 3rd party C++ libraries + pybind11 in my distribution package, or is there some better way to do things?
Should the C++ library be in .dll format, or can I get away with a bunch of .cpp files?
Can I somehow write the .hpp and 3rdparty files to the .whl package? Or it should work without them?
That's my first time working with Python, extensions and packages, so maybe I'm asking all the wrong questions. Any advice would be helpful.

Let sphinx use version from setup.py

If I do sphinx-quickstart I get asked about the version of the project.
I want to avoid to have two places for the version of my project.
How to do this in the python packing world?

The easiest (and probably cleanest) way is to define __version__ for the __init__.py of your top-level package, and then import that package and read the version in both setup.py and your Sphinx project's conf.py.
So lets say your project is called myproject.
Move your current version out of setup.py, and make it a variable in myproject/__init__.py instead:
myproject/__init__.py:
# import foo
# ...
__version__ = '1.5'
Import myproject in your project's setup.py, and replace the hardcoded version with myproject.__version__:
setup.py:
from setuptools import setup
from myproject import __version__
project = "myproject"
setup(
name=project,
version=__version__,
# ...
)
In your Sphinx project's conf.py, do the same. So edit the generated conf.py along these lines:
docs/conf.py:
from myproject import __version__
# ...
# The short X.Y version.
version = __version__
# The full version, including alpha/beta/rc tags.
release = version
For an example of a library that does this pretty much exactly like this, have a look at the requests module (__init__.py | setup.py | conf.py).
This will take care of the auto-generated texts where the project version is used (like the links to the front page of the documentation). If you want to use your version in specific custom places, you can use the rst_epilog directive to dynamically insert configuration values defined in conf.py.

Maybe an even cleaner option is to actually build sphinx from the setup.py command as described in http://www.sphinx-doc.org/en/master/setuptools.html:
setup.py
# this is only necessary when not using setuptools/distribute
from sphinx.setup_command import BuildDoc
cmdclass = {'build_sphinx': BuildDoc}
name = 'My project'
version = '1.2'
release = '1.2.0'
setup(
name=name,
author='Bernard Montgomery',
version=release,
cmdclass=cmdclass,
# these are optional and override conf.py settings
command_options={
'build_sphinx': {
'project': ('setup.py', name),
'version': ('setup.py', version),
'release': ('setup.py', release),
'source_dir': ('setup.py', 'doc')}},
)
Then build documentation using
$ python setup.py build_sphinx
Benefits:
Makes setup.py the single source of version number
Avoids having to make packages out of your project folders unnecessarily

You could have a look at bumpversion module:
"A small command line tool to simplify releasing software by updating all version strings in your source code by the correct increment"
You may use a configuration file .bumpversion.cfg for complex multi-file operations.

Another way is integrating setuptools_scm in your project. This way you can
from setuptools_scm import get_version
version = get_version()
in your conf.py

Here is a straightforward solution, ironically from the setuptools_scm PyPI page:
# contents of docs/conf.py
from importlib.metadata import version
release = version('myproject')
# for example take major/minor
version = '.'.join(release.split('.')[:2])
Here is their explanation why it is discouraged to use their package from Sphinx:
The underlying reason is, that services like Read the Docs sometimes change the working directory for good reasons and using the installed metadata prevents using needless volatile data there.

Extract Information from pyproject.toml
If you use a pyproject.toml you could also parse it in the conf.py with tomli or use the equivalent tomllib when you are on python ^3.11.
Like this you can extract the information from the pyproject.toml and use it in your sphinx documentation configuration.
Here a short incomplete example using tomli, assuming conf.py
is located at <project-root>/docs/source/conf.py:
# conf.py
import tomli
with open("../../pyproject.toml", "rb") as f:
toml = tomli.load(f)
# -- Project information -----------------------------------------------------
pyproject = toml["tool"]["poetry"]
project = pyproject["name"]
version = pyproject["version"]
release = pyproject["version"]
copyright = ...
author = ...
# and the rest of the configuration

How to include license file in setup.py script?

I have written a Python extension module in C++.
I plan to distribute the module with setuptools.
There will be binary distributions for 32- and 64-bit Windows (built with setup.py bdist_egg) and a source distribution for UNIX-like platforms (built with setup.py sdist).
I plan to license the module under the BSD license.
In my source tree, the file LICENSE.txt is in the top folder along with setup.py.
How should I include it in the installation package?
I tried the following setup.py script:
from setuptools import setup, Extension
from glob import glob
setup(
name = 'Foo',
version = '0.1.0',
ext_modules = [Extension('Foo', glob('Source/*.cpp'))],
package_data = {'': ['LICENSE.txt']}
)
It did not work, the license file is not included in the installation package.
Maybe because the setup.py file does not define any packages,
only a single extension module.
How do I fix this?

Write a setup.cfg file and in there specify:
[metadata]
license_files = LICENSE.txt
For this to work it seems like wheel is required to be installed. That is:
pip install wheel
If you have wheel already installed and it doesn't work, try to update it:
pip install --upgrade wheel
Then when installing the package via pip install <path> the LICENSE file gets included.

Since setuptools 42.0.0 you can use the license_files key to specify a list of license files to be included into a distribution. Since version 56.0.0 it supports pattern matching and defaults to ('LICEN[CS]E*', 'COPYING*', 'NOTICE*', 'AUTHORS*').
Note that due to implementation details there's actually no need to put this key into setup.cfg file (as another answer suggests). You could supply it as an argument to setup() function instead:
(documentation was unclear on this at the time of writing)
from setuptools import setup
setup(
...
license_files = ('LICENSE.txt',),
...
)
Also note that while these files will be included in both binary (wheel) and source distributions, they won't be installed with your package from setup.py-style source distribution if the user doesn't have a wheel package installed!
To ensure the license files will be installed along with your package you need to make some additional modifications to your setup script:
from setuptools import setup
from setuptools.command.egg_info import egg_info
class egg_info_ex(egg_info):
"""Includes license file into `.egg-info` folder."""
def run(self):
# don't duplicate license into `.egg-info` when building a distribution
if not self.distribution.have_run.get('install', True):
# `install` command is in progress, copy license
self.mkpath(self.egg_info)
self.copy_file('LICENSE.txt', self.egg_info)
egg_info.run(self)
setup(
...
license_files = ('LICENSE.txt',),
cmdclass = {'egg_info': egg_info_ex},
...
)
If your project is a pyproject.toml-style project and you think it will be installed by PEP 517-compatible frontend (e.g. pip>=19), a wheel will be forcibly built from your sources and the license files will be installed into .dist-info folder automatically.
Since version 61.0.0 you could specify project metadata and other configuration options in pyproject.toml file instead.

Using a METADATA.in file, the license can be included both the source package and wheels automatically:
METADATA.in
include README.md
include COPYING
Check out an example here:
https://github.com/node40/smsh

New setuptools (40.x) allows metadata, including license, to be stored in the setup.cfg's "metadata" section. If you use older setuptools you could provide license using the "license" named argument in your setup():
def read_text(file_name: str):
return open(os.path.join(base_path, file_name)).read()
setup(
name = 'Foo',
version = '0.1.0',
ext_modules = [Extension('Foo', glob('Source/*.cpp'))],
# package_data = {'': ['LICENSE.txt']}
license=read_text("LICENSE.txt")
)

You have to move the LICENSE.txt file into the package directory for your project. It cannot reside the top level. Python directories get deployed, not the deployment artifact. If you create a python package, that package actually contains a number of subpackages. Each subpackage must contain ALL the files relevant to deployment.
Do not use data_files as it will actually distribute the files as a separate package. (I've heard package_files works, but I have yet to see a working example to do this).

For example:
setup(
...
license="ZPL",
classifiers=[
...
'License :: OSI Approved :: Zope Public License',
...
],
...)
additionally you can insert your licence text into 'long_description':
setup(
...
long_description="Package description. \nLicense Text",
...)

Installing my sdist from PyPI puts the files in unexpected places

My problem is that when I upload my Python package to PyPI, and then install it from there using pip, my app breaks because it installs my files into completely different locations than when I simply install the exact same package from a local sdist.
Installing from the local sdist puts files on my system like this:
/Python27/
Lib/
site-packages/
gloopy-0.1.alpha-py2.7.egg/ (egg and install info files)
data/ (images and shader source)
doc/ (html)
examples/ (.py scripts that use the library)
gloopy/ (source)
This is much as I'd expect, and works fine (e.g. my source can find my data dir, because they lie next to each other, just like they do in development.)
If I upload the same sdist to PyPI and then install it from there, using pip, then things look very different:
/Python27/
data/ (images and shader source)
doc/ (html)
Lib/
site-packages/
gloopy-0.1.alpha-py2.7.egg/ (egg and install info files)
gloopy/ (source files)
examples/ (.py scripts that use the library)
This doesn't work at all - my app can't find its data files, plus obviously it's a mess, polluting the top-level /python27 directory with all my junk.
What am I doing wrong? How do I make the pip install behave like the local sdist install? Is that even what I should be trying to achieve?
Details
I have setuptools installed, and also distribute, and I'm calling distribute_setup.use_setuptools()
WindowsXP, Python2.7.
My development directory looks like this:
/gloopy
/data (image files and GLSL shader souce read at runtime)
/doc (html files)
/examples (some scripts to show off the library)
/gloopy (the library itself)
My MANIFEST.in mentions all the files I want to be included in the sdist, including everything in the data, examples and doc directories:
recursive-include data *.*
recursive-include examples *.py
recursive-include doc/html *.html *.css *.js *.png
include LICENSE.txt
include TODO.txt
My setup.py is quite verbose, but I guess the best thing is to include it here, right? I also includes duplicate references to the same data / doc / examples directories as are mentioned in the MANIFEST.in, because I understand this is required in order for these files to be copied from the sdist to the system during install.
NAME = 'gloopy'
VERSION= __import__(NAME).VERSION
RELEASE = __import__(NAME).RELEASE
SCRIPT = None
CONSOLE = False
def main():
import sys
from pprint import pprint
from setup_utils import distribute_setup
from setup_utils.sdist_setup import get_sdist_config
distribute_setup.use_setuptools()
from setuptools import setup
description, long_description = read_description()
config = dict(
name=name,
version=version,
description=description,
long_description=long_description,
keywords='',
packages=find_packages(),
data_files=[
('examples', glob('examples/*.py')),
('data/shaders', glob('data/shaders/*.*')),
('doc', glob('doc/html/*.*')),
('doc/_images', glob('doc/html/_images/*.*')),
('doc/_modules', glob('doc/html/_modules/*.*')),
('doc/_modules/gloopy', glob('doc/html/_modules/gloopy/*.*')),
('doc/_modules/gloopy/geom', glob('doc/html/_modules/gloopy/geom/*.*')),
('doc/_modules/gloopy/move', glob('doc/html/_modules/gloopy/move/*.*')),
('doc/_modules/gloopy/shapes', glob('doc/html/_modules/gloopy/shapes/*.*')),
('doc/_modules/gloopy/util', glob('doc/html/_modules/gloopy/util/*.*')),
('doc/_modules/gloopy/view', glob('doc/html/_modules/gloopy/view/*.*')),
('doc/_static', glob('doc/html/_static/*.*')),
('doc/_api', glob('doc/html/_api/*.*')),
],
classifiers=[
'Development Status :: 1 - Planning',
'Intended Audience :: Developers',
'License :: OSI Approved :: BSD License',
'Operating System :: Microsoft :: Windows',
'Programming Language :: Python :: 2.7',
],
# see classifiers http://pypi.python.org/pypi?:action=list_classifiers
)
config.update(dict(
author='Jonathan Hartley',
author_email='tartley#tartley.com',
url='http://bitbucket.org/tartley/gloopy',
license='New BSD',
) )
if '--verbose' in sys.argv:
pprint(config)
setup(**config)
if __name__ == '__main__':
main()

The data_files parameter is for data files who isn't a part of the package. You should probably use package_data instead.
See https://docs.python.org/3/distutils/setupscript.html#installing-package-data
That wouldn't install the data in site-packages/data, but in my opinion that's not where is should be installed anyway. You won't know which package it's a part of. It should be installed in site-packages//gloopy-0.1.alpha-py2.7.egg/[data|doc|examples] IMO.
If you really do think the data is not package data, then you should use data_files and in that case pip installs it correctly, while I'd claim setup.py install installs it in the wrong place. But in my opinion, in this case, it is package_data, as it's related to the package, and not used by other software.

You can load package data with pkgutil.get_data(), it will find where exactly package data is installed.
Here is a nice blog post about including data files in packages: Including data files into Python packages

Standard way to embed version into Python package?

Is there a standard way to associate version string with a Python package in such way that I could do the following?
import foo
print(foo.version)
I would imagine there's some way to retrieve that data without any extra hardcoding, since minor/major strings are specified in setup.py already. Alternative solution that I found was to have import __version__ in my foo/__init__.py and then have __version__.py generated by setup.py.

Not directly an answer to your question, but you should consider naming it __version__, not version.
This is almost a quasi-standard. Many modules in the standard library use __version__, and this is also used in lots of 3rd-party modules, so it's the quasi-standard.
Usually, __version__ is a string, but sometimes it's also a float or tuple.
As mentioned by S.Lott (Thank you!), PEP 8 says it explicitly:
Module Level Dunder Names
Module level "dunders" (i.e. names with two leading and two trailing
underscores) such as __all__, __author__, __version__, etc.
should be placed after the module docstring but before any import
statements except from __future__ imports.
You should also make sure that the version number conforms to the format described in PEP 440 (PEP 386 a previous version of this standard).

I use a single _version.py file as the "once cannonical place" to store version information:
It provides a __version__ attribute.
It provides the standard metadata version. Therefore it will be detected by pkg_resources or other tools that parse the package metadata (EGG-INFO and/or PKG-INFO, PEP 0345).
It doesn't import your package (or anything else) when building your package, which can cause problems in some situations. (See the comments below about what problems this can cause.)
There is only one place that the version number is written down, so there is only one place to change it when the version number changes, and there is less chance of inconsistent versions.
Here is how it works: the "one canonical place" to store the version number is a .py file, named "_version.py" which is in your Python package, for example in myniftyapp/_version.py. This file is a Python module, but your setup.py doesn't import it! (That would defeat feature 3.) Instead your setup.py knows that the contents of this file is very simple, something like:
__version__ = "3.6.5"
And so your setup.py opens the file and parses it, with code like:
import re
VERSIONFILE="myniftyapp/_version.py"
verstrline = open(VERSIONFILE, "rt").read()
VSRE = r"^__version__ = ['\"]([^'\"]*)['\"]"
mo = re.search(VSRE, verstrline, re.M)
if mo:
verstr = mo.group(1)
else:
raise RuntimeError("Unable to find version string in %s." % (VERSIONFILE,))
Then your setup.py passes that string as the value of the "version" argument to setup(), thus satisfying feature 2.
To satisfy feature 1, you can have your package (at run-time, not at setup time!) import the _version file from myniftyapp/__init__.py like this:
from _version import __version__
Here is an example of this technique that I've been using for years.
The code in that example is a bit more complicated, but the simplified example that I wrote into this comment should be a complete implementation.
Here is example code of importing the version.
If you see anything wrong with this approach, please let me know.

Rewritten 2017-05
After 13+ years of writing Python code and managing various packages, I came to the conclusion that DIY is maybe not the best approach.
I started using the pbr package for dealing with versioning in my packages. If you are using git as your SCM, this will fit into your workflow like magic, saving your weeks of work (you will be surprised about how complex the issue can be).
As of today, pbr has 12M mongthly downloads, and reaching this level didn't include any dirty tricks. It was only one thing -- fixing a common packaging problem in a very simple way.
pbr can do more of the package maintenance burden, and is not limited to versioning, but it does not force you to adopt all its benefits.
So to give you an idea about how it looks to adopt pbr in one commit have a look switching packaging to pbr
Probably you would observed that the version is not stored at all in the repository. PBR does detect it from Git branches and tags.
No need to worry about what happens when you do not have a git repository because pbr does "compile" and cache the version when you package or install the applications, so there is no runtime dependency on git.
Old solution
Here is the best solution I've seen so far and it also explains why:
Inside yourpackage/version.py:
# Store the version here so:
# 1) we don't load dependencies by storing it in __init__.py
# 2) we can import it in setup.py for the same reason
# 3) we can import it into your module module
__version__ = '0.12'
Inside yourpackage/__init__.py:
from .version import __version__
Inside setup.py:
exec(open('yourpackage/version.py').read())
setup(
...
version=__version__,
...
If you know another approach that seems to be better let me know.

Per the deferred [STOP PRESS: rejected] PEP 396 (Module Version Numbers), there is a proposed way to do this. It describes, with rationale, an (admittedly optional) standard for modules to follow. Here's a snippet:
When a module (or package) includes a version number, the version SHOULD be available in the __version__ attribute.
For modules which live inside a namespace package, the module SHOULD include the __version__ attribute. The namespace package itself SHOULD NOT include its own __version__ attribute.
The __version__ attribute's value SHOULD be a string.

There is a slightly simpler alternative to some of the other answers:
__version_info__ = ('1', '2', '3')
__version__ = '.'.join(__version_info__)
(And it would be fairly simple to convert auto-incrementing portions of version numbers to a string using str().)
Of course, from what I've seen, people tend to use something like the previously-mentioned version when using __version_info__, and as such store it as a tuple of ints; however, I don't quite see the point in doing so, as I doubt there are situations where you would perform mathematical operations such as addition and subtraction on portions of version numbers for any purpose besides curiosity or auto-incrementation (and even then, int() and str() can be used fairly easily). (On the other hand, there is the possibility of someone else's code expecting a numerical tuple rather than a string tuple and thus failing.)
This is, of course, my own view, and I would gladly like others' input on using a numerical tuple.
As shezi reminded me, (lexical) comparisons of number strings do not necessarily have the same result as direct numerical comparisons; leading zeroes would be required to provide for that. So in the end, storing __version_info__ (or whatever it would be called) as a tuple of integer values would allow for more efficient version comparisons.

Many of these solutions here ignore git version tags which still means you have to track version in multiple places (bad). I approached this with the following goals:
Derive all python version references from a tag in the git repo
Automate git tag/push and setup.py upload steps with a single command that takes no inputs.
How it works:
From a make release command, the last tagged version in the git repo is found and incremented. The tag is pushed back to origin.
The Makefile stores the version in src/_version.py where it will be read by setup.py and also included in the release. Do not check _version.py into source control!
setup.py command reads the new version string from package.__version__.
Details:
Makefile
# remove optional 'v' and trailing hash "v1.0-N-HASH" -> "v1.0-N"
git_describe_ver = $(shell git describe --tags | sed -E -e 's/^v//' -e 's/(.*)-.*/\1/')
git_tag_ver = $(shell git describe --abbrev=0)
next_patch_ver = $(shell python versionbump.py --patch $(call git_tag_ver))
next_minor_ver = $(shell python versionbump.py --minor $(call git_tag_ver))
next_major_ver = $(shell python versionbump.py --major $(call git_tag_ver))
.PHONY: ${MODULE}/_version.py
${MODULE}/_version.py:
echo '__version__ = "$(call git_describe_ver)"' > $#
.PHONY: release
release: test lint mypy
git tag -a $(call next_patch_ver)
$(MAKE) ${MODULE}/_version.py
python setup.py check sdist upload # (legacy "upload" method)
# twine upload dist/* (preferred method)
git push origin master --tags
The release target always increments the 3rd version digit, but you can use the next_minor_ver or next_major_ver to increment the other digits. The commands rely on the versionbump.py script that is checked into the root of the repo
versionbump.py
"""An auto-increment tool for version strings."""
import sys
import unittest
import click
from click.testing import CliRunner # type: ignore
__version__ = '0.1'
MIN_DIGITS = 2
MAX_DIGITS = 3
#click.command()
#click.argument('version')
#click.option('--major', 'bump_idx', flag_value=0, help='Increment major number.')
#click.option('--minor', 'bump_idx', flag_value=1, help='Increment minor number.')
#click.option('--patch', 'bump_idx', flag_value=2, default=True, help='Increment patch number.')
def cli(version: str, bump_idx: int) -> None:
"""Bumps a MAJOR.MINOR.PATCH version string at the specified index location or 'patch' digit. An
optional 'v' prefix is allowed and will be included in the output if found."""
prefix = version[0] if version[0].isalpha() else ''
digits = version.lower().lstrip('v').split('.')
if len(digits) > MAX_DIGITS:
click.secho('ERROR: Too many digits', fg='red', err=True)
sys.exit(1)
digits = (digits + ['0'] * MAX_DIGITS)[:MAX_DIGITS] # Extend total digits to max.
digits[bump_idx] = str(int(digits[bump_idx]) + 1) # Increment the desired digit.
# Zero rightmost digits after bump position.
for i in range(bump_idx + 1, MAX_DIGITS):
digits[i] = '0'
digits = digits[:max(MIN_DIGITS, bump_idx + 1)] # Trim rightmost digits.
click.echo(prefix + '.'.join(digits), nl=False)
if __name__ == '__main__':
cli() # pylint: disable=no-value-for-parameter
This does the heavy lifting how to process and increment the version number from git.
__init__.py
The my_module/_version.py file is imported into my_module/__init__.py. Put any static install config here that you want distributed with your module.
from ._version import __version__
__author__ = ''
__email__ = ''
setup.py
The last step is to read the version info from the my_module module.
from setuptools import setup, find_packages
pkg_vars = {}
with open("{MODULE}/_version.py") as fp:
exec(fp.read(), pkg_vars)
setup(
version=pkg_vars['__version__'],
...
...
)
Of course, for all of this to work you'll have to have at least one version tag in your repo to start.
git tag -a v0.0.1

I use a JSON file in the package dir. This fits Zooko's requirements.
Inside pkg_dir/pkg_info.json:
{"version": "0.1.0"}
Inside setup.py:
from distutils.core import setup
import json
with open('pkg_dir/pkg_info.json') as fp:
_info = json.load(fp)
setup(
version=_info['version'],
...
)
Inside pkg_dir/__init__.py:
import json
from os.path import dirname
with open(dirname(__file__) + '/pkg_info.json') as fp:
_info = json.load(fp)
__version__ = _info['version']
I also put other information in pkg_info.json, like author. I
like to use JSON because I can automate management of metadata.

Lots of work toward uniform versioning and in support of conventions has been completed since this question was first asked. Palatable options are now detailed in the Python Packaging User Guide. Also noteworthy is that version number schemes are relatively strict in Python per PEP 440, and so keeping things sane is critical if your package will be released to the Cheese Shop.
Here's a shortened breakdown of versioning options:
Read the file in setup.py (setuptools) and get the version.
Use an external build tool (to update both __init__.py as well as source control), e.g. bump2version, changes or zest.releaser.
Set the value to a __version__ global variable in a specific module.
Place the value in a simple VERSION text file for both setup.py and code to read.
Set the value via a setup.py release, and use importlib.metadata to pick it up at runtime. (Warning, there are pre-3.8 and post-3.8 versions.)
Set the value to __version__ in sample/__init__.py and import sample in setup.py.
Use setuptools_scm to extract versioning from source control so that it's the canonical reference, not code.
NOTE that (7) might be the most modern approach (build metadata is independent of code, published by automation). Also NOTE that if setup is used for package release that a simple python3 setup.py --version will report the version directly.

Also worth noting is that as well as __version__ being a semi-std. in python so is __version_info__ which is a tuple, in the simple cases you can just do something like:
__version__ = '1.2.3'
__version_info__ = tuple([ int(num) for num in __version__.split('.')])
...and you can get the __version__ string from a file, or whatever.

There doesn't seem to be a standard way to embed a version string in a python package. Most packages I've seen use some variant of your solution, i.e. eitner
Embed the version in setup.py and have setup.py generate a module (e.g. version.py) containing only version info, that's imported by your package, or
The reverse: put the version info in your package itself, and import that to set the version in setup.py

arrow handles it in an interesting way.
Now (since 2e5031b)
In arrow/__init__.py:
__version__ = 'x.y.z'
In setup.py:
from arrow import __version__
setup(
name='arrow',
version=__version__,
# [...]
)
Before
In arrow/__init__.py:
__version__ = 'x.y.z'
VERSION = __version__
In setup.py:
def grep(attrname):
pattern = r"{0}\W*=\W*'([^']+)'".format(attrname)
strval, = re.findall(pattern, file_text)
return strval
file_text = read(fpath('arrow/__init__.py'))
setup(
name='arrow',
version=grep('__version__'),
# [...]
)

I also saw another style:
>>> django.VERSION
(1, 1, 0, 'final', 0)

After several hours of trying to find the simplest reliable solution, here are the parts:
create a version.py file INSIDE the folder of your package "/mypackage":
# Store the version here so:
# 1) we don't load dependencies by storing it in __init__.py
# 2) we can import it in setup.py for the same reason
# 3) we can import it into your module module
__version__ = '1.2.7'
in setup.py:
exec(open('mypackage/version.py').read())
setup(
name='mypackage',
version=__version__,
in the main folder init.py:
from .version import __version__
The exec() function runs the script outside of any imports, since setup.py is run before the module can be imported. You still only need to manage the version number in one file in one place, but unfortunately it is not in setup.py. (that's the downside, but having no import bugs is the upside)

pbr with bump2version
This solution was derived from this article
The use case - python GUI package distributed via PyInstaller. Needs to show version info.
Here is the structure of the project packagex
packagex
├── packagex
│   ├── __init__.py
│   ├── main.py
│   └── _version.py
├── packagex.spec
├── LICENSE
├── README.md
├── .bumpversion.cfg
├── requirements.txt
├── setup.cfg
└── setup.py
where setup.py is
# setup.py
import os
import setuptools
about = {}
with open("packagex/_version.py") as f:
exec(f.read(), about)
os.environ["PBR_VERSION"] = about["__version__"]
setuptools.setup(
setup_requires=["pbr"],
pbr=True,
version=about["__version__"],
)
packagex/_version.py contains just
__version__ = "0.0.1"
and packagex/__init__.py
from ._version import __version__
and for .bumpversion.cfg
[bumpversion]
current_version = 0.0.1
commit = False
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<build>\d+))?
serialize =
{major}.{minor}.{patch}-{release}{build}
{major}.{minor}.{patch}
[bumpversion:part:release]
optional_value = prod
first_value = dev
values =
dev
prod
[bumpversion:file:packagex/_version.py]

Using setuptools and pbr
There is not a standard way to manage version, but the standard way to manage your packages is setuptools.
The best solution I've found overall for managing version is to use setuptools with the pbr extension. This is now my standard way of managing version.
Setting up your project for full packaging may be overkill for simple projects, but if you need to manage version, you are probably at the right level to just set everything up. Doing so also makes your package releasable at PyPi so everyone can download and use it with Pip.
PBR moves most metadata out of the setup.py tools and into a setup.cfg file that is then used as a source for most metadata, which can include version. This allows the metadata to be packaged into an executable using something like pyinstaller if needed (if so, you will probably need this info), and separates the metadata from the other package management/setup scripts. You can directly update the version string in setup.cfg manually, and it will be pulled into the *.egg-info folder when building your package releases. Your scripts can then access the version from the metadata using various methods (these processes are outlined in sections below).
When using Git for VCS/SCM, this setup is even better, as it will pull in a lot of the metadata from Git so that your repo can be your primary source of truth for some of the metadata, including version, authors, changelogs, etc. For version specifically, it will create a version string for the current commit based on git tags in the repo.
PyPA - Packaging Python Packages with SetupTools - Tutorial
PBR latest build usage documentation - How to setup an 8-line setup.py and a setup.cfg file with the metadata.
As PBR will pull version, author, changelog and other info directly from your git repo, so some of the metadata in setup.cfg can be left out and auto generated whenever a distribution is created for your package (using setup.py)
Get the current version in real-time
setuptools will pull the latest info in real-time using setup.py:
python setup.py --version
This will pull the latest version either from the setup.cfg file, or from the git repo, based on the latest commit that was made and tags that exist in the repo. This command doesn't update the version in a distribution though.
Updating the version metadata
When you create a distribution with setup.py (i.e. py setup.py sdist, for example), then all the current info will be extracted and stored in the distribution. This essentially runs the setup.py --version command and then stores that version info into the package.egg-info folder in a set of files that store distribution metadata.
Note on process to update version meta-data:
If you are not using pbr to pull version data from git, then just update your setup.cfg directly with new version info (easy enough, but make sure this is a standard part of your release process).
If you are using git, and you don't need to create a source or binary distribution (using python setup.py sdist or one of the python setup.py bdist_xxx commands) the simplest way to update the git repo info into your <mypackage>.egg-info metadata folder is to just run the python setup.py install command. This will run all the PBR functions related to pulling metadata from the git repo and update your local .egg-info folder, install script executables for any entry-points you have defined, and other functions you can see from the output when you run this command.
Note that the .egg-info folder is generally excluded from being stored in the git repo itself in standard Python .gitignore files (such as from Gitignore.IO), as it can be generated from your source. If it is excluded, make sure you have a standard "release process" to get the metadata updated locally before release, and any package you upload to PyPi.org or otherwise distribute must include this data to have the correct version. If you want the Git repo to contain this info, you can exclude specific files from being ignored (i.e. add !*.egg-info/PKG_INFO to .gitignore)
Accessing the version from a script
You can access the metadata from the current build within Python scripts in the package itself. For version, for example, there are several ways to do this I have found so far:
## This one is a new built-in as of Python 3.8.0 should become the standard
from importlib.metadata import version
v0 = version("mypackage")
print('v0 {}'.format(v0))
## I don't like this one because the version method is hidden
import pkg_resources # part of setuptools
v1 = pkg_resources.require("mypackage")[0].version
print('v1 {}'.format(v1))
# Probably best for pre v3.8.0 - the output without .version is just a longer string with
# both the package name, a space, and the version string
import pkg_resources # part of setuptools
v2 = pkg_resources.get_distribution('mypackage').version
print('v2 {}'.format(v2))
## This one seems to be slower, and with pyinstaller makes the exe a lot bigger
from pbr.version import VersionInfo
v3 = VersionInfo('mypackage').release_string()
print('v3 {}'.format(v3))
You can put one of these directly in your __init__.py for the package to extract the version info as follows, similar to some other answers:
__all__ = (
'__version__',
'my_package_name'
)
import pkg_resources # part of setuptools
__version__ = pkg_resources.get_distribution("mypackage").version

Create a file named by _version.txt in the same folder as __init__.py and write version as a single line:
0.8.2
Read this infomation from file _version.txt in __init__.py:
import os
def get_version():
with open(os.path.join(os.path.abspath(os.path.dirname(__file__)), "_version.txt")) as f:
return f.read().strip()
__version__ = get_version()

I described a standard and modern way here, relying on setuptools_scm.
This pattern has worked successfully for dozens of published packages over the past years, so I can warmly recommend it.
Note that you do not need the getversion package to implement this pattern. It just happens that the getversion documentation hosts this tip.

I prefer to read the package version from installation environment.
This is my src/foo/_version.py:
from pkg_resources import get_distribution
__version__ = get_distribution('foo').version
Makesure foo is always already installed, that's why a src/ layer is required to prevent foo imported without installation.
In the setup.py, I use setuptools-scm to generate the version automatically.
Update in 2022.7.5:
There is another way, which is my faviourate now. Use setuptools-scm to generate a _version.py file.
setup(
...
use_scm_version={
'write_to':
'src/foo/_version.py',
'write_to_template':
'"""Generated version file."""\n'
'__version__ = "{version}"\n',
},
)

Using setuptools and pyproject.toml
Setuptools now offers a way to dynamically get version in pyproject.toml
Reproducing the example here, you can create something like the following in your pyproject.toml
# ...
[project]
name = "my_package"
dynamic = ["version"]
# ...
[tool.setuptools.dynamic]
version = {attr = "my_package.__version__"}

Use a version.py file only with __version__ = <VERSION> param in the file. In the setup.py file import the __version__ param and put it's value in the setup.py file like this:
version=__version__
Another way is to use just a setup.py file with version=<CURRENT_VERSION> - the CURRENT_VERSION is hardcoded.
Since we don't want to manually change the version in the file every time we create a new tag (ready to release a new package version), we can use the following..
I highly recommend bumpversion package. I've been using it for years to bump a version.
start by adding version=<VERSION> to your setup.py file if you don't have it already.
You should use a short script like this every time you bump a version:
bumpversion (patch|minor|major) - choose only one option
git push
git push --tags
Then add one file per repo called: .bumpversion.cfg:
[bumpversion]
current_version = <CURRENT_TAG>
commit = True
tag = True
tag_name = {new_version}
[bumpversion:file:<RELATIVE_PATH_TO_SETUP_FILE>]
Note:
You can use __version__ parameter under version.py file like it was suggested in other posts and update the bumpversion file like this:
[bumpversion:file:<RELATIVE_PATH_TO_VERSION_FILE>]
You must git commit or git reset everything in your repo, otherwise you'll get a dirty repo error.
Make sure that your virtual environment includes the package of bumpversion, without it it will not work.

For what it's worth, if you're using NumPy distutils, numpy.distutils.misc_util.Configuration has a make_svn_version_py() method that embeds the revision number inside package.__svn_version__ in the variable version .

If you use CVS (or RCS) and want a quick solution, you can use:
__version__ = "$Revision: 1.1 $"[11:-2]
__version_info__ = tuple([int(s) for s in __version__.split(".")])
(Of course, the revision number will be substituted for you by CVS.)
This gives you a print-friendly version and a version info that you can use to check that the module you are importing has at least the expected version:
import my_module
assert my_module.__version_info__ >= (1, 1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

packaging with numpy and test suite - python

Related

Packaging a Python extension for a C++ library: what to include, how to include it in .whl?

Let sphinx use version from setup.py

How to include license file in setup.py script?

Installing my sdist from PyPI puts the files in unexpected places

Standard way to embed version into Python package?

Categories

Resources