How to perform custom build steps in setup.py? - python

The distutils module allows to include and install resource files together with Python modules. How to properly include them if resource files should be generated during a building process?
For example, the project is a web application which contains CoffeeScript sources that should be compiled into JavaScript and included in a Python package then. Is there a way to integrate this into a normal sdist/bdist process?

I spent a fair while figuring this out, the various suggestions out there are broken in various ways - they break installation of dependencies, or they don't work in pip, etc. Here's my solution:
in setup.py:
from setuptools import setup, find_packages
from setuptools.command.install import install
from distutils.command.install import install as _install
class install_(install):
# inject your own code into this func as you see fit
def run(self):
ret = None
if self.old_and_unmanageable or self.single_version_externally_managed:
ret = _install.run(self)
else:
caller = sys._getframe(2)
caller_module = caller.f_globals.get('__name__','')
caller_name = caller.f_code.co_name
if caller_module != 'distutils.dist' or caller_name!='run_commands':
_install.run(self)
else:
self.do_egg_install()
# This is just an example, a post-install hook
# It's a nice way to get at your installed module though
import site
site.addsitedir(self.install_lib)
sys.path.insert(0, self.install_lib)
from mymodule import install_hooks
install_hooks.post_install()
return ret
Then, in your call to the setup function, pass the arg:
cmdclass={'install': install_}
You could use the same idea for build as opposed to install, write yourself a decorator to make it easier, etc. This has been tested via pip, and direct 'python setup.py install' invocation.

The best way would be to write a custom build_coffeescript command and make it a subcommand of build. More details are given in other replies to similar/duplicate questions, for example this one:
https://stackoverflow.com/a/1321345/150999

Related

How do you get the filename of a Python wheel when running setup.py?

I have a build process that creates a Python wheel using the following command:
python setup.py bdist_wheel
The build process can be run on many platforms (Windows, Linux, py2, py3 etc.) and I'd like to keep the default output names (e.g. mapscript-7.2-cp27-cp27m-win_amd64.whl) to upload to PyPI.
Is there anyway to get the generated wheel's filename (e.g. mapscript-7.2-cp27-cp27m-win_amd64.whl) and save to a variable so I can then install the wheel later on in the script for testing?
Ideally the solution would be cross platform. My current approach is to try and clear the folder, list all files and select the first (and only) file in the list, however this seems a very hacky solution.
setuptools
If you are using a setup.py script to build the wheel distribution, you can use the bdist_wheel command to query the wheel file name. The drawback of this method is that it uses bdist_wheel's private API, so the code may break on wheel package update if the authors decide to change it.
from setuptools.dist import Distribution
def wheel_name(**kwargs):
# create a fake distribution from arguments
dist = Distribution(attrs=kwargs)
# finalize bdist_wheel command
bdist_wheel_cmd = dist.get_command_obj('bdist_wheel')
bdist_wheel_cmd.ensure_finalized()
# assemble wheel file name
distname = bdist_wheel_cmd.wheel_dist_name
tag = '-'.join(bdist_wheel_cmd.get_tag())
return f'{distname}-{tag}.whl'
The wheel_name function accepts the same arguments you pass to the setup() function. Example usage:
>>> wheel_name(name="mydist", version="1.2.3")
mydist-1.2.3-py3-none-any.whl
>>> wheel_name(name="mydist", version="1.2.3", ext_modules=[Extension("mylib", ["mysrc.pyx", "native.c"])])
mydist-1.2.3-cp36-cp36m-linux_x86_64.whl
Notice that the source files for native libs (mysrc.pyx or native.c in the above example) don't have to exist to assemble the wheel name. This is helpful in case the sources for the native lib don't exist yet (e.g. you are generating them later via SWIG, Cython or whatever).
This makes the wheel_name easily reusable in the setup.py script where you define the distribution metadata:
# setup.py
from setuptools import setup, find_packages, Extension
from setup_helpers import wheel_name
setup_kwargs = dict(
name='mydist',
version='1.2.3',
packages=find_packages(),
ext_modules=[Extension(...), ...],
...
)
file = wheel_name(**setup_kwargs)
...
setup(**setup_kwargs)
If you want to use it outside of the setup script, you have to organize the access to setup() args yourself (e.g. reading them from a setup.cfg script or whatever).
This part is loosely based on my other answer to setuptools, know in advance the wheel filename of a native library
poetry
Things can be simplified a lot (it's practically a one-liner) if you use poetry because all the relevant metadata is stored in the pyproject.toml. Again, this uses an undocumented API:
from clikit.io import NullIO
from poetry.factory import Factory
from poetry.masonry.builders.wheel import WheelBuilder
from poetry.utils.env import NullEnv
def wheel_name(rootdir='.'):
builder = WheelBuilder(Factory().create_poetry(rootdir), NullEnv(), NullIO())
return builder.wheel_filename
The rootdir argument is the directory containing your pyproject.toml script.
flit
AFAIK flit can't build wheels with native extensions, so it can give you only the purelib name. Nevertheless, it may be useful if your project uses flit for distribution building. Notice this also uses an undocumented API:
from flit_core.wheel import WheelBuilder
from io import BytesIO
from pathlib import Path
def wheel_name(rootdir='.'):
config = str(Path(rootdir, 'pyproject.toml'))
builder = WheelBuilder.from_ini_path(config, BytesIO())
return builder.wheel_filename
Implementing your own solution
I'm not sure whether it's worth it. Still, if you want to choose this path, consider using packaging.tags before you find some old deprecated stuff or even decide to query the platform yourself. You will still have to fall back to private stuff to assemble the correct wheel name, though.
My current approach to install the wheel is to point pip to the folder containing the wheel and let it search itself:
python -m pip install --no-index --find-links=build/dist mapscript
twine also can be pointed directly at a folder without needing to know the exact wheel name.
I used a modified version of hoefling's solution. My goal was to copy the build to a "latest" wheel file. The setup() function will return an object with all the info you need, so you can find out what it actually built, which seems simpler than the solution above. Assuming you have a variable version in use, the following will get the file name I just built and then copies it.
setup = setuptools.setup(
# whatever options you currently have
)
wheel_built = 'dist/{}-{}.whl'.format(
setup.command_obj['bdist_wheel'].wheel_dist_name,
'-'.join(setup.command_obj['bdist_wheel'].get_tag()))
wheel_latest = wheel_built.replace(version, 'latest')
shutil.copy(wheel_built, wheel_latest)
print('Copied {} >> {}'.format(wheel_built, wheel_latest))
I guess one possible drawback is you have to actually do the build to get the name, but since that was part of my workflow, I was ok with that. hoefling's solution has the benefit of letting you plan the name without doing the build, but it seems more complex.

setup_requires only for some commands

I have a distutils-style Python package which requires a specific, and quite large, dependency for its build step. Currently, this dependency is specified under the setup_requires argument to distutils.setup. Unfortunately, this means the dependency will be built for any execution of setup.py, including when running setup.py clean. This creates the rather ironic situation of the clean step sometimes causing large amount of code to be compiled.
As I said, this setup dependency is only required for the build step. Is there a way to encode this logic in setup.py so that all commands that do not invoke the build command are run without it?
You can always order the Distribution to fetch some packages explicitly, same way as they will be if you define them in setup_requires. Example with numpy dependency required for build command only:
from distutils.command.build import build as build_orig
from setuptools import setup, find_packages, Command, dist
class build(build_orig):
def run(self):
self.distribution.fetch_build_eggs(['numpy'])
# numpy becomes available after this line. Test it:
import numpy
print(numpy.__version__)
super().run()
setup(
name='spam',
packages=find_packages(),
cmdclass={'build': build,}
...
)
The dependencies are passed the same as they would be defined in setup_requires arg, so version specs are also ok:
self.distribution.fetch_build_eggs(['numpy>=1.13'])
Although I must note that fetching dependencies via setup_requires is usually much slower than installing them via pip (especially when you have some heavy dependencies that must be built from source first), so if you can be sure you will have pip available (or use python3.4 and newer), the approach suggested by phd in his answer will save you time. Fetching eggs via distribution may, however, come handy when building for old python versions or obscure python installations like the system python on MacOS.
if sys.argv[0] == 'build':
kw = {'setup_requires': [req1, req2, …]}
else:
kw = {}
setup(
…,
**kw
)
Another approach to try is override build command with a custom cmdclass:
from setuptools.command.build import build as _build
class build(_build):
def run(self):
subprocess.call(["pip", "install", req1, req2…])
_build.run(self)
setup(
…,
cmdclass={'build': build},
)
and avoid setup_requires at all.

python setuptools install_requires is ignored when overriding cmdclass

I have a setup.py that looks like this:
from setuptools import setup
from subprocess import call
from setuptools.command.install import install
class MyInstall(install):
def run(self):
call(["pip install -r requirements.txt --no-clean"], shell=True)
install.run(self)
setup(
author='Attila Zseder',
version='0.1',
name='entity_extractor',
packages=['...'],
install_requires=['DAWG', 'mrjob', 'cchardet'],
package_dir={'': 'modules'},
scripts=['...'],
cmdclass={'install': MyInstall},
)
I need MyInstall because I want to install some libraries from github and I didn't want to use dependency_links option, because it's discouraged (for example here), so I can do this with requirements.txt.
When I install this package with pip, everything is working fine, but for some reasons I have to solve this in a way that it also works with pure python setup.py install. And it doesn't.
When overriding cmdclass in setup() with my own class, install_requires seems to be ignored. As soon as I comment out that line, those packages are being installed.
I know that install_requires is not supported for example in distutils (if I remember well), but it is in setuptools. And then cmdclass wouldn't have any effect on install_requires.
I googled this problem for hours, found a lot of kind of related answers on stackoverflow, but not for this particular problem.
With putting every needed package to requirements.txt, everything's working fine, but I would like to understand why this is happening. Thanks!
The same problem just happened to me. It somehow seems like something triggers setuptools to do an 'old-style install' with distutils, which indeed does not support install_requires.
You call install.run(self) which calls run(self) in setuptools/setuptools/command/install.py, line 51-74
https://bitbucket.org/pypa/setuptools/src/8e8c50925f18eafb7e66fe020aa91a85b9a4b122/setuptools/command/install.py?at=default
def run(self):
# Explicit request for old-style install? Just do it
if self.old_and_unmanageable or self.single_version_externally_managed:
return _install.run(self)
# Attempt to detect whether we were called from setup() or by another
# command. If we were called by setup(), our caller will be the
# 'run_command' method in 'distutils.dist', and *its* caller will be
# the 'run_commands' method. If we were called any other way, our
# immediate caller *might* be 'run_command', but it won't have been
# called by 'run_commands'. This is slightly kludgy, but seems to
# work.
#
caller = sys._getframe(2)
caller_module = caller.f_globals.get('__name__','')
caller_name = caller.f_code.co_name
if caller_module != 'distutils.dist' or caller_name!='run_commands':
# We weren't called from the command line or setup(), so we
# should run in backward-compatibility mode to support bdist_*
# commands.
_install.run(self)
else:
self.do_egg_install()
I'm not sure whether this behaviour is intended, but replacing
install.run(self)
with
install.do_egg_install()
should solve your problem. At least it works for me, but I would also appreciate a more detailed answer. Thanks!
According to https://stackoverflow.com/a/20196065 a more correct way to do this may be to override bdist_egg command.
You could try:
from setuptools.command.bdist_egg import bdist_egg as _bdist_egg
class bdist_egg(_bdist_egg):
def run(self):
call(["pip install -r requirements.txt --no-clean"], shell=True)
_bdist_egg.run(self)
...
setup(...
cmdclass={'bdist_egg': bdist_egg}, # override bdist_egg
)
It worked for me and install_requireis no more ignored. Nevertheless, I still don't understand why most people seem to override cmdclass install and do not complain about install_require being ignored.
I know this is an old question, but I ran into a similar problem. The solution I have found fixes this problem for me is very subtle: The install class you're setting in cmd_class must physically be named install. See this answer on a related issue.
Note that I use the class name install for my derived class because that is what python setup.py --help-commands will use.
You also should use self.execute(_func_name, (), msg="msg") in your post_install instead of calling the function directly
So implementing something like this should cause you to avoid the do_egg_install workaround implemented above by KEgg.
from setuptools.command.install import install as _install
...
def _post_install():
#code here
class install(_install):
def run(self):
_install.run(self)
self.execute(_post_install, (), msg="message here")

Execute a Python script post install using distutils / setuptools

Note: distutils is deprecated and the accepted answer has been updated to use setuptools
I'm trying to add a post-install task to Python distutils as described in How to extend distutils with a simple post install script?. The task is supposed to execute a Python script in the installed lib directory. This script generates additional Python modules the installed package requires.
My first attempt is as follows:
from distutils.core import setup
from distutils.command.install import install
class post_install(install):
def run(self):
install.run(self)
from subprocess import call
call(['python', 'scriptname.py'],
cwd=self.install_lib + 'packagename')
setup(
...
cmdclass={'install': post_install},
)
This approach works, but as far as I can tell has two deficiencies:
If the user has used a Python interpreter other than the one picked up from PATH, the post install script will be executed with a different interpreter which might cause a problem.
It's not safe against dry-run etc. which I might be able to remedy by wrapping it in a function and calling it with distutils.cmd.Command.execute.
How could I improve my solution? Is there a recommended way / best practice for doing this? I'd like to avoid pulling in another dependency if possible.
The way to address these deficiences is:
Get the full path to the Python interpreter executing setup.py from sys.executable.
Classes inheriting from setuptools.Command (such as setuptools.command.install.install which we use here) implement the execute method, which executes a given function in a "safe way" i.e. respecting the dry-run flag.
Note however that the --dry-run option is currently broken and does not work as intended anyway.
I ended up with the following solution:
import os, sys
from setuptools import setup
from setuptools.command.install import install as _install
def _post_install(dir):
from subprocess import call
call([sys.executable, 'scriptname.py'],
cwd=os.path.join(dir, 'packagename'))
class install(_install):
def run(self):
_install.run(self)
self.execute(_post_install, (self.install_lib,),
msg="Running post install task")
setup(
...
cmdclass={'install': install},
)
Note that I use the class name install for my derived class because that is what python setup.py --help-commands will use.
I think the easiest way to perform the post-install, and keep the requirements, is to decorate the call to setup(...):
from setup tools import setup
def _post_install(setup):
def _post_actions():
do_things()
_post_actions()
return setup
setup = _post_install(
setup(
name='NAME',
install_requires=['...
)
)
This will run setup() when declaring setup. Once done with the requirements installation, it will run the _post_install() function, which will run the inner function _post_actions().

How to depends of a system command with python/distutils?

I'm looking for the most elegant way to notify users of my library that they need a specific unix command to ensure that it will works...
When is the bet time for my lib to raise an error:
Installation ?
When my app call the command ?
At the import of my lib ?
both?
And also how should you detect that the command is missing (if not commands.getoutput("which CommandIDependsOn"): raise Exception("you need CommandIDependsOn")).
I need advices.
IMO, the best way is to check at install if the user has this specific *nix command.
If you're using distutils to distribute your package, in order to install it you have to do:
python setup.py build
python setup.py install
or simply
python setup.py install (in that case python setup.py build is implicit)
To check if the *nix command is installed, you can subclass the build method in your setup.py like this :
from distutils.core import setup
from distutils.command.build import build as _build
class build(_build):
description = "Custom Build Process"
user_options= _build.user_options[:]
# You can also define extra options like this :
#user_options.extend([('opt=', None, 'Name of optionnal option')])
def initialize_options(self):
# Initialize here you're extra options... Not needed in your case
#self.opt = None
_build.initialize_options(self)
def finalize_options(self):
# Finalize your options, you can modify value
if self.opt is None :
self.opt = "default value"
_build.finalize_options(self)
def run(self):
# Extra Check
# Enter your code here to verify if the *nix command is present
.................
# Start "classic" Build command
_build.run(self)
setup(
....
# Don't forget to register your custom build command
cmdclass = {'build' : build},
....
)
But what if the user uninstall the required command after the package installation? To solve this problem, the only "good" solution is to use a packaging systems such as deb or rpm and put a dependency between the command and your package.
Hope this helps
I wouldn't have any check at all. Document that your library requires this command, and if the user tries to use whatever part of your library needs it, an exception will be raised by whatever runs the command. It should still be possible to import your library and use it, even if only a subset of functionality is offered.
(PS: commands is old and broken and shouldn't be used in new code. subprocess is the hot new stuff.)

Categories