I would like to create an egg from two directories and want to include .config and .log files. The structure of the directories is the following:
MSKDataDownloader
|_______configs
|________sensors.config
MSKSubscriber
|_______doc
|________dependencies.log
Here's my setup.py file:
from setuptools import setup, find_packages
setup(
name='MSKDataDownloader',
version='1.0.0',
description='Data Downloader',
packages=find_packages(),
include_package_data=True,
package_data={
'MSKDataDownloader': ['config/*.config'],
'MSKSubscriber': ['doc/*.log']
'MSKSubscriber': ['config/*.config']
}
)
What am I doing wrong? Why is it not including the .config and .log files in the egg.
The problem is that include_package_data=True doesn't mean what you think it means (or what most reasonable people would think it means). The short version is, just get rid of it.
From the docs:
If set to True, this tells setuptools to automatically include any data files it finds inside your package directories that are specified by your MANIFEST.in file. For more information, see the section below on Including Data Files.
If you follow the link, you'll see that it in fact makes setuptools ignore whatever you told it explicitly in package_data, and instead look for every file mentioned in MANIFEST.in and find it within your directory tree (or source control tree):
If using the setuptools-specific include_package_data argument, files specified by package_data will not be automatically added to the manifest unless they are listed in the MANIFEST.in file.
And, since you don't have a MANIFEST.in, this means you end up with nothing.
So, you want to do one of two things:
Remove include_package_data=True.
Create a MANIFEST.in and remove package_data=….
This is all complicated by the fact that there are lots of examples and blog posts and tutorials left over from the distribute days1 that are just plain wrong for modern setuptools. In fact, there are a whole lot more out-of-date and wrong posts out there than correct ones.
The obvious answer is to just only use the tutorials and examples from the PyPA on pypa.org… but unfortunately, they haven't get written tutorials that cover anywhere near everything that you need.
So, often, you pretty much much have to read old tutorials, then look up everything they tell you in the reference docs to see which parts are wrong.
1. IIRC, in distribute, the include_package_data=True would cause your extra files to get added to an sdist, just not to anything else. Which still sounds useless, right? Except that you could make your egg and other distributions depend on building the sdist then running a script that generates the MANIFEST.in programmatically. Which was useful for… I forget, something to do with pulling version files from source control maybe?
Related
The setuptools documentation is very explicit about adding code to __init__.py files from namespaces:
You must NOT include any other code and data in a namespace package's __init__.py. Even though it may appear to work during development, or when projects are installed as .egg files, it will not work when the projects are installed using "system" packaging tools -- in such cases the __init__.py files will not be installed, let alone executed.
Yet, I do not understand what these "system" packaging tools are. What are they? How could I reproduce this situation where the __init__.py files are gone?
#Anzel's comment looked like a good answer, and I'd say PEP-420 confirms that. In its Rationale section, we read:
Namespace packages are designed to support being split across multiple directories (and hence found via multiple sys.path entries). In this configuration, it doesn't matter if multiple portions all provide an __init__.py file, so long as each portion correctly initializes the namespace package. However, Linux distribution vendors (amongst others) prefer to combine the separate portions and install them all into the same file system directory. This creates a potential for conflict, as the portions are now attempting to provide the same file on the target system - something that is not allowed by many package managers. Allowing implicit namespace packages means that the requirement to provide an __init__.py file can be dropped completely, and affected portions can be installed into a common directory or split across multiple directories as distributions see fit.
So yes, we cannot add any more code to our __init__.py files because OS package managers (and others) would prefer to merge them into only one directory tree.
Currently I'm using the auto-tools to build/install and package a project of mine, but I would really like to move to something that feels more "pythonic".
My project consists of two scripts, one module, two glade GUI descriptions, and two .desktop files. It's currently a pure python project, though that's likely to change soon-ish.
Looking at setuptools I can easily see how to deal with everything except the .desktop files; they have to end up in a specific directory so that Gnome can find them.
Is using distuils/setuptools a good idea to begin with?
I managed to get this to work, but it kinda feels to me more like a workaround.
Don't know what's the preferred way to handle this...
I used the following setup.py file (full version is here):
from setuptools import setup
setup(
# ...
data_files=[
('share/icons/hicolor/scalable/apps', ['data/mypackage.svg']),
('share/applications', ['data/mypackage.desktop'])
],
entry_points={
'console_scripts': ['startit=mypackage.cli:run']
}
)
The starter script trough entry_points works. But the data_files where put in an egg file and not in the folders specified, so they can't be accessed by the desktop shell.
To work around this, I used the following setup.cfg file:
[install]
single-version-externally-managed=1
record=install.txt
This works. Both data files are created in the right place and the .desktop file is recognized by Gnome.
In general, yes - everything is better than autotools when building Python projects.
I have good experiences with setuptools so far. However, installing files into fixed locations is not a strength of setuptools - after all, it's not something to build installaters for Python apps, but distribute Python libraries.
For the installation of files which are not application data files (like images, UI files etc) but provide integration into the operating system, you are better off with using a real packaging format (like RPM or deb).
That said, nothing stops you from having the build process based on setuptools and a small make file for installing everything into its rightful place.
You can try to use python-distutils-extra. The DistUtilsExtra.auto module automatically supports .desktop files, as well as Glade/GtkBuilder .ui files, Python modules and scripts, misc data files, etc.
It should work both with Distutils and Setuptools.
I've created https://pypi.python.org/pypi/install-freedesktop. It creates .desktop files automatically for the gui_scripts entry points, which can be customized through a setup argument, and supports --user as well as system-wide installation. Compared to DistUtilsExtra, it's more narrow in scope and IMHO more pythonic (explicit is better than implicit).
Following an (hopefully) common practice, I have a Python package that includes several modules and an executable script in a separate scripts directory, as can be seen here.
The documentation for the script, apart from the auto-generated help given by optparse, is together with the package documentation in a Sphinx subdirectory. I am trying to:
generate the man page for the script from the existing documentation
include the man page in the distribution
I can easily do #1 with Sphinx, the man_pages setting and sphinx-build -b man. So I can call python setup.py build_sphinx -b man and have the man page generated in the build/sphinx/man directory.
Now I would like to be able to have the generated man page included in the distribution tarball, so GNU/Linux packagers can find it and install it to the proper location. Various options like package_data do not seem to work here because the man page is not there until it is generated by Sphinx. This could also apply to i18n files (.mo vs .po files).
Including files that are not part of the source in MANIFEST.in doesn't seem right. The possibility of commiting the generated files to the source repository looks like an awful thing to do and I would like to avoid it.
There should be one-- and preferably only one --obvious way to do it.
To add static man pages in you distribution, you can add them in the MANIFEST file.
recursive-include docs *.txt
recursive-include po *.po
recursive-include sample_data *
recursive-include data *.desktop *.svg *.png
include COPYING.txt
include README.txt
recursive-include man_pages
Where man_pages is the directory containing the copies of generated man pages.
See also: http://linuxmanpages.com/man1/man.1.php
I would cause setup.py to generate the man pages probably before calling distutils.core.setup. Remember that setup.py at one level is python code. You want to test and make sure that it works even if sphinx is not installed (unless you require sphinx). So, if the man pages already exist and sphinx is not available do not fail. That way someone who unpacks your source distribution without sphinx can still run setup.py build and other targets.
The other option is to check in the man pages, but like you, I find that ugly.
The thing that I have seen done before is to provide a build target for your docs and make it clear in the README file that the documentation includes man pages and can be built by running that build target. Package maintainers then build your docs and package them during the package creation process.
The fedora 18 rpm for hawkey, for example, builds this way. I have also seen other rpms follow the model of building documentation at the same time as the source is built, then packaging it.
This question deserve a better answer, and not only because this issue has been bothering me for a while. So here is my implementation.
Download build_manpage.py from my github project (here is a link to build_manpage)
Save it somewhere you can import it to your setup.py
# inside setup.py
from setuptools import setup
from build_manpage import BuildManPage
...
...
setup(
...
...
cmdclass={
'build_manpage': BuildManPage,
)
Now you can invoke setup.py like this:
$ python setup.py build_manpage --output=prog.1 --parser=yourmodule:argparser
I have an egg distribution of a PyQt application which i build myself, and it contains sphinx generated documentation. When i call the help file from the application it opens the sphinx index.html in a QtWebKit.QWebView window. Apparently, only the index.html file is extracted from the egg into the OS's egg-directory (e.g. [..]\Application Data\Python-Eggs\ under Windows).
This results in broken css, broken images, and broken links, because these other files don't seem to get unpacked; they are present in the egg file, but not in the egg-directory.
Am i missing something here? Is there a way to force unpacking all html, css, image file immediately?
I see that you've already found another way to do it, but for future reference, here's the non-workaround way to do it automatically, from the documentation at http://peak.telecommunity.com/DevCenter/setuptools#automatic-resource-extraction [emphasis added]:
If you are using tools that expect your resources to be "real" files, or your project includes non-extension native libraries or other files that your C extensions expect to be able to access, you may need to list those files in the eager_resources argument to setup(), so that the files will be extracted together
So, in this case, what you want to do is have:
eager_resources=['doc/sphinx/build/html', 'doc/sphinx/build/html/index.html']
in your setup.py, which will cause the 'html' directory to be recursively extracted when you ask for the index.html (assuming that 'doc' in your example is a top-level package).
(You can find out more about the eager_resources keyword in the docs at http://peak.telecommunity.com/DevCenter/setuptools#new-and-changed-setup-keywords)
def get_help_url(self):
from pkg_resources import resource_filename
from doc import sphinx
import os
from PyQt4.QtCore import QUrl
html_path = resource_filename(sphinx.__name__, os.path.join('build', 'html'))
return QUrl(os.path.join(html_path, 'index.html'))
instead of
html = resource_filename(sphinx.__name__, os.path.join('build', 'html', 'index.html'))
return QUrl(html)
did the trick
Probable cause: not all the files are included in the egg in the first place.
Check this by unzipping the .egg (you might need to rename it to a .zip file for that on windows). Check if all the contents are there.
Look at how you made the egg. Do you use a MANIFEST.in file to tell setuptools which files to include? If not, you're probably trusting on setuptools' automatic inclusion of subversion files. All subversion'ed files automatically end up in the egg, python files do to, the rest does not.
The sphinx documentation is probably generated, so it is not in subversion, so it doesn't get included automatically.
Two solutions:
Use a MANIFEST.in file to manually specify (wildcards do work) all the files that should be included. Fail-safe as long as you're complete.
Or specify the html files as package_data, see How does setuptools decide which files to keep for sdist/bdist?
When packaging a Python package with a setup.py that uses the setuptools:
from setuptools import setup
...
the source distribution created by:
python setup.py sdist
not only includes, as usual, the files specified in MANIFEST.in, but it also, gratuitously, includes all of the files that Subversion lists as being version controlled beneath the package directory. This is vastly annoying. Not only does it make it difficult to exercise any sort of explicit control over what files get distributed with my package, but it means that when I build my package following an "svn export" instead of an "svn checkout", the contents of my package might be quite different, since without the .svn metadata setuptools will make different choices about what to include.
My question: how can I turn off this terrible behavior, so that "setuptools" treats my project the same way whether I'm using Subversion, or version control it's never heard of, or a bare tree created with "svn export" that I've created at the end of my project to make sure it builds cleanly somewhere besides my working directory?
The best I have managed so far is an ugly monkey-patch:
from setuptools.command import sdist
del sdist.finders[:]
But this is Python, not the jungle, so of course I want a better solution that involves no monkeys at all. How can I tame setuptools, turn off its magic, and have it behave sensibly by looking at the visible, predictable rules in my MANIFEST.py instead?
I know you know much of this, Brandon, but I'll try to give as a complete answer as I can (although I'm no setuptools gury) for the benefit of others.
The problem here is that setuptools itself involves quite a lot of black magick, including using an entry point called setuptools.file_finders where you can add plugins to find files to include. I am, however, at a complete loss as to how REMOVE plugins from it...
Quick workaround: svn export your package to a temporary directory and run the setup.py from there. That means you have no svn, so the svn finder finds no files to include. :)
Longer workaround: Do you really need setuptools? Setuptools have a lot of features, so the answer is likely yes, but mainly those features are depdenencies (so your dependencies get installed by easy_install), namespace packages (foo.bar), and entry points. Namespace packages can actually be created without setuptools as well. But if you use none of these you might actually get away with just using distutils.
Ugly workaround: The monkeypatch you gave to sdist in your question, which simply makes the plugin not have any finders, and exit quickly.
So as you see, this answer, although as complete as I can make it, is still embarrassingly incomplete. I can't actually answer your question, though I think the answer is "You can't".
Create a MANIFEST.in file with:
recursive-exclude .
# other MANIFEST.in commands go here
# to explicitly include whatever files you want
See http://docs.python.org/distutils/commandref.html#sdist-cmd for the MANIFEST.in syntax.
Simple solution, do not use setuptools for creating the source distribution, downgrade to distutils for that command:
from distutils.command.sdist import sdist
from setuptools import setup
setup(
# ... all the usual setup arguments ...
cmdclass = {'sdist': sdist},
)
Probably the answer is in your setup.py. Do you use find_packages? This function by default uses the VCS (e.g. subversion, hg, ...). If you don't like it, just write a different Python function which collects only the things you want.
I would argue that the default sdist behavior is correct. When you are building a source distribution, I would expect it to contain everything that is checked into Subversion. Of course it would be nice to be able to override it cleanly in special circumstances.
Compare sdist to bdist_egg; I bet only the files that are specified explicitly get included.
I did a simple test with three files, all in svn. Empty dummy.lkj and foobar.py and with setup.py looking like this:
import setuptools
setuptools.setup(name='foobar', version='0.1', py_modules=['foobar'])
sdist creates a tarball that includes dummy.lkj. bdist_egg creates an egg that does not include dummy.lkj.
You probably want something like this:
from distutils.core import setup
def packages():
import os
packages = []
for path, dirs, files in os.walk("yourprogram"):
if ".svn" in dirs:
dirs.remove(".svn")
if "__init__.py" in files:
packages.append(path.replace(os.sep, "."))
return packages
setup(
# name, version, description, etc...
packages = packages(),
# pacakge_data, data_files, etc...
)