setup.py not installing data files - python

I have a Python library that, in addition to regular Python modules, has some data files that need to go in /usr/local/lib/python2.7/dist-package/mylibrary.
Unfortunately, I have been unable to convince setup.py to actually install the data files there. Note that this behaviour is under install - not sdist.
Here is a slightly redacted version of setup.py
module_list = list_of_files
setup(name ='Modules',
version ='1.33.7',
description ='My Sweet Module',
author ='PN',
author_email ='email',
url ='url',
packages = ['my_module'],
# I tried this. It got installed in /usr/my_module. Not ok.
# data_files = [ ("my_module", ["my_module/data1",
# "my_module/data2"])]
# This doesn't install it at all.
package_data = {"my_module" : ["my_module/data1",
"my_module/data2"] }
)
This is in Python 2.7 (will have to run in 2.6 eventually), and will have to run on some Ubuntu between 10.04 and 12+. Developing it right now on 12.04.

UPD:
package_data accepts dict in format {'package': ['list', 'of?', 'globs*']}, so to make it work, one should specify shell globs relative to package dir, not the file paths relative to the distribution root.
data_files has a different meaning, and, in general, one should avoid using this parameter.
With setuptools you only need include_package_data=True, but data files should be under version control system, known to setuptools (by default it recognizes only CVS and SVN, install setuptools-git or setuptools-hg if you use git or hg...)
with setuptools you can:
- in MANIFEST.im:
include my_module/data*
- in setup.py:
setup(
...
include_package_data = True,
...
)

http://docs.python.org/distutils/setupscript.html#installing-additional-files
If directory is a relative path, it is interpreted relative to the
installation prefix (Python’s sys.prefix for pure-Python packages,
sys.exec_prefix for packages that contain extension modules).
This will probably do it:
data_files = [ ("my_module", ["local/lib/python2.7/dist-package/my_module/data1",
"local/lib/python2.7/dist-package/my_module/data2"])]
Or just use join to add the prefix:
data_dir = os.path.join(sys.prefix, "local/lib/python2.7/dist-package/my_module")
data_files = [ ("my_module", [os.path.join(data_dir, "data1"),
os.path.join(data_dir, "data2")])]

The following solution worked fine for me.
You should have MANIFEST.in file where setup.py is located.
Add the following code to the manifest file
recursive-include mypackage *.json *.md # can be extended with more extensions or file names.
Another solution is adding the following code to the MANIFEST.in file.
graft mypackage # will copy the entire package including non-python files.
global-exclude __pyache__ *.txt # list files you dont want to include here.
Now, when you do pip install all the necessary files will be included.
Hope this helps.
UPDATE:
Make sure that you also have include_package_data=True in the setup file

Related

Proper way to copy configuration files during installation?

I am trying to distribute a mplstyle I wrote such that I can share it easily. It boils down to copying a text file to the proper configuration direction (which is known for any architecture) during installation. I want to be able to install using either python setup.py install or pip install .... Currently I do not seem to get either of the two ways robust (see current approach below).
Installing with pip install ... does not seem to invoke the copying at all.
Installing with python setup.py install works well on my machine, but ReadTheDocs gives me the following error:
python setup.py install --force
running install
error: [Errno 2] No such file or directory: u'/home/docs/.config/matplotlib/stylelib/goose.mplsty
What is the proper way to copy configuration files during installation in a robust way?
Current approach
File structure
setup.py
goosempl/
| __init__.py
| stylelib/
| goose.mplstyle
| ...
setup.py
from setuptools import setup
from setuptools.command.install import install
class PostInstallCommand(install):
def run(self):
import goosempl
goosempl.copy_style()
install.run(self)
setup(
name = 'goosempl',
...,
install_requires = ['matplotlib>=2.0.0'],
packages = ['goosempl'],
cmdclass = {'install': PostInstallCommand},
package_data = {'goosempl/stylelib':['goosempl/stylelib/goose.mplstyle']},
)
goosempl/__init__.py
def copy_style():
import os
import matplotlib
from pkg_resources import resource_string
files = [
'stylelib/goose.mplstyle',
]
for fname in files:
path = os.path.join(matplotlib.get_configdir(),fname)
text = resource_string(__name__,fname).decode()
print(path, text)
open(path,'w').write(text)
Upload to PyPi
python setup.py bdist_wheel --universal
twine upload dist/*
First of all, based on the project structure you've provided, you're not specifying the package_data correctly. If goosempl is a package and stylelib a directory inside it containing the mplstyle files (what I assume from your code), then your package_data configuration line should be:
package_data = {'goosempl': ['stylelib/goose.mplstyle']},
As described in Building and Distributing Packages with Setuptools:
The package_data argument is a dictionary that maps from package names to lists of glob patterns. The globs may include subdirectory names, if the data files are contained in a subdirectory of the package.
So your package is goosempl and stylelib/goose.mplstyle is the file to be included in package data for goosempl.
Your second issue (No such file or directory) is simple: in the copy_style() function, you never check if the parent directory of the file exists before writing the file. You should be able to reproduce this locally by removing the directory /home/<user>/.config/matplotlib/stylelib/ (or moving it temporarily).
The fix is also quite simple, actually there are lots of them. Use whatever you want to create missing directories.
distutils.dir_util.mkpath is suitable for both python2 and python3:
for fname in files:
path = os.path.join(matplotlib.get_configdir(), fname)
distutils.dir_util.mkpath(os.dirname(path))
My preferred one is using pathlib, but it is available only since Python 3.4:
for fname in files:
path = pathlib.Path(matplotlib.get_configdir(), fname)
path.parent.mkdir(parents=True, exist_ok=True)

Python Package depending on XML file

I created a private Python package that requires an XML file. When I run the package locally and on CircleCi, everything works great. Now, when I run code that installs the package as a dependency, I keep getting an error:
<urlopen error [Errno 2] No such file or directory: '/home/ubuntu/virtualenvs/venv-system/local/lib/python2.7/site-packages/...../metadata_wsdl.xml'>
Does anyone know what could be wrong? I have not been able to figure this one out.
You need to explicitly include any resources that aren't Python source code (*.py) in your setuptools distribution.
There are several ways to do this. The one I'd recommend is to use a combination of include_package_data = True in your setup() function and a MANIFEST.in file.
So assuming your distribution is layed out as my.package/my/package (i.e., with no intermediate src or lib directory), you could use something along these lines:
setup.py
from setuptools import setup, find_packages
setup(
...
packages = find_packages('my'), # include all packages under my/
include_package_data = True, # include everything in source control
# or included in MANIFEST.in
)
MANIFEST.in
recursive-include my *
recursive-include docs *
global-exclude *.pyc
global-exclude ._*
global-exclude *.mo
This would recursively include any type of file below my.package/my/ as well as my.package/docs/, and globally exclude some other types of files unwanted in a released distribution.
Please refer to Building and Distributing Packages with Setuptools » Including Data Files for more details on the available methods to include data files, and The MANIFEST.in template for more information about how to define your MANIFEST.
Once you've successfully included your data files in your distribution, you should make sure to use the ResourceManager API to access them from your code (as opposed to __file__ trickery or other path hacks, which won't work for certain platforms or zipped eggs).

jinja2 and distutils - how can I make distutils install template together with modules [duplicate]

How do I make setup.py include a file that isn't part of the code? (Specifically, it's a license file, but it could be any other thing.)
I want to be able to control the location of the file. In the original source folder, the file is in the root of the package. (i.e. on the same level as the topmost __init__.py.) I want it to stay exactly there when the package is installed, regardless of operating system. How do I do that?
Probably the best way to do this is to use the setuptools package_data directive. This does mean using setuptools (or distribute) instead of distutils, but this is a very seamless "upgrade".
Here's a full (but untested) example:
from setuptools import setup, find_packages
setup(
name='your_project_name',
version='0.1',
description='A description.',
packages=find_packages(exclude=['ez_setup', 'tests', 'tests.*']),
package_data={'': ['license.txt']},
include_package_data=True,
install_requires=[],
)
Note the specific lines that are critical here:
package_data={'': ['license.txt']},
include_package_data=True,
package_data is a dict of package names (empty = all packages) to a list of patterns (can include globs). For example, if you want to only specify files within your package, you can do that too:
package_data={'yourpackage': ['*.txt', 'path/to/resources/*.txt']}
The solution here is definitely not to rename your non-py files with a .py extension.
See Ian Bicking's presentation for more info.
UPDATE: Another [Better] Approach
Another approach that works well if you just want to control the contents of the source distribution (sdist) and have files outside of the package (e.g. top-level directory) is to add a MANIFEST.in file. See the Python documentation for the format of this file.
Since writing this response, I have found that using MANIFEST.in is typically a less frustrating approach to just make sure your source distribution (tar.gz) has the files you need.
For example, if you wanted to include the requirements.txt from top-level, recursively include the top-level "data" directory:
include requirements.txt
recursive-include data *
Nevertheless, in order for these files to be copied at install time to the package’s folder inside site-packages, you’ll need to supply include_package_data=True to the setup() function. See Adding Non-Code Files for more information.
To accomplish what you're describing will take two steps...
The file needs to be added to the source tarball
setup.py needs to be modified to install the data file to the source path
Step 1: To add the file to the source tarball, include it in the MANIFEST
Create a MANIFEST template in the folder that contains setup.py
The MANIFEST is basically a text file with a list of all the files that will be included in the source tarball.
Here's what the MANIFEST for my project look like:
CHANGELOG.txt
INSTALL.txt
LICENSE.txt
pypreprocessor.py
README.txt
setup.py
test.py
TODO.txt
Note: While sdist does add some files automatically, I prefer to explicitly specify them to be sure instead of predicting what it does and doesn't.
Step 2: To install the data file to the source folder, modify setup.py
Since you're looking to add a data file (LICENSE.txt) to the source install folder you need to modify the data install path to match the source install path. This is necessary because, by default, data files are installed to a different location than source files.
To modify the data install dir to match the source install dir...
Pull the install dir info from distutils with:
from distutils.command.install import INSTALL_SCHEMES
Modify the data install dir to match the source install dir:
for scheme in INSTALL_SCHEMES.values():
scheme['data'] = scheme['purelib']
And, add the data file and location to setup():
data_files=[('', ['LICENSE.txt'])]
Note: The steps above should accomplish exactly what you described in a standard manner without requiring any extension libraries.
It is 2019, and here is what is working -
despite advice here and there, what I found on the internet halfway documented is using setuptools_scm, passed as options to setuptools.setup. This will include any data files that are versioned on your VCS, be it git or any other, to the wheel package, and will make "pip install" from the git repository to bring those files along.
So, I just added these two lines to the setup call on "setup.py". No extra installs or import required:
setup_requires=['setuptools_scm'],
include_package_data=True,
No need to manually list package_data, or in a MANIFEST.in file - if it is versioned, it is included in the package. The docs on "setuptools_scm" put emphasis on creating a version number from the commit position, and disregard the really important part of adding the data files. (I can't care less if my intermediate wheel file is named "*0.2.2.dev45+g3495a1f" or will use the hardcoded version number "0.3.0dev0" I've typed in - but leaving crucial files for the program to work behind is somewhat important)
create MANIFEST.in in the project root with recursive-include to the required directory or include with the file name.
include LICENSE
include README.rst
recursive-include package/static *
recursive-include package/templates *
documentation can be found here
Step 1: create a MANIFEST.in file in the same folder with setup.py
Step 2: include the relative path to the files you want to add in MANIFEST.in
include README.rst
include docs/*.txt
include funniest/data.json
Step 3: set include_package_data=True in the setup() function to copy these files to site-package
Reference is here.
I wanted to post a comment to one of the questions but I don't enough reputation to do that >.>
Here's what worked for me (came up with it after referring the docs):
package_data={
'mypkg': ['../*.txt']
},
include_package_data: False
The last line was, strangely enough, also crucial for me (you can also omit this keyword argument - it works the same).
What this does is it copies all text files in your top-level or root directory (one level up from the package mypkg you want to distribute).
None of the above really worked for me. What saved me was this answer.
Apparently, in order for these data files to be extracted during installation, I had to do a couple of things:
Like already mentioned - Add a MANIFEST.in to the project and specify the folder/files you want to be included. In my case: recursive-include folder_with_extra_stuff *
Again, like already mentioned - Add include_package_data=True to your setup.py. This is crucial, because without it only the files that match *.py will be brought.
This is what was missing! - Add an empty __init__.py to your data folder. For me I had to add this file to my folder-with-extra-stuff.
Extra - Not sure if this is a requirement, but with my own python modules I saw that they're zipped inside the .egg file in site-packages. So I had to add zip_safe=False to my setup.py file.
Final Directory Structure
my-app/
├─ app/
│ ├─ __init__.py
│ ├─ __main__.py
├─ folder-with-extra-stuff/
│ ├─ __init__.py
│ ├─ data_file.json
├─ setup.py
├─ MANIFEST.in
This works in 2020!
As others said create "MANIFEST.in" where your setup.py is located.
Next in manifest include/exclude all the necessary things. Be careful here regarding the syntax.
Ex: lets say we have template folder to be included in the source package.
in manifest file do this :
recursive-include template *
Make sure you leave space between dir-name and pattern for files/dirs like above.
Dont do like this like we do in .gitignore
recursive-include template/* [this won't work]
Other option is to use include. There are bunch of options. Look up here at their docs for Manifest.in
And the final important step, include this param in your setup.py and you are good to go!
setup(
...
include_package_data=True,
......
)
Hope that helps! Happy Coding!
In setup.py under setup( :
setup(
name = 'foo library'
...
package_data={
'foolibrary.folderA': ['*'], # All files from folder A
'foolibrary.folderB': ['*.txt'] #All text files from folder B
},
Here is a simpler answer that worked for me.
First, per a Python Dev's comment above, setuptools is not required:
package_data is also available to pure distutils setup scripts
since 2.3. – Éric Araujo
That's great because putting a setuptools requirement on your package means you will have to install it also. In short:
from distutils.core import setup
setup(
# ...snip...
packages = ['pkgname'],
package_data = {'pkgname': ['license.txt']},
)
I just wanted to follow up on something I found working with Python 2.7 on Centos 6. Adding the package_data or data_files as mentioned above did not work for me. I added a MANIFEST.IN with the files I wanted which put the non-python files into the tarball, but did not install them on the target machine via RPM.
In the end, I was able to get the files into my solution using the "options" in the setup/setuptools. The option files let you modify various sections of the spec file from setup.py. As follows.
from setuptools import setup
setup(
name='theProjectName',
version='1',
packages=['thePackage'],
url='',
license='',
author='me',
author_email='me#email.com',
description='',
options={'bdist_rpm': {'install_script': 'filewithinstallcommands'}},
)
file - MANIFEST.in:
include license.txt
file - filewithinstallcommands:
mkdir -p $RPM_BUILD_ROOT/pathtoinstall/
#this line installs your python files
python setup.py install -O1 --root=$RPM_BUILD_ROOT --record=INSTALLED_FILES
#install license.txt into /pathtoinstall folder
install -m 700 license.txt $RPM_BUILD_ROOT/pathtoinstall/
echo /pathtoinstall/license.txt >> INSTALLED_FILES
None of the answers worked for me because my files were at the top level, outside the package. I used a custom build command instead.
import os
import setuptools
from setuptools.command.build_py import build_py
from shutil import copyfile
HERE = os.path.abspath(os.path.dirname(__file__))
NAME = "thepackage"
class BuildCommand(build_py):
def run(self):
build_py.run(self)
if not self.dry_run:
target_dir = os.path.join(self.build_lib, NAME)
for fn in ["VERSION", "LICENSE.txt"]:
copyfile(os.path.join(HERE, fn), os.path.join(target_dir,fn))
setuptools.setup(
name=NAME,
cmdclass={"build_py": BuildCommand},
description=DESCRIPTION,
...
)
For non-python files to be included in an installation, they must be within one of the installed package directories. If you specify non-python files outside of your package directories in MANIFEST.in, they will be included in your distribution, but they will not be installed. The "documented" ways of installing arbitrary files outside of your package directories do not work reliably (as everyone has noticed by now).
The above answer from Julian Mann copies the files to your package directory in the build directory, so it does work, but not if you are installing in editable/develop mode (pip install -e or python setup.py develop). Based on this answer to a related question (and Julian's answer), below is an example that copies files to your installed package location either way after all the other install/develop tasks are done. The assumption here is that files file1 and file2 in your root-level data directory will be copied to your installed package directory (my_package), and that they will be accessible from python modules in your package using os.path.join(os.path.dirname(__file__), 'file1'), etc.
Remember to also do the MANIFEST.in stuff described above so that these files are also included in your distribution. Why setuptools would include files in your distribution and then silently never install them, is beyond my ken. Though installing them outside of your package directory is probably even more dubious.
import os
from setuptools import setup
from setuptools.command.develop import develop
from setuptools.command.install import install
from shutil import copyfile
HERE = os.path.abspath(os.path.dirname(__file__))
NAME = 'my_package'
def copy_files (target_path):
source_path = os.path.join(HERE, 'data')
for fn in ["file1", "file2"]:
copyfile(os.path.join(source_path, fn), os.path.join(target_path,fn))
class PostDevelopCommand(develop):
"""Post-installation for development mode."""
def run(self):
develop.run(self)
copy_files (os.path.abspath(NAME))
class PostInstallCommand(install):
"""Post-installation for installation mode."""
def run(self):
install.run(self)
copy_files (os.path.abspath(os.path.join(self.install_lib, NAME)))
setup(
name=NAME,
cmdclass={
'develop': PostDevelopCommand,
'install': PostInstallCommand,
},
version='0.1.0',
packages=[NAME],
include_package_data=True,
setup_requires=['setuptools_scm'],
)
Figured out a workaround: I renamed my lgpl2.1_license.txt to lgpl2.1_license.txt.py, and put some triple quotes around the text. Now I don't need to use the data_files option nor to specify any absolute paths. Making it a Python module is ugly, I know, but I consider it less ugly than specifying absolute paths.

Python setuptools: how to include a config file for distribution into <prefix>/etc

How can I write setup.py so that:
The binary egg distribution (bdist_egg) includes a sample configuration file and
Upon installation puts it into the {prefix}/etc directory?
A sample project source directory looks like this:
bin/
myapp
etc/
myapp.cfg
myapp/
__init__.py
[...]
setup.py
The setup.py looks like this:
from distutils.command.install_data import install_data
packages = ['myapp', ]
scripts = ['bin/myapp',]
cmdclasses = {'install_data': install_data}
data_files = [('etc', ['etc/myapp.cfg'])]
setup_args = {
'name': 'MyApp',
'version': '0.1',
'packages': packages,
'cmdclass': cmdclasses,
'data_files': data_files,
'scripts': scripts,
# 'include_package_data': True,
'test_suite': 'nose.collector'
}
try:
from setuptools import setup
except ImportError:
from distutils.core import setup
setup(**setup_args)
setuptools are installed in both the build environment and in the installation environment.
The 'include_package_data' commented out or not does not help.
I was doing some research on this issue and I think the answer is in the setuptools documentation: http://peak.telecommunity.com/DevCenter/setuptools#non-package-data-files
Next, I quote the extract that I think has the answer:
Non-Package Data Files
The distutils normally install general "data files" to a
platform-specific location (e.g. /usr/share). This feature intended to
be used for things like documentation, example configuration files,
and the like. setuptools does not install these data files in a
separate location, however. They are bundled inside the egg file or
directory, alongside the Python modules and packages. The data files
can also be accessed using the Resource Management API [...]
Note, by the way, that this encapsulation of data files means that you
can't actually install data files to some arbitrary location on a
user's machine; this is a feature, not a bug. You can always include a
script in your distribution that extracts and copies your the
documentation or data files to a user-specified location, at their
discretion. If you put related data files in a single directory, you
can use resource_filename() with the directory name to get a
filesystem directory that then can be copied with the shutil module.
[...]

How can I get my setup.py to use a relative path to my files?

I'm trying to build a Python distribution with distutils. Unfortunately, my directory structure looks like this:
/code
/mypackage
__init__.py
file1.py
file2.py
/subpackage
__init__.py
/build
setup.py
Here's my setup.py file:
from distutils.core import setup
setup(
name = 'MyPackage',
description = 'This is my package',
packages = ['mypackage', 'mypackage.subpackage'],
package_dir = { 'mypackage' : '../mypackage' },
version = '1',
url = 'http://www.mypackage.org/',
author = 'Me',
author_email = 'me#here.com',
)
When I run python setup.py sdist it correctly generates the manifest file, but doesn't include my source files in the distribution. Apparently, it creates a directory to contain the source files (i.e. mypackage1) then copies each of the source files to mypackage1/../mypackage which puts them outside of the distribution.
How can I correct this, without forcing my directory structure to conform to what distutils expects?
What directory structure do you want inside of the distribution archive file? The same as your existing structure?
You could package everything one directory higher (code in your example) with this modified setup.py:
from distutils.core import setup
setup(
name = 'MyPackage',
description = 'This is my package',
packages = ['mypackage', 'mypackage.subpackage'],
version = '1',
url = 'http://www.mypackage.org/',
author = 'Me',
author_email = 'me#here.com',
script_name = './build/setup.py',
data_files = ['./build/setup.py']
)
You'd run this (in the code directory):
python build/setup.py sdist
Or, if you want to keep dist inside of build:
python build/setup.py sdist --dist-dir build/dist
I like the directory structure you're trying for. I've never thought setup.py was special enough to warrant being in the root code folder. But like it or not, I think that's where users of your distribution will expect it to be. So it's no surprise that you have to trick distutils to do something else. The data_files parameter is a hack to get your setup.py into the distribution in the same place you've located it.
Have it change to the parent directory first, perhaps?
import os
os.chdir(os.pardir)
from distutils.core import setup
etc.
Or if you might be running it from anywhere (this is overkill, but...):
import os.path
my_path = os.path.abspath(__file__)
os.chdir(os.normpath(os.path.join(my_path, os.pardir)))
etc. Not sure this works, but should be easy to try.
Run setup.py from the root folder of the project
In your case, place setup.py in code/
code/ should also include:
LICENSE.txt
README.txt
INSTALL.txt
TODO.txt
CHANGELOG.txt
The when you run "setup.py sdist' it should auto-gen a MANIFEST including:
- any files specified in py_modules and/or packages
- setup.py
- README.txt
To add more files just hand-edit the MANIFEST file to include whatever other files your project needs.
For a somewhat decent explanation of this read this.
To see a working example checkout my project.
Note: I don't put the MANIFEST under version control so you won't find it there.
A sorta lame workaround but I'd probably just use a Makefile that rsynced ./mypackage to ./build/mypackage and then use the usual distutils syntax from inside ./build. Fact is, distutils expects to unpack setup.py into the root of the sdist and have code under there, so you're going to have a devil of time convincing it to do otherwise.
You can always nuke the copy when you make clean so you don't have to mess up your vcs.
Also a lame workaround, but a junction/link of the package directory inside of the build project should work.

Categories