Cannot include non-python files with setup.py - python

I read a lot of answers on this question, but no solution works for me.
Project layout:
generators_data\
en_family_names.txt
en_female_names.txt
__init__.py
generators.py
setup.py
I want include "generators_data" with it's content into installation. My setup.py:
from distutils.core import setup
setup(name='generators',
version='1.0',
package_data={'generators': ['generators_data/*']}
)
I tried
python setup.py install
got
running install
running build
running install_egg_info
Removing c:\Python27\Lib\site-packages\generators-1.0-py2.7.egg-info
Writing c:\Python27\Lib\site-packages\generators-1.0-py2.7.egg-info
but generators_data directory doesn't appear in "c:\Python27\Lib\site-packages\". Why?

The code you posted contains two issues: setup.py should be sibling to the package you want to distribute, not inside it, and you need to list packages in setup.py.
Try with this this layout:
generators/ # project root, the directory you get from git clone or equivalent
setup.py
generators/ # Python package
__init__.py
# other modules
generators_data/
names.txt
And this setup.py:
setup(name='generators',
version='1.0',
packages=['generators'],
package_data={'generators': ['generators_data/*']},
)

Related

PyPI - folder inside the package folder not getting uploaded [duplicate]

When using setuptools, I can not get the installer to pull in any package_data files. Everything I've read says that the following is the correct way to do it. Can someone please advise?
setup(
name='myapp',
packages=find_packages(),
package_data={
'myapp': ['data/*.txt'],
},
include_package_data=True,
zip_safe=False,
install_requires=['distribute'],
)
where myapp/data/ is the location of the data files.
I realize that this is an old question, but for people finding their way here via Google: package_data is a low-down, dirty lie. It is only used when building binary packages (python setup.py bdist ...) but not when building source packages (python setup.py sdist ...). This is, of course, ridiculous -- one would expect that building a source distribution would result in a collection of files that could be sent to someone else to built the binary distribution.
In any case, using MANIFEST.in will work both for binary and for source distributions.
I just had this same issue. The solution, was simply to remove include_package_data=True.
After reading here, I realized that include_package_data aims to include files from version control, as opposed to merely "include package data" as the name implies. From the docs:
The data files [of include_package_data] must be under CVS or Subversion control
...
If you want finer-grained control over what files are included (for example, if
you have documentation files in your package directories and want to exclude
them from installation), then you can also use the package_data keyword.
Taking that argument out fixed it, which is coincidentally why it also worked when you switched to distutils, since it doesn't take that argument.
Following #Joe 's recommendation to remove the include_package_data=True line also worked for me.
To elaborate a bit more, I have no MANIFEST.in file. I use Git and not CVS.
Repository takes this kind of shape:
/myrepo
- .git/
- setup.py
- myproject
- __init__.py
- some_mod
- __init__.py
- animals.py
- rocks.py
- config
- __init__.py
- settings.py
- other_settings.special
- cool.huh
- other_settings.xml
- words
- __init__.py
word_set.txt
setup.py:
from setuptools import setup, find_packages
import os.path
setup (
name='myproject',
version = "4.19",
packages = find_packages(),
# package_dir={'mypkg': 'src/mypkg'}, # didnt use this.
package_data = {
# If any package contains *.txt or *.rst files, include them:
'': ['*.txt', '*.xml', '*.special', '*.huh'],
},
#
# Oddly enough, include_package_data=True prevented package_data from working.
# include_package_data=True, # Commented out.
data_files=[
# ('bitmaps', ['bm/b1.gif', 'bm/b2.gif']),
('/opt/local/myproject/etc', ['myproject/config/settings.py', 'myproject/config/other_settings.special']),
('/opt/local/myproject/etc', [os.path.join('myproject/config', 'cool.huh')]),
#
('/opt/local/myproject/etc', [os.path.join('myproject/config', 'other_settings.xml')]),
('/opt/local/myproject/data', [os.path.join('myproject/words', 'word_set.txt')]),
],
install_requires=[ 'jsonschema',
'logging', ],
entry_points = {
'console_scripts': [
# Blah...
], },
)
I run python setup.py sdist for a source distrib (haven't tried binary).
And when inside of a brand new virtual environment, I have a myproject-4.19.tar.gz, file,
and I use
(venv) pip install ~/myproject-4.19.tar.gz
...
And other than everything getting installed to my virtual environment's site-packages, those special data files get installed to /opt/local/myproject/data and /opt/local/myproject/etc.
include_package_data=True worked for me.
If you use git, remember to include setuptools-git in install_requires. Far less boring than having a Manifest or including all path in package_data ( in my case it's a django app with all kind of statics )
( pasted the comment I made, as k3-rnc mentioned it's actually helpful as is )
Using setup.cfg (setuptools ≥ 30.3.0)
Starting with setuptools 30.3.0 (released 2016-12-08), you can keep your setup.py very small and move the configuration to a setup.cfg file. With this approach, you could put your package data in an [options.package_data] section:
[options.package_data]
* = *.txt, *.rst
hello = *.msg
In this case, your setup.py can be as short as:
from setuptools import setup
setup()
For more information, see configuring setup using setup.cfg files.
There is some talk of deprecating setup.cfg in favour of pyproject.toml as proposed in PEP 518, but this is still provisional as of 2020-02-21.
Update: This answer is old and the information is no longer valid. All setup.py configs should use import setuptools. I've added a more complete answer at https://stackoverflow.com/a/49501350/64313
I solved this by switching to distutils. Looks like distribute is deprecated and/or broken.
from distutils.core import setup
setup(
name='myapp',
packages=['myapp'],
package_data={
'myapp': ['data/*.txt'],
},
)
I had the same problem for a couple of days but even this thread wasn't able to help me as everything was confusing. So I did my research and found the following solution:
Basically in this case, you should do:
from setuptools import setup
setup(
name='myapp',
packages=['myapp'],
package_dir={'myapp':'myapp'}, # the one line where all the magic happens
package_data={
'myapp': ['data/*.txt'],
},
)
The full other stackoverflow answer here
I found this post while stuck on the same problem.
My experience contradicts the experiences in the other answers.
include_package_data=True does include the data in the
bdist! The explanation in the setuptools
documentation
lacks context and troubleshooting tips, but
include_package_data works as advertised.
My setup:
Windows / Cygwin
git version 2.21.0
Python 3.8.1 Windows distribution
setuptools v47.3.1
check-manifest v0.42
Here is my how-to guide.
How-to include package data
Here is the file structure for a project I published on PyPI.
(It installs the application in __main__.py).
├── LICENSE.md
├── MANIFEST.in
├── my_package
│ ├── __init__.py
│ ├── __main__.py
│ └── _my_data <---- folder with data
│ ├── consola.ttf <---- data file
│ └── icon.png <---- data file
├── README.md
└── setup.py
Starting point
Here is a generic starting point for the setuptools.setup() in
setup.py.
setuptools.setup(
...
packages=setuptools.find_packages(),
...
)
setuptools.find_packages() includes all of my packages in the
distribution. My only package is my_package.
The sub-folder with my data, _my_data, is not considered a
package by Python because it does not contain an __init__.py,
and so find_packages() does not find it.
A solution often-cited, but incorrect, is to put an empty
__init__.py file in the _my_data folder.
This does make it a package, so it does include the folder
_my_data in the distribution. But the data files inside
_my_data are not included.
So making _my_data into a package does not help.
The solution is:
the sdist already contains the data files
add include_package_data=True to include the data files in the bdist as well
Experiment (how to test the solution)
There are three steps to make this a repeatable experiment:
$ rm -fr build/ dist/ my_package.egg-info/
$ check-manifest
$ python setup.py sdist bdist_wheel
I will break these down step-by-step:
Clean out the old build:
$ rm -fr build/ dist/ my_package.egg-info/
Run check-manifest to be sure MANIFEST.in matches the
Git index of files under version control:
$ check-manifest
If MANIFEST.in does not exist yet, create it from the Git
index of files under version control:
$ check-manifest --create
Here is the MANIFEST.in that is created:
include *.md
recursive-include my_package *.png
recursive-include my_package *.ttf
There is no reason to manually edit this file.
As long as everything that should be under version control is
under version control (i.e., is part of the Git index),
check-manifest --create does the right thing.
Note: files are not part of the Git index if they are either:
ignored in a .gitignore
excluded in a .git/info/exclude
or simply new files that have not been added to the index yet
And if any files are under version control that should not be
under version control, check-manifest issues a warning and
specifies which files it recommends removing from the Git index.
Build:
$ python setup.py sdist bdist_wheel
Now inspect the sdist (source distribution) and bdist_wheel
(build distribution) to see if they include the data files.
Look at the contents of the sdist (only the relevant lines are
shown below):
$ tar --list -f dist/my_package-0.0.1a6.tar.gz
my_package-0.0.1a6/
...
my_package-0.0.1a6/my_package/__init__.py
my_package-0.0.1a6/my_package/__main__.py
my_package-0.0.1a6/my_package/_my_data/
my_package-0.0.1a6/my_package/_my_data/consola.ttf <-- yay!
my_package-0.0.1a6/my_package/_my_data/icon.png <-- yay!
...
So the sdist already includes the data files because they are
listed in MANIFEST.in. There is nothing extra to do to include
the data files in the sdist.
Look at the contents of the bdist (it is a .zip file, parsed
with zipfile.ZipFile):
$ python check-whl.py
my_package/__init__.py
my_package/__main__.py
my_package-0.0.1a6.dist-info/LICENSE.md
my_package-0.0.1a6.dist-info/METADATA
my_package-0.0.1a6.dist-info/WHEEL
my_package-0.0.1a6.dist-info/entry_points.txt
my_package-0.0.1a6.dist-info/top_level.txt
my_package-0.0.1a6.dist-info/RECORD
Note: you need to create your own check-whl.py script to produce the
above output. It is just three lines:
from zipfile import ZipFile
path = "dist/my_package-0.0.1a6-py3-none-any.whl" # <-- CHANGE
print('\n'.join(ZipFile(path).namelist()))
As expected, the bdist is missing the data files.
The _my_data folder is completely missing.
What if I create a _my_data/__init__.py? I repeat the
experiment and I find the data files are still not there! The
_my_data/ folder is included but it does not contain the data
files!
Solution
Contrary to the experience of others, this does work:
setuptools.setup(
...
packages=setuptools.find_packages(),
include_package_data=True, # <-- adds data files to bdist
...
)
With the fix in place, redo the experiment:
$ rm -fr build/ dist/ my_package.egg-info/
$ check-manifest
$ python.exe setup.py sdist bdist_wheel
Make sure the sdist still has the data files:
$ tar --list -f dist/my_package-0.0.1a6.tar.gz
my_package-0.0.1a6/
...
my_package-0.0.1a6/my_package/__init__.py
my_package-0.0.1a6/my_package/__main__.py
my_package-0.0.1a6/my_package/_my_data/
my_package-0.0.1a6/my_package/_my_data/consola.ttf <-- yay!
my_package-0.0.1a6/my_package/_my_data/icon.png <-- yay!
...
Look at the contents of the bdist:
$ python check-whl.py
my_package/__init__.py
my_package/__main__.py
my_package/_my_data/consola.ttf <--- yay!
my_package/_my_data/icon.png <--- yay!
my_package-0.0.1a6.dist-info/LICENSE.md
my_package-0.0.1a6.dist-info/METADATA
my_package-0.0.1a6.dist-info/WHEEL
my_package-0.0.1a6.dist-info/entry_points.txt
my_package-0.0.1a6.dist-info/top_level.txt
my_package-0.0.1a6.dist-info/RECORD
How not to test if data files are included
I recommend troubleshooting/testing using the approach outlined
above to inspect the sdist and bdist.
pip install in editable mode is not a valid test
Note: pip install -e . does not show if data files are
included in the bdist.
The symbolic link causes the installation to behave as if the
data files are included (because they already exist locally on
the developer's computer).
After pip install my_package, the data files are in the
virtual environment's lib/site-packages/my_package/ folder,
using the exact same file structure shown above in the list of
the whl contents.
Publishing to TestPyPI is a slow way to test
Publishing to TestPyPI and then installing and looking in
lib/site-packages/my_packages is a valid test, but it is too
time-consuming.
Ancient question and yet... package management of python really leaves a lot to be desired. So I had the use case of installing using pip locally to a specified directory and was surprised both package_data and data_files paths did not work out. I was not keen on adding yet another file to the repo so I ended up leveraging data_files and setup.py option --install-data; something like this
pip install . --install-option="--install-data=$PWD/package" -t package
Moving the folder containing the package data into to module folder solved the problem for me.
See this question: MANIFEST.in ignored on "python setup.py install" - no data files installed?
Just remove the line:
include_package_data=True,
from your setup script, and it will work fine. (Tested just now with latest setuptools.)
Like others in this thread, I'm more than a little surprised at the combination of longevity and still a lack of clarity, BUT the best answer for me was using check-manifest as recommended in the answer from #mike-gazes
So, using just a setup.cfg and no setup.py and additional text and python files required in the package, what worked for me was keeping this in setup.cfg:
[options]
packages = find:
include_package_data = true
and updating the MANIFEST.in based on the check-manifest output:
include *.in
include *.txt
include *.yml
include LICENSE
include tox.ini
recursive-include mypkg *.py
recursive-include mypkg *.txt
For a directory structure like:
foo/
├── foo
│   ├── __init__.py
│   ├── a.py
│   └── data.txt
└── setup.py
and setup.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from setuptools import setup
NAME = 'foo'
DESCRIPTION = 'Test library to check how setuptools works'
URL = 'https://none.com'
EMAIL = 'gzorp#bzorp.com'
AUTHOR = 'KT'
REQUIRES_PYTHON = '>=3.6.0'
setup(
name=NAME,
version='0.0.0',
description=DESCRIPTION,
author=AUTHOR,
author_email=EMAIL,
python_requires=REQUIRES_PYTHON,
url=URL,
license='MIT',
classifiers=[
'Programming Language :: Python',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.6',
],
packages=['foo'],
package_data={'foo': ['data.txt']},
include_package_data=True,
install_requires=[],
extras_require={},
cmdclass={},
)
python setup.py bdist_wheel works.
Starting with Setuptools 62.3.0, you can now use recursive wildcards ("**") to include a (sub)directory recursively. This way you can include whole folders with all their folders and files in it.
For example, when using a pyproject.toml file, this is how you include two folders recursively:
[tool.setuptools.package-data]
"ema_workbench.examples.data" = ["**"]
"ema_workbench.examples.models" = ["**"]
But you can also only include certain file-types, in a folder and all subfolders. If you want to include all markdown (.md) files for example:
[tool.setuptools.package-data]
"ema_workbench.examples.data" = ["**/*.md"]
It should also work when using setup.py or setup.cfg.
See https://github.com/pypa/setuptools/pull/3309 for the details.

include extra file in a Python package using setuptools

I am attempting to build a python wheel using setuptools. The package needs to include two files:
mymodule.py - a python module in the same directory as setup.py
myjar.jar - a java .jar file that exists outside of my package directory
I am building my package using python3 setup.py bdist_wheel.
If I call setup() like so:
setup(
name="mypkg",
py_modules=["mymodule"],
data_files=[('jars', ['../target/scala-2.11/myjar.jar'])]
)
then myjar.jar does successfully get included in the .whl (good so far) however when I pip install mypkg it places the jar at /usr/local/myjar.jar (this kinda explains why) which isn't what I want at all, I want it to exist in the same place as mymodule.py, i.e. /usr/local/lib/python3.7/site-packages/
If I change setup.py to
setup(
name="mypkg",
py_modules=["mymodule"],
package_data={'jars': '../target/scala-2.11/myjar.jar'}
)
or
setup(
name="mypkg",
py_modules=["mymodule"],
package_data={'jars': ['../target/scala-2.11/myjar.jar']}
)
then myjar.jar simply doesn't get included in the .whl. I tried copying myjar.jar into the same directory and changing setup.py to:
setup(
name="mypkg",
py_modules=["mymodule"],
package_data={'jars': 'myjar.jar'}
)
or
setup(
name="mypkg",
py_modules=["mymodule"],
package_data={'jars': ['myjar.jar']}
)
but still myjar.jar does not get included in the .whl.
I've been tearing my hair out over this for hours, hence why I'm here.
I've read a myriad of SO posts on this:
How to include package data with setuptools/distribute?
MANIFEST.in ignored on "python setup.py install" - no data files installed?
How do you add additional files to a wheel?
setuptools: adding additional files outside package
which suggest different combinations of data_files, package_data, include_package_data=True and/or use of a Manifest.in file but still I can't get this working as I would like, so I'm here hoping someone can advise what I'm doing wrong.
The data files (in that case myjar.jar) should really be package data files, and as such they should be part of a Python package. So having such files in parent directories makes things much more complicated, but probably not impossible. So let's start with a simpler example. I believe something like the following should work...
Project directory structure:
MyProject
├ MANIFEST.in
├ mymodule.py
├ setup.py
└ myjars
├ __init__.py
└ myjar.jar
MANIFEST.in:
recursive-include myjars *.jar
setup.py:
#!/usr/bin/env python3
import setuptools
setuptools.setup(
name='MyProject',
version='1.2.3',
#
include_package_data=True,
packages=['myjars'],
py_modules=["mymodule"],
)
myjars/__init__.py might not be strictly necessary, but I believe it's better to have it. And as always, an empty __init__.py file is perfectly good enough.
(This assumes the myjars/myjar.jar file exists before the source distribution sdist is built.)
As to dealing with the data files in parent directories, my recommendation would be to simply copy (or symlink) those files before calling setup.py, maybe as part of a shell script or anything like that. There are probably ways to do the copy as part of a custom setuptools command in setup.py, but it's not worth the effort in my opinion, and really it's not part of setup.py's job.

Why current working directory affects install path of setup.py? How to prevent that?

I have created a custom python package following this guide, so I have the following structure:
mypackage/ <-- VCS root
mypackage/
submodule1/
submodule2/
setup.py
And setup.py contains exactly the same information as in the guide:
from setuptools import setup, find_packages
setup(name='mypackage',
version='0.1',
description='desc',
url='vcs_url',
author='Hodossy, Szabolcs',
author_email='myemail#example.com',
license='MIT',
packages=find_packages(),
install_requires=[
# deps
],
zip_safe=False)
I have noticed if I go into the folder where setup.py is, and then call python setup.py install in a virtual environment, in site-packages the following structure is installed:
.../site-packages/mypackage-0.1-py3.6.egg/mypackage/
submodule1/
submodule2/
but if I call it from one folder up like python mypackage/setup.py install, then the structure is the following:
.../site-packages/mypackage-0.1-py3.6.egg/mypackage/
mypackage/
submodule1/
submodule2/
This later one ruins all imports from my module, as the path is different for the submodules.
Could you explain what is happening here and how to prevent that kind of behaviour?
This is experienced with Python 3.6 on both Windows and Linux.
Your setup.py does not contain any paths, but seems to only find the files via find_packages. So of course it depends from where you run it. The setup.py isn't strictly tied to its location. Of course you could do things like chdir to the basename of the setup file path in sys.argv[0], but that's rather ugly.
The question is, WHY do you want to build it that way? It looks more like you would want a structure like
mypackage-source
mypackage
submodule1
submodule2
setup.py
And then execute setup.py from the work directory. If you want to be able to run it from anywhere, the better workaround would be to put a shellscript next to it, like
#!/bin/sh
cd ``basename $0``
python setup.py $#
which separates the task of changing to the right directory (here I assume the directory with setup.py in the workdir) from running setup.py

How to distribute python module under a specific name

I have python module, named models.py which i would like to upload to PyPi. When afterwards i install it using pip i would like it to appear as a package.
To explain myself, I have following project structure:
my_utils
mapper/
models.py
MANIFEST.in
setup.py
README
logger/
logger.py
logger_conf.json
MANIFEST.in
README
setup.py
I would like to create a distributed package out mapper.models, but when it is being installed on target machine, i would like it to appear in site-packages under mapper_tools.models.
My setup.py:
from distutils.core import setup
setup(
name='mapper_tools',
version='0.1.0',
description='some description',
author='myname',
author_email='my#email.com',
url='https://github.com/mapper-tools',
py_modules=['models'],
)
My MANIFEST.in:
include models.py README
Currently, after running pip install mapper-tools, I find models.py right under site-packages and I would like it to appear under mapper_tools.
Can i specify the structure that should be installed without changing the layout of my project?

Build a python package with setup.py in CMake

EDIT: The question is a bit too long. Here is my real question: How can I build and install a python package with setuptools (setup.py) inside CMake? The detail of my code is shown below (but with an out-of-source build method, the method with the source is working).
I have a project where I need to distribute my own python package. I made a setup.py script but I would like to build & install it with CMake.
I followed Using CMake with setup.py but it only works with one CMakeLists.txt alongside the setup.py and the python folder and without executing cmake from a build directory.
With this layout :
Project/
--build/
--lib/
----python/
------folder1/
------folder2/
------data/
------...
------__init__.py
----setup.py
----CMakeLists.txt
--CMakeLists.txt
and with CMakeLists.txt:
cmake_minimum_required(VERSION 2.8.8 FATAL_ERROR)
add_subdirectory(lib)
(..)
and with lib/CMakeLists.txt:
find_program(PYTHON "python")
if (PYTHON)
set(SETUP_PY_IN "${CMAKE_CURRENT_SOURCE_DIR}/setup.py")
set(SETUP_PY "${CMAKE_CURRENT_BINARY_DIR}/setup.py")
set(DEPS "${CMAKE_CURRENT_SOURCE_DIR}/python/__init__.py")
set(OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/build")
configure_file(${SETUP_PY_IN} ${SETUP_PY})
add_custom_command(OUTPUT ${OUTPUT}
COMMAND ${PYTHON}
ARGS setup.py build
DEPENDS ${DEPS})
add_custom_target(target ALL DEPENDS ${OUTPUT})
install(CODE "execute_process(COMMAND ${PYTHON} ${SETUP_PY} install)")
endif()
and with setup.py:
from setuptools import setup, find_packages
setup(name="python",
version="xx",
author="xx",
packages = find_packages(),
package_data = {'': ['*.txt']},
description="Python lib for xx")
When I run CMake from build directory and then make, the target is built but with nothing. It is as if no packages were found. The installation installs the python package without .py files.
setuptools doesn't know about the out of source build and therefore doesn't find any python source files (because you do not copy them to the binary dir, only the setup.py file seems to exist there). In order to fix this, you would have to copy the python source tree into the CMAKE_CURRENT_BINARY_DIR.
https://bloerg.net/2012/11/10/cmake-and-distutils.html suggests setting package_dir to ${CMAKE_CURRENT_SOURCE_DIR} in setup.py.
As pointed out previously you can copy your python files to the build folder, e.g. something like this
set(TARGET_NAME YourLib)
file(GLOB_RECURSE pyfiles python/*.py)
foreach (filename ${pyfiles})
get_filename_component(target "${filename}" NAME)
message(STATUS "Copying ${filename} to ${TARGET_NAME}/${target}")
configure_file("${filename}"
"${CMAKE_CURRENT_BINARY_DIR}/${TARGET_NAME}/${target}" COPYONLY)
endforeach (filename)
and then have a build target like this
add_custom_target(PyPackageBuild
COMMAND "${PYTHON_EXECUTABLE}" -m pip wheel .
WORKING_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}"
COMMENT "Building python wheel package"
)
add_dependencies(PyPackageBuild ${TARGET_NAME})
In case you do not want to use pip you have to adjust the PyPackageBuld target.
If you want to include some shared library, e.g. written in C++, which is build by other parts of your cmake project you have to copy the shared object file as well to the binary folder
set_target_properties(${TARGET_NAME} PROPERTIES
PREFIX "${PYTHON_MODULE_PREFIX}"
SUFFIX "${PYTHON_MODULE_EXTENSION}"
BUILD_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}"
LIBRARY_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/${TARGET_NAME}/libs"
BUILD_WITH_INSTALL_RPATH TRUE)
set(TARGET_PYMODULE_NAME "${PYTHON_MODULE_PREFIX}${TARGET_NAME}${PYTHON_MODULE_EXTENSION}")
and add it to the package_data in setup.py
....
package_data={
'': ['libs/YourLib.cpython-39-x86_64-linux-gnu.so']
}
You can find a working example using pybind11 for `C++´ bindings at https://github.com/maximiliank/cmake_python_r_example

Categories