I'm writing a Python package, in which I'm adding package data. The package data of my setup.cfg looks like the following:
include_package_data = True
[options.package_data]
mypymodule = dir1/*/*, dir2/*/*
Both dir1 and dir2 are directories inside mypymodule source folder, which is part of "mypackage". Building, installing and using "mypackage" is not an issue.
The trouble I'm facing is the following:
Both dir1 and dir2 have their subdirectories with some files inside each that I want to include in "mypackage" installation, and then access them as resources when "mypackage" is running.
For dir1, everything is fine: all dir1 subfolders are correctly installed within "mypackage".
But for dir2, it looks like the installation is not recursive: only files belonging directly at the dir2 folder are installed within "mypackage"; all dir2 subfolders are ignored in "mypackage" installation, and I am not able to access their content as resources.
The difference between dir1 and dir2 is:
dir1 is a regular directory in "mypackage" git repo
dir2 is added in "mypackage" git repo as a git submodule
What am I missing? Is there some limitations in adding git submodules are package_data in Python packages setup.cfg?
Hope you can help! Thank you!
I've tried changing the "*" notation for subdirectories in setup.cfg file, but with no success. I expect that both dir1 and dir2 are treated evenly whether regular dirs or git submodule dirs.
The issue is the sub dir notation I used; there is nothing to with one of the dirs being a git repo or not. I just mistaken the sub dir level I wanted to include.
For including recursiely anything below a dir, double star notation ** is needed. Backslashs plus stars /*/* only adds until the denoted level.
The correct notation in my example is:
include_package_data = True
[options.package_data]
mypymodule = dir1/**, dir2/**
Related
I have a private pip package located in my virtual environment site-packages folder and I'd like to cythonize it for a speed boost and added protection.
My script successfully converts the files to .c, however, it places the build/ folder for the temporary .so files locally. It then tries to copy those .so files locally to a folder that does not exist. Instead, I want it to copy those files over to the venv site-packages/ python package where it had just created the .c files.
Is there a way I can specify where to copy those files over?
Or can I specify where that build folder gets created?
my_app/ (working dir/main program folder)
├── app_gui/ (my application)
├── build/ (build folder generated by cython)
virtualenvs/lib/python/site-packages/
├── my_pip_installed_package/
├── folder/
def main():
all_files = get_all_files(BASE_DIR)
py_files = [file for file in all_files if file.suffix =='.py']
py_file_strings = convert_to_str(py_files)
setup(ext_modules=cythonize(py_file_strings, compiler_directives={"language_level": "3"}))
I found what I would call a workaround to do this. Instead of running my cythonize script locally on the main application, I changed directories to the site-packages folder, changed where BASE_DIR points to and ran the script from there.
I am using setuptools to create a python package. This is the directory structure:
mypackage
.git
__init__.py
pyproject.toml
setup.cfg
module1.py
module2.py
I have this structured in a flat hierarchy so I can clone this repository/copy this directory into a parent project/repository and directly write from mypackage import something, instead of having to pip install or play around with PYTHONPATH.
What do I specify in the setup.cfg file, such that this is installed as a single package, given the file structure?
# setup.cfg
[options]
package_dir = # What is here?
I tried out the custom discovery options in setuptools, which seems to work. I have to manually specify the root directory for packages. However setuptools fails when it finds multiple top-level modules/packages. So I have to use the find option to explicitly include discovered modules into a single package.
[options]
# Use the find function to find modules/packages
packages = find:
# The root directory is the current directory, relative to setup.cfg
packages_dir =
.
[options.packages.find]
# Options for the find command. Include is implicitly looking for all .py files,
# where the directory is "."
where = .
Background
I'm developing a python package with roughly the following directory structure:
mossutils/
setup.py
mossutils/
__init__.py
init.py
data/
style.css
script.js
...
My package's setup.py declares console_scripts and includes package_data files:
setup(
name='mossutils',
packages=['mossutils'],
package_data={"mossutils": ["data/*"]},
entry_points = {
"console_scripts": ['mu-init = mossutils.init:main']
},
...)
Installing the package via pip install works as expected: everything is installed in my Python's Lib\site-packages, including the data directory and all files in it, and script mu-init can be executed from the shell (or rather, command prompt, since I'm using Windows).
Goal
Script mu-init is supposed to do some kind of project scaffolding in the current working directory it is invoked from. In particular, it should copy all package_data files (data/style.css, data/script.js, ...) to the current directory.
Solution Attempt
Using module pkgutil, I can read the content of a file, e.g.
import pkgutil
...
data = pkgutil.get_data(__name__, "data/style.css")
Questions
Is there a way for my init.py script to iterate over the contents of the data directory, without hard-coding the file names (in init.py)?
Can the files from the data directory be copied to the current working directory, without opening the source file, reading the content, and writing it to a destination file?
You can get the list of files in the directory using pkg_resources library, which is distributed together with setuptools.
import pkg_resources
pkg_resources.resource_listdir("mossutils", "data")
I'm doing a project with this layout:
project/
bin/
my_bin.py
CHANGES.txt
docs/
LICENSE.txt
README.txt
MANIFEST.in
setup.py
project/
__init__.py
some_thing.py
default_data.json
other_datas/
default/
other_default_datas.json
And the problem is that when I install this using pip, it puts the "default_data.json" and the "other_datas" folder not in the same place as the rest of the app.
How am I supposed to do to make them be in the same place?
They end up on "/home/user/.virtualenvs/proj-env/project"
instead of "/home/user/.virtualenvs/proj-env/lib/python2.6/site-packages/project"
In the setup.py I'm doing it like this:
inside_dir = 'project'
data_folder= os.path.join(inside_dir,'other_datas')
data_files = [(inside_dir, [os.path.join(inside_dir,'default_data.json')])]
for dirpath, dirnames, filenames in os.walk(data_folder):
data_files.append([dirpath, [os.path.join(dirpath, f) for f in filenames]])
From https://docs.python.org/3.4/distutils/setupscript.html#installing-additional-files:
If directory is a relative path, it is interpreted relative to the installation prefix (Python’s sys.prefix for pure-Python packages, sys.exec_prefix for packages that contain extension modules).
Each file name in files is interpreted relative to the setup.py script at the top of the package source distribution.
So the described behavior is simply how data_files work.
If you want to include the data files within your package you need to use package_data instead:
package_data={'project': ['default_data.json', 'other_datas/default/*.json']}
Take a look at this package https://pypi.python.org/pypi/datafolder. It makes it easy to install and use (data files: *.conf, *.ini *.db, ...) by your package and by the user.
Change your MANIFEST.in to include those .json.
It is probably gonna work:
recursive-include project/ *.json
If all of my __init__.py files are empty, do I have to store them into version control, or is there a way to make distutils create empty __init__.py files during installation?
In Python, __init__.py files actually have a meaning! They mean that the folder they are in is a Python module. As such, they have a real role in your code and should most probably be stored in Version Control.
You could well imagine a folder in your source tree that is NOT a Python module, for example a folder containing only resources (e.g. images) and no code. That folder would not need to have a __init__.py file in it. Now how do you make the difference between folders where distutils should create those files and folders where it should not ?
Is there a reason you want to avoid putting empty __init__.py files in version control? If you do this you won't be able to import your packages from the source directory wihout first running distutils.
If you really want to, I suppose you can create __init__.py in setup.py. It has to be before running distutils.setup, so setup itself is able to find your packages:
from distutils import setup
import os
for path in [my_package_directories]:
filename = os.path.join(pagh, '__init__.py')
if not os.path.exists(filename):
init = open(filename, 'w')
init.close()
setup(
...
)
but... what would you gain from this, compared to having the empty __init__.py files there in the first place?