How to create nested namespace packages for setuptools distribution - python

I'm developing a python project that will have separately distributable parts.
I have been able to accomplish part of my goal by making a namespace package. I have "sub1" and "sub2", both in namespace "lvl1". I can pip install these in development mode using "pip install -e" or python setup.py develop. I can import them with import lvl1.sub1 and import lvl1.sub2.
However, the project is massive and calls for nested namespaces. I want to import lvl1.lvl2.sub1 and import lvl1.lvl2.sub2. So both subpackages are in the same namespace ("lvl2"), which is itself in a namespace ("lvl1").
Desired conceptual structure:
lvl1/
lvl2/
sub1/
code.py
more_code.py
...
sub2/
code.py
...
Is there a way to do this and how?

Yes there is more than one way. Please read section "Nested namespace packages" in PEP 420.
In python >= 3.3, the easiest way to make nested namespace is to delete (do not include) file __init__.py in the specific folders ("lvl1" and "lvl2") in every distributable parts. In each of the setup.py, explicitly list all the packages in the deepest namespace.
"lvl1_part1/setup.py"
setup(
name='lvl1_part1',
...
zip_safe=False,
packages=['lvl1.lvl2.sub1']
)
"lvl1_part2/setup.py"
setup(
name='lvl1_part2',
...
zip_safe=False,
packages=['lvl1.lvl2.sub2']
)
The file structure for testing:
lvl1_part1/
setup.py
lvl1/
lvl2/
sub1/
__init__.py
lvl1_part2/
setup.py
lvl1/
lvl2/
sub2/
__init__.py
To make the above packages compatible to older python versions, please add the pkgutil magic file to each of the "lvl1" and "lvl2" folders.
Credits: The example above is modified from https://github.com/pypa/sample-namespace-packages/tree/master/pkgutil

Related

python setup tools - install a sub package from within a project

I have two projects (trysetup1 and trysetup2) with the following structure:
I want to pip install package1 and use module1 from project trysetup2
my setup.py that under package1 looks like this:
import setuptools
setuptools.setup(
name="common",
version="1.0.2",
packages=setuptools.find_packages(),
)
the way I want to use module1 is like this from package1.module1 import ClassOne because I still need to use it from package2
when trying to import from module2 it works just fine
but when trying to use it from module3 (in the different project after pip installing it) i'm having "Unresolved reference 'package1'" problem
I know I'm able to use module1 by putting it inside another package under package1 but I need this exact stracture in order to use it from the rest of the project 'trysetup1'
Thanks!
My answer was found here:
https://docs.python.org/3/distutils/examples.html
actually, all I needed to do was to change my setup.py file to look like this:
setuptools.setup(
name="common",
version="1.0.2",
package_dir={'package1': ''},
packages=['package1'],
)
by adding package_dir param, setup function tells all files under my root directory (package1) to be under package1 directory and by adding packages param it distributes package1 package and then if you go to:
/..../venv/lib/python3.8/site-packages/common-1.0.2-py3.8.egg-info/top_level.txt
you'll see the following content:

Python Namespace Packages in Python3

The topic of namespace packages seems a bit confusing for the uninitiated, and it doesn't help that prior versions of Python have implemented it in a few different ways or that a lot of the Q&A on StackOverflow are dated. I am looking for a solution in Python 3.5 or later.
#The scenario:
I'm in the process of refactoring a bunch of Python code into modules and submodules, and working to get each of these projects set up to operate independently of each other while sitting in the same namespace.
We're eventually going to be using an internal PyPi server, serving these packages to our internal network and don't want to confuse them with external (public) PyPi packages.
Example: I have 2 modules, and I would like to be able to perform the following:
from org.client.client1 import mod1
from org.common import config
The reflected modules would be separated as such:
Repository 1:
org_client_client1_mod1/
setup.py
mod1/
__init__.py
somefile.py
Repository 2:
org_common_config/
setup.py
config/
__init__.py
someotherfile.py
My Git repositories are already setup as org_client_client1_mod1 and org_common_config, so I just need to perform the setup on the packaging and __init__.py files, I believe.
Questions:
#1
With the __init__.py, which of these should I be using (if any)?:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
Or:
import pkg_resources
pkg_resources.declare_namespace(__name__)
#2
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'],
or namespace_modules=['org', 'common']?
#3
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
Late to the party, but never hurts to help fellow travellers down the namespace path in Python!
#1:
With the __init__.py, which of these should I be using (if any)?:
It depends, There are three ways to do namespace packages as listed here:
Use native namespace packages. This type of namespace package is defined in PEP 420 and is available in Python 3.3 and later. This is recommended if packages in your namespace only ever need to support Python 3 and installation via pip.
Use pkgutil-style namespace packages. This is recommended for new packages that need to support Python 2 and 3 and installation via both pip and python setup.py install.
Use pkg_resources-style namespace packages. This method is recommended if you need compatibility with packages already using this method or if your package needs to be zip-safe.
If you are using #2 (pkgutil-style) or #3 (pkg_resources-style), then you will have to use the corresponding style for __init__.py files. If you use native namespaces then no __init__.py in the namespace directory.
#2:
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'], or namespace_modules=['org', 'common']?
If your choice of namespace package is not native style, then yes, you will need namespace_packages in your setup().
#3:
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
Since you ended up down to a complex topic in python, it seems you know what you are doing, what you want and identified that creating a Python Namespace package is the way to do it. This would be considered a pythonic way to solve a problem.
Adding to your questions, here are a few things I discovered:
I read PEP420, the Python Packaging guide and spent a lot of time understanding the namespace packages, and I generally understood how it worked. I read through a couple of answers here, here, here, and this thread on SO as well - the example here and on the Git link shared by Rob.
My problem however was after I created my package. As all the instructions and sample code explicitly listed the package in the setuptools.setup(package=[]) function, my code failed. My sub-packages/directories were not included. Digging deeper, I found out that setuptools has a find_namespace_package() function that helps in adding sub-packages too
EDIT:
Link to find_namespace_packages() (setuptools version greater than 40.1.0): https://setuptools.readthedocs.io/en/latest/setuptools.html#find-namespace-packages
EDIT (08/09/2019):
To complete the answer, let me also restructure with an example.
The following solution is assuming Python 3.3+ which has support for implicit namespace packages
Since you are looking for a solution for Python version 3.5 or later, let's take the code samples provided and elaborate further.
Let's assume the following:
Namespace/Python package name : org
Distribution packages: org_client, org_common
Python: 3.3+
setuptools: 40.1.0
For you to do the following
from org.client.client1 import mod1
from org.common import config
And keeping your top level directories the same, viz. org_client_client1_mod1 and org_common_config, you can change your structure to the following
Repository 1:
org_client_client1_mod1/
setup.py
org/
client/
client1/
__init__.py
submod1/
__init__.py
mod1/
__init__.py
somefile.py
file1.py
Updated setup.py
from setuptools import find_namespace_packages, setup
setup(
name="org_client",
...
packages=find_namespace_packages(), # Follows similar lookup as find_packages()
...
)
Repository 2:
org_common_config/
setup.py
org/
common/
__init__.py
config/
__init__.py
someotherfile.py
Updated setup.py:
from setuptools import find_namespace_packages, setup
setup(
name="org_common",
...
packages=find_namespace_packages(), # Follows similar lookup as find_packages()
...
)
To install (using pip):
(venv) $ pip3 install org_common_config/
(venv) $ pip3 install org_client_client1_mod1/
Updated pip list will show the following:
(venv) $ pip3 list
...
org_client
org_common
...
But they won't be importable, for importing you will have to follow org.client and org.common notation.
To understand why, you can browse here (assuming inside venv):
(venv) $ cd venv/lib/python3.5/site-packages/
(venv) $ ls -l | grep org
You'll see that there's no org_client or org_common directories, they are interpreted as a namespace package.
(venv) $ cd venv/lib/python3.5/site-packages/org/
(venv) $ ls -l
client/
common/
...
This is a tough subject. All the -'s, _'s, and __init__.py's everywhere don't exactly make it easy on us.
First, I'll answer your questions:
With the __init__.py, which of these should I be using (if any)?
__init__.py can be completely empty, it just needs to be in the correct place. Namely (pun) they should be in any subpackage containing python code (excluding setup.py.) Follow those rules and you should be fine.
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'], or namespace_modules=['org', 'common']?
Nope! Only name= and packages=. However, note the format of the packages= arg compared against the directory structure.
Here's the format of the package= arg:
Here's the corresponding directory structure:
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
If you want to be able to install multiple features individually, but under the same top-level namespace, you're on the right track.
I'll spend the rest of this answer re-implementing your namespace package in native format:
I'll put all helpful documentation I've been able to find at the bottom of the post.
K so I'm going to assume you want native namespace packages. First let's look at the current structure of your 2 repos:
org_client_client1_mod1/
setup.py
mod1/
__init__.py
somefile.py
&
org_common_config/
setup.py
config/
__init__.py
someotherfile.py
This^ would be too easy!!!
To get what you want:
My brain isn't elastic enough to know if we can go 3-levels deep with namespace packages, but to do what you want, here's what I'm pretty sure you'd want to do:
org-client/
setup.py
org/
client/
client1/
__init__.py
mod1/
__init__.py
somefile.py
&
org-common-but-also-note-this-name-doesnt-matter/
setup.py
org/
common/
__init__.py
config/
__init__.py
someotherfile.py
Basically then the key is going to be specifying the correct name= & packages= args to stuptools.setup() inside of each setup.py.
These are going to be:
name='org_client',
...
packages=['org.client']
&
name='org_common'
...
packages['org.common']
respectively.
Then just install each one with pip install . inside each top-level dir.
Installing the first one will give you access to the somefile.py module, and installing the second will give you access to someotherfile.py. It also won't get confused about you trying to install 2 packages named org in the same environment.
K so the most helpful section of the docs: https://packaging.python.org/guides/packaging-namespace-packages/#packaging-namespace-packages
And then here's how I actually came to understand this: https://github.com/pypa/sample-namespace-packages/tree/master/native

Install python repository without parent directory structure

I have a repository I inherited used by a lot of teams, lots of scripts call it, and it seems like its going to be a real headache to make any structural changes to it. I would like to make this repo installable somehow. It is structured like this:
my_repo/
scripts.py
If it was my repository, I would change the structure like so and make it installable, and run python setup.py install:
my_repo/
setup.py
my_repo/
__init__.py
scripts.py
If this is not feasible (and it sounds like it might not be), can I somehow do something like:
my_repo/
setup.py
__init__.py
scripts.py
And add something to setup.py to let it know that the repo is structured funny like this, so that I can install it?
You can do what you suggest.
my_repo/
setup.py
__init__.py
scripts.py
The only thing is you will need to import modules in your package via their name if they are in the base level. So for example if your structure looked like this:
my_repo/
setup.py
__init__.py
scripts.py
lib.py
pkg/
__init__.py
pkgmodule.py
Then your imports in scripts.py might look like
from lib import func1, func2
from pkg.pkgmodule import stuff1, stuff2
So in your base directory imports are essentially by module name not by package. This could screw up some of your other packages namespaces if you're not careful, like if there is another dependency with a package named lib. So it would be best if you have these scripts running in a virtualenv and if you test to ensure namespacing doesn't get messed up
There is a directive in setup.py file to set the name of a package to install and from where it should get it's modules for installation. That would let you use the desired directory structure. For instance with a given directory structure as :
my_repo/
setup.py
__init__.py
scripts.py
You could write a setup.py such as:
setup(
# -- Package structure ----
packages=['my_repo'],
package_dir={'my_repo': '.'})
Thus anyone installing the contents of my_repo with the command "./setup.py install" or "pip install ." would end up with an installed copy of my_repo 's modules.
As a side note; relative imports work differently in python 2 and python 3. In the latter, any relative imports need to explicitly specify the will to do so. This method of installing my_repo will work in python 3 when calling in an absolute import fashion:
from my_repo import scripts

How to exclude a single file from package with setuptools and setup.py

I am working on blowdrycss. The repository is here.
I want the settings file for blowdrycss_settings.py to be excluded from the final package on pypi. The intention is to dynamically build a custom settings file that will be placed in the users virtualenv / project folder.
In setup.py, I have the following:
packages=find_packages(exclude=['blowdrycss_settings.py', ]),
I also tried exclude_package_data:
exclude_package_data={
'': ['blowdrycss_settings.py'],
'': ['blowdrycss/blowdrycss_settings.py'],
'blowdrycss': ['blowdrycss_settings.py'],
},
I then run python setup.py sdist bdist.
However, when I look in the build folder I still see blowdrycss_settings.py:
- build
- lib
- blowdrycss_settings.py
It seems like it should be simple to just exclude a file.
How do I exclude blowdrycss_settings.py from the distributed package?
Imagine you have a project
root
├── setup.py
└── spam
├── __init__.py
├── bacon.py
└── eggs.py
and you want to exclude spam/eggs.py from packaging:
import fnmatch
from setuptools import find_packages, setup
from setuptools.command.build_py import build_py as build_py_orig
excluded = ['spam/eggs.py']
class build_py(build_py_orig):
def find_package_modules(self, package, package_dir):
modules = super().find_package_modules(package, package_dir)
return [
(pkg, mod, file)
for (pkg, mod, file) in modules
if not any(fnmatch.fnmatchcase(file, pat=pattern) for pattern in excluded)
]
setup(
packages=find_packages(),
cmdclass={'build_py': build_py}
)
Globs and multiple entries in excluded list will work too because it is consumed by fnmatch, so you can e.g. declare
excluded = [
'modules_in_directory/*.py',
'modules_in_subtree/**/*.py',
'path/to/other/module.py'
]
etc.
This recipe is based on my other answer to the question setup.py exclude some python files from bdist
. The difference is that this recipe excludes modules based on file globs, while the other one excludes modules based on qualnames, for example
excluded = ['spam.*', '*.settings']
will exclude all submodules of spam package and all modules named settings in every package and subpackage etc.
Here is my solution.
Underneath of blowdrycss, I created a new module called settings so the directory structure now looks like this:
blowdrycss
blowdrycss
settings
blowdrycss_settings.py
Based on this reference, inside of setup.py I have the following:
packages=find_packages(exclude=['*.settings', ]),
To build the distribution:
Delete the build, dist, and .egg-info folders.
Run python setup.py sdist bdist
In retrospect, it is good that I was unable to do what I was originally attempting. The new structure feels cleaner and is more modular.
The easiest way to remove a single, or at least a few specific files, from a package with setuptools is just to use the MANIFEST.in. For example, in a package you can exclude all files name foo.py by simply specifying global-exclude foo.py. There's no need for setuptools hacking or changing the structure of your package if you just use the MANIFEST.in method.
See the dedicated PyPA article on using MANIFEST.in for more commands you can use.

Any way to make setuptools exclude certain files when packaging up a distribution? [duplicate]

I am working on blowdrycss. The repository is here.
I want the settings file for blowdrycss_settings.py to be excluded from the final package on pypi. The intention is to dynamically build a custom settings file that will be placed in the users virtualenv / project folder.
In setup.py, I have the following:
packages=find_packages(exclude=['blowdrycss_settings.py', ]),
I also tried exclude_package_data:
exclude_package_data={
'': ['blowdrycss_settings.py'],
'': ['blowdrycss/blowdrycss_settings.py'],
'blowdrycss': ['blowdrycss_settings.py'],
},
I then run python setup.py sdist bdist.
However, when I look in the build folder I still see blowdrycss_settings.py:
- build
- lib
- blowdrycss_settings.py
It seems like it should be simple to just exclude a file.
How do I exclude blowdrycss_settings.py from the distributed package?
Imagine you have a project
root
├── setup.py
└── spam
├── __init__.py
├── bacon.py
└── eggs.py
and you want to exclude spam/eggs.py from packaging:
import fnmatch
from setuptools import find_packages, setup
from setuptools.command.build_py import build_py as build_py_orig
excluded = ['spam/eggs.py']
class build_py(build_py_orig):
def find_package_modules(self, package, package_dir):
modules = super().find_package_modules(package, package_dir)
return [
(pkg, mod, file)
for (pkg, mod, file) in modules
if not any(fnmatch.fnmatchcase(file, pat=pattern) for pattern in excluded)
]
setup(
packages=find_packages(),
cmdclass={'build_py': build_py}
)
Globs and multiple entries in excluded list will work too because it is consumed by fnmatch, so you can e.g. declare
excluded = [
'modules_in_directory/*.py',
'modules_in_subtree/**/*.py',
'path/to/other/module.py'
]
etc.
This recipe is based on my other answer to the question setup.py exclude some python files from bdist
. The difference is that this recipe excludes modules based on file globs, while the other one excludes modules based on qualnames, for example
excluded = ['spam.*', '*.settings']
will exclude all submodules of spam package and all modules named settings in every package and subpackage etc.
Here is my solution.
Underneath of blowdrycss, I created a new module called settings so the directory structure now looks like this:
blowdrycss
blowdrycss
settings
blowdrycss_settings.py
Based on this reference, inside of setup.py I have the following:
packages=find_packages(exclude=['*.settings', ]),
To build the distribution:
Delete the build, dist, and .egg-info folders.
Run python setup.py sdist bdist
In retrospect, it is good that I was unable to do what I was originally attempting. The new structure feels cleaner and is more modular.
The easiest way to remove a single, or at least a few specific files, from a package with setuptools is just to use the MANIFEST.in. For example, in a package you can exclude all files name foo.py by simply specifying global-exclude foo.py. There's no need for setuptools hacking or changing the structure of your package if you just use the MANIFEST.in method.
See the dedicated PyPA article on using MANIFEST.in for more commands you can use.

Categories