I've been coding in Python for long, but never actually tried to pack a piece of code so that I can share it. I started reading
https://python-packaging.readthedocs.io/en/latest/.
I started with the simplest possible case, say I want to share a module named 'clipper', and the only important thing is a class called Clipper. It seems in case I use setuptools I should create somewhere folders
clipper/clipper
and inside the inner clipper, place a file __init__.py
with the definition of class Clipper. So far so good. Theoretically, after installing the package, the way to use the class would be:
import clipper
cl = clipper.Clipper()
My problem is, I am assuming that while I am developing and before any installation, the same code should work. I mean, the previous code should create an instance of the object. But how would that work? How should I set PYTHONPATH so that the previous import would actually work?
Maybe I got something really wrong, I thought packing would be easier compared to coding, but I've spent some time and I don't get it. Any help, please?
Rather than modifying your Python path, install the packaged module as an editable version and your environment will handle this for you. When you have it running as an editable version you'll be able to make changes to the code on your local development instance.
For example, assuming you have Pip you can run the following command in the first 'clipper' folder (the same folder as the setup.py file you created during packaging):
pip install -e .
-e means editable
the . means install the package located in the current folder.
More detail here in an SO answer from 2015: "pip install --editable ./" vs "python setup.py develop"
Your directory tree would look a bit like this:
~/clipper/
setup.py
clipper/
__init__.py
clipper.py
The setup.py file contains information telling Python how to 'parse' your package. Things like the name of your project, the version and what packages to include are defined here. For your example, setup.py may look like this:
from distutils.core import setup
setup(
name="Clipper",
# A name for your package, typically your project name
description="My first package",
# A short description
version="1.0.0",
# A version specification
packages=["clipper"]
# A list of packages to include
)
Within clipper, clipper.py contains the actual Clipper class:
class Clipper(object):
def __init__(self):
pass
def foo(self):
print("Invoked foo!")
__init__.py is a special type of file. It defines the public interface for interacting with your package. Typically, it imports all public functions and classes:
from .clipper import Clipper
Finally, to turn this into a proper package, run python3 setup.py sdist. This creates the source distribution for your package and allows you to import it 1. Let's try that now. Navigate back to ~/clipper/ and start Python:
>>> from clipper import Clipper
>>> c = Clipper()
>>> c.foo()
Invoked foo!
>>>
And here's 'real' example of what a package directory would look like:
~/calculator/
setup.py
calculator/
__init__.py
add.py
substract.py
setup.py
from distutils.core import setup
setup(
name="Calculator",
description="Calculate stuff!",
version="1.0.0",
packages=["calculator"]
)
__init__.py
from .add import *
from .substract import *
add.py
def add(a, b):
"""Return `a` + `b`."""
return a + b
substract.py
def substract(a, b):
"""Return `a` - `b`."""
return a - b
For more information, see the Python tutorial on packaging.
1 You may get some warnings about missing information, but you can ignore that for now.
Related
My directory is structured like this
>project
>tests
>test_to_json.py
>app.py
>x.db
in my test_to_json.py I want to access a class from app.py
How would I import it? I've tried from ..app import myClass but that gives me ImportError: attempted relative import with no known parent package. Is there something I'm missing?
You cannot use .. relative pathing in python. That specific type of relative python is simply not allowed. . is allowed, though. The resolution for this problem is usually done by converting your project into a python package.
Extensive tutorials for doing so can be found here, but I will give an example of how to convert your project into a package.
Step 1
The new file structure should look like this:
>project
>tests
>__init__.py #Note this file
>test_to_json.py
>__init__.py #Note this file
>setup.py #Note this file
>app.py
>x.db
Step 2
Write your setup.py.
Here is an generic setup.py that should work for your project:
from setuptools import setup, find_packages
setup(
name='project_package', #the name of your desired import
version='0.0.1',
author='An Awesome Coder',
packages=find_packages(),
description='An awesome package that does something',
install_requires=[], # a list of python dependencies for your package
)
find_packages() looks for all the __init__.py files in your package to identify submodules.
Step 3
Install your package. In the folder with your new setup.py, run pip install -e . This will install your package on your computer, basically adding the files to your python system path.
Step 4
From any python terminal on your computer you should now be able to import your package using the package_name you specified.
import project_package
from project_package.app import myClass
myClass()
.
.
.
Background
I'm trying to create a Python package with a semi-complicated structure. I have published a few packages before, and because they were simple, I put all the classes and functions necessary in __init__.py itself. For example, the file structure of one of my simple packages would be:
Package/
venv # my virtual environment
package-name/
__init__.py
setup.py # with setuptools, etc.
A sample __init__.py file from a sample package would be:
# __init__.py from a sample simple package
import requests
class User:
def __init__(self, something):
self.something = requests.get(url).json()['something']
def do_something(self):
return float(self.something) * 10
As a sample package like this is basic and only requires the User class, this would suffice. On installing the package with pip, using import package-name works to call the User object.
# random computer with simple package installed
import package_name
api = package_name.User()
And this works fine.
Question
A more complicated package like the one I'm working on cannot contain all classes and functions directly in the __init__.py file. The structure is below:
Package/
venv # my virtual environment
package-name/
__init__.py
related.py
something.py
setup.py
The problem is that I can't quite figure out how to get the contents of related.py and something.py to work implicitly. By implicitly, I mean that when the user executes import package_name, they can use package_name.attribute to access any attributes from any of the files, whether it's in __init__.py, related.py, or something.py.
Currently, if I structure the __init__.py file like this:
# complex package __init__.py
from package_name import related
from package_name import something
The user still has to call related and something as attributes of package_name:
# random computer user file with package_name installed
import package_name
x = package_name.related.x()
y = package_name.something.y()
I want them to be able to do this instead:
# random computer user file with package_name installed
import package_name
x = package_name.x()
y = package_name.y()
Without having to themselves write: from package_name import related and from package_name import something on their own computer.
Apologies for the long question but I wanted to be as clear as possible about what I'm asking because there are a lot of moving parts.
Whatever names are available in your __init__.py will be available as package_name.whatever when someone does import package_name. So if you want to make names from submodules available, you can do from package_name.related import x (or from .related import x) inside your __init__.py.
The following structure (in Python 3.7) is not allowing me to import class A in module B:
package:
package:
__init__.py
a:
__init__.py
a.py
b:
__init__.py
b.py
The top-level __init__.py is blank. Here are the remaining files:
a
# package/package/a/__init__.py
from .a import A
# package/package/a/a.py
class A:
def __init__(self):
pass
b:
# package/package/b/__init__.py
from .b import B
# package/package/b/b.py
from package.a.a import A
class B:
def __init__(self):
pass
Without doing anything else, on Windows, if I try to run b.py (from within the b folder), I get the following error:
ModuleNotFoundError: No module named 'package.a'
If I add a main.py at the top level:
package:
package:
__init__.py
main.py
a:
__init__.py
a.py
b:
__init__.py
b.py
containing
# package/package/main.py
import a
import b
and run main.py (from within package/package), I get the same error.
If I change b.py to
# package/package/b/b.py
from ..a.a import A
class B:
def __init__(self):
pass
and run b.py (from within the b folder) or main.py (from within package/package), I get the error that
ValueError: attempted relative import beyond top-level package
The python docs make it seem like I should be able to do this though!
Can someone please explain why I am getting these errors? I've seen a couple similar posts to this, but they haven't fixed my problem:
Importing Submodules Python 3
Python submodule importing correctly in python 3.7 but not 3.6
Whatever module is being run by Python is called top-level.
In your shell, when you run > py main.py ($ python3 main.py on Linux), the file main.py is top-level and is called the top-level module.
In the interpreter, the interpreter itself is always top-level, and is called the top-level environment (for proof, type >>> __name__ into the interpreter and it will return '__main__')
Unfortunately (IMO), the term "top-level" is not well-defined in the python docs as it is used in several different contexts. Regardless, it is important to understand that Python always renames __name__ of the top-level entity to '__main__'.
PEP 328 (explained in this SO post) states
relative imports use the module's __name__ attribute to determine its position in the package hierarchy.
Since Python renames the __name__ of the top-level module to '__main__', the top-level module has no package information because it has no dots in its name.
Hence, top-level modules are not considered part of any package (even though they very well may be!). In other words, for packages imported from the current directory, '__main__' determines what is top-level. Packages at the same level as '__main__' (a and b in my example) are top-level packages.
Confusingly, the python docs and PEP 328 give a misleading example. The "correct usages" shown for relative imports are only valid in a specific context.
Recall that import searches paths listed in sys.path to find packages and modules to import. The current directory is in sys.path, along with the paths to builtin packages (like math and os) and installed packages (i.e. pip installed package). Python does not rename the __name__ of non-top-level packages.
Therefore, the python docs and PEP 328 examples are valid only for packages and modules NOT in the top-level directory.
What I should have written was from a.a import A:
# package/package/b/b.py
from a.a import A
class B:
def __init__(self):
pass
Since package is above the top-level module, trying to do an absolute import (like from package.a.a import a) results in an ImportError even though main.py is inside of the package package.
That being said, if you go to PyPI and GitHub and look at released packages, you will find they have absolute imports like import package.a.a! In fact, if I were to publish my package and leave the import as from a.a import A, end users would get an ImportError because they installed package package and don't have a package a! Furthermore, in the current configuration, I'm unable to test with unittest or pytest that users can import and use my package as expected because I cannot do from package.a.a import A myself!
So the question becomes how do you write and test your own custom packages?
The trick is that these packages were written in development mode. To install your package in development mode, run > pip install -e . from the top-level directory (assuming you have a setup.py written; see the NOTE below).
When this is done, python treats your package like a typical library package (i.e. a pip installed package), so python does not change its __name__ to __main__. Thus, you can
import it with absolute imports
test and use your package like an end user would, and
any edits you make to it take effect immediately when run without requiring you re-pip install it, just like packages in the top-level directory do
This key difference between developing packages vs. standalone programs is a huge source of confusion and frustration for most first-time developers (myself included) and is very important to keep in mind. I hope this answer provides clarification for others and may be added to documentation in the future. Please let me know in the comments below if this helped you.
NOTE: pip install -e ., where -e stands for "editable", puts a link (a *.pth file) in your python installation folder so that the package is treated as an installed package, but also that any changes you write in it will take effect immediately (see the Python Packaging Tutorial). Hence, you can use this to develop your own packages or to install and edit third-part packages to your needs. This requires you create a setup.py, but all your test code, client code, etc., will be able to import your package the usual way (i.e. you can treat your package like any other pip installed package). You can achieve the same effect with poetry and flit by configuring your pyproject.toml file.
Here are some additional useful references:
realpython.com: Python Modules and Packages - An Introduction
realpython.com: Python import: Advanced Techniques and Tips
realpython.com: Absolute vs Relative Imports in Python
I've also stumbled upon a similar issue. I've decided to create a new Python import library to solve this and other issues.
The result is ultraimport. It allows file based imports in Python and it does not care about any top-level module. If you know the path, you can import the file. I've used your structure as one of the examples which you can also find in the repository in the examples folder.
After changing your b.py to:
import ultraimport
A = ultraimport('__dir__/../a/a.py', 'A')
print(A)
class B:
def __init__(self):
pass
you can execute it as expected from the b folder:
package/package/b$ python ./b.py
<class 'a.A'>
Also running it through main.py works now:
package/package$ python ./main.py
<class 'a.A'>
I want to inherit from a class in a file that lies in a directory above the current one.
Is it possible to relatively import that file?
from ..subpkg2 import mod
Per the Python docs: When inside a package hierarchy, use two dots, as the import statement doc says:
When specifying what module to import you do not have to specify the absolute name of the module. When a module or package is contained within another package it is possible to make a relative import within the same top package without having to mention the package name. By using leading dots in the specified module or package after from you can specify how high to traverse up the current package hierarchy without specifying exact names. One leading dot means the current package where the module making the import exists. Two dots means up one package level. Three dots is up two levels, etc. So if you execute from . import mod from a module in the pkg package then you will end up importing pkg.mod. If you execute from ..subpkg2 import mod from within pkg.subpkg1 you will import pkg.subpkg2.mod. The specification for relative imports is contained within PEP 328.
PEP 328 deals with absolute/relative imports.
import sys
sys.path.append("..") # Adds higher directory to python modules path.
#gimel's answer is correct if you can guarantee the package hierarchy he mentions. If you can't -- if your real need is as you expressed it, exclusively tied to directories and without any necessary relationship to packaging -- then you need to work on __file__ to find out the parent directory (a couple of os.path.dirname calls will do;-), then (if that directory is not already on sys.path) prepend temporarily insert said dir at the very start of sys.path, __import__, remove said dir again -- messy work indeed, but, "when you must, you must" (and Pyhon strives to never stop the programmer from doing what must be done -- just like the ISO C standard says in the "Spirit of C" section in its preface!-).
Here is an example that may work for you:
import sys
import os.path
sys.path.append(
os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir)))
import module_in_parent_dir
Import module from a directory which is exactly one level above the current directory:
from .. import module
How to load a module that is a directory up
preface: I did a substantial rewrite of a previous answer with the hopes of helping ease people into python's ecosystem, and hopefully give everyone the best change of success with python's import system.
This will cover relative imports within a package, which I think is the most probable case to OP's question.
Python is a modular system
This is why we write import foo to load a module "foo" from the root namespace, instead of writing:
foo = dict(); # please avoid doing this
with open(os.path.join(os.path.dirname(__file__), '../foo.py') as foo_fh: # please avoid doing this
exec(compile(foo_fh.read(), 'foo.py', 'exec'), foo) # please avoid doing this
Python isn't coupled to a file-system
This is why we can embed python in environment where there isn't a defacto filesystem without providing a virtual one, such as Jython.
Being decoupled from a filesystem lets imports be flexible, this design allows for things like imports from archive/zip files, import singletons, bytecode caching, cffi extensions, even remote code definition loading.
So if imports are not coupled to a filesystem what does "one directory up" mean? We have to pick out some heuristics but we can do that, for example when working within a package, some heuristics have already been defined that makes relative imports like .foo and ..foo work within the same package. Cool!
If you sincerely want to couple your source code loading patterns to a filesystem, you can do that. You'll have to choose your own heuristics, and use some kind of importing machinery, I recommend importlib
Python's importlib example looks something like so:
import importlib.util
import sys
# For illustrative purposes.
file_path = os.path.join(os.path.dirname(__file__), '../foo.py')
module_name = 'foo'
foo_spec = importlib.util.spec_from_file_location(module_name, file_path)
# foo_spec is a ModuleSpec specifying a SourceFileLoader
foo_module = importlib.util.module_from_spec(foo_spec)
sys.modules[module_name] = foo_module
foo_spec.loader.exec_module(foo_module)
foo = sys.modules[module_name]
# foo is the sys.modules['foo'] singleton
Packaging
There is a great example project available officially here: https://github.com/pypa/sampleproject
A python package is a collection of information about your source code, that can inform other tools how to copy your source code to other computers, and how to integrate your source code into that system's path so that import foo works for other computers (regardless of interpreter, host operating system, etc)
Directory Structure
Lets have a package name foo, in some directory (preferably an empty directory).
some_directory/
foo.py # `if __name__ == "__main__":` lives here
My preference is to create setup.py as sibling to foo.py, because it makes writing the setup.py file simpler, however you can write configuration to change/redirect everything setuptools does by default if you like; for example putting foo.py under a "src/" directory is somewhat popular, not covered here.
some_directory/
foo.py
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
py_modules=['foo'],
)
.
python3 -m pip install --editable ./ # or path/to/some_directory/
"editable" aka -e will yet-again redirect the importing machinery to load the source files in this directory, instead copying the current exact files to the installing-environment's library. This can also cause behavioral differences on a developer's machine, be sure to test your code!
There are tools other than pip, however I'd recommend pip be the introductory one :)
I also like to make foo a "package" (a directory containing __init__.py) instead of a module (a single ".py" file), both "packages" and "modules" can be loaded into the root namespace, modules allow for nested namespaces, which is helpful if we want to have a "relative one directory up" import.
some_directory/
foo/
__init__.py
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
packages=['foo'],
)
I also like to make a foo/__main__.py, this allows python to execute the package as a module, eg python3 -m foo will execute foo/__main__.py as __main__.
some_directory/
foo/
__init__.py
__main__.py # `if __name__ == "__main__":` lives here, `def main():` too!
setup.py
.
#!/usr/bin/env python3
# setup.py
import setuptools
setuptools.setup(
name="foo",
...
packages=['foo'],
...
entry_points={
'console_scripts': [
# "foo" will be added to the installing-environment's text mode shell, eg `bash -c foo`
'foo=foo.__main__:main',
]
},
)
Lets flesh this out with some more modules:
Basically, you can have a directory structure like so:
some_directory/
bar.py # `import bar`
foo/
__init__.py # `import foo`
__main__.py
baz.py # `import foo.baz
spam/
__init__.py # `import foo.spam`
eggs.py # `import foo.spam.eggs`
setup.py
setup.py conventionally holds metadata information about the source code within, such as:
what dependencies are needed to install named "install_requires"
what name should be used for package management (install/uninstall "name"), I suggest this match your primary python package name in our case foo, though substituting underscores for hyphens is popular
licensing information
maturity tags (alpha/beta/etc),
audience tags (for developers, for machine learning, etc),
single-page documentation content (like a README),
shell names (names you type at user shell like bash, or names you find in a graphical user shell like a start menu),
a list of python modules this package will install (and uninstall)
a defacto "run tests" entry point python ./setup.py test
Its very expansive, it can even compile c extensions on the fly if a source module is being installed on a development machine. For a every-day example I recommend the PYPA Sample Repository's setup.py
If you are releasing a build artifact, eg a copy of the code that is meant to run nearly identical computers, a requirements.txt file is a popular way to snapshot exact dependency information, where "install_requires" is a good way to capture minimum and maximum compatible versions. However, given that the target machines are nearly identical anyway, I highly recommend creating a tarball of an entire python prefix. This can be tricky, too detailed to get into here. Check out pip install's --target option, or virtualenv aka venv for leads.
back to the example
how to import a file one directory up:
From foo/spam/eggs.py, if we wanted code from foo/baz we could ask for it by its absolute namespace:
import foo.baz
If we wanted to reserve capability to move eggs.py into some other directory in the future with some other relative baz implementation, we could use a relative import like:
import ..baz
Here's a three-step, somewhat minimalist version of ThorSummoner's answer for the sake of clarity. It doesn't quite do what I want (I'll explain at the bottom), but it works okay.
Step 1: Make directory and setup.py
filepath_to/project_name/
setup.py
In setup.py, write:
import setuptools
setuptools.setup(name='project_name')
Step 2: Install this directory as a package
Run this code in console:
python -m pip install --editable filepath_to/project_name
Instead of python, you may need to use python3 or something, depending on how your python is installed. Also, you can use -e instead of --editable.
Now, your directory will look more or less like this. I don't know what the egg stuff is.
filepath_to/project_name/
setup.py
test_3.egg-info/
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
This folder is considered a python package and you can import from files in this parent directory even if you're writing a script anywhere else on your computer.
Step 3. Import from above
Let's say you make two files, one in your project's main directory and another in a sub directory. It'll look like this:
filepath_to/project_name/
top_level_file.py
subdirectory/
subfile.py
setup.py |
test_3.egg-info/ |----- Ignore these guys
... |
Now, if top_level_file.py looks like this:
x = 1
Then I can import it from subfile.py, or really any other file anywhere else on your computer.
# subfile.py OR some_other_python_file_somewhere_else.py
import random # This is a standard package that can be imported anywhere.
import top_level_file # Now, top_level_file.py works similarly.
print(top_level_file.x)
This is different than what I was looking for: I hoped python had a one-line way to import from a file above. Instead, I have to treat the script like a module, do a bunch of boilerplate, and install it globally for the entire python installation to have access to it. It's overkill. If anyone has a simpler method than doesn't involve the above process or importlib shenanigans, please let me know.
Polished answer of #alex-martelli with pathlib:
import pathlib
import sys
_parentdir = pathlib.Path(__file__).parent.parent.resolve()
sys.path.insert(0, str(_parentdir))
import module_in_parent_dir
sys.path.remove(str(_parentdir))
To run python /myprogram/submodule/mymodule.py which imports /myprogram/mainmodule.py, e.g., via
from mainmodule import *
on Linux (e.g., in the python Docker image), I had to add the program root directory to PYTHONPATH:
export PYTHONPATH=/myprogram
It is 2022 and none of the answers really worked for me. Here is what worked in the end
import sys
sys.path.append('../my_class')
import my_class
My directory structure:
src
--my_class.py
notebooks
-- mynotebook.ipynb
I imported my_class from mynotebook.ipynb.
You can use the sys.path.append() method to add the directory containing the package to the list of paths searched for modules. For example, if the package is located two directories above the current directory, you can use the following code:
import sys
sys.path.append("../../")
if the package is location one directory above the current directory, you can use below code:
import sys
sys.path.append("..")
Python is a modular system
Python doesn't rely on a file system
To load python code reliably, have that code in a module, and that module installed in python's library.
Installed modules can always be loaded from the top level namespace with import <name>
There is a great sample project available officially here: https://github.com/pypa/sampleproject
Basically, you can have a directory structure like so:
the_foo_project/
setup.py
bar.py # `import bar`
foo/
__init__.py # `import foo`
baz.py # `import foo.baz`
faz/ # `import foo.faz`
__init__.py
daz.py # `import foo.faz.daz` ... etc.
.
Be sure to declare your setuptools.setup() in setup.py,
official example: https://github.com/pypa/sampleproject/blob/master/setup.py
In our case we probably want to export bar.py and foo/__init__.py, my brief example:
setup.py
#!/usr/bin/env python3
import setuptools
setuptools.setup(
...
py_modules=['bar'],
packages=['foo'],
...
entry_points={},
# Note, any changes to your setup.py, like adding to `packages`, or
# changing `entry_points` will require the module to be reinstalled;
# `python3 -m pip install --upgrade --editable ./the_foo_project
)
.
Now we can install our module into the python library;
with pip, you can install the_foo_project into your python library in edit mode,
so we can work on it in real time
python3 -m pip install --editable=./the_foo_project
# if you get a permission error, you can always use
# `pip ... --user` to install in your user python library
.
Now from any python context, we can load our shared py_modules and packages
foo_script.py
#!/usr/bin/env python3
import bar
import foo
print(dir(bar))
print(dir(foo))
The topic of namespace packages seems a bit confusing for the uninitiated, and it doesn't help that prior versions of Python have implemented it in a few different ways or that a lot of the Q&A on StackOverflow are dated. I am looking for a solution in Python 3.5 or later.
#The scenario:
I'm in the process of refactoring a bunch of Python code into modules and submodules, and working to get each of these projects set up to operate independently of each other while sitting in the same namespace.
We're eventually going to be using an internal PyPi server, serving these packages to our internal network and don't want to confuse them with external (public) PyPi packages.
Example: I have 2 modules, and I would like to be able to perform the following:
from org.client.client1 import mod1
from org.common import config
The reflected modules would be separated as such:
Repository 1:
org_client_client1_mod1/
setup.py
mod1/
__init__.py
somefile.py
Repository 2:
org_common_config/
setup.py
config/
__init__.py
someotherfile.py
My Git repositories are already setup as org_client_client1_mod1 and org_common_config, so I just need to perform the setup on the packaging and __init__.py files, I believe.
Questions:
#1
With the __init__.py, which of these should I be using (if any)?:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
Or:
import pkg_resources
pkg_resources.declare_namespace(__name__)
#2
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'],
or namespace_modules=['org', 'common']?
#3
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
Late to the party, but never hurts to help fellow travellers down the namespace path in Python!
#1:
With the __init__.py, which of these should I be using (if any)?:
It depends, There are three ways to do namespace packages as listed here:
Use native namespace packages. This type of namespace package is defined in PEP 420 and is available in Python 3.3 and later. This is recommended if packages in your namespace only ever need to support Python 3 and installation via pip.
Use pkgutil-style namespace packages. This is recommended for new packages that need to support Python 2 and 3 and installation via both pip and python setup.py install.
Use pkg_resources-style namespace packages. This method is recommended if you need compatibility with packages already using this method or if your package needs to be zip-safe.
If you are using #2 (pkgutil-style) or #3 (pkg_resources-style), then you will have to use the corresponding style for __init__.py files. If you use native namespaces then no __init__.py in the namespace directory.
#2:
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'], or namespace_modules=['org', 'common']?
If your choice of namespace package is not native style, then yes, you will need namespace_packages in your setup().
#3:
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
Since you ended up down to a complex topic in python, it seems you know what you are doing, what you want and identified that creating a Python Namespace package is the way to do it. This would be considered a pythonic way to solve a problem.
Adding to your questions, here are a few things I discovered:
I read PEP420, the Python Packaging guide and spent a lot of time understanding the namespace packages, and I generally understood how it worked. I read through a couple of answers here, here, here, and this thread on SO as well - the example here and on the Git link shared by Rob.
My problem however was after I created my package. As all the instructions and sample code explicitly listed the package in the setuptools.setup(package=[]) function, my code failed. My sub-packages/directories were not included. Digging deeper, I found out that setuptools has a find_namespace_package() function that helps in adding sub-packages too
EDIT:
Link to find_namespace_packages() (setuptools version greater than 40.1.0): https://setuptools.readthedocs.io/en/latest/setuptools.html#find-namespace-packages
EDIT (08/09/2019):
To complete the answer, let me also restructure with an example.
The following solution is assuming Python 3.3+ which has support for implicit namespace packages
Since you are looking for a solution for Python version 3.5 or later, let's take the code samples provided and elaborate further.
Let's assume the following:
Namespace/Python package name : org
Distribution packages: org_client, org_common
Python: 3.3+
setuptools: 40.1.0
For you to do the following
from org.client.client1 import mod1
from org.common import config
And keeping your top level directories the same, viz. org_client_client1_mod1 and org_common_config, you can change your structure to the following
Repository 1:
org_client_client1_mod1/
setup.py
org/
client/
client1/
__init__.py
submod1/
__init__.py
mod1/
__init__.py
somefile.py
file1.py
Updated setup.py
from setuptools import find_namespace_packages, setup
setup(
name="org_client",
...
packages=find_namespace_packages(), # Follows similar lookup as find_packages()
...
)
Repository 2:
org_common_config/
setup.py
org/
common/
__init__.py
config/
__init__.py
someotherfile.py
Updated setup.py:
from setuptools import find_namespace_packages, setup
setup(
name="org_common",
...
packages=find_namespace_packages(), # Follows similar lookup as find_packages()
...
)
To install (using pip):
(venv) $ pip3 install org_common_config/
(venv) $ pip3 install org_client_client1_mod1/
Updated pip list will show the following:
(venv) $ pip3 list
...
org_client
org_common
...
But they won't be importable, for importing you will have to follow org.client and org.common notation.
To understand why, you can browse here (assuming inside venv):
(venv) $ cd venv/lib/python3.5/site-packages/
(venv) $ ls -l | grep org
You'll see that there's no org_client or org_common directories, they are interpreted as a namespace package.
(venv) $ cd venv/lib/python3.5/site-packages/org/
(venv) $ ls -l
client/
common/
...
This is a tough subject. All the -'s, _'s, and __init__.py's everywhere don't exactly make it easy on us.
First, I'll answer your questions:
With the __init__.py, which of these should I be using (if any)?
__init__.py can be completely empty, it just needs to be in the correct place. Namely (pun) they should be in any subpackage containing python code (excluding setup.py.) Follow those rules and you should be fine.
With setup.py, do I still need to add the namespace_modules parameter, and if so, would I use namespace_modules=['org.common'], or namespace_modules=['org', 'common']?
Nope! Only name= and packages=. However, note the format of the packages= arg compared against the directory structure.
Here's the format of the package= arg:
Here's the corresponding directory structure:
Could I forgo all of the above by just implementing this differently somehow? Perhaps something simpler or more "pythonic"?
If you want to be able to install multiple features individually, but under the same top-level namespace, you're on the right track.
I'll spend the rest of this answer re-implementing your namespace package in native format:
I'll put all helpful documentation I've been able to find at the bottom of the post.
K so I'm going to assume you want native namespace packages. First let's look at the current structure of your 2 repos:
org_client_client1_mod1/
setup.py
mod1/
__init__.py
somefile.py
&
org_common_config/
setup.py
config/
__init__.py
someotherfile.py
This^ would be too easy!!!
To get what you want:
My brain isn't elastic enough to know if we can go 3-levels deep with namespace packages, but to do what you want, here's what I'm pretty sure you'd want to do:
org-client/
setup.py
org/
client/
client1/
__init__.py
mod1/
__init__.py
somefile.py
&
org-common-but-also-note-this-name-doesnt-matter/
setup.py
org/
common/
__init__.py
config/
__init__.py
someotherfile.py
Basically then the key is going to be specifying the correct name= & packages= args to stuptools.setup() inside of each setup.py.
These are going to be:
name='org_client',
...
packages=['org.client']
&
name='org_common'
...
packages['org.common']
respectively.
Then just install each one with pip install . inside each top-level dir.
Installing the first one will give you access to the somefile.py module, and installing the second will give you access to someotherfile.py. It also won't get confused about you trying to install 2 packages named org in the same environment.
K so the most helpful section of the docs: https://packaging.python.org/guides/packaging-namespace-packages/#packaging-namespace-packages
And then here's how I actually came to understand this: https://github.com/pypa/sample-namespace-packages/tree/master/native