I'm trying to build a library with multiple sub-modules. The general structure looks like so:
package_test/
|___ setup.py
|___ README
|___ package_name/
|___|___ __init__.py
|___|___ submodule/
|___|___|___ __init__.py
|___|___|___ superawesome.py
|___|___ submodule2/
|___|___|___ __init__.py
|___|___|___ prettyokay.py
The goal is to allow a user to do something like:
from package_name.submodule import superawesome
from package_name.submodule2 import prettyokay
The __init__.py for each submodule currently contains (or the appropriate name for submodule2):
from .superawesome import superawesome
__all__ = ['superawesome']
The __init__.py for inside package_name contains:
__version__ = 'v0.0.alpha'
__all__ = ['submodule','submodule2']
When I attempt to use this, iPython shows this error:
From package_name.submodule import superawesome
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-03be4241edd5> in <module>()
----> 1 from package_name.submodule import superawesome
ModuleNotFoundError: No module named 'package_name.submodule'
My setup.py currently contains:
from distutils.core import setup
setup(
name = 'package_name',
packages = ['package_name'], # this must be the same as the name above
version = '0.1.1',
long_description=open('README').read(),
)
I need help figuring out how to do the layered importing structure in order to properly import the submodules as part of the whole module. This is clearly a contrived naming set, but in the real problem, each submodule contains a set of modules that belong together... but all of the submodules together are necessary for the library; so I really want to maintain this structure.
Related
This is the structure of my project at the moment.
classes/
├─ __init__.py
├─ card.py
├─ find.py
├─ player.py
├─ table.py
.gitignore
main.py
My __init__.py:
from .card import *
from .player import *
from .table import *
The problem is when I want to import from example player to table.py,
like this:
from classes import Player
from classes import Card, CardType, CardValue, HandValue
I get the following error message:
Traceback (most recent call last):
File "/home/adrian/Desktop/Programozas/Poker/classes/table.py", line 1, in <module>
from classes import Player
ModuleNotFoundError: No module named 'classes'
Do you have any idea?
Notice that this error is occurring in ".../classes/table.py", line 1, as you are trying to import classes.Player when both table.py and player.py are contained in classes. To do this in table.py, you would use from .player import Player just like you did in __init__.py (instead of from classes import Player).
Anything at the same level or above classes can be done using this method, but anything defined inside classes cannot call up to classes, as anything therein does not even know classes exists.
Edit: if you need to be able to run both main.py and table.py, you could do something like:
# in table.py
try:
from player import Player # this should work when running table.py directly
except:
from .player import Player # this should work when running main.py (it's a relative import)
I'm trying to make a library out of a Python project I don't own.
The project has the following directory layout:
.
├── MANIFEST.in
├── pyproject.toml
└── src
├── all.py
├── the.py
└── sources.py
In pyproject.toml I have:
[tool.setuptools]
packages = ["mypkg"]
[tool.setuptools.package-dir]
mypkg = "src"
The problem I'm facing is that when I build and install this package I can't use it because the author is importing stuff without mypkg prefix in the various source files.
F.ex. in all.py
from the import SomeThing
Since I don't own the package I can't go modify all the sources but I still want to be able to build a library from it by just adding MANIFEST.in and pyproject.toml.
Is it possible to somehow instruct setuptools to build a package that won't litter site-packages with all the sources while still allowing them to be imported without the mypkg prefix?
It isn't possible without adding a custom import hook with the package. The hook takes the form of a module that is shipped with the package, and it must be imported before usage from your module (e.g. in src/all.py)
src/mypkgimp.py
import sys
import importlib
class MyPkgLoader(importlib.abc.Loader):
def find_spec(self, name, path=None, target=None):
# update the list with modules that should be treated special
if name in ['sources', 'the']:
return importlib.util.spec_from_loader(name, self)
return None
def create_module(self, spec):
# Uncomment if "normal" imports should have precedence
# try:
# sys.meta_path = [x for x in sys.meta_path[:] if x is not self]
# return importlib.import_module(spec.name)
# except ImportError:
# pass
# finally:
# sys.meta_path = [self] + sys.meta_path
# Otherwise, this will unconditionally shadow normal imports
module = importlib.import_module('.' + spec.name, 'mypkg')
# Final step: inject the module to the "shortened" name
sys.modules[spec.name] = module
return module
def exec_module(self, module):
pass
if not hasattr(sys, 'frozen'):
sys.meta_path = [MyPkgLoader()] + sys.meta_path
Yes, the above uses different methods described by the thread I have linked previously, as importlib have deprecated those methods in Python 3.10, refer to documentation for details.
Anyway, for the demo, put some dummy classes in the modules:
src/the.py
class SomeThing: ...
src/sources.py
class Source: ...
Now, modify src/all.py to have the following:
import mypkg.mypkgimp
from the import SomeThing
Example usage:
>>> from sources import Source
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'sources'
>>> from mypkg import all
>>> all.SomeThing
<class 'mypkg.the.SomeThing'>
>>> from sources import Source
>>> Source
<class 'mypkg.sources.Source'>
>>> from sources import Error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Error' from 'mypkg.sources' (/tmp/mypkg/src/sources.py)
Note how the import initially didn't work, but after mypkg.all got imported, the sources import now works globally. Hence care may be needed to not shadow "real" imports and I have provided the example to import using the "default"[*] import mechanism.
If you want the module names to look different (i.e. without the mypkg. prefix), that will be a separate question, as code typically don't check for their own module name for functionality (and never mind that this actually shows how the namespace is implicitly used - changing the actual name is more akin to a module relocation, yes this can be done, but a bit more complicated and this answer is long enough as it is).
[*] "default" as in not including behaviors introduced by this custom import hook - other import hooks may do their own other weird shenanigans.
I want to import constants from a constants module from two different modules, but I get the following error:
Traceback (most recent call last):
File "C:\Temp\tmp\pycircular\pycircular\pycircular.py", line 2, in <module>
from my_classes.foo import Foo
File "C:\Temp\tmp\pycircular\pycircular\my_classes\foo.py", line 1, in <module>
from pycircular.constants import ANOTHER_CONSTANT
File "C:\Temp\tmp\pycircular\pycircular\pycircular.py", line 2, in <module>
from my_classes.foo import Foo
ImportError: cannot import name 'Foo' from partially initialized module 'my_classes.foo' (most likely due to a circular import) (C:\Temp\tmp\pycircular\pycircular\my_classes\foo.py)
My project structure is the following:
|-constants.py
|-my_classes
| |-foo.py
| |-__init__.py
|-pycircular.py
|-__init__.py
# =============
# pycircular.py
# =============
from constants import SOME_CONSTANT
from my_classes.foo import Foo
def main():
print(SOME_CONSTANT)
my_foo = Foo()
my_foo.do_something()
if __name__ == "__main__":
main()
# =============
# foo.py
# =============
from pycircular.constants import ANOTHER_CONSTANT
class Foo:
def do_something(self):
print(ANOTHER_CONSTANT)
# =============
# constants.py
# =============
ANOTHER_CONSTANT = "ANOTHER"
SOME_CONSTANT = "CONSTANT"
I assume that it is the same problem as solved here https://stackoverflow.com/a/62303448/2021763.
But I really do not get why from my_classes.foo import Foo in pycircular.py is called a second time.
Update:
After renaming the package pycircular to pycircular_pack it worked in PyCharm.
But it only works because in Pycharm the option Add content roots to to PYTHONPATH is automatically set.
The output of sys.path is ['C:\\Temp\\tmp\\pycircular\\pycircular_pack', 'C:\\Temp\\tmp\\pycircular', 'C:\\Tools\\miniconda\\envs\\my_env\\python39.zip', 'C:\\Tools\\miniconda\\envs\\my_env\\DLLs', 'C:\\Tools\\miniconda\\envs\\my_env\\lib', 'C:\\Tools\\miniconda\\envs\\my_env', 'C:\\Tools\\miniconda\\envs\\my_env\\lib\\site-packages']
Without the option the output is ['C:\\Temp\\tmp\\pycircular\\pycircular_pack', 'C:\\Tools\\miniconda\\envs\\my_env\\python39.zip', 'C:\\Tools\\miniconda\\envs\\my_env\\DLLs', 'C:\\Tools\\miniconda\\envs\\my_env\\lib', 'C:\\Tools\\miniconda\\envs\\my_env', 'C:\\Tools\\miniconda\\envs\\my_env\\lib\\site-packages']
And without the option I only get it to work with absolute imports.
# pycircular.py
from constants import SOME_CONSTANT
from my_classes.foo import Foo
...
# foo.py
from constants import ANOTHER_CONSTANT
To elaborate based on the comments and edit:
After renaming the package pycircular to pycircular_pack it worked in PyCharm. But it only works because in Pycharm the option Add content roots to to PYTHONPATH is automatically set.
You should make sure the package directory is not set as a content root or source root. The directory hosting the package directory should be set as source root.
C:\Temp\tmp\pycircular # <- source root
|- pycircular_pack # <- not set as anything
| |- constants.py
| |- my_classes
| | |- foo.py
| | |- __init__.py
| |- pycircular.py
| |- __init__.py
|- other_file.py # <- for illustration's sake
Now your sys.path will be set to include C:\Temp\tmp\pycircular only and there will be exactly one way to import things from your module.
Namely,
other_file.py (outside the package) will be able to use the package as pycircular_pack
pycircular_pack/*.py can refer to modules in the pycircular_pack package by either
(e.g.) from .constants import ... (relative import from current package), or
(e.g.) from pycircular_pack.constants import ... (absolute import)
pycircular_pack/my_classes/*.py can refer to modules in the pycircular_pack package by either
(e.g.) from ..constants import ... (relative import from parent package), or
(e.g.) from pycircular_pack.constants import ... (absolute import)
If your pycircular_pack package would contain a runnable script, e.g. a CLI as pycircular_pack/cli.py, then the correct way to run that script on the command line would be to use python -m pycircular_pack.cli; this has Python set up the path just like we want here, where python pycircular_pack/cli.py would not do the right thing.
I have the following Python package with 2 moludes:
-pack1
|-__init__
|-mod1.py
|-mod2.py
-import_test.py
with the code:
# in mod1.py
a = 1
and
# in mod2.py
from mod1 import a
b = 2
and the __init__ code:
# in __init__.py
__all__ = ['mod1', 'mod2']
Next, I am trying to import the package:
# in import_test.py
from pack1 import *
But I get an error:
ModuleNotFoundError: No module named 'mod1'
If I remove the dependency "from mod1 import a" in mod2.py, the import goes correctly. But that dependency makes the import incorrect with that exception "ModuleNotFoundError".
???
The issue here is that from mod2 perspective the first level in which it will search for a module is in the path from which you are importing it (here I am assuming that pack1 is not in your PYTHONPATH and that you are importing it from the same directory where pack1 is contained).
This means that if pack1 is in the directory /dir/to/pack1 and you do:
from mod1 import a
Python will look for mod1 in the same directory as pack1, i.e., /dir/to/pack1.
To solve your issue it is enough to do either:
from pack1.mod1 import a
or in Python 3.5+
from .mod1 import a
As a side note, unless this is a must for you, I do not recommend designing your package to be used as from pack import *, even if __all__ exists to give you better control of your public API.
Consider the following package structure:
.
├── module
│ ├── __init__.py
│ └── submodule
│ ├── attribute.py
│ ├── data.txt
│ └── __init__.py
└── test.py
and the following piece of code:
import pkgutil
data = pkgutil.get_data('module.submodule', 'data.txt')
import module.submodule.attribute
retval = module.submodule.attribute.hello()
Running this will raise the error:
Traceback (most recent call last):
File "test.py", line 7, in <module>
retval = module.submodule.attribute.hello()
AttributeError: module 'module' has no attribute 'submodule'
However, if you run the following:
import pkgutil
import module.submodule.attribute
data = pkgutil.get_data('module.submodule', 'data.txt')
retval = module.submodule.attribute.hello()
or
import pkgutil
import module.submodule.attribute
retval = module.submodule.attribute.hello()
it works fine.
Why does running pkgutil.get_data disrupt the future import?
First of all, this was a great question and a great opportunity to learn something new about python's import system. So let's dig in!
If we look at the implementation of pkgutil.get_data we see something like this:
def get_data(package, resource):
spec = importlib.util.find_spec(package)
if spec is None:
return None
loader = spec.loader
if loader is None or not hasattr(loader, 'get_data'):
return None
# XXX needs test
mod = (sys.modules.get(package) or
importlib._bootstrap._load(spec))
if mod is None or not hasattr(mod, '__file__'):
return None
# Modify the resource name to be compatible with the loader.get_data
# signature - an os.path format "filename" starting with the dirname of
# the package's __file__
parts = resource.split('/')
parts.insert(0, os.path.dirname(mod.__file__))
resource_name = os.path.join(*parts)
return loader.get_data(resource_name)
And the answer to your question is in this part of the code:
mod = (sys.modules.get(package) or
importlib._bootstrap._load(spec))
It looks at the already loaded packages and if the package we're looking for (module.submodule in this example) exists it uses it and if not, then tries to load the package using importlib._bootstrap._load.
So let's look at the implementation of importlib._bootstrap._load to see what's going on.
def _load(spec):
"""Return a new module object, loaded by the spec's loader.
The module is not added to its parent.
If a module is already in sys.modules, that existing module gets
clobbered.
"""
with _ModuleLockManager(spec.name):
return _load_unlocked(spec)
Well, There's right there! The doc says "The module is not added to its parent."
It means the submodule module is loaded but it's not added to the module module. So when we try to access the submodule via module there's no connection, hence the AtrributeError.
It makes sense for the get_data method to use this function as it just wants some other file in the package and there is no need to import the whole package and add it to its parent and its parents' parent and so on.
to see it yourself I suggest using a debugger and setting some breakpoints. Then you can see what happens step by step along the way.