Importing From Sister Subdirectories in Python? - python

So, I've seen a few similar questions on Stack Overflow, but nothing seems to address my issue, or the general case. So, hopefully this question fixes that, and stops my headaches. I have a git repo of the form:
repo/
__init__.py
sub1/
__init__.py
sub1a/
__init.py
mod1.py
sub2/
__init__.py
mod2.py
How do I import mod2.py from mod1.py and vice versa, and how does this change depending on whether mod1.py or mod2.py are scripts (when each respectively is importing-- not being imported)?

The simplest solution is to put the directory containing repo in your PYTHONPATH, and then just use absolute-path imports, e.g. import repo.sub2.mod2 and so on.
Any other solution is going to involve some hackery if you want it to cover cases where you're invoking both the python files directly as scripts from arbitrary directories - most likely sys.path mangling to effectively accomplish the same thing as setting PYTHONPATH, but without having to have the user set it.

If you are using Python 2.6+, you have two choices:
Relative imports
Adding repo to your PYTHONPATH
With relative imports, a special dot syntax is used:
in package sub1:
from .sub2.mod2 import thing
in package sub1a:
from ..sub2.mod2 import otherthing
Note that plain import statements (import module) don't work with relative imports.
A better solution would be using absolute imports with your Python path set correctly (example in bash):
export PYTHONPATH=/where/your/project/is:$PYTHONPATH
More info:
How to do relative imports in Python?
Permanently add a directory to PYTHONPATH
Import a module from a relative path

A script or module can import modules that are either
on the system path, or
part of the same package as the importing script/module.
For modules these rules apply without exception. For scripts, the rules apply, but the wrinkle is that by default when you run a script, it is not considered to be part of a package.
This means that by default a script can only import modules that are on the system path. By default the path includes the current directory, so if you run a script, it can import modules in the same directory, or packages that are subdirectories. But that's it. A script has no notion of "where it is" in the directory tree, so it can't do any imports that require specific relative path information about enclosing directories. That means you cannot import things "from the parent directory" or "from a sibling directory". Things that are in those directories can only be imported if they are on the system path.
If you want to make a script "know" that is in a package, you can give it a __package__ attribute. See this previous question. You can then use explicit relative imports (e.g., from ...sub2 import mod2) normally from within that script.

Related

How to work with absolute imports when developing a package

It's been a while that I am struggling with imports in packages. When I develop a package, I read everywhere that it is preferable to use absolute imports in submodules of that package. I understand that and I like it more as well. But then I don't like and I also read that you shouldn't use sys.path.append('/path/to/package') to use your package in development...
So my question is, how do you develop such a package from zero, using directly absolute imports? At the moment I develop the package using relative imports, since then I am able to test the code I am writing before packaging and installing, then I change the imports once I have a release and build the package.
What is the correct way of doing such thing? In Pycharm for example you would mark the folder as 'source roor' and be able to work as if the package folder was in the path. Still I read that this is not the proper way... what am I missing? How do you develop a package while testing its code?
Your mileage may vary but this is what I usually do:
Within a package (foo), absolute (import foo.bar) or relative (import .bar) doesn't matter to me as long as it works. Sometimes, I prefer relative especially when the project is large and one day I might decide to move a number of source files into a subdirectory.
How do I test? My $PYTHONPATH usually has . in it, and my directory hierarchy is like this:
/path/to/foo_project
/setup.py
/foo
/__init__.py
/bar.py
/test
/test1.py
/test2.py
then the script in foo_project/test/test1.py will be like what you normally use the package, using import foo.bar. And when I test my code, I will be in the directory foo_project and run python test/test1.py. Since I have . in my $PYTHONPATH, it will find the directory foo and use it as a package.

Python package cannot be imported

I have a Python module called util. I would like to import a script in this package _util.py from another script in scripts folder.
Even if the util package has an empty __init__.py file it does not appear as a Python package but a normal directory, without the small dot on folder image.
How can I import this module?
A preliminary answer to your question is that modules or methods beginning with underscores are meant to be used internally.
_my_method() should only be referenced from within the module holding it
_my_module() should only be referenced from within the package holding it
That being said, this convention is meant to be a hint to other developers, not a strict prohibition. Perhaps the first step you can take to solve the import issue is to rename _util.py to util.py and proceed from there.

Dealing with module name collision

Occasionally, module name collisions happen between the application and an internal file in a third-party package. For example, a file named profile.py in the current folder will cause jupyter notebook to crash as it attempts to import it instead of its own profile.py. What's a good way to avoid this problem, from the perspective of the package user? (Or is this something that the package developer should somehow prevent?)
Note: while a similar problem occurs due to a collision between application and built-in names (e.g., time.py or socket.py), at least it's relatively easy to remember the names of standard library modules and other built-in objects.
The current directory is the directory which contains the main script of the application. If you want to avoid name collisions in this directory, don't put any modules in it.
Instead, use a namespace. Create a uniquely-named package in the directory of the main script, and import everything from that. The main script should be very simple, and contain nothing more than this:
if __name__ == '__main__':
from mypackage import myapp
myapp.run()
All the modules inside the package should also use from imports to access the other modules within the package. For example, myapp.py might contain:
from mypackage import profile

python - absolute import for module in the same directory

I have this package:
mypackage/
__init__.py
a.py
b.py
And I want to import things from module a to module b, does it make sense to write in module b
from mypackage.a import *
or should I just use
from a import *
Both options will work, I'm just wondering which is better (the 2nd makes sense because it's in the same level but I'm considering the 1st to avoid collisions, for example if the system is running from a folder that contains a file named a.py).
You can safely use number 2 because there shouldn't be any collisions - you'll be always importing a module from the same package as the current one. Please note, that if your module has the same name as one of the standard library modules, it will be imported instead of the standard one. From the documentation:
When a module named spam is imported, the interpreter first searches
for a built-in module with that name. If not found, it then searches
for a file named spam.py in a list of directories given by the
variable sys.path. sys.path is initialized from these locations:
the directory containing the input script (or the current directory).
PYTHONPATH (a list of directory names, with the same syntax as the
shell variable PATH).
the installation-dependent default.
After initialization, Python programs can modify sys.path. The
directory containing the script being run is placed at the beginning
of the search path, ahead of the standard library path. This means
that scripts in that directory will be loaded instead of modules of
the same name in the library directory. This is an error unless the
replacement is intended. See section Standard Modules for more
information.
The option from mypackage.a import * can be used for consistency reasons all over the project. In some modules you will have to do absolute imports anyway. Thus you won't have to think whether the module is in the same package or not and simply use a uniform style in the entire project. Additionally this approach is more reliable and predictable.
Python style guidelines don't recommend using relative imports:
Relative imports for intra-package imports are highly discouraged.
Always use the absolute package path for all imports. Even now that
PEP 328 is fully implemented in Python 2.5, its style of explicit
relative imports is actively discouraged; absolute imports are more
portable and usually more readable.
Since python 2.5 a new syntax for intra-package relative imports has been introduced. Now you can . to refer to the current module and .. referring to the module being 1 level above.
from . import echo
from .. import formats
from ..filters import equalizer
You should use from mypackage.a import things, you, want.
There are two issues here, the main one is relative vs absolute imports, the semantics of which changed in Python 3, and can optionally be used in Python 2.6 and 2.7 using a __future__ import. By using mypackage.a you guarantee that you will get the code you actually want, and it will work reliably on future versions of Python.
The second thing is that you should avoid import *, as it can potentially mask other code. What if the a.py file gained a function called sum? It would silently override the builtin one. This is especially bad when importing your own code in other modules, as you may well have reused variable or function names.
Therefore, you should only ever import the specific functions you need. Using pyflakes on your sourcecode will then warn you when you have potential conflicts.

Deploying a python application with shared package

I'm thinking how to arrange a deployed python application which will have a
Executable script located in /usr/bin/ which will provide a CLI to functionality implemented in
A library installed to wherever the current site-packages directory is.
Now, currently, I have the following directory structure in my sources:
foo.py
foo/
__init__.py
...
which I guess is not the best way to do things. During development, everything works as expected, however when deployed, the "from foo import FooObject" code in foo.py seemingly attempts to import foo.py itself, which is not the behaviour I'm looking for.
So the question is what is the standard practice of orchestrating situations like this? One of the things I could think of is, when installing, rename foo.py to just foo, which stops it from importing itself, but that seems rather awkward...
Another part of the problem, I suppose, is that it's a naming challenge. Perhaps call the executable script foo-bin.py?
This article is pretty good, and shows you a good way to do it. The second item from the Do list answers your question.
shameless copy paste:
Filesystem structure of a Python project
by Jp Calderone
Do:
name the directory something related to your project. For example, if your
project is named "Twisted", name the
top-level directory for its source
files Twisted. When you do releases,
you should include a version number
suffix: Twisted-2.5.
create a directory Twisted/bin and put your executables there, if you
have any. Don't give them a .py
extension, even if they are Python
source files. Don't put any code in
them except an import of and call to a
main function defined somewhere else
in your projects.
If your project is expressable as a single Python source file, then put it
into the directory and name it
something related to your project. For
example, Twisted/twisted.py. If you
need multiple source files, create a
package instead (Twisted/twisted/,
with an empty
Twisted/twisted/__init__.py) and place
your source files in it. For example,
Twisted/twisted/internet.py.
put your unit tests in a sub-package of your package (note - this means
that the single Python source file
option above was a trick - you always
need at least one other file for your
unit tests). For example,
Twisted/twisted/test/. Of course, make
it a package with
Twisted/twisted/test/__init__.py.
Place tests in files like
Twisted/twisted/test/test_internet.py.
add Twisted/README and Twisted/setup.py to explain and
install your software, respectively,
if you're feeling nice.
Don't:
put your source in a directory called src or lib. This makes it hard
to run without installing.
put your tests outside of your Python package. This makes it hard to
run the tests against an installed
version.
create a package that only has a __init__.py and then put all your code into __init__.py. Just make a module
instead of a package, it's simpler.
try to come up with magical hacks to make Python able to import your module
or package without having the user add
the directory containing it to their
import path (either via PYTHONPATH or
some other mechanism). You will not
correctly handle all cases and users
will get angry at you when your
software doesn't work in their
environment.
Distutils supports installing modules, packages, and scripts. If you create a distutils setup.py which refers to foo as a package and foo.py as a script, then foo.py should get installed to /usr/local/bin or whatever the appropriate script install path is on the target OS, and the foo package should get installed to the site_packages directory.
You should call the executable just foo, not foo.py, then attempts to import foo will not use it.
As for naming it properly: this is difficult to answer in the abstract; we would need to know what specifically it does. For example, if it configures and controls, calling it -config or ctl might be appropriate. If it is a shell API for the library, it should have the same name as the library.
Your CLI module is one thing, the package that supports it is another thing. Don't confuse the names withe module foo (in a file foo.py) and the package foo (in a directory foo with a file __init__.py).
You have two things named foo: a module and a package. What else do you want to name foo? A class? A function? A variable?
Pick a distinctive name for the foo module or the foo package. foolib, for example, is a popular package name.

Categories