Proper way to import across Python package - python

Let's say I have a couple of Python packages.
/package_name
__init__.py
/dohickey
__init__.py
stuff.py
other_stuff.py
shiny_stuff.py
/thingamabob
__init__.py
cog_master.py
round_cogs.py
teethless_cogs.py
/utilities
__init__.py
important.py
super_critical_top_secret_cog_blueprints.py
What's the best way to utilize the utilites package? Say shiny_stuff.py needs to import important.py, what's the best way to go about that?
Currently I'm thinking
from .utilities import important
But is that the best way? Would it make more sense to add utilities to the path and import it that way?
import sys
sys.path.append(os.path.basename(os.path.basename(__file__)))
import utilities.super_critical_top_secret_cog_blueprints
That seems clunky to add to each of my files.

I think the safest way is always to use absolute import, so in you case:
from package_name.utilities import important
This way you won't have to change your code if you decide to move your shiny_stuff.py in some other package (assuming that package_name will still be in your sys.path).

According to Nick Coghlan (who is a Python core developer):
"“Never add a package directory, or any directory inside a package, directly to the Python path.” (Under the heading "The double import trap")
Adding the package directory to the path gives two separate ways for the module to be referred to. The link above is an excellent blog post about the Python import system. Adding it to the path directly means you can potentially have two copies of a single module, which you don't want. Your relative import from .utilities import important is fine, and an absolute import import package_name.utilities.important is also fine.

A "best" out-of-context choice probably doesn't exist, but you can have some criteria choosing which is better for your use cases, and for such a judgment one should know are the different possible approaches and their characteristics. Probably the best source of information is the PEP 328 itself, which contains some rationale about declaring distinct possibilities for that.
A common approach is to use the "absolute import", in your case it would be something like:
from package_name.utilities import important
This way, you can make this file it a script. It is somewhat independent from other modules and packages, fixed mainly by its location. If you have a package structure and need to change one single module from its location, having absolute path would help this single file to be kept unchanged, but all the ones which uses this module it should change. Of course you can also import the __init__.py files as:
from package_name import utilities
And these imports have the same characteristics. Be careful that utilities.important try to find a variable important within __init__.py, not from important.py, so having a "import important" __init__.py would help avoiding a mistake due to the distinction between file structure and namespace structure.
Another way to do that is the relative approach, by using:
from ..utilities import important
The first dot (from .stuff import ___ or from . import ___) says "the module in this [sub]package", or __init__.py when there's only the dot. From the second dot we are talking about parent directories. Generally, starting with dots in any import isn't allowed in a script/executable, but you can read about explicit relative imports (PEP 366) if you care about scripts with relative imports.
A justification for relative import can be found on the PEP 328 itself:
With the shift to absolute imports, the question arose whether relative imports should be allowed at all. Several use cases were presented, the most important of which is being able to rearrange the structure of large packages without having to edit sub-packages. In addition, a module inside a package can't easily import itself without relative imports.
Either case, the modules are tied to the subpackages in the sense that package_name is imported first no matter which the user tried to import first, unless you use sys.path to search for subpackages as packages (i.e., use the package root inside sys.path)...but that sounds weird, why would one do that?
The __init__.py can auto-import module names, for that one should care about its namespace contents. For example, say important.py has an object called top_secret, which is a dictionary. To find it from anywhere you would need
from package_name.utilities.important import top_secret
Perhaps you want be less specific:
from package_name.utilities import top_secret
That would be done with an __init__.py with the following line inside it:
from .important import top_secret
That's perhaps mixing the relative and absolute imports, but for a __init__.py you probably know that subpackage makes sense as a subpackage, i.e., as an abstraction by itself. If it's just a bunch of files located in the same place with the need for a explicit module name, probably the __init__.py would be empty (or almost empty). But for avoiding explicit module names for the user, the same idea can be done on the root __init__.py, with
from .utilities import top_secret
Completely indirect, but the namespace gets flat this way while the files are nested for some internal organization. For example, the wx package (wxPython) do that: everything can be found from wx import ___ directly.
You can also use some metaprogramming for finding the contents if you want to follow this approach, for example, using __all__ to detect all names a module have, or looking for the file location to know which modules/subpackages are available there to import. However, some simpler code completion utilities might get lost when doing that.
For some contexts you might have other kind of constraints. For example, macropy makes some "magic" with imports and doesn't work on the file you call as a script, so you'll need at least 2 modules just to use this package.
Anyhow, you should always ask whether nesting into subpackages is really needed for you code or API organization. The PEP 20 tells us that "Flat is better than nested", which isn't a law but a point-of-view that suggests you should keep a flat package structure unless nesting is needed for some reason. Likewise, you don't need a module for each class nor anything alike.

Use absolute import in case you need to move to a different location.

Related

Import from parent directory for a test sub-directory without using packaging, Python 2.7

TL;DR
For a fixed and unchangeable non-package directory structure like this:
some_dir/
mod.py
test/
test_mod.py
example_data.txt
what is a non-package way to enable test_mod.py to import from mod.py?
I am restricted to using Python 2.7 in this case.
I want to write a few tests for the functions in mod.py. I want to create a new directory test that sits alongside mod.py and inside it there is one test file test/test_mod.py which should import the functions from mod.py and test them.
Because of the well-known limitations of relative imports, which rely on package-based naming, you can't do this in the straightforward way. Yet, all advice on the topic suggests to build the script as a package and then use relative imports, which is impossible for my use case.
In my case, it is not allowable for mod.py to be built as a package and I cannot require users of mod.py to install it. They are instead free to merely check the file out from version control and just begin using it however they wish, and I am not able to change that circumstance.
Given this, what is a way to provide a simple, straightforward test directory?
Note: not just a test file that sits alongside mod.py, but an actual test directory since there will be other assets like test data that come with it, and the sub-directory organization is critical.
I apologize if this is a duplicate, but out of the dozen or so permutations of this question I've seen in my research before posting, I haven't seen a single one that addresses how to do this. They all say to use packaging, which is not a permissible option for my case.
Based on #mgilson 's comment, I added a file import_helper.py to the test directory.
some_dir/
mod.py
test/
test_mod.py
import_helper.py
example_data.txt
Here is the content of import_helper.py:
import sys as _sys
import os.path as _ospath
import inspect as _inspect
from contextlib import contextmanager as _contextmanager
#_contextmanager
def enable_parent_import():
path_appended = False
try:
current_file = _inspect.getfile(_inspect.currentframe())
current_directory = _ospath.dirname(_ospath.abspath(current_file))
parent_directory = _ospath.dirname(current_directory)
_sys.path.insert(0, parent_directory)
path_appended = True
yield
finally:
if path_appended:
_sys.path.pop(0)
and then in the import section of test_mod.py, prior to an attempt to import mod.py, I have added:
import unittest
from import_helper import enable_parent_import
with enable_parent_import():
from mod import some_mod_function_to_test
It is unfortunate to need to manually mangle PYTHONPATH, but writing it as a context manager helps a little, and restores sys.path back to its original state prior to the parent directory modification.
In order for this solution to scale across multiple instances of this problem (say tomorrow I am asked to write a widget.py module for some unrelated tasks and it also cannot be distributed as a package), I have to replicate my helper function and ensure a copy of it is distributed with any tests, or I have to write that small utility as a package, ensure it gets globally installed across my user base, and then maintain it going forward.
When you manage a lot of Python code internal to a company, often one dysfunctional code distribution mode that occurs is that "installing" some Python code becomes equivalent to checking out the new version from version control.
Since the code is often extremely localized and specific to a small set of tasks for a small subset of a larger team, maintaining the overhead for sharing code via packaging (even if it is a better idea in general) will simply never happen.
As a result, I feel the use case I describe above is extremely common for real-world Python, and it would be nice if some import tools added this functionality for modifying PYTHONPATH, which some sensible default choices (like adding parent directory) being very easy.
That way you could rely on this at least being part of the standard library, and not needing to roll your own code and ensure it's either shipped with your tests or installed across your user base.

Python - from . import

I'm taking my first stab at a library, and I've noticed the easiest way to solve the issue of intra-library imports is by using constructions like the following:
from . import x
from ..some_module import y
Something about this strikes me as 'bad.' Maybe it's just the fact that I can't remember seeing it very often, although in fairness I haven't poked around the guts of a ton of libraries.
Just wanted to see if this is considered good practice and, if not, what's the better way to do this?
There is a PEP for everything.
Quote from PEP8: Imports
Explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose:
Guido's decision in PEP328 Imports: Multi-Line and Absolute/Relative
Copy Pasta from PEP328
Here's a sample package layout:
package/
__init__.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
moduleA.py
Assuming that the current file is either moduleX.py or subpackage1/__init__.py , the following are all correct usages of the new syntax:
from .moduleY import spam
from .moduleY import spam as ham
from . import moduleY
from ..subpackage1 import moduleY
from ..subpackage2.moduleZ import eggs
from ..moduleA import foo
from ...package import bar
from ...sys import path
Explicit is better than implicit. At least according to the zen of python.
I find using . based imports to be confusing especially if you build or work in lots of libraries. If I don't know the package structure by heart its going to be less obvious where something comes from this way.
If someone wants to do something similar to (but not the same as) what I'm doing inside one of my library's modules, if the full package structure is specified in the import, people can copy and paste the import line.
Refactoring and restructuring are more difficult with dots because they will mean something different if you move a module around in a package structure or if you move a module to a different package.
If you want convenient access to something in your package, its likely other people do to, so you might as well solve that problem by building a good library rather than leaning on the language to keep your import lines under 80 characters. In these cases, if you have a package mypackage with sub package stuff with module things and class Whatever needs to be imported frequently in your code and users code, you can put an import in to the __init__.py for mypackage:
__all__ = ['Whatever']
from mypackage.stuff.things import Whatever
and then you and anyone else who wants to use Whatever can just do:
from mypackage import Whatever
But getting less verbose or less explicit than that will more than likely cause you or someone else difficulty down the line.

What is absolute import in python?

I am new to Python. I am developing a small project. I need to follow coding standards from starting on wards. How to use import statements in a proper way. Now I am working on Python 2.7. If I move to 3.x are there any conflicts with absolute imports? And what is the difference between absolute and relative imports?
The distinction between absolute and relative that's being drawn here is very similar to the way we talk about absolute and relative file paths or even URLs.
An absolute {import, path, URL} tells you exactly how to get the thing you are after, usually by specifying every part:
import os, sys
from datetime import datetime
from my_package.module import some_function
Relative {imports, paths, URLs} are exactly what they say they are: they're relative to their current location. That is, if the directory structure changes or the file moves, these may break (because they no longer mean the same thing).
from .module_in_same_dir import some_function
from ..module_in_parent_dir import other_function
Hence, absolute imports are preferred for code that will be shared.
I was asked in comments to provide an example of how from __future__ import absolute_import ties into this, and how it is meant to be used. In trying to formulate this example, I realized I couldn't quite explain its behavior either, so I asked a new question. This answer gives a code sample showing a correctly working implementation of from __future__ import absolute_import, where it actually resolves an ambiguity.
The accepted answer goes into more detail about why this works the way it does, including a discussion of the confusing wording of the Python 2.5 changelog. Essentially, the scope of this directive (and by extension the distinction between absolute and relative imports in Python) is very, very narrow. If you find yourself needing these distinctions to make your code work, you're probably better off renaming your local module if at all possible.
Imports should usually be on separate lines:
Yes: import os
import sys
No: import sys, os
It's okay to say this though:
from subprocess import Popen, PIPE
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:
Standard library imports.
Related third party imports.
Local application/library specific imports.
You should put a blank line between each group of imports.
As per Pep8 :-
Absolute imports are recommended, as they are usually more readable and tend to be better behaved (or at least give better error messages) if the import system is incorrectly configured (such as when a directory inside a package ends up on sys.path):
import mypkg.sibling
from mypkg import sibling
from mypkg.sibling import example
However, explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose:
from . import sibling
from .sibling import example
Standard library code should avoid complex package layouts and always use absolute imports.
Implicit relative imports should never be used and have been removed in Python 3.
When importing a class from a class-containing module, it's usually okay to spell this:
from myclass import MyClass
from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them explicitly:
import myclass
import foo.bar.yourclass
and use "myclass.MyClass" and "foo.bar.yourclass.YourClass".
Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools. There is one defensible use case for a wildcard import, which is to republish an internal interface as part of a public API (for example, overwriting a pure Python implementation of an interface with the definitions from an optional accelerator module and exactly which definitions will be overwritten isn't known in advance).
https://www.python.org/dev/peps/pep-0008/#imports

I'd like Python to look into a package first for imports

I might just be at odds with Python's way of thinking here, but to me, when it comes to a package (or any project folder system), the contents of a package should always be more important than anything outside of that package, including PYTHONPATH.
Take this hierarchy for example:
somewhere/
foo/
__init__.py
bar/
__init__.py
foo/
__init__.py
If somewhere is in PYTHONPATH, and nothing else here is, and in somewhere/bar/__init__.py I do a simple import foo, I feel bar should import its child, somewhere/bar/foo, not a total stranger, somewhere/foo from the path variable. Path should be where you go if you can't find something right inside your own system.
In my tests, though, it seems that PYTHONPATH trumps direct descendants, which would be a shame, because it's a less powerful, less flexible system, and it doesn't properly honor the DAG nature of hierarchies. Children come first, not siblings, and certainly not ancestors or complete, non-relations. However, when I remove PYTHONPATH, suddenly it uses the foo inside bar.
Am I just doing something wrong, or does Python really work this way? Is there something I can do to make it work the way I think it should? If I remove somewhere/bar/foo, then it can look in the path, but if I explicitly put a foo in bar, then it should use that, just as an instance variable will override a class variable.
PEP 238 is about absolute and relatve imports.
As I understood, from . import bar would import the right thing. Reading that PEP could help you understanding the different ways of importing modules.
They point out, that absolut imports are the default, because it can be used for everything:
import foo
import bar.foo
import sys
In contrast to:
import ..foo
import .foo
import sys #absolute

Constants in a python package

I'm writing a python package and am wondering where the best place is to put constants?
I know you can create a file called 'constants.py' in the package and then call them with module.constants.const, but shouldn't there be a way to associate the constant with the whole module? e.g. you can call numpy.pi, how would I do something like that?
Also, where in the module is the best place to put paths to directories outside of the module where I want to read/write files?
Put them where you feel they can most easily be maintained. Usually that means in the module to which the constants logically belong.
You can always import the constants into the __init__.py file of your package to make it easier for someone to find them. If you did decide on a constants module, I'd add a __all__ sequence to state what values are public, then in the __init__.py file do:
from constants import *
to make the same names available at the package level.

Categories