I am new to Python. I am developing a small project. I need to follow coding standards from starting on wards. How to use import statements in a proper way. Now I am working on Python 2.7. If I move to 3.x are there any conflicts with absolute imports? And what is the difference between absolute and relative imports?
The distinction between absolute and relative that's being drawn here is very similar to the way we talk about absolute and relative file paths or even URLs.
An absolute {import, path, URL} tells you exactly how to get the thing you are after, usually by specifying every part:
import os, sys
from datetime import datetime
from my_package.module import some_function
Relative {imports, paths, URLs} are exactly what they say they are: they're relative to their current location. That is, if the directory structure changes or the file moves, these may break (because they no longer mean the same thing).
from .module_in_same_dir import some_function
from ..module_in_parent_dir import other_function
Hence, absolute imports are preferred for code that will be shared.
I was asked in comments to provide an example of how from __future__ import absolute_import ties into this, and how it is meant to be used. In trying to formulate this example, I realized I couldn't quite explain its behavior either, so I asked a new question. This answer gives a code sample showing a correctly working implementation of from __future__ import absolute_import, where it actually resolves an ambiguity.
The accepted answer goes into more detail about why this works the way it does, including a discussion of the confusing wording of the Python 2.5 changelog. Essentially, the scope of this directive (and by extension the distinction between absolute and relative imports in Python) is very, very narrow. If you find yourself needing these distinctions to make your code work, you're probably better off renaming your local module if at all possible.
Imports should usually be on separate lines:
Yes: import os
import sys
No: import sys, os
It's okay to say this though:
from subprocess import Popen, PIPE
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:
Standard library imports.
Related third party imports.
Local application/library specific imports.
You should put a blank line between each group of imports.
As per Pep8 :-
Absolute imports are recommended, as they are usually more readable and tend to be better behaved (or at least give better error messages) if the import system is incorrectly configured (such as when a directory inside a package ends up on sys.path):
import mypkg.sibling
from mypkg import sibling
from mypkg.sibling import example
However, explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose:
from . import sibling
from .sibling import example
Standard library code should avoid complex package layouts and always use absolute imports.
Implicit relative imports should never be used and have been removed in Python 3.
When importing a class from a class-containing module, it's usually okay to spell this:
from myclass import MyClass
from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them explicitly:
import myclass
import foo.bar.yourclass
and use "myclass.MyClass" and "foo.bar.yourclass.YourClass".
Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools. There is one defensible use case for a wildcard import, which is to republish an internal interface as part of a public API (for example, overwriting a pure Python implementation of an interface with the definitions from an optional accelerator module and exactly which definitions will be overwritten isn't known in advance).
https://www.python.org/dev/peps/pep-0008/#imports
Related
If I were to create a module that was called for example imp_mod.py and inside it contained all (subjectively used) relevant modules that I frequently used.
Would importing this module into my main program allow me access to the imports contained inside imp_mod.py?
If so, what disadvantages would this bring?
I guess a major advantage would be a reduction of time spent importing even though its only a couple of seconds saved...
Yes, it would allow you to access them. If you place these imports in imp_mod.py:
from os import listdir
from collections import defaultdict
from copy import deepcopy
Then, you could do this in another file, say, myfile.py:
import imp_mod
imp_mod.listdir
imp_mod.defaultdict
imp_mod.deepcopy
You're wrong about reduction of importing time, as what happens is the opposite. Python will need to import imp_mod and then import the other modules afterwards, while the first import would not be needed if you were importing these modules in myfile.py itself. If you do the same imports in another file, they will already be in cache, so virtually no time is spent in the next import.
The real disadvantage here is less readability. Whoever looks at imp_mod.listdir, for example, will ask himself what the heck is this method and why it has the same name as that os module's method. When he had to open imp_mod.py just to find out that it's the same method, well, he probably wouldn't be happy. I wouldn't.
As lucasnadalutti mentioned, you can access them by importing your module.
In terms of advantages, it can make your main program care less about where the imports are coming from if the imp_mod handles all imports, however, as your program gets more complex and starts to include more namespaces, this approach can get more messy. You can start to handle a bit of this by using __init__.py within directories to handle imports to do a similar thing, but as things get more complex, personally, I feel it add a little more complexity. I'd rather just know where a module came from to look it up.
TL;DR
For a fixed and unchangeable non-package directory structure like this:
some_dir/
mod.py
test/
test_mod.py
example_data.txt
what is a non-package way to enable test_mod.py to import from mod.py?
I am restricted to using Python 2.7 in this case.
I want to write a few tests for the functions in mod.py. I want to create a new directory test that sits alongside mod.py and inside it there is one test file test/test_mod.py which should import the functions from mod.py and test them.
Because of the well-known limitations of relative imports, which rely on package-based naming, you can't do this in the straightforward way. Yet, all advice on the topic suggests to build the script as a package and then use relative imports, which is impossible for my use case.
In my case, it is not allowable for mod.py to be built as a package and I cannot require users of mod.py to install it. They are instead free to merely check the file out from version control and just begin using it however they wish, and I am not able to change that circumstance.
Given this, what is a way to provide a simple, straightforward test directory?
Note: not just a test file that sits alongside mod.py, but an actual test directory since there will be other assets like test data that come with it, and the sub-directory organization is critical.
I apologize if this is a duplicate, but out of the dozen or so permutations of this question I've seen in my research before posting, I haven't seen a single one that addresses how to do this. They all say to use packaging, which is not a permissible option for my case.
Based on #mgilson 's comment, I added a file import_helper.py to the test directory.
some_dir/
mod.py
test/
test_mod.py
import_helper.py
example_data.txt
Here is the content of import_helper.py:
import sys as _sys
import os.path as _ospath
import inspect as _inspect
from contextlib import contextmanager as _contextmanager
#_contextmanager
def enable_parent_import():
path_appended = False
try:
current_file = _inspect.getfile(_inspect.currentframe())
current_directory = _ospath.dirname(_ospath.abspath(current_file))
parent_directory = _ospath.dirname(current_directory)
_sys.path.insert(0, parent_directory)
path_appended = True
yield
finally:
if path_appended:
_sys.path.pop(0)
and then in the import section of test_mod.py, prior to an attempt to import mod.py, I have added:
import unittest
from import_helper import enable_parent_import
with enable_parent_import():
from mod import some_mod_function_to_test
It is unfortunate to need to manually mangle PYTHONPATH, but writing it as a context manager helps a little, and restores sys.path back to its original state prior to the parent directory modification.
In order for this solution to scale across multiple instances of this problem (say tomorrow I am asked to write a widget.py module for some unrelated tasks and it also cannot be distributed as a package), I have to replicate my helper function and ensure a copy of it is distributed with any tests, or I have to write that small utility as a package, ensure it gets globally installed across my user base, and then maintain it going forward.
When you manage a lot of Python code internal to a company, often one dysfunctional code distribution mode that occurs is that "installing" some Python code becomes equivalent to checking out the new version from version control.
Since the code is often extremely localized and specific to a small set of tasks for a small subset of a larger team, maintaining the overhead for sharing code via packaging (even if it is a better idea in general) will simply never happen.
As a result, I feel the use case I describe above is extremely common for real-world Python, and it would be nice if some import tools added this functionality for modifying PYTHONPATH, which some sensible default choices (like adding parent directory) being very easy.
That way you could rely on this at least being part of the standard library, and not needing to roll your own code and ensure it's either shipped with your tests or installed across your user base.
Let's say I have a couple of Python packages.
/package_name
__init__.py
/dohickey
__init__.py
stuff.py
other_stuff.py
shiny_stuff.py
/thingamabob
__init__.py
cog_master.py
round_cogs.py
teethless_cogs.py
/utilities
__init__.py
important.py
super_critical_top_secret_cog_blueprints.py
What's the best way to utilize the utilites package? Say shiny_stuff.py needs to import important.py, what's the best way to go about that?
Currently I'm thinking
from .utilities import important
But is that the best way? Would it make more sense to add utilities to the path and import it that way?
import sys
sys.path.append(os.path.basename(os.path.basename(__file__)))
import utilities.super_critical_top_secret_cog_blueprints
That seems clunky to add to each of my files.
I think the safest way is always to use absolute import, so in you case:
from package_name.utilities import important
This way you won't have to change your code if you decide to move your shiny_stuff.py in some other package (assuming that package_name will still be in your sys.path).
According to Nick Coghlan (who is a Python core developer):
"“Never add a package directory, or any directory inside a package, directly to the Python path.” (Under the heading "The double import trap")
Adding the package directory to the path gives two separate ways for the module to be referred to. The link above is an excellent blog post about the Python import system. Adding it to the path directly means you can potentially have two copies of a single module, which you don't want. Your relative import from .utilities import important is fine, and an absolute import import package_name.utilities.important is also fine.
A "best" out-of-context choice probably doesn't exist, but you can have some criteria choosing which is better for your use cases, and for such a judgment one should know are the different possible approaches and their characteristics. Probably the best source of information is the PEP 328 itself, which contains some rationale about declaring distinct possibilities for that.
A common approach is to use the "absolute import", in your case it would be something like:
from package_name.utilities import important
This way, you can make this file it a script. It is somewhat independent from other modules and packages, fixed mainly by its location. If you have a package structure and need to change one single module from its location, having absolute path would help this single file to be kept unchanged, but all the ones which uses this module it should change. Of course you can also import the __init__.py files as:
from package_name import utilities
And these imports have the same characteristics. Be careful that utilities.important try to find a variable important within __init__.py, not from important.py, so having a "import important" __init__.py would help avoiding a mistake due to the distinction between file structure and namespace structure.
Another way to do that is the relative approach, by using:
from ..utilities import important
The first dot (from .stuff import ___ or from . import ___) says "the module in this [sub]package", or __init__.py when there's only the dot. From the second dot we are talking about parent directories. Generally, starting with dots in any import isn't allowed in a script/executable, but you can read about explicit relative imports (PEP 366) if you care about scripts with relative imports.
A justification for relative import can be found on the PEP 328 itself:
With the shift to absolute imports, the question arose whether relative imports should be allowed at all. Several use cases were presented, the most important of which is being able to rearrange the structure of large packages without having to edit sub-packages. In addition, a module inside a package can't easily import itself without relative imports.
Either case, the modules are tied to the subpackages in the sense that package_name is imported first no matter which the user tried to import first, unless you use sys.path to search for subpackages as packages (i.e., use the package root inside sys.path)...but that sounds weird, why would one do that?
The __init__.py can auto-import module names, for that one should care about its namespace contents. For example, say important.py has an object called top_secret, which is a dictionary. To find it from anywhere you would need
from package_name.utilities.important import top_secret
Perhaps you want be less specific:
from package_name.utilities import top_secret
That would be done with an __init__.py with the following line inside it:
from .important import top_secret
That's perhaps mixing the relative and absolute imports, but for a __init__.py you probably know that subpackage makes sense as a subpackage, i.e., as an abstraction by itself. If it's just a bunch of files located in the same place with the need for a explicit module name, probably the __init__.py would be empty (or almost empty). But for avoiding explicit module names for the user, the same idea can be done on the root __init__.py, with
from .utilities import top_secret
Completely indirect, but the namespace gets flat this way while the files are nested for some internal organization. For example, the wx package (wxPython) do that: everything can be found from wx import ___ directly.
You can also use some metaprogramming for finding the contents if you want to follow this approach, for example, using __all__ to detect all names a module have, or looking for the file location to know which modules/subpackages are available there to import. However, some simpler code completion utilities might get lost when doing that.
For some contexts you might have other kind of constraints. For example, macropy makes some "magic" with imports and doesn't work on the file you call as a script, so you'll need at least 2 modules just to use this package.
Anyhow, you should always ask whether nesting into subpackages is really needed for you code or API organization. The PEP 20 tells us that "Flat is better than nested", which isn't a law but a point-of-view that suggests you should keep a flat package structure unless nesting is needed for some reason. Likewise, you don't need a module for each class nor anything alike.
Use absolute import in case you need to move to a different location.
I'm reading App Engine's documentation and i saw that i could nest some imports, for example:
from google.appengine.ext import db, webapp #so on
Is it a bad thing to do so? If it's not (since it works), what's the limit/advantages/disadvantages of that?
That's not nesting, that's just importing specific names. There's nothing wrong with that per se, but as always consult PEP 8 for style guidelines.
No it's not. See the PEP8 python styling standards. Here's the pertinent excerpt:
Imports
- Imports should usually be on separate lines, e.g.:
Yes: import os
import sys
No: import sys, os
it's okay to say this though:
from subprocess import Popen, PIPE
- Imports are always put at the top of the file, just after any module
comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:
1. standard library imports
2. related third party imports
3. local application/library specific imports
I would avoid it. The examples given in PEP 8, quoted by sdolan, are imports of specific names from inside a module, but what you're asking about is importing multiple different packages on a single line. It won't break, but separating it out is neater, and makes it easier to refactor and change imports later.
I've noticed that importing a module will import its functions and methods, and the functions and methods of those as well. Is there a set rule for how many levels down python will import when you import an upper-level module?
edit
sorry, I think I've been misunderstood by the answers so far responding about multiple imports of some dependencies. I'm thinking of nested folders e.g. in django, if you import django, you can access django.contrib.auth, but you can't access django.contrib.auth.views unless you import that specifically. I was just wondering if it's always two levels down in such a case
second edit
to clarify again.. in the django example, the layout is /django/contrib/auth/views.py, where each of the subfolders has a "init.py" making it a module, none of which define any "all" attributes. Is my example bad, since maybe you can't use the dot syntax to navigate to a file within a module designated folder?
No, python will import what it needs to import. However, each module is only imported once. For example, if one module does import sys and another module does import sys, it will not physically do it twice.
Not really. A module imports stuff from other modules because it needs to use them in that module, otherwise it'll break.
There is no pre-defined import depth level. Import statements are executed, just like any other python statement.
But, you may wonder, how are cycles avoided? Modules are added to sys.modules (i.e., cached) when they get imported for the first time, and that is the first location examined when an import statement is executed. So each module is loaded just once, although it may appear in many import statements.