Best practices for imports for installed package - python

Imagine I have a package "foolibrary" that is installed via setup.py, and I'm the primary developer. Which is the preferred means of calling imports inside the package? Imagine foolibrary has two modules (a.py, b.py) and I need to access them in c.py:
foolibrary
a.py
b.py
c.py
In c.py, what is the preferred way to import these and why?
from a import blah
vs
from foolibrary.a import blah
vs
from .a import blah
I've seen all three and generally use the foolibrary.a import style, but mostly out of habit.

The relative import syntax, from .a import blah, is the modern way to do things. See PEP 328, https://www.python.org/dev/peps/pep-0328/ , as to why it's superior to the alternatives. (Though admittedly PEP 8 prefers absolute exports, it also allows within-package relative imports as an acceptable alternative).
Personally, BTW, I always ever import only modules, not "stuff" (functions, classes, whatever) from inside a module.
But, this is a style constraint that is far from universal (it is, however, part of https://google-styleguide.googlecode.com/svn/trunk/pyguide.html -- and having been at Google 10 years now and helped shape parts of its Python practices and style, I'm understandably biased in favor of that style:-).

Related

Python packages import

I want to know why we use from module import module?
For example from BeautifulSoup import BeautifulSoup instead of only import BeautifulSoup.
Are both of those different?
And if yes, then how?
Yes, they are different.
There are several ways to import a module
import module
The module is imported, but all its functions, classes and variables remain within the name space of 'module . To reference them, they must be prepended with the module name:
module.somefunction() module.someclass() module.somevariable = 1
import module as md
The module is imported, but given a new name space 'md'. To reference the functions, classes and variables, they must be prepended with the new namespace:
md.somefunction() module.someclass() module.somevariable = 1
This keeps the namespaces separated, but provides a shorthand notation which makes the code more readable.
from module import *
All functions, classes and variabes from the module are imported into the current namespace as if they were defined in the current module. They can be called with their own name:
somefunction() someclass() somevariable = 1
The disadvantage is that there might be overlapping names!
from module import something
from module import something, somethingelse
This imports only 'something' (and 'somethingelse') a function, a class, a variable into the current namespace.
This is not only more efficient, but it also reduces the risk of overlapping names.
A module name and a class inside the module may have the same name. Don't let that confuse you:
import BeautifulSoup reference: BeautifulSoup.BeautifulSoup()
from BeautifulSoup import BeautifulSoup reference: BeautifulSoup()
import BeautifulSoup as bs reference: bs.BeautifulSoup()
I think it comes down to what the individual library (Python package) owner wants to do and however they think it will be easiest for people to use their product. As part of the Python community, they may take into account the current conventions. Those may be done formally as a PEP like PEP 8, informally through common practice, or as third-party formalized linters.
It can be confusing that there are some distinctions, like package vs module, that have the potential to help with understanding, but do not really mean anything to a Python application itself. For example, with your example, it is not from module import module; it is more like from package import class for BeautifulSoup. But it could be argued that a package is an external module. I have heard some call any file a module and any directory a package.
More background, if you are new to python:
In python it is all about namespacing. Namespacing has the potential to allow for very clean and flexible code organization compared to earlier languages. However it also allows that any package, module, class, function, method, variable, or whatever name can be hard for someone writing a client application to know what is what. There is even a convenience class in the standard library for faking it: types.Namespace. Then there are metaclasses and dynamic programming as well. Anyway, yes there are common conventions followed to reduce confusion, but they have to be learned separately from the language itself (like learning customs of a country, not only its language, to be able to understand common phrases).
What PEP 20 has to say about namespaces:
import this
Back to PEP 8, generally, it is accepted that classes are word-uppercase (class SomeClass:) without underscores, and functions, variables and methods are all lowercase with words separated by underscores (some_variable). Not always the case, but those are probably the most widely accepted styles. Somewhere I see things going against what is "pythonic" and/or commonplace is when a bindings library is a thin wrapper around a library from another language (e.g. C++) or when the Python code is tightly coupled with code in another language, so the styles smear together for easier transitions. Nothing says anyone has to follow specific style.
Some people prefer terseness, so they may shorten and combine words for variable names (e.g. foo bar as fb). Others may like to be explicit (e.g. foo bar as foo_bar). And still others may prefer to be generic and use comments to describe their variables (why? it may be convenient for large complex equations):
# x is for foo bar.
# y is for spam and z is for ham.
assert x + y != z
Some people really like formatters like Black that follow very strict rules, but others like flexibility or have reason to go against convention. Even in the standard libraries that come with Python, there are inconsistency from legacy code that the maintainers have left alone or allowed before the community settled on common practices that the code goes against.

What are all the ways to import modules in Python?

I've done some research, and I came across the following article: http://effbot.org/zone/import-confusion.htm. While this seems to be a great guide, it was written in 1999, a while back. I'm am using Python 3.4.3, so I am thinking that some things have changed, which worries me, because I don't want to learn what is not applicable. Therefore, in Python 3, what are all of the ways to import packages and modules, in detail? Which ways are the most common and should be used above others?
The only ways that matter for ordinary usage are the first three ways listed on that page:
import module
from module import this, that, tother
from module import *
These haven't changed in Python 3. (Some of the details about where Python looks for the module.py file to load module have been tweaked, but the behavior of the import itself still works as described on the page you linked.)
One thing has been added, before Python 3 but since that article. That is explicit relative imports. These let you do things like from ..module import blah. This kind of import can only be used from inside a package; it lets modules in a package refer to other modules in the same package in a way that is relative to the package (i.e., without having to specify how to import the top-level package). You can read the details in PEP 328. Even this, though, is basically just a new variation on the from module import blah style syntax mentioned on the page you linked to.
__import__ also still works in Python 3. This is an internal function that you only would need to use if doing something rather unusual. The same applies to various functions in the importlib module (and the deprecated imp module). The exact level of wizardliness of these importing functions varies from one to another, but for ordinary usage of "I just want to import this module and use it", you essentially never need to use them. They're only needed if you want to do something like dynamically import a module whose name isn't known until runtime.
The Zen of Python gives you some hints:
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
So given the simple, obvious method is: import module_name and it preserves namespaces I would suggest that while there are several import methods as you can see from the python3 manual entry and you can extend them by overriding the __import__() method or by rolling your own I would say stick with it until you have a good reason not to.
The fact that__import__() is surrounded by double underscores is also a hint to leave it alone.
If you are looking to understand the design decisions behind the import mechanisms then start with the manual then follow up into the PEPs 302 & 420 are good starting points.
I think import as tuple would be much better for readability and Maximum Line Length(pep8)
The import statement has two problems:
Long import statements can be difficult to write, requiring various
contortions to fit Pythonic style guidelines.
Imports can be ambiguous in the face of packages; within a package,
it's not clear whether import foo refers to a module within the package or some module outside the package.
golang language have the same thing for that
so would more prefer import kinda this
from package import (x, y)
instead of this
from authentication.views import SignupView, LoginView, VerificationView, SignupDetailView
https://legacy.python.org/dev/peps/pep-0328/
We can import modules in Python using the following ways
import module
from module import function
from module import *
Although using from module import *
is not a good practice, because of readability: Other programmer cannot understand what all are actually used in the current module. Memory overload: All are loaded in to memory. Best practices for using import in a module.
Say you have python modules (mymod1.py, mymod2.py files containing different functions) inside mypkg package (folder having init.py file, it can be an empty file).
#mymod1.py
def add_fun(a,b):
return a+b
def sub_fun(a,b):
return a-b
def mul_fun(a,b):
return a*b
def div_fun(a,b):
return a/b
#mymod2.py
def fun1(...):
........
........
def fun2(...):
........
........
Following are different ways to import:
from mypkg.mymod1 import * #import all the function from mymod1
add_fun(10, 20)
mul_fun(10, 2)
from mypkg.mymod1 import add_fun,div_fun #import only needed functions from mymod1
add_fun(10, 20)
div_mul(10, 2)
from mypkg import mymod1 #import mymod module
mymod1.add_fun(10, 20)
mymod1.mul_fun(10, 2)
import mypkg #import package and use different models inside it
mypkg.mymod1.add_fun(10, 20)
mypkg.mymod1.mul_fun(10, 2)

Python - from . import

I'm taking my first stab at a library, and I've noticed the easiest way to solve the issue of intra-library imports is by using constructions like the following:
from . import x
from ..some_module import y
Something about this strikes me as 'bad.' Maybe it's just the fact that I can't remember seeing it very often, although in fairness I haven't poked around the guts of a ton of libraries.
Just wanted to see if this is considered good practice and, if not, what's the better way to do this?
There is a PEP for everything.
Quote from PEP8: Imports
Explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose:
Guido's decision in PEP328 Imports: Multi-Line and Absolute/Relative
Copy Pasta from PEP328
Here's a sample package layout:
package/
__init__.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
moduleA.py
Assuming that the current file is either moduleX.py or subpackage1/__init__.py , the following are all correct usages of the new syntax:
from .moduleY import spam
from .moduleY import spam as ham
from . import moduleY
from ..subpackage1 import moduleY
from ..subpackage2.moduleZ import eggs
from ..moduleA import foo
from ...package import bar
from ...sys import path
Explicit is better than implicit. At least according to the zen of python.
I find using . based imports to be confusing especially if you build or work in lots of libraries. If I don't know the package structure by heart its going to be less obvious where something comes from this way.
If someone wants to do something similar to (but not the same as) what I'm doing inside one of my library's modules, if the full package structure is specified in the import, people can copy and paste the import line.
Refactoring and restructuring are more difficult with dots because they will mean something different if you move a module around in a package structure or if you move a module to a different package.
If you want convenient access to something in your package, its likely other people do to, so you might as well solve that problem by building a good library rather than leaning on the language to keep your import lines under 80 characters. In these cases, if you have a package mypackage with sub package stuff with module things and class Whatever needs to be imported frequently in your code and users code, you can put an import in to the __init__.py for mypackage:
__all__ = ['Whatever']
from mypackage.stuff.things import Whatever
and then you and anyone else who wants to use Whatever can just do:
from mypackage import Whatever
But getting less verbose or less explicit than that will more than likely cause you or someone else difficulty down the line.

Proper way to import across Python package

Let's say I have a couple of Python packages.
/package_name
__init__.py
/dohickey
__init__.py
stuff.py
other_stuff.py
shiny_stuff.py
/thingamabob
__init__.py
cog_master.py
round_cogs.py
teethless_cogs.py
/utilities
__init__.py
important.py
super_critical_top_secret_cog_blueprints.py
What's the best way to utilize the utilites package? Say shiny_stuff.py needs to import important.py, what's the best way to go about that?
Currently I'm thinking
from .utilities import important
But is that the best way? Would it make more sense to add utilities to the path and import it that way?
import sys
sys.path.append(os.path.basename(os.path.basename(__file__)))
import utilities.super_critical_top_secret_cog_blueprints
That seems clunky to add to each of my files.
I think the safest way is always to use absolute import, so in you case:
from package_name.utilities import important
This way you won't have to change your code if you decide to move your shiny_stuff.py in some other package (assuming that package_name will still be in your sys.path).
According to Nick Coghlan (who is a Python core developer):
"“Never add a package directory, or any directory inside a package, directly to the Python path.” (Under the heading "The double import trap")
Adding the package directory to the path gives two separate ways for the module to be referred to. The link above is an excellent blog post about the Python import system. Adding it to the path directly means you can potentially have two copies of a single module, which you don't want. Your relative import from .utilities import important is fine, and an absolute import import package_name.utilities.important is also fine.
A "best" out-of-context choice probably doesn't exist, but you can have some criteria choosing which is better for your use cases, and for such a judgment one should know are the different possible approaches and their characteristics. Probably the best source of information is the PEP 328 itself, which contains some rationale about declaring distinct possibilities for that.
A common approach is to use the "absolute import", in your case it would be something like:
from package_name.utilities import important
This way, you can make this file it a script. It is somewhat independent from other modules and packages, fixed mainly by its location. If you have a package structure and need to change one single module from its location, having absolute path would help this single file to be kept unchanged, but all the ones which uses this module it should change. Of course you can also import the __init__.py files as:
from package_name import utilities
And these imports have the same characteristics. Be careful that utilities.important try to find a variable important within __init__.py, not from important.py, so having a "import important" __init__.py would help avoiding a mistake due to the distinction between file structure and namespace structure.
Another way to do that is the relative approach, by using:
from ..utilities import important
The first dot (from .stuff import ___ or from . import ___) says "the module in this [sub]package", or __init__.py when there's only the dot. From the second dot we are talking about parent directories. Generally, starting with dots in any import isn't allowed in a script/executable, but you can read about explicit relative imports (PEP 366) if you care about scripts with relative imports.
A justification for relative import can be found on the PEP 328 itself:
With the shift to absolute imports, the question arose whether relative imports should be allowed at all. Several use cases were presented, the most important of which is being able to rearrange the structure of large packages without having to edit sub-packages. In addition, a module inside a package can't easily import itself without relative imports.
Either case, the modules are tied to the subpackages in the sense that package_name is imported first no matter which the user tried to import first, unless you use sys.path to search for subpackages as packages (i.e., use the package root inside sys.path)...but that sounds weird, why would one do that?
The __init__.py can auto-import module names, for that one should care about its namespace contents. For example, say important.py has an object called top_secret, which is a dictionary. To find it from anywhere you would need
from package_name.utilities.important import top_secret
Perhaps you want be less specific:
from package_name.utilities import top_secret
That would be done with an __init__.py with the following line inside it:
from .important import top_secret
That's perhaps mixing the relative and absolute imports, but for a __init__.py you probably know that subpackage makes sense as a subpackage, i.e., as an abstraction by itself. If it's just a bunch of files located in the same place with the need for a explicit module name, probably the __init__.py would be empty (or almost empty). But for avoiding explicit module names for the user, the same idea can be done on the root __init__.py, with
from .utilities import top_secret
Completely indirect, but the namespace gets flat this way while the files are nested for some internal organization. For example, the wx package (wxPython) do that: everything can be found from wx import ___ directly.
You can also use some metaprogramming for finding the contents if you want to follow this approach, for example, using __all__ to detect all names a module have, or looking for the file location to know which modules/subpackages are available there to import. However, some simpler code completion utilities might get lost when doing that.
For some contexts you might have other kind of constraints. For example, macropy makes some "magic" with imports and doesn't work on the file you call as a script, so you'll need at least 2 modules just to use this package.
Anyhow, you should always ask whether nesting into subpackages is really needed for you code or API organization. The PEP 20 tells us that "Flat is better than nested", which isn't a law but a point-of-view that suggests you should keep a flat package structure unless nesting is needed for some reason. Likewise, you don't need a module for each class nor anything alike.
Use absolute import in case you need to move to a different location.

In python, what are the pros and cons of importing a class vs. importing the class's module?

I'm authoring a set of python coding guidelines for a team of ~30 developers. As a basis for my document, so far I've studied the Google python style guide and the PEP 8 style guide, and incorporated information from both.
One place where the Google style guide is more restrictive than PEP 8 is with imports. The Google guide requests developers only import packages and modules only, and then refer to items within by a more-qualified name. For example:
from pkg import module
...
my_class = module.MyClass()
The justification is that the "source of each identifier is indicated in a consistent way". For our project, we intend to organize with packages two or three levels deep, so to know the full source of the identifier, the reader will likely need to examine the import statement anyway. I'd like to advocate this style of import as a "preferred style":
from pkg.module import MyClass
...
my_class = MyClass()
IMHO, the readability in python constructs such as list comprehensions is improved when the names are more succinct.
What I'm unclear on is what the python interpreter might do behind the scenes. For example, is MyClass now part of the global namespace for both this module, and all importers of this module? (This would be bad, could lead to some weird bugs; if this were true, I'd advocate the Google style).
My python development experience is limited to about 6 months (and there are not many experts on our project to consult), so I wanted to get more information from the community. Here are some items I've researched already:
effbot - discussion on imports
stack overflow - import vs. from import
python documentation - modules
python documentation - import
Thank you for your responses!
In Python, there is no such thing as a variable that is global across more than one module. If you do from pkg.module import MyClass, then MyClass is in the global namespace of the module where you do that, but not of any other module (including modules that import the module that imports MyClass).
As for your more general question, either import mechanism can be acceptable depending on the situation. If the module name is long, you can get some shortening by importing it under a different name:
# Awkward
from package import reallylongmodule
reallylongmodule.MyClass()
# Less awkward
from package import reallylongmodule as rlm
rlm.MyClass()
Importing just the class can be okay if the class name is distinctive enough that you can tell where it comes from and what it is. However, if you have multiple modules that define classes with relatively undescriptive names (e.g., "Processor", "Unit", "Data", "Manager"), then it can be a good idea to access them via the module name to clarify what you're doing.
Style guides are ultimately guides and not laws. My own preference would be to choose a mechanism that maximizes clarity and readability. That involves a tradeoff between avoiding long and cumbersome names, and also avoiding short, vague, or cryptic names. How you make that tradeoff depends on the particular libraries you're using and how you're using them (e.g., how many modules you import, how many things you import from them).
I suggest you to use automatic code checkers, like pylint, pep8, pyflakes, instead of writing code guides.
I personally prefer to use from pkg import module, because of possible name collisions.
from package import module
def my_fun():
module.function()
Inpreter must do 3 hash-table lookups local function namespace, current module's global namespace and imported module's namespace.
In
from package.module import function
def my_fun():
function()
it will do only 2 lookups: the last one performed in import time.

Categories