python package best practice: managing imports - python

Coming from R, I'm trying to wrap my head around the package system in python.
My question (in short) is: what is the best practice for managing external library imports?
Suppose I have a package (call it pointless) with the following directory structure.
pointless/
setup.py
...etc
pointless/
__init__.py
module1.py
module2.py
And suppose both module1 and module2 had the header:
from __future__ import division
import numpy as np
...
My issue is that when I import pointless I get a double-whammy of np and division in both pointless.module1 and pointless.module2. There has to be a better way?
EDIT
Apologies if that wasn't clear. It bugs me that when I run (ipython):
>>> import pointless
>>> pointless.module1.<TAB>
pointless.module1.np
pointless.module.division
...
>>> pointless.module2.<TAB>
pointless.module1.np
pointless.module.division
...
I can see the np namespace in both modules, which seems messy and way overkill.
Is there a way I can "centralize" my external library imports so that I don't see them in every module? Or am I missing something?

This is related to this question: what happens when i import module twice in python. Long story short: If you import a module twice, it is loaded only once, so your example is not problematic at all.

Related

Structuring python packages for straightforward importing

I have a number of functions that I have written myself over time, over time I have placed each of these functions in their own modules (as I thought this was best practice).
I would like take the next step and organise these modules that I can import in fewer lines of code. However, I'm finding that I need to need to use a lot of lines of 'repeated' code to import the functions I want to have access to.
my_functions
-src
--__init__.py
--func1.py
--func2.py
--func3.py
I have followed this talk tutorial to build this collection of modules into a package thinking that I could use something like
import my_fucntions as mf
but I have to import each module
import func1 as fu1
import func2 as fu2
...
a = fu1.func1()
my question is, what do I have to do to be able to import my functions as a package, the same way that I would import something like pandas
import my_functions as mf
a = mf.func1(arg)
I'm finding it difficult to find either a tutorial or a clear simple example of how to structure this so any guidance at all would be useful. If it's just not doable without building something as complex and pandas or numpy thats ok too, I just said I'd try one last shot at it.
Create a __init__.py file in both my_functions directory and src directory.
Now in the my_functions __init__.py file. No need to edit __init__ file inside src.
from src.func1 import func1
from src.func2 import func2
__all__ = [
'func1',
'func2'
]
Now you can use my_functions like below
import my_functions
my_functions.f1()
__all__ gets called when you try to import my_functions and allow you to import things you have mentioned using dot notation.
To call a python script from the root call it using dot notation as below
python3 -m <some_directory>.<file>

Pylint gives warnings for centralised imports

In a python project I would like to globber imports into a single file called common_imports.py in order to reduce number of import statements in python files.
Instead of writing
file1.py
import foo
import bar
import baz
[...]
file2.py
import foo
import bar
import baz
[...]
I would like to write
file1.py
from common_imports import *
[...]
file2.py
from common_imports import *
[...]
common_imports.py
import foo
import bar
import baz
However, this gives me a lot of pylint false positives.
I can disable pylint warnings in the common_imports.py file by adding a pylint disable comment. I can disable wildcard imports. Unfortunately, I can disable unused imports only globally but not specific for all imports from common_imports.py.
Somebody has an idea howto get pylint on the track?
Summarising my comments above into a proper answer:
TL;DR:
While the reusable code motive is commendable, it's not fit for purpose here. Listen to the linter, and save your hard-earned respect among your colleagues. :-)
Pythonic Viewpoint:
Don't
Why? Python convention, in all its organisational glory and documented structure, states that if you use a library in a module, import it in the module. Plain and simple.
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
-- PEP8 - Imports
At a lower level, the sys.modules dict, which tracks imports, will only import a library if it hasn’t been imported already. So from an efficiency point of view, there is no gain.
Maintainer's Viewpoint:
Don't
Why? If (when) the code is changed / optimised in a module, thus alleviating the need for a specific import … "remind me where I look to find where that library is imported? Oh ya, here. But this other module needs that import, but not this new library I’m using to optimise this code. Where should I import that? Ugh!!!"
You've lost the hard-earned respect of following maintainers.

Given only source file, what files does it import?

I'm building a dependency graph in python3 using the ast module. How do I know what file(s) will be imported if that import statement were to be executed?
Not a complete answer, but here are some bits you should be aware of:
Imports might happen in conditional or try-catch blocks. So depending on a setting of an environment variable, module A might or might not import module B.
There's a wide variety of import syntax: import A, from A import B, from A import *, from . import A, from .. import A, from ..A import B as well as their versions with A replaced with sub-modules.
Imports can happen in any executable context - the top-level of the file, in a function, in a class definition etc.
eval can evaluate code with imports. Up to you if you consider such code to be a dependency.
The standard library modulefinder module might help.
As suggested in a comment: the other answers are valid, but one of the fundamental problems is that your examples only work for 'simple' scripts or files: A lot of more complex code will use things like dynamic imports: consider the following:
path, task_name = "module.function".rsplit(".", 1);
module = importlib.import_module(path);
real_func = getattr(module, task_name);
real_func();
The actual original string could be obfuscated, or pulled from a DB, or a file or...
There are alternatives to importlib, but this is on top of the exec type stuff you might see in #horia's good answer.

Circular function import

I'm trying to do the following in python 2.6.
my_module.py:-
from another_module import another_factory
def my_factory(name):
pass
another_module.py:-
from my_module import my_factory
def another_factory(name):
pass
Both modules in the same folder.
It gives me the error:
Error: cannot import name my_factory
As seen from the comments, you are trying to do a circle import which is impossible.
If in your module A you try to import something from the module B, and when loading the module B (to satisfy this dependency) you are trying to import something from the module A, you are where you started and you got a circle import: A needs B and B needs A!!, it is somehow like saying that A needs A, which is quite unlogic.
For instance:
# moduleA
from moduleB import functionB
...
So the interpreter tries to load the moduleB, which looks like the following:
# moduleB
from moduleA import functionA
...
And goes back to the moduleA, which tries again to import B, and, etc. Therefore python just raises the error and stops the insanity for a greater good.
Dependencies don't work like this. Define what module needs the other one, and just do a simple import. In your example, it seems that another_module needs my_module, so change my_module and eliminate the dependency on another_module.
If both modules actually need each other, it is a clear sign that they belong to the same logical concept, and should be merged.
PD: in some cases to avoid huge files, you can split a logical unit in two, and to avoid the circle dependencies, you write your imports inside of the functions (which are not executed at load time), so that there is not a circle. This is however in general something to avoid.
The real question is... do you consider each file as a module or are they part of a package ?
Trying to import modules outside a package is sometimes painful. You should rather build a package by simply creating an empty __init__.py module in the directory. Though, if you have
__init__.py
my_module.py
another_module.py
If you have te following function in my_module.py,
def my_factory(x):
return x * x
You should be able to access the my_factory() function from another_module.py by writing this :
from my_module import my_factory
But, if you don't have the __init__.py file/module, the import function will be (somehow) lost and will only use the sys.path for searching other modules. You may then add the following lines (before the import) in the another_module.py file :
sys.path.append(os.path.dirname(os.path.expanduser('.')))
You may also use the various packages available to help importing modules, like imp or import_file (see the documentation). Or you can decide to use load_source (also see the doc : https://docs.python.org/2/library/imp.html)

Python: Importing everything from a python namespace / package

all.
I'd think that this could be answered easily, but it isn't. As long as I've been searching for an answer, I keep thinking that I'm overlooking something simple.
I have a python workspace with the following package structure:
MyTestProject
/src
/TestProjectNamespace
__init__.py
Module_A.py
Module_B.py
SecondTestProject
/src
/SecondTestProjectNamespace
__init__.py
Module_1.py
Module_2.py
...
Module_10.py
Note that MyTestProjectNamespace has a reference to SecondTestProjectNamespace.
In MyTestProjectNamespace, I need to import everything in SecondTestProjectNamespace. I could import one module at a time with the following statement(s):
from SecondTestProjectNamespace.Module_A import *
from SecondTestProjectNamespace.Module_B import *
...but this isn't practical if the SecondTestProject has 50 modules in it.
Does Python support a way to import everything in a namespace / package? Any help would be appreciated.
Thanks in advance.
Yes, you can roll this using pkgutil.
Here's an example that lists all packages under twisted (except tests), and imports them:
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4
import pkgutil
import twisted
for importer, modname, ispkg in pkgutil.walk_packages(
path=twisted.__path__,
prefix=twisted.__name__+'.',
onerror=lambda x: None):
# skip tests
if modname.find('test') > -1:
continue
print(modname)
# gloss over import errors
try:
__import__(modname)
except:
print 'Failed importing', modname
pass
# show that we actually imported all these, by showing one subpackage is imported
print twisted.python
I have to agree with the other posters that star imports are a bad idea.
No. It is possible to set up SecondTestProject to automatically import everything in its submodules, by putting code in __init__.py to do the from ... import * you mention. It's also possible to automate this to some extent using the __import__ function and/or the imp module. But there is no quick and easy way to take a package that isn't set up this way and make it work this way.
It's probably not a good idea anyway. If you have 50 modules, importing everything from all of them into your global namespace is going to cause a proliferation of names, and very likely conflicts among those names.
As other had put it - it might not be a good idea. But there are ways of keeping your namespaces and therefore avoiding naming conflicts - and having all the modules/sub-packages in a module available to the package user with a single import.
Let's suppose I have a package named "pack", within it a module named "a.py" defining some "b" variable. All I want to do is :
>>> import pack
>>> pack.a.b
1
One way of doing this is to put in pack/__init__.py a line that says
import a - thus in your case you'd need fifty such lines, and keep them up to date.
Not that bad.
However, the documentation at http://docs.python.org/tutorial/modules.html#importing-from-a-package - says that if you have a string list named __all__ in your __init__.py file, all module/sub-package names in that list are imported when one does from pack import *
That alone would half-work - but would require users of your package to perform the not-recommended "from x import *" form.
But -- you can do the "... import *" inside __init__.py itself, after defining the __all__ variable - so all you have to do is to keep the __all__ up to date:
With the TestProjectNamespace/__init__.py being like this:
__all__ = ["Module_A", "Module_B", ...]
from TestProjectNamespace import *
your users would have
TestProjectNamespace.Module_A (and others) available upon import of TestProjectNamespace.
And, of course - you could automate the creation of __all__ - it is just a variable, after all - but I would not recommend that.
Does Python support a way to import everything in a namespace / package?
No. A package is not a super-module -- it's a collection of modules grouped together.
At least part of the reason is that it's not trivial to determine what 'everything' means inside a folder: there are problems like network drives, soft links, hard links, ...

Categories