Python: Optimizing imports - python

Does it matter where modules are loaded in a code?
Or should they all be declared at the top, since during load time the external modules will have to be loaded regardless of where they are declared in the code...?
Example:
from os import popen
try:
popen('echo hi')
doSomethingIllegal;
except:
import logging #Module called only when needed?
logging.exception("Record to logger)
or is this optimized by the compiler the same way as:
from os import popen
import logging #Module will always loaded regardless
try:
popen('echo hi')
doSomethingIllegal;
except:
logging.exception("Record to logger)

This indicates it may make a difference:
"import statements can be executed just about anywhere. It's often useful to place them inside functions to restrict their visibility and/or reduce initial startup time. Although Python's interpreter is optimized to not import the same module multiple times, repeatedly executing an import statement can seriously affect performance in some circumstances."
These two OS questions, local import statements? and import always at top of module? discuss this at length.
Finally, if you are curious about your specific case you could profile/benchmark your two alternatives in your environment.
I prefer to put all of my import statements at the top of the source file, following stylistic conventions and for consistency (also it would make changes easier later w/o having to hunt through the source file looking for import statements scattered throughout)

The general rule of thumb is that imports should be at the top of the file, as that makes code easier to follow, and that makes it easier to figure out what a module will need without having to go through all the code.
The Python style guide covers some basic guidelines for how imports should look: http://www.python.org/dev/peps/pep-0008/#imports
In practice, though, there are times when it makes sense to import from within a particular function. This comes up with imports that would be circular:
# Module 1
from module2 import B
class A(object):
def do_something(self):
my_b = B()
...
# Module 2
from module1 import A
class B(object):
def do_something(self):
my_a = A()
...
That won't work as is, but you could get around the circularity by moving the import:
# Module 1
from module2 import B
class A(object):
def do_something(self):
my_b = B()
...
# Module 2
class B(object):
def do_something(self):
from module1 import A
my_a = A()
...
Ideally, you would design the classes such that this would never come up, and maybe even include them in the same module. In that toy example, having each import the other really doesn't make sense. However, in practice, there are some cases where it makes more sense to include an import for one method within the method itself, rather than throwing everything into the same module, or extracting the method in question out to some other object.
But, unless you have good reason to deviate, I say go with the top-of-the-module convention.

Related

Why does using import on two files/modules at once giving me errors, but not when I only do it on one? [duplicate]

This question already has answers here:
What can I do about "ImportError: Cannot import name X" or "AttributeError: ... (most likely due to a circular import)"?
(17 answers)
Closed 6 months ago.
I know the issue of circular imports in python has come up many times before and I have read these discussions. The comment that is made repeatedly in these discussions is that a circular import is a sign of a bad design and the code should be reorganised to avoid the circular import.
Could someone tell me how to avoid a circular import in this situation?: I have two classes and I want each class to have a constructor (method) which takes an instance of the other class and returns an instance of the class.
More specifically, one class is mutable and one is immutable. The immutable class is needed
for hashing, comparing and so on. The mutable class is needed to do things too. This is similar to sets and frozensets or to lists and tuples.
I could put both class definitions in the same module. Are there any other suggestions?
A toy example would be class A which has an attribute which is a list and class B which has an attribute which is a tuple. Then class A has a method which takes an instance of class B and returns an instance of class A (by converting the tuple to a list) and similarly class B has a method which takes an instance of class A and returns an instance of class B (by converting the list to a tuple).
Consider the following example python package where a.py and b.py depend on each other:
/package
__init__.py
a.py
b.py
Types of circular import problems
Circular import dependencies typically fall into two categories depending
on what you're trying to import and where you're using it inside each
module. (And whether you're using python 2 or 3).
1. Errors importing modules with circular imports
In some cases, just importing a module with a circular import dependency
can result in errors even if you're not referencing anything from the
imported module.
There are several standard ways to import a module in python
import package.a # (1) Absolute import
import package.a as a_mod # (2) Absolute import bound to different name
from package import a # (3) Alternate absolute import
import a # (4) Implicit relative import (deprecated, python 2 only)
from . import a # (5) Explicit relative import
Unfortunately, only the 1st and 4th options actually work when you
have circular dependencies (the rest all raise ImportError
or AttributeError). In general, you shouldn't be using the
4th syntax, since it only works in python2 and runs the risk of
clashing with other 3rd party modules. So really, only the first
syntax is guaranteed to work.
EDIT: The ImportError and AttributeError issues only occur in
python 2. In python 3 the import machinery has been rewritten and all
of these import statements (with the exception of 4) will work, even with
circular dependencies. While the solutions in this section may help refactoring python 3 code, they are mainly intended
for people using python 2.
Absolute Import
Just use the first import syntax above. The downside to this method is
that the import names can get super long for large packages.
In a.py
import package.b
In b.py
import package.a
Defer import until later
I've seen this method used in lots of packages, but it still feels
hacky to me, and I dislike that I can't look at the top of a module
and see all its dependencies, I have to go searching through all the
functions as well.
In a.py
def func():
from package import b
In b.py
def func():
from package import a
Put all imports in a central module
This also works, but has the same problem as the first method, where
all the package and submodule calls get super long. It also has two
major flaws -- it forces all the submodules to be imported, even if
you're only using one or two, and you still can't look at any of the
submodules and quickly see their dependencies at the top, you have to
go sifting through functions.
In __init__.py
from . import a
from . import b
In a.py
import package
def func():
package.b.some_object()
In b.py
import package
def func():
package.a.some_object()
2. Errors using imported objects with circular dependencies
Now, while you may be able to import a module with a circular import
dependency, you won't be able to import any objects defined in the module
or actually be able to reference that imported module anywhere
in the top level of the module where you're importing it. You can,
however, use the imported module inside functions and code blocks that don't
get run on import.
For example, this will work:
package/a.py
import package.b
def func_a():
return "a"
package/b.py
import package.a
def func_b():
# Notice how package.a is only referenced *inside* a function
# and not the top level of the module.
return package.a.func_a() + "b"
But this won't work
package/a.py
import package.b
class A(object):
pass
package/b.py
import package.a
# package.a is referenced at the top level of the module
class B(package.a.A):
pass
You'll get an exception
AttributeError: module 'package' has no attribute 'a'
Generally, in most valid cases of circular dependencies, it's possible
to refactor or reorganize the code to prevent these errors and move
module references inside a code block.
Only import the module, don't import from the module:
Consider a.py:
import b
class A:
def bar(self):
return b.B()
and b.py:
import a
class B:
def bar(self):
return a.A()
This works perfectly fine.
We do a combination of absolute imports and functions for better reading and shorter access strings.
Advantage: Shorter access strings compared to pure absolute imports
Disadvantage: a bit more overhead due to extra function call
main/sub/a.py
import main.sub.b
b_mod = lambda: main.sub.b
class A():
def __init__(self):
print('in class "A":', b_mod().B.__name__)
main/sub/b.py
import main.sub.a
a_mod = lambda: main.sub.a
class B():
def __init__(self):
print('in class "B":', a_mod().A.__name__)

How do I do python unittest doc's recommended method of lazy import?

Python's docs say that there's an alternative to local imports to prevent loading the module on startup:
https://docs.python.org/3/library/unittest.mock-examples.html#mocking-imports-with-patch-dict
...to prevent “up front costs” by delaying the import.
This can also be solved in better ways than an unconditional local
import (store the module as a class or module attribute and only do
the import on first use).
I don't understand the explanation in brackets. How do I do this? However I think about it, I seem to end up with local imports anyway.
The documentation likely refers to the use of importlib.import_module, which exposes Python's import functionality:
import importlib
class Example():
TO_IMPORT = "os" # the name of the module to lazily import
__module = None
def __init__(self):
if self.__module is None:
self.__module = importlib.import_module(self.TO_IMPORT)
Note that this way the module is only imported once when the class is first instantiated and is not available in global namespace.
Further, it allows you to change which module is imported, which could be useful e.g. in cases where the same class is used as an interface to different backends:
import importlib
class Example():
def __init__(self, backend="some_module"):
self.module = importlib.import_module(backend)

Importing in Python for multiple files

I am struggling to figure out how to handle importing dependencies that are used in multiple files.
Let's say I want to import an external API for example, and two classes depend on this import. Putting the import into the 'index' file, as an attempt to make it global does not work. I can import it in each class file fine, but that seems to be a violation of DRY, as well as setting myself up for failure later on.
So is there a way to import once, in a single file that is globally accessible? What I experimented with was creating an index.py, foo.py (for the foo class) and bar.py (for the bar class):
Index:
from example import API
import foo
import bar
foo()
bar()
foo.py:
class foo:
... (try to put the example API to use)
bar.py: (same as foo.py really, just here to make the case for using the same dependency in two different places)
This failed to work, as the classes appeared to not be able to access exampleAPI. What is the correct way to do this, or am I looking at it wrong? Thanks!
In general, you should import each module you need in each of your own modules that needs to use it. You don't need to worry about duplication, since each module has its own global namespace. Furthermore, modules are cached (in the sys.modules dictionary) so you don't need to worry about extra work being done to load the module multiple times.
That said, there can be some exceptions. For instance, if the specific source of an API is considered "private" information (e.g. because it's an implementation detail or because it might be configurable and not always come from the same place all the time), it might make sense to import it into some namespace where all other users will look for it.
On the other hand, your example suggests you may be splitting up your code more than you should. Unlike some other languages (such as Java), in Python it's neither required nor recommended for each class to live in its own file. Instead, you should divide your code up into modules dictated by how closely they interact with each other. Closely related classes should be part of the same module, while pieces that don't interact at all might make more sense in separate modules (especially if some other code might need one part but not the other). It may not be inappropriate for your whole program to be in a single module! Obviously, some of this is a matter of style and taste, so there's not a single best option for every programmer in every situation.
For your example code, if you want to keep separate modules, I'd suggest something like this:
index.py:
from foo import Foo # no need to import API here if you're not using it directly
from bar import Bar
foo = Foo() # create an instance of the foo class
result = foo.some_method() # call methods on it
bar = Bar(foo) # you can also pass your instances around to other classes
foo.py:
from example import API
class Foo:
def some_method(self):
return API.whatever() # use the API in some way
bar.py:
from example import API # don't worry about importing the API more than ocne
class Bar:
def __init__(self, foo):
self.foo = foo
def blah(self):
API.something_else(self.foo.some_method())
Note that I changed some names around. Python's convention is to use CapitalizedNames for classes, and lowercase_names_with_underscores (sometimes known as "snake case") for modules, variables and methods. Your original code seemed to have some confusion between the modules name foo and bar and the classes within them with the same names. Using different styles for the different names can help avoiding that confusion.

Can I "fake" a package (or at least a module) in python for testing purposes?

I want to fake a package in python. I want to define something so that the code can do
from somefakepackage.morefakestuff import somethingfake
And somefakepackage is defined in code and so is everything below it. Is that possible? The reason for doing this is to trick my unittest that I got a package ( or as I said in the title, a module ) in the python path which actually is just something mocked up for this unittest.
Sure. Define a class, put the stuff you need inside that, assign the class to sys.modules["classname"].
class fakemodule(object):
#staticmethod
def method(a, b):
return a+b
import sys
sys.modules["package.module"] = fakemodule
You could also use a separate module (call it fakemodule.py):
import fakemodule, sys
sys.modules["package.module"] = fakemodule
Yes, you can make a fake module:
from types import ModuleType
m = ModuleType("fake_module")
import sys
sys.modules[m.__name__] = m
# some scripts may expect a file
# even though this file doesn't exist,
# it may be used by Python for in error messages or introspection.
m.__file__ = m.__name__ + ".py"
# Add a function
def my_function():
return 10
m.my_function = my_function
Note, in this example its using an actual module (of ModuleType) since some
Python code may expect modules, (instead of a dummy class).
This can be made into a utility function:
def new_module(name, doc=None):
import sys
from types import ModuleType
m = ModuleType(name, doc)
m.__file__ = name + '.py'
sys.modules[name] = m
return m
print(new_module("fake_module", doc="doc string"))
Now other scripts can run:
import fake_module
I took some of the ideas from the other answers and turned them into a Python decorator #modulize which converts a function into a module. This module can then be imported as usual. Here is an example.
#modulize('my_module')
def my_dummy_function(__name__): # the function takes one parameter __name__
# put module code here
def my_function(s):
print(s, 'bar')
# the function must return locals()
return locals()
# import the module as usual
from my_module import my_function
my_function('foo') # foo bar
The code for the decorator is as follows
import sys
from types import ModuleType
class MockModule(ModuleType):
def __init__(self, module_name, module_doc=None):
ModuleType.__init__(self, module_name, module_doc)
if '.' in module_name:
package, module = module_name.rsplit('.', 1)
get_mock_module(package).__path__ = []
setattr(get_mock_module(package), module, self)
def _initialize_(self, module_code):
self.__dict__.update(module_code(self.__name__))
self.__doc__ = module_code.__doc__
def get_mock_module(module_name):
if module_name not in sys.modules:
sys.modules[module_name] = MockModule(module_name)
return sys.modules[module_name]
def modulize(module_name, dependencies=[]):
for d in dependencies: get_mock_module(d)
return get_mock_module(module_name)._initialize_
The project can be found here on GitHub. In particular, I created this for programming contests which only allow the contestant to submit a single .py file. This allows one to develop a project with multiple .py files and then combine them into one .py file at the end.
You could fake it with a class which behaves like somethingfake:
try:
from somefakepackage.morefakestuff import somethingfake
except ImportError:
class somethingfake(object):
# define what you'd expect of somethingfake, e.g.:
#staticmethod
def somefunc():
...
somefield = ...
TL;DR
Patch sys.modules using unittest.mock:
mock.patch.dict(
sys.modules,
{'somefakepackage': mock.Mock()},
)
Explanation
Other answers correctly recommend to fix sys.modules but a proper way to do it is by patching it using mock.patch. Meaning replacing it temporarily (only for when tests are run) with a fake object that optionally imitates the desired behaviour. And restoring it back once tests are finished to not affect other test cases.
The code in TL;DR section will simply make your missing package not raise ImportError. To provide fake package with contents and imitate desired behaviour, initiate mock.Mock(…) with proper arguments (e.g. add attributes via Mock's **kwargs).
Full code example
The code below temporarily patches sys.modules so that it includes somefakepackage and makes it importable from the dependent modules without ImportError.
import sys
import unittest
from unittest import mock
class SomeTestCase(unittest.TestCase):
def test_smth(self):
# implement your testing logic, for example:
self.assertEqual(
123,
somefakepackage_dependent.some_func(),
)
#classmethod
def setUpClass(cls): # called once before all the tests
# define what to patch sys.modules with
cls._modules_patcher = mock.patch.dict(
sys.modules,
{'somefakepackage': mock.Mock()},
)
# actually patch it
cls._modules_patcher.start()
# make the package globally visible and import it,
# just like if you have imported it in a usual way
# placing import statement at the top of the file,
# but relying on a patched dependency
global somefakepackage_dependent
import somefakepackage_dependent
#classmethod # called once after all tests
def tearDownClass(cls):
# restore initial sys.modules state back
cls._modules_patcher.stop()
To read more about setUpClass/tearDownClass methods, see unittest docs.
unittest's built-in mock subpackage is actually a very powerful tool. Better dive deeper into its documentation to get a better understanding.

How does import keyword in python actually work?

Let's say I have 3 files:
a.py
from d import d
class a:
def type(self):
return "a"
def test(self):
try:
x = b()
except:
print "EXCEPT IN A"
from b import b
x = b()
return x.type()
b.py
import sys
class b:
def __init__(self):
if "a" not in sys.modules:
print "Importing a!"
from a import a
pass
def type(self):
return "b"
def test(self):
for modules in sys.modules:
print modules
x = a()
return x.type()
c.py
from b import b
import sys
x = b()
print x.test()
and run python c.py
Python comes back complaining:
NameError: global name 'a' is not
defined
But, a IS in sys.modules:
copy_reg
sre_compile
locale
_sre
functools
encodings
site
__builtin__
operator
__main__
types
encodings.encodings
abc
errno
encodings.codecs
sre_constants
re
_abcoll
ntpath
_codecs
nt
_warnings
genericpath
stat
zipimport
encodings.__builtin__
warnings
UserDict
encodings.cp1252
sys
a
codecs
os.path
_functools
_locale
b
d
signal
linecache
encodings.aliases
exceptions
sre_parse
os
And I can alter b.py such that:
x = a()
changes to
x = sys.modules["a"].a()
And python will happily run that.
A couple questions arise from this:
Why does python say it doesn't know what a is, when it is in sys.modules?
Is using sys.modules a "proper" way to access class and function definitions?
What is the "right" way to import modules?
ie
from module import x
or
import module
I guess it's a problem of scoping, if you import a module in your constructor you can only use it in your constructor, after the import statement.
According to the Python documentation,
Import statements are executed in two steps: (1) find a module, and initialize it if necessary; (2) define a name or names in the local namespace (of the scope where the import statement occurs).
So the problem is that even though module a has been imported, the name a has only been bound in the scope of the b.__init__ method, not the entire scope of b.py. So in the b.test method, there is no such name a, and so you get a NameError.
You might want to read this article on importing Python modules, as it helps to explain best practices for working with imports.
In your case, a is in sys.modules.. but not everything in sys.modules is in b's scope. If you want to use re, you'd have to import that as well.
Conditional importing is occasionally acceptable, but this isn't one of those occasions. For one thing, the circular dependency between a and b in this case is unfortunate, and should be avoided (lots of patterns for doing so in Fowler's Refactoring).. That said, there's no need to conditionally import here.
b ought to simply import a. What were you trying to avoid by not importing it directly at the top of the file?
It is bad style to conditionally import code modules based on program logic. A name should always mean the same thing everywhere in your code. Think about how confusing this would be to debug:
if (something)
from office import desk
else
from home import desk
... somewhere later in the code...
desk()
Even if you don't have scoping issues (which you most likely will have), it's still confusing.
Put all your import statements at the top of your file. That's where other coders will look for them.
As far as whether to use "from foo import bar" verses just "import foo", the tradeoff is more typing (having to type "foo.bar()" or just type "bar()") verses clearness and specificity. If you want your code to be really readable and unambiguous, just say "import foo" and fully specify the call everywhere. Remember, it's much harder to read code than it is to write it.

Categories