PEP 8 and deferred import - python

I am working on a large Python program which makes use of a multitude of modules depending on command-line options, in particular, numpy. We have recently found a need to run this on a small embedded module which precludes the use of numpy. From our perspective, this is easy enough (just don't use the problematic command line options.)
However, following PEP 8, our import numpy is at the beginning of each module that might need it, and the program will crash due to numpy not being installed. The straightforward solution is to move import numpy from the top of the file to the functions that need it. The question is, "How bad is this"?
(An alternative solution is to wrap import numpy in a try .. except. Is this better?)

Here is a best practice pattern to check if a module is installed and make code branch depending on it.
# GOOD
import pkg_resources
try:
pkg_resources.get_distribution('numpy')
except pkg_resources.DistributionNotFound:
HAS_NUMPY = False
else:
HAS_NUMPY = True
# You can also import numpy here unless you want to import it inside the function
Do this in every module imports having soft dependency to numpy. More information in Plone CMS coding conventions.

Another idiom which I've seen is to import the module as None if unavailable:
try:
import numpy as np
except ImportError:
np = None
Or, as in the other answer, you can use the pkg_resources.get_distribution above, rather than try/except (see the blog post linked to from the plone docs).
In that way, before using numpy you can hide numpy's use in an if block:
if np:
# do something with numpy
else:
# do something in vanilla python
The key is to ensure your CI tests have both environments - with and without numpy (and if you are testing coverage this should count both block as covered).

Related

What's the Pythonic way to write conditional statements based on installed modules?

Coming from a C++ world I got used to write conditional compilation based on flags that are determined at compilation time with tools like CMake and the like. I wonder what's the most Pythonic way to mimic this functionality. For instance, this is what I currently set depending on whether a module is found or not:
import imp
try:
imp.find_module('petsc4py')
HAVE_PETSC=True
except ImportError:
HAVE_PETSC=False
Then I can use HAVE_PETSC throughout the rest of my Python code. This works, but I wonder if it's the right way to do it in Python.
Yes, it is ok. You can even issue an import directly, and
use the modulename itself as the flag - like in:
try:
import petsc4py
except ImportError
petsc4py = None
And before any use, just test for the truthfulness of petsc4py itself.
Actually, checking if it exists, and only then trying to import it, feels unpythonic due to the redundancy, as both actions trigger an ImportError all the same. But having a HAVE_PETSC variable for the checkings is ok - it can be created after the try/except above with HAVE_PETSC = bool(petsc4py)
The way you're doing it is more-or-less fine. In fact, The python standard library uses a similar paradigm of "try to import something and if it's not valid for some reason then set a variable somehow" in multiple places. Checking if a boolean is set later in the program is going to be faster than doing a separate try/except block every single time.
In your case it would probably just be better to do this, though:
try:
import petsc4py
HAVE_PETSC = True
except ImportError:
HAVE_PETSC = False
What you have works on a paradigm level, but there's no real reason to go through importlib in this case (and you probably shouldn't use imp anyway, as it's deprecated in recent versions of python).

What are the rules for importing with "as" in Python without using from

I was trying to import the following function in Python 2.7
import scipy.signal.savgol_filter as sgolay
I received the following error:
ImportError: No module named savgol_filter
savgol_filter is a function, not a module, so the error makes some sense. My question then is, is it not possible to import, without the use of the word "from" anything besides a module?
In other words, the following works:
from scipy.signal import savgol_filter as sgolay
But in general, does the following "sub_part" need to be a module?
import my_module.sub_part as some_name
I've seen lots of writing suggesting "sub_part" does not need to be a module. Is there something tricky going on with scipy that is making this not work?
Thanks,
Jim
In general, if you do import thing, import thing.subthing, import thing.subthing.subsubthing, etc., the far-right thing needs to be a module. Only the from form allows importing things that aren't modules. If you want a definitive statement of the forms of the import statement and what it allows, the Python language reference explains it in great detail, but it's a pretty dense read.
import is for importing modules. If you are not importing modules then you should see if its an object and create an instance like this:
import scipy as sp
sgolay = scipy.signal.savgol_filter
#Other stuff to do...

Conventions for 'import ... as'

Typically, one uses import numpy as np to import the module numpy.
Are there general conventions for naming?
What about other modules, in particular from scientific computing like scipy, sympy and pylab or submodules like scipy.sparse.
SciPy recommends import scipy as sp in its documentation, though personally I find that rather useless since it only gives you access to re-exported NumPy functionality, not anything that SciPy adds to that. I find myself doing import scipy.sparse as sp much more often, but then I use that module heavily. Also
import matplotlib as mpl
import matplotlib.pyplot as plt
import networkx as nx
You might encounter more of these as you start using more libraries. There's no registry or anything for these shorthands and you're free to invent new ones as you see fit. There's also no general convention except that import lln as library_with_a_long_name obviously won't occur very often.
Aside from these shorthands, there's a habit among Python 2.x programmers to do things like
# Try to import the C implementation of StringIO; if that doesn't work
# (e.g. in IronPython or Jython), import the pure Python version.
# Make sure the imported module is called StringIO locally.
try:
import cStringIO as StringIO
except ImportError:
import StringIO
Python 3.x is putting an end to this, though, because it no longer offers partial C implementations of StringIO, pickle, etc.

Supporting Multiple Python Versions In Your Code?

Today I tried using pyPdf 1.12 in a script I was writing that targets Python 2.6. When running my script, and even importing pyPdf, I get complaints about deprecated functionality (md5->hashsum, sets). I'd like to contribute a patch to make this work cleanly in 2.6, but I imagine the author does not want to break compatibility for older versions (2.5 and earlier).
Searching Google and Stack Overflow have so far turned up nothing. I feel like I have seen try/except blocks around import statements before that accomplish something similar, but can't find any examples. Is there a generally accepted best practice for supporting multiple Python versions?
There are two ways to do this:
(1) Just like you described: Try something and work around the exception for old versions. For example, you could try to import the json module and import a userland implementation if this fails:
try:
import json
except ImportError:
import myutils.myjson as json
This is an example from Django (they use this technique often):
try:
reversed
except NameError:
from django.utils.itercompat import reversed # Python 2.3 fallback
If the iterator reversed is available, they use it. Otherwise, they import their own implementation from the utils package.
(2) Explicitely compare the version of the Python interpreter:
import sys
if sys.version_info < (2, 6, 0):
# Do stuff for old version...
else:
# Do 2.6+ stuff
sys.version_info is a tuple that can easily be compared with similar version tuples.
You can certainly do
try:
import v26
except ImportError:
import v25
Dive Into Python—Using Exceptions for Other Purposes
Multiple versions of Python are supported here. You can a) conditionally use the newer version, which takes a little work, or b) turn off the warnings, which should really be the default (and is on newer Pythons).

What are good rules of thumb for Python imports?

I am a little confused by the multitude of ways in which you can import modules in Python.
import X
import X as Y
from A import B
I have been reading up about scoping and namespaces, but I would like some practical advice on what is the best strategy, under which circumstances and why. Should imports happen at a module level or a method/function level? In the __init__.py or in the module code itself?
My question is not really answered by "Python packages - import by class, not file" although it is obviously related.
In production code in our company, we try to follow the following rules.
We place imports at the beginning of the file, right after the main file's docstring, e.g.:
"""
Registry related functionality.
"""
import wx
# ...
Now, if we import a class that is one of few in the imported module, we import the name directly, so that in the code we only have to use the last part, e.g.:
from RegistryController import RegistryController
from ui.windows.lists import ListCtrl, DynamicListCtrl
There are modules, however, that contain dozens of classes, e.g. list of all possible exceptions. Then we import the module itself and reference to it in the code:
from main.core import Exceptions
# ...
raise Exceptions.FileNotFound()
We use the import X as Y as rarely as possible, because it makes searching for usage of a particular module or class difficult. Sometimes, however, you have to use it if you wish to import two classes that have the same name, but exist in different modules, e.g.:
from Queue import Queue
from main.core.MessageQueue import Queue as MessageQueue
As a general rule, we don't do imports inside methods -- they simply make code slower and less readable. Some may find this a good way to easily resolve cyclic imports problem, but a better solution is code reorganization.
Let me just paste a part of conversation on django-dev mailing list started by Guido van Rossum:
[...]
For example, it's part of the Google Python style guides[1] that all
imports must import a module, not a class or function from that
module. There are way more classes and functions than there are
modules, so recalling where a particular thing comes from is much
easier if it is prefixed with a module name. Often multiple modules
happen to define things with the same name -- so a reader of the code
doesn't have to go back to the top of the file to see from which
module a given name is imported.
Source: http://groups.google.com/group/django-developers/browse_thread/thread/78975372cdfb7d1a
1: http://code.google.com/p/soc/wiki/PythonStyleGuide#Module_and_package_imports
I would normally use import X on module level. If you only need a single object from a module, use from X import Y.
Only use import X as Y in case you're otherwise confronted with a name clash.
I only use imports on function level to import stuff I need when the module is used as the main module, like:
def main():
import sys
if len(sys.argv) > 1:
pass
HTH
Someone above said that
from X import A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P
is equivalent to
import X
import X allows direct modifications to A-P, while from X import ... creates copies of A-P. For from X import A..P you do not get updates to variables if they are modified. If you modify them, you only modify your copy, but X does know about your modifications.
If A-P are functions, you won't know the difference.
Others have covered most of the ground here but I just wanted to add one case where I will use import X as Y (temporarily), when I'm trying out a new version of a class or module.
So if we were migrating to a new implementation of a module, but didn't want to cut the code base over all at one time, we might write a xyz_new module and do this in the source files that we had migrated:
import xyz_new as xyz
Then, once we cut over the entire code base, we'd just replace the xyz module with xyz_new and change all of the imports back to
import xyz
DON'T do this:
from X import *
unless you are absolutely sure that you will use each and every thing in that module. And even then, you should probably reconsider using a different approach.
Other than that, it's just a matter of style.
from X import Y
is good and saves you lots of typing. I tend to use that when I'm using something in it fairly frequently But if you're importing a lot from that module, you could end up with an import statement that looks like this:
from X import A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P
You get the idea. That's when imports like
import X
become useful. Either that or if I'm not really using anything in X very frequently.
I generally try to use the regular import modulename, unless the module name is long, or used often..
For example, I would do..
from BeautifulSoup import BeautifulStoneSoup as BSS
..so I can do soup = BSS(html) instead of BeautifulSoup.BeautifulStoneSoup(html)
Or..
from xmpp import XmppClientBase
..instead of importing the entire of xmpp when I only use the XmppClientBase
Using import x as y is handy if you want to import either very long method names , or to prevent clobbering an existing import/variable/class/method (something you should try to avoid completely, but it's not always possible)
Say I want to run a main() function from another script, but I already have a main() function..
from my_other_module import main as other_module_main
..wouldn't replace my main function with my_other_module's main
Oh, one thing - don't do from x import * - it makes your code very hard to understand, as you cannot easily see where a method came from (from x import *; from y import *; my_func() - where is my_func defined?)
In all cases, you could just do import modulename and then do modulename.subthing1.subthing2.method("test")...
The from x import y as z stuff is purely for convenience - use it whenever it'll make your code easier to read or write!
When you have a well-written library, which is sometimes case in python, you ought just import it and use it as it. Well-written library tends to take life and language of its own, resulting in pleasant-to-read -code, where you rarely reference the library. When a library is well-written, you ought not need renaming or anything else too often.
import gat
node = gat.Node()
child = node.children()
Sometimes it's not possible to write it this way, or then you want to lift down things from library you imported.
from gat import Node, SubNode
node = Node()
child = SubNode(node)
Sometimes you do this for lot of things, if your import string overflows 80 columns, It's good idea to do this:
from gat import (
Node, SubNode, TopNode, SuperNode, CoolNode,
PowerNode, UpNode
)
The best strategy is to keep all of these imports on the top of the file. Preferrably ordered alphabetically, import -statements first, then from import -statements.
Now I tell you why this is the best convention.
Python could perfectly have had an automatic import, which'd look from the main imports for the value when it can't be found from global namespace. But this is not a good idea. I explain shortly why. Aside it being more complicated to implement than simple import, programmers wouldn't be so much thinking about the depedencies and finding out from where you imported things ought be done some other way than just looking into imports.
Need to find out depedencies is one reason why people hate "from ... import *". Some bad examples where you need to do this exist though, for example opengl -wrappings.
So the import definitions are actually valuable as defining the depedencies of the program. It is the way how you should exploit them. From them you can quickly just check where some weird function is imported from.
The import X as Y is useful if you have different implementations of the same module/class.
With some nested try..import..except ImportError..imports you can hide the implementation from your code. See lxml etree import example:
try:
from lxml import etree
print("running with lxml.etree")
except ImportError:
try:
# Python 2.5
import xml.etree.cElementTree as etree
print("running with cElementTree on Python 2.5+")
except ImportError:
try:
# Python 2.5
import xml.etree.ElementTree as etree
print("running with ElementTree on Python 2.5+")
except ImportError:
try:
# normal cElementTree install
import cElementTree as etree
print("running with cElementTree")
except ImportError:
try:
# normal ElementTree install
import elementtree.ElementTree as etree
print("running with ElementTree")
except ImportError:
print("Failed to import ElementTree from any known place")
I'm with Jason in the fact of not using
from X import *
But in my case (i'm not an expert programmer, so my code does not meet the coding style too well) I usually do in my programs a file with all the constants like program version, authors, error messages and all that stuff, so the file are just definitions, then I make the import
from const import *
That saves me a lot of time. But it's the only file that has that import, and it's because all inside that file are just variable declarations.
Doing that kind of import in a file with classes and definitions might be useful, but when you have to read that code you spend lots of time locating functions and classes.

Categories