the pythonic way of optional imports [duplicate] - python

This question already has answers here:
What's Python good practice for importing and offering optional features?
(7 answers)
Closed last month.
I have a package that allows the user to use any one of 4 packages they want to connect to a database. It works great but I'm unhappy with the way I'm importing things.
I could simply import all the packages, but I don't want to do that in case the specific user doesn't ever need to use turbodbc for example:
import pyodbc
import pymssql
import turbodbc
from ibmdbpy.base import IdaDataBase
Currently, I have the following situation. I try to import all of them, but the ones that don't import, no problem, My program simply assumes they will not be called and if they are it errors:
# some envs may not have all these packages installed so we try each:
try:
import pyodbc
except:
pass
try:
import pymssql
except:
pass
try:
import turbodbc
except:
pass
try:
from ibmdbpy.base import IdaDataBase
except:
pass
This doesn't feel pythonic. So I know there are packages such as holoviews or tensorflow that allow you to specify a backend. They are of course orders of magnitude more complicated than mine, but they have to handle the same pattern.
How can I make this code right? it is technically buggy because if they intend to use pyodbc but don't have it installed, my program will not warn them, it will error at runtime. So really this goes beyond esthetics or philosophy; this is technically error-prone code.
How would you handle this situation?
Fyi, here is an example of how the code is called:
connect('Playground', package='pymssql')

try:
import pyodbc
except ImportError:
pyodbc = None
then later:
if pyodbc is None and user_wants_to_use_pyodbc:
print_warning()
raise SomeConfigurationErrorOrSuch()
This approach works well for a small number of options. If you have enough options that you need to abstract out this approach, then you can use the importlib module to import modules under the control of your program.

I would use import_module from importlib:
from importlib import import_module
modules_to_import = ['pyodbc', 'pymssql', 'turbodbc', 'ibmdbpy.base.IdaDataBase']
for m in modules_to_import:
try:
globals()[m.split('.')[-1]] = import_module(m)
except ModuleNotFoundError:
print('Module {} not found'.format(m))

I've used something similar to the answers above, but sometimes you might need to mock an object to fool lint.
try:
from neuralprophet import NeuralProphet
using_neuralprophet = True
except ImportError:
class NeuralMock:
whatever=False
using_neuralprophet = False
NeuralProphet = NeuralMock()
Source: timemachines

You can put imports in places other than the beginning of the file. "Re-importing" something doesn't actually do anything, so it's not computationally expensive to import x frequently:
def switch(x):
if x == 'a':
import json
json.load(file)
elif x == 'b':
import pandas as pd
pd.read_csv(file)
You can also use importlib to dynamically import modules. This is especially useful if you have multiple implementations of the same API that you want to choose between
class Connection:
def __init__(self, driver_module, driver_name):
# or driver_module, driver_name = full_path.rsplit('.', 1)
self.driver = get_attr(importlib.load_module(driver_module), driver_name)()
def method(self):
return self.driver.do()

Related

how to only import module if necessary and only once

I have a class which can be plotted using matplotlib, but it can (and will) also be used without plotting it.
I would like to only import matplotlib if necessary, ie. if the plot method is called on an instance of the class, but at the same time I would like to only import matplotlib once if at all.
Currently what I do is:
class Cheese:
def plot(self):
from matplotlib import pyplot as plt
# *plot some cheese*
..but I suppose that this may lead to importing multiple times.
I can think of lots of ways to accomplish only importing once, but they are not pretty.
What is a pretty and "pythonic" way of doing this?
I don't mean for this to be "opinion based", so let me clarify what I mean by "pretty":
using the fewest lines of code.
most readable
most efficient
least error-prone
etc.
If a module is already loaded then it won't be loaded again. you will just get a reference to it. If you don't plan to use this class locally and just want to satisfy the typehinter then you can do the following
#imports
#import whatever you need localy
from typing import TYPE_CHECKING
if TYPE_CHECKING: # False at runtime
from matplotlib import pyplot as plt
Optional import in Python:
try:
import something
import_something = True
except ImportError:
import something_else
import_something_else = True
Conditional import in Python:
if condition:
import something
# something library related code
elif condition:
# code without library
import related to one function:
def foo():
import some_library_to_use_only_inside_foo
TLDR; Python does so for you already, for free.
Python import machinery imports module only once, even if it was imported multiple times. Even from different files (docs).
The most pythonic way to import something is to do so at the beginning of file. Unless you have special needs, like import different modules depending on some condition, eg. platform (windows, linux).

Python: Define a function only if package exists

Is it possible to tell Python 2.7 to only parse a function definition if a package exists?
I have a script that is run on multiple machines. There are some functions defined in the script that are very nice to have, but aren't required for the core operations the script performs. Some of the machines the script is run on don't have the package that the function imports, (and the package can't be installed on them). Currently I have to comment out the function definition before cloning the repo onto those machines. Another solution would be to maintain two different branches but that is even more tedious. Is there a solution that prevents us from having to constantly comment out code before pushing?
There are already solutions for when the function is called, such as this:
try:
someFunction()
except NameError:
print("someFunction() not found.")
Function definitions and imports are just code in Python, and like other code, you can wrap them in a try:
try:
import bandana
except ImportError:
pass # Hat-wearing functions are optional
else:
def wear(hat):
bandana.check(hat)
...
This would define the wear function only if the bandana module is available.
Whether this is a good idea or not is up to you - I think it would be fine in your own scripts, but you might not want to do this in code other people will use. Another idea might be to do something like this:
def wear(hat):
try:
import bandana
except ImportError:
raise NotImplementedError("You need the bandana package to wear hats")
else:
bandana.check(hat)
...
This would make it clearer why you can't use the wear function.
A somewhat improved solution is as follows:
In file header:
try:
# Optional dependency
import psutil
except ImportError as e:
psutil = e
Later in the beginning of your function or inside __init__ method:
if isinstance(psutil, ImportError):
raise psutil
Pros: you get the original exception message when you access optional functionality. Just as if you've did simply import psutil

How should one write the import procedures in a module that uses imported modules in a limited way?

I have a module that features numerous functions. Some of these functions are dependent on other modules. The module is to be used in some environments that are not going to have these other modules. In these environments, I want the functionality of the module that is not dependent on these other unavailable modules to be usable. What is a reasonable way of coding this?
I am happy for an exception to be raised when a function's dependency module is not met, but I want the module to be usable for other functions that do not have dependency problems.
At present, I'm thinking of a module-level function something like the following:
tryImport(moduleName = None):
try:
module = __import__(moduleName)
return(module)
except:
raise(Exception("module {moduleName} import error".format(
moduleName = moduleName)))
sys.exit()
This would then be used within functions in a way such as the following:
def function1():
pyfiglet = tryImport(moduleName = "pyfiglet")
For your use case, it sounds like there's nothing wrong with putting the imports you need inside functions, rather than at the top of your module (don't worry, it costs virtually nothing to reimport):
def some_function():
import foo # Throws if there is no foo
return foo.bar ()
Full stop, please.
The first thing is that the problem you described indicates that you should redesign your module. The most obvious solution is to break it into a couple of modules in one package. Each of them would contain group of functions with common external dependencies. You can easily define how many groups you need if you only know which dependencies might be missing on the target machine. Then you can import them separately. Basically in a given environment you import only those you need and the problem doesn't exist anymore.
If you still believe it's not the case and insist on keeping all the functions in one module then you can always do something like:
if env_type == 'env1':
import pyfiglet
if env_type in ('env1', 'env2'):
import pynose
import gzio
How you deduct the env_type is up to you. Might come from some configuration file or the environment variable.
Then you have your functions in this module. No problem occurs if none of the module consumers calls function which makes use of the module which is unavailable in a given environment.
I don't see any point in throwing your custom exception. NameError exception will be thrown anyway upon trial to access not imported name. By the way your sys.exit() would never be executed anyway.
If you don't want to define environment types, you can still achieve the goal with the following code:
try: import pyfiglet
except ImportError: pass
try: import pynose
except ImportError: pass
try: import gzio
except ImportError: pass
Both code snippets are supposed to be used on module level and not inside functions.
TL;DR I would just break this module into several parts. But if you really have to keep it monolithic, just use the basic language features and don't over-engineer it by using __import__ and a dedicated function.

Modify the strategy importing module by hacking the sys.modules

As I read the question. I came up with an idea. But I don't know the consequences of my guesswork.
My idea is that change the import strategy by modify the sys.modules, then change the import things without modify old code.
Edit 1
A situation use the method
Hack code:
try:
import concurrent.futures
except ImportError:
concurrent.futures = wrapper_futures
Then this code can use for python2 and python3
Old code:
from concurrent.futures import Future
try:
from servicelibrary.simple import synchronous
except ImportError:
from servicelibrary.simple import alternative as synchronous
is probably a better way to do it if I understand your question properly

Choose adapter dynamically depending on librarie(s) installed

I am designing a library that has adapters that supports a wide-range of libraries. I want the library to dynamically choose which ever adapter that has the library it uses installed on the machine when importing specific classes.
The goal is to be able to change the library that the program depends on without having to make modifications to the code. This particular feature is for handling RabbitMQ connections, as we have had a lot of problems with pika, we want to be able to change to a different library e.g. pyAMPQ or rabbitpy without having to change the underlying code.
I was thinking of implementing something like this in the __init__.py file of servicelibrary.simple.
try:
#import pika # Is pika installed?
from servicelibrary.simple.synchronous import Publisher
from servicelibrary.simple.synchronous import Consumer
except ImportError:
#import ampq # Is ampq installed?
from servicelibrary.simple.alternative import Publisher
from servicelibrary.simple.alternative import Consumer
Then when the user imports the library
from servicelibrary.simple import Publisher
The underlying layer looks something like this
alternative.py
import amqp
class Publisher(object):
......
class Consumer(object):
......
synchronous.py
import pika
class Publisher(object):
......
class Consumer(object):
......
This would automatically pick the second one when the first one is not installed.
Is there a better way of implementing something like this? If anyone could link a library/adapter with a similar implementation that would be helpful as well.
[Edit]
What would be the cleanest way to implement something like this? In the future I would also like to be able to change the default preference. Ultimately I may just settle for using the library installed, as I can control that, but it would be a nice feature to have.
Alexanders suggestion is interesting, but I would like to know if there is a cleaner way.
[Edit2]
The original example was simplified. Each module may contain multiple types of imports, e.g. Consumer and Publisher.
The importlib.import_module might do what you need:
INSTALLED = ['syncronous', 'alternative']
for mod_name in INSTALLED:
try:
module = importlib.import_module('servicelibrary.simple.' + mod_name)
Publisher = getattr(module, 'Publisher')
if Publisher:
break # found, what we needed
except ImportError:
continue
I guess, this is not the most advance technique, but the idea should be clear.
And you can take a look at the imp module as well.
A flexible solution, using importlib. This is a complete, working solution that i've tested.
First, the header:
import importlib
parent = 'servicelib.simple'
modules = {'.synchronous':['.alternative', '.alternative_2']}
success = False #an indicator, default is False,
#changed to True when the import succeeds.
We import the required module, set our indicator, and specify our modules. modules is a dictionary, with the key set as the default module, and the value as a list of alternatives.
Next, the import-ant part:
#Obtain the module
for default, alternatives in modules.items():
try: #we will try to import the default module first
mod = importlib.import_module(parent+default)
success = True
except ImportError: #the default module fails, try the alternatives
for alt in alternatives:
try: #try the first alternative, if it still fails, try the next one.
mod = importlib.import_module(parent+alt)
success = True
#Stop searching for alternatives!
break
except ImportError:
continue
print 'Success: ', success
And to have the classes, simply do:
Publisher = mod.Publisher
Consumer = mod.Consumer
With this solution, you can have multiple alternatives at once. For example, you can use both rabbitpy and pyAMPQ as your alternatives.
Note: Works with both Python 2 and Python 3.
If you have more questions, feel free to comment and ask!
You've got the right idea. Your case works because each subobject has the same sort of classes e.g. both APIs have a class called Publisher and you can just make sure the correct version is imported.
If this isn't true (if possible implementation A and B are not similar) you write your own facade, which is just your own simple API that then calls the real API with the correct methods/parameters for that library.
Obviously switching between choices may require some overhead (i don't know your case, but for instance, let's say you had two libraries to walk through an open file, and the library handles opening the file. You can't just switch to the second library in the middle of the file and expect it to start where the first library stopped). But it's just a matter of saving it:
accessmethods = {}
try:
from modA.modB import classX as apiA_classX
from modA.modB import classY as apiA_classY
accessmethods['apiA'] = [apiA_classX, apiA_classY]
classX = apiA_classX
classY = apiA_classY
except:
pass
try:
from modC.modD import classX as apiB_classX
from modC.modD import classY as apiB_classY
accessmethods['apiB'] = [apiB_classX, apiB_classY]
classX = apiB_classX
classY = apiB_classY
except:
pass
def switchMethod(method):
global classX
global classY
try:
classX, classY = accessmethods[method]
except KeyError as e:
raise ValueError, 'Method %s not currently available'%method
etc.
I know two method, one is wildly used and another is my guesswork. You can choose one for your situation.
The first one, which is widely used, such as from tornado.concurrent import Future.
try:
from concurrent import futures
except ImportError:
futures = None
#define _DummyFuture balabala...
if futures is None:
Future = _DummyFuture
else:
Future = futures.Future
Then you can use from tornado.concurrent import Future in other files.
The second one, which is my guesswork, and I write simple demo, but I haven't use it in production environment because I don't need it.
import sys
try:
import servicelibrary.simple.synchronous
except ImportError:
import servicelibrary.simple.alternative
sys.modules['servicelibrary.simple.synchronous'] = servicelibrary.simple.alternative
You can run the script before other script import servicelibrary.simple.synchronous. Then you can use the script as before:
from servicelibrary.simple.synchronous import Publisher
from servicelibrary.simple.synchronous import Consumer
The only thing I wonder is that what are the consequences of my guesswork.
Based on the answers I ended up with the following implementation for Python 2.7.
Examples are simplified for stackoverflow..
from importlib import import_module
PARENT = 'myservicelib.rabbitmq'
MODULES = ['test_adapter',
'test_two_adapter']
SUCCESS = False
for _module in MODULES:
try:
__module = import_module('{0}.{1}'.format(PARENT, _module))
Consumer = getattr(__module, 'Consumer')
Publisher = getattr(__module, 'Publisher')
SUCCESS = True
break
except ImportError:
pass
if not SUCCESS:
raise NotImplementedError('no supported rabbitmq library installed.')
Although, as I also had some of my projects running Python 2.6 I had to either modify the code, or include importlib. The problem with a production platform is that it isn't always easy to include new dependencies.
This is the compromise I came up with, based on __import__ instead of importlib.
It might be worth checking if sys.modules actually contains the namespace so you don't get a KeyError raised, but it is unlikely.
import sys
PARENT = 'myservicelib.rabbitmq'
MODULES = ['test_adapter',
'test_two_adapter']
SUCCESS = False
for _module in MODULES:
try:
__module_namespace = '{0}.{1}'.format(PARENT, _module)
__import__(__module_namespace)
__module = sys.modules[__module_namespace]
Consumer = getattr(__module, 'Consumer')
Publisher = getattr(__module, 'Publisher')
SUCCESS = True
break
except ImportError:
pass
if not SUCCESS:
raise NotImplementedError('no supported rabbitmq library installed.')

Categories