Choose adapter dynamically depending on librarie(s) installed - python

I am designing a library that has adapters that supports a wide-range of libraries. I want the library to dynamically choose which ever adapter that has the library it uses installed on the machine when importing specific classes.
The goal is to be able to change the library that the program depends on without having to make modifications to the code. This particular feature is for handling RabbitMQ connections, as we have had a lot of problems with pika, we want to be able to change to a different library e.g. pyAMPQ or rabbitpy without having to change the underlying code.
I was thinking of implementing something like this in the __init__.py file of servicelibrary.simple.
try:
#import pika # Is pika installed?
from servicelibrary.simple.synchronous import Publisher
from servicelibrary.simple.synchronous import Consumer
except ImportError:
#import ampq # Is ampq installed?
from servicelibrary.simple.alternative import Publisher
from servicelibrary.simple.alternative import Consumer
Then when the user imports the library
from servicelibrary.simple import Publisher
The underlying layer looks something like this
alternative.py
import amqp
class Publisher(object):
......
class Consumer(object):
......
synchronous.py
import pika
class Publisher(object):
......
class Consumer(object):
......
This would automatically pick the second one when the first one is not installed.
Is there a better way of implementing something like this? If anyone could link a library/adapter with a similar implementation that would be helpful as well.
[Edit]
What would be the cleanest way to implement something like this? In the future I would also like to be able to change the default preference. Ultimately I may just settle for using the library installed, as I can control that, but it would be a nice feature to have.
Alexanders suggestion is interesting, but I would like to know if there is a cleaner way.
[Edit2]
The original example was simplified. Each module may contain multiple types of imports, e.g. Consumer and Publisher.

The importlib.import_module might do what you need:
INSTALLED = ['syncronous', 'alternative']
for mod_name in INSTALLED:
try:
module = importlib.import_module('servicelibrary.simple.' + mod_name)
Publisher = getattr(module, 'Publisher')
if Publisher:
break # found, what we needed
except ImportError:
continue
I guess, this is not the most advance technique, but the idea should be clear.
And you can take a look at the imp module as well.

A flexible solution, using importlib. This is a complete, working solution that i've tested.
First, the header:
import importlib
parent = 'servicelib.simple'
modules = {'.synchronous':['.alternative', '.alternative_2']}
success = False #an indicator, default is False,
#changed to True when the import succeeds.
We import the required module, set our indicator, and specify our modules. modules is a dictionary, with the key set as the default module, and the value as a list of alternatives.
Next, the import-ant part:
#Obtain the module
for default, alternatives in modules.items():
try: #we will try to import the default module first
mod = importlib.import_module(parent+default)
success = True
except ImportError: #the default module fails, try the alternatives
for alt in alternatives:
try: #try the first alternative, if it still fails, try the next one.
mod = importlib.import_module(parent+alt)
success = True
#Stop searching for alternatives!
break
except ImportError:
continue
print 'Success: ', success
And to have the classes, simply do:
Publisher = mod.Publisher
Consumer = mod.Consumer
With this solution, you can have multiple alternatives at once. For example, you can use both rabbitpy and pyAMPQ as your alternatives.
Note: Works with both Python 2 and Python 3.
If you have more questions, feel free to comment and ask!

You've got the right idea. Your case works because each subobject has the same sort of classes e.g. both APIs have a class called Publisher and you can just make sure the correct version is imported.
If this isn't true (if possible implementation A and B are not similar) you write your own facade, which is just your own simple API that then calls the real API with the correct methods/parameters for that library.
Obviously switching between choices may require some overhead (i don't know your case, but for instance, let's say you had two libraries to walk through an open file, and the library handles opening the file. You can't just switch to the second library in the middle of the file and expect it to start where the first library stopped). But it's just a matter of saving it:
accessmethods = {}
try:
from modA.modB import classX as apiA_classX
from modA.modB import classY as apiA_classY
accessmethods['apiA'] = [apiA_classX, apiA_classY]
classX = apiA_classX
classY = apiA_classY
except:
pass
try:
from modC.modD import classX as apiB_classX
from modC.modD import classY as apiB_classY
accessmethods['apiB'] = [apiB_classX, apiB_classY]
classX = apiB_classX
classY = apiB_classY
except:
pass
def switchMethod(method):
global classX
global classY
try:
classX, classY = accessmethods[method]
except KeyError as e:
raise ValueError, 'Method %s not currently available'%method
etc.

I know two method, one is wildly used and another is my guesswork. You can choose one for your situation.
The first one, which is widely used, such as from tornado.concurrent import Future.
try:
from concurrent import futures
except ImportError:
futures = None
#define _DummyFuture balabala...
if futures is None:
Future = _DummyFuture
else:
Future = futures.Future
Then you can use from tornado.concurrent import Future in other files.
The second one, which is my guesswork, and I write simple demo, but I haven't use it in production environment because I don't need it.
import sys
try:
import servicelibrary.simple.synchronous
except ImportError:
import servicelibrary.simple.alternative
sys.modules['servicelibrary.simple.synchronous'] = servicelibrary.simple.alternative
You can run the script before other script import servicelibrary.simple.synchronous. Then you can use the script as before:
from servicelibrary.simple.synchronous import Publisher
from servicelibrary.simple.synchronous import Consumer
The only thing I wonder is that what are the consequences of my guesswork.

Based on the answers I ended up with the following implementation for Python 2.7.
Examples are simplified for stackoverflow..
from importlib import import_module
PARENT = 'myservicelib.rabbitmq'
MODULES = ['test_adapter',
'test_two_adapter']
SUCCESS = False
for _module in MODULES:
try:
__module = import_module('{0}.{1}'.format(PARENT, _module))
Consumer = getattr(__module, 'Consumer')
Publisher = getattr(__module, 'Publisher')
SUCCESS = True
break
except ImportError:
pass
if not SUCCESS:
raise NotImplementedError('no supported rabbitmq library installed.')
Although, as I also had some of my projects running Python 2.6 I had to either modify the code, or include importlib. The problem with a production platform is that it isn't always easy to include new dependencies.
This is the compromise I came up with, based on __import__ instead of importlib.
It might be worth checking if sys.modules actually contains the namespace so you don't get a KeyError raised, but it is unlikely.
import sys
PARENT = 'myservicelib.rabbitmq'
MODULES = ['test_adapter',
'test_two_adapter']
SUCCESS = False
for _module in MODULES:
try:
__module_namespace = '{0}.{1}'.format(PARENT, _module)
__import__(__module_namespace)
__module = sys.modules[__module_namespace]
Consumer = getattr(__module, 'Consumer')
Publisher = getattr(__module, 'Publisher')
SUCCESS = True
break
except ImportError:
pass
if not SUCCESS:
raise NotImplementedError('no supported rabbitmq library installed.')

Related

the pythonic way of optional imports [duplicate]

This question already has answers here:
What's Python good practice for importing and offering optional features?
(7 answers)
Closed last month.
I have a package that allows the user to use any one of 4 packages they want to connect to a database. It works great but I'm unhappy with the way I'm importing things.
I could simply import all the packages, but I don't want to do that in case the specific user doesn't ever need to use turbodbc for example:
import pyodbc
import pymssql
import turbodbc
from ibmdbpy.base import IdaDataBase
Currently, I have the following situation. I try to import all of them, but the ones that don't import, no problem, My program simply assumes they will not be called and if they are it errors:
# some envs may not have all these packages installed so we try each:
try:
import pyodbc
except:
pass
try:
import pymssql
except:
pass
try:
import turbodbc
except:
pass
try:
from ibmdbpy.base import IdaDataBase
except:
pass
This doesn't feel pythonic. So I know there are packages such as holoviews or tensorflow that allow you to specify a backend. They are of course orders of magnitude more complicated than mine, but they have to handle the same pattern.
How can I make this code right? it is technically buggy because if they intend to use pyodbc but don't have it installed, my program will not warn them, it will error at runtime. So really this goes beyond esthetics or philosophy; this is technically error-prone code.
How would you handle this situation?
Fyi, here is an example of how the code is called:
connect('Playground', package='pymssql')
try:
import pyodbc
except ImportError:
pyodbc = None
then later:
if pyodbc is None and user_wants_to_use_pyodbc:
print_warning()
raise SomeConfigurationErrorOrSuch()
This approach works well for a small number of options. If you have enough options that you need to abstract out this approach, then you can use the importlib module to import modules under the control of your program.
I would use import_module from importlib:
from importlib import import_module
modules_to_import = ['pyodbc', 'pymssql', 'turbodbc', 'ibmdbpy.base.IdaDataBase']
for m in modules_to_import:
try:
globals()[m.split('.')[-1]] = import_module(m)
except ModuleNotFoundError:
print('Module {} not found'.format(m))
I've used something similar to the answers above, but sometimes you might need to mock an object to fool lint.
try:
from neuralprophet import NeuralProphet
using_neuralprophet = True
except ImportError:
class NeuralMock:
whatever=False
using_neuralprophet = False
NeuralProphet = NeuralMock()
Source: timemachines
You can put imports in places other than the beginning of the file. "Re-importing" something doesn't actually do anything, so it's not computationally expensive to import x frequently:
def switch(x):
if x == 'a':
import json
json.load(file)
elif x == 'b':
import pandas as pd
pd.read_csv(file)
You can also use importlib to dynamically import modules. This is especially useful if you have multiple implementations of the same API that you want to choose between
class Connection:
def __init__(self, driver_module, driver_name):
# or driver_module, driver_name = full_path.rsplit('.', 1)
self.driver = get_attr(importlib.load_module(driver_module), driver_name)()
def method(self):
return self.driver.do()

How can I get a module from a URL and import it such that its problematic dependencies are ignored?

Ok, this is extremely hacky and silly programming. I have a function that can import a module file from a URL. It works fine and is hardly secure. I want to import a module file that has a problematic dependency (the system I'm on can't support that particular dependency) but the functionality I want from that module doesn't rely on the problematic dependency, so it's not a problem if it can be ignored.
My thinking is that I could use this smuggle function (shown below) to get the module file and then somehow import it using FuckIt.py, but I'm not sure how to make these two ideas work together.
How could this be done?
import imp
import urllib
def smuggle(
module_name = None,
URL = None
):
if module_name is None:
module_name = URL
try:
module = __import__(module_name)
return(module)
except:
try:
module_string = urllib.urlopen(URL).read()
module = imp.new_module("module")
exec module_string in module.__dict__
return(module)
except:
raise(
Exception(
"module {module_name} import error".format(
module_name = module_name
)
)
)
sys.exit()
damned_silly_module = smuggle(
module_name = "damned_silly_module",
URL = "https://raw.githubusercontent.com/https://github.com/justsomefuckingguy/damned_silly_module/master/damned_silly_module.py"
)
damned_silly_module.some_function_or_other()
Putting aside Fuckit.py, if this is about a particular module with particular failing dependencies, the best way to get this to work is by making the import of the dependency succeed: Provide a mock sub-module with the same name, with stubs for whatever will be asked for. For example, if damn_silly_module tries to import silly_walks, which you don't have, make a mock silly_walks module and arrange for it to be found.
import sys
sys.path.insert(0, "path/to/mock/modules")
module = imp.new_module("module")
Or something like that. You could even catch ImportError and do this only if the module in question is absent. This is analogous to the python 2 custom of importing, say, cPickle as pickle and failing back to import pickle if that is unavailable.
If you want this to work in general, with modules you'll see in the future, you'd need to catch ImportError, examine it to figure out what's missing, mock it on the fly and try again.
Incidentally, your exception handling needs some work. Never catch everything (with except: with no arguments); catch ImportError (and perhaps NameError if the import succeeds but a later name lookup fails). Never raise an undifferentiated Exception, raise ImportError. In this case, it may be better to re-raise the exception you just caught, with a simple
raise
And get rid of sys.exit(). It's dead code-- it'll never be reached. (Also: raise is a keyword, you don't need function brackets around its arguments).

Python: Define a function only if package exists

Is it possible to tell Python 2.7 to only parse a function definition if a package exists?
I have a script that is run on multiple machines. There are some functions defined in the script that are very nice to have, but aren't required for the core operations the script performs. Some of the machines the script is run on don't have the package that the function imports, (and the package can't be installed on them). Currently I have to comment out the function definition before cloning the repo onto those machines. Another solution would be to maintain two different branches but that is even more tedious. Is there a solution that prevents us from having to constantly comment out code before pushing?
There are already solutions for when the function is called, such as this:
try:
someFunction()
except NameError:
print("someFunction() not found.")
Function definitions and imports are just code in Python, and like other code, you can wrap them in a try:
try:
import bandana
except ImportError:
pass # Hat-wearing functions are optional
else:
def wear(hat):
bandana.check(hat)
...
This would define the wear function only if the bandana module is available.
Whether this is a good idea or not is up to you - I think it would be fine in your own scripts, but you might not want to do this in code other people will use. Another idea might be to do something like this:
def wear(hat):
try:
import bandana
except ImportError:
raise NotImplementedError("You need the bandana package to wear hats")
else:
bandana.check(hat)
...
This would make it clearer why you can't use the wear function.
A somewhat improved solution is as follows:
In file header:
try:
# Optional dependency
import psutil
except ImportError as e:
psutil = e
Later in the beginning of your function or inside __init__ method:
if isinstance(psutil, ImportError):
raise psutil
Pros: you get the original exception message when you access optional functionality. Just as if you've did simply import psutil

How should one write the import procedures in a module that uses imported modules in a limited way?

I have a module that features numerous functions. Some of these functions are dependent on other modules. The module is to be used in some environments that are not going to have these other modules. In these environments, I want the functionality of the module that is not dependent on these other unavailable modules to be usable. What is a reasonable way of coding this?
I am happy for an exception to be raised when a function's dependency module is not met, but I want the module to be usable for other functions that do not have dependency problems.
At present, I'm thinking of a module-level function something like the following:
tryImport(moduleName = None):
try:
module = __import__(moduleName)
return(module)
except:
raise(Exception("module {moduleName} import error".format(
moduleName = moduleName)))
sys.exit()
This would then be used within functions in a way such as the following:
def function1():
pyfiglet = tryImport(moduleName = "pyfiglet")
For your use case, it sounds like there's nothing wrong with putting the imports you need inside functions, rather than at the top of your module (don't worry, it costs virtually nothing to reimport):
def some_function():
import foo # Throws if there is no foo
return foo.bar ()
Full stop, please.
The first thing is that the problem you described indicates that you should redesign your module. The most obvious solution is to break it into a couple of modules in one package. Each of them would contain group of functions with common external dependencies. You can easily define how many groups you need if you only know which dependencies might be missing on the target machine. Then you can import them separately. Basically in a given environment you import only those you need and the problem doesn't exist anymore.
If you still believe it's not the case and insist on keeping all the functions in one module then you can always do something like:
if env_type == 'env1':
import pyfiglet
if env_type in ('env1', 'env2'):
import pynose
import gzio
How you deduct the env_type is up to you. Might come from some configuration file or the environment variable.
Then you have your functions in this module. No problem occurs if none of the module consumers calls function which makes use of the module which is unavailable in a given environment.
I don't see any point in throwing your custom exception. NameError exception will be thrown anyway upon trial to access not imported name. By the way your sys.exit() would never be executed anyway.
If you don't want to define environment types, you can still achieve the goal with the following code:
try: import pyfiglet
except ImportError: pass
try: import pynose
except ImportError: pass
try: import gzio
except ImportError: pass
Both code snippets are supposed to be used on module level and not inside functions.
TL;DR I would just break this module into several parts. But if you really have to keep it monolithic, just use the basic language features and don't over-engineer it by using __import__ and a dedicated function.

Load modules conditionally Python

I'm wrote a main python module that need load a file parser to work, initially I was a only one text parser module, but I need add more parsers for different cases.
parser_class1.py
parser_class2.py
parser_class3.py
Only one is required for every running instance, then I'm thinking load it by command line:
mmain.py -p parser_class1
With this purpose I wrote this code in order to select the parser to load when the main module will be called:
#!/usr/bin/env python
import argparse
aparser = argparse.ArgumentParser()
aparser.add_argument('-p',
action='store',
dest='module',
help='-p module to import')
results = aparser.parse_args()
if not results.module:
aparser.error('Error! no module')
try:
exec("import %s" %(results.module))
print '%s imported done!'%(results.module)
except ImportError, e:
print e
But, I was reading that this way is dangerous, maybe no stardard..
Then, is this approach ok? or I must find another way to do it?
Why?
Thanks, any comment are welcome.
You could actually just execute the import statement inside a conditional block:
if x:
import module1a as module1
else:
import module1b as module1
You can account for various whitelisted module imports in different ways using this, but effectively the idea is to pre-program the imports, and then essentially use a GOTO to make the proper imports... If you do want to just let the user import any arbitrary argument, then the __import__ function would be the way to go, rather than eval.
Update:
As #thedox mentioned in the comment, the as module1 section is the idiomatic way for loading similar APIs with different underlying code.
In the case where you intend to do completely different things with entirely different APIs, that's not the pattern to follow.
A more reasonable pattern in this case would be to include the code related to a particular import with that import statement:
if ...:
import module1
# do some stuff with module1 ...
else:
import module2
# do some stuff with module2 ...
As for security, if you allow the user to cause an import of some arbitrary code-set (e.g. their own module, perhaps?), it's not much different than using eval on user-input. It's essentially the same vulnerability: the user can get your program to execute their own code.
I don't think there's a truly safe manner to let the user import arbitrary modules, at all. The exception here is if they have no access to the file-system, and therefore cannot create new code to be imported, in which case you're basically back to the whitelist case, and may as well implement an explicit whitelist to prevent future-vulnerabilities if/when at some point in the future the user does gain file-system access.
here is how to use __import__()
allowed_modules = ['os', 're', 'your_module', 'parser_class1.py', 'parser_class2.py']
if not results.module:
aparser.error('Error! no module')
try:
if results.module in allowed_modules:
module = __import__(results.module)
print '%s imported as "module"'%(results.module)
else:
print 'hey what are you trying to do?'
except ImportError, e:
print e
module.your_function(your_data)
EVAL vs __IMPORT__()
using eval allows the user to run any code on your computer. Don't do that. __import__() only allows the user to load modules, apparently not allowing user to run arbitrary code. But it's only apparently safer.
The proposed function, without allowed_modules is still risky since it can allow to load an arbitrary model that may have some malicious code running on when loaded. Potentially the attacker can load a file somewhere (a shared folder, a ftp folder, a upload folder managed by your webserver ...) and call it using your argument.
WHITELISTS
Using allowed_modules mitigates the problem but do not solve it completely: to hardening even more you still have to check if the attacker wrote a "os.py", "re.py", "your_module.py", "parser_class1.py" into your script folder, since python first searches module there (docs).
Eventually you may compare parser_class*.py code against a list of hashes, like sha1sum does.
FINAL REMARKS: At the real end, if user has write access to your script folder you cannot ensure an absolutely safe code.
You should think of all of the possible modules you may import for that parsing function and then use a case statement or dictionary to load the correct one. For example:
import parser_class1, parser_class2, parser_class3
parser_map = {
'class1': parser_class1,
'class2': parser_class2,
'class3': parser_class3,
}
if not args.module:
#report error
parser = None
else:
parser = parser_map[args.module]
#perform work with parser
If loading any of the parser_classN modules in this example is expensive, you can define lambdas or functions that return that module (i.e. def get_class1(): import parser_class1; return parser_class1) and alter the line to be parser = parser_map[args.module]()
The exec option could be very dangerous because you're executing unvalidated user input. Imagine if your user did something like -
mmain.py -p "parser_class1; some_function_or_code_that_is_malicious()"

Categories