How to provide an API to extend a Python program with plugins? - python

I was wondering how I can provide an API for my Python program to enable others to extend it with plugins.
I thought about something like from myProgram.plugins import aClassToExtendByOthers, registerThatClass.
But I have no idea how to provide this.
I could use an exec statement within my loadPlugins function for every plugin in the plugins-folder, but this would not enable importing stuff I would like to
provide for people to write those plugins.

For a system that I use in several of my programs, I define a directory of plugins, and provide a base plugin class for all plugins to subclass. I then import all modules in the directory, and selectively initialize (by checking to see if they subclass my base plugin class), and store instances of plugins in a dictionary (or list). I have found that the command dispatch pattern has worked effectively for me as a way to structure the plugins and pass events. The base plugin (or another optional interface class) can provide the methods that the plugin needs to interact with the application. I hope this helps. It may not be the best way to do it, but it has worked for me.
In addition, you could do additional filtering, such as requiring plugin files to have a prefix (e.g. __plug_file.py__ and __plug_other.py__).

you can use imp module (see docs.python.org)
sys.path.insert(0, pluginsDir)
lst = map(lambda x: os.path.splitext(os.path.basename(x))[0], glob.glob(os.path.join(pluginsDir, "*.py")))
for module in lst:
try:
f, fn, d = imp.find_module(module,[pluginsDir])
loaded = imp.load_module(module, f, fn, d)
for fully functional example see the loader of ojuba control center
http://git.ojuba.org/cgit/occ/tree/OjubaControlCenter/loader.py

Related

Python: Determine which package contains the currently running code (in order to retrieve package resources)?

Learning about Python package resources. I understand I can include arbitrary files and directories in my package using include_package_data.
Let's say I write a package that contains a single class:
(mypackage / __init__.py)
from importlib_resources import files
class myClass:
def getResource(self, filename):
return files('mypackage').joinpath(filename).read_text()
Now I package up this code into an installable Python package using a setup.py file, and install it.
Now, another developer comes along and wants to subclass my class. That user is also going to package their code in a package and release it, with a dependency being my package. And they want the getResource function to pull resources from their package, not mine:
( otherpackage / __init__.py )
from importlib_resources import files
from mypackage import myClass
class myNewClass(myClass):
def getResource(self, filename):
return files('otherpackage').joinpath(filename).read_text()
This is hardly ideal, since the other developer must supersede the getResource method, copy my code, and just change the package name. (Edit: This could be accomplished with super and passing in the package name, but it's still an extra step and could be argued to violate the DRY principle)
The question is: Is there a way to get the name of the Python package that the currently executing code is part of? (Or perhaps a better approach would be determining which package a class belongs to) That way, my package could simply discover that it's now running inside a subclass of otherpackage, and get the resources from otherpackage appropriately. (I could code a fallback, so that if the user doesn't include a resource in their package, I can retrieve it from a default resource in mypackage).
In this example, I want code running in an instance of myClass to get mypackage, but code running in an instance of myNewClass to get otherpackage, so I can pass this as the parameter to files(). I'd expect that if the code were not running in a package, I could simply get None back, and handle that situation using a fallback utilizing __file__.
Put yet another way, I wish for a function such that if the developer directly instantiates myClass and the function is called in the class's code, it should pull resources from mypackage; if the developer instantiates myNewClass it should use that developer's package. If the developer instantiated myNewClass outside of a package, I should be able to detect that and fallback on locating resources by way of path composition with __file__. I would like this to happen without the other developer ever needing to explicitly provide the name of their package to my code - the DRY principle applies here.
Secondary question: if this second developer sets up their package in dev/link mode (using -e on pip install), will this work while they are developing the package? In other words, can the other developer use the importlib.resources library to access the files in their dev tree from their code transparently?
Naturally, I would document how developers who subclass the API must deal with package resources (e.g. where to put files), so they would simply follow any directory names or other conventions I built into myClass's expectations.
In order to find the module where a certain class was defined you can use inspect.getmodule:
from importlib.resources import read_text
from inspect import getmodule
class myClass:
def getResource(self, filename):
return read_text(getmodule(self).__name__, filename)
If you want to take into account where a certain class gets instantiated, you'd have to rely on frame inspection, e.g. via inspect.stack:
from inspect import stack
class myClass:
def getResource(self, filename):
caller_filename = stack()[1].filename
... # do something based on `caller_filename`
return read_text(getmodule(self).__name__, filename)
However, you problem description sounds like namespace packages would be useful (see also PEP 420), where developers can just drop their own stuff into the namespace package. So only your distribution would contain the logic for picking up the various resources.

Is it possible to overwrite a Python built-in for the whole Python package?

I have a Python package with dozens of subpackages/modules. Nearly each of the modules uses open built-in Python function. I have written an own implementation of the file opening function and would like to "redirect" all open calls I have in the package modules to my_open function.
I am aware that it is possible to write open = my_open_file in the top of the module to shadow the open within the module, but this would imply editing each module. Alternatively, putting open = my_open_file in the __init__.py of the package and then doing from package_name import open which also implies adding a single line of code to each module.
Is it possible to define the my_open_file function for the package scope just in a single place? Or adding a single line of code in each module is my only option?
Think about what you're asking to do with this module: you've written your own package, a personal change to the way a built-in function is linked. You want to redefine a standard language feature. That's okay ... part of the power of a language is the capability to do things the way you define.
The problem comes when you want to do this by default, overriding that standard capability for code that hasn't explicitly requested that change. You cannot do this without some high-powered authority. The most straightforward way is to rebuild your Python system from sources, replacing the standard open with your own.
In reality, the "normal" way is to have each of your application modules "opt-in", declaring explicitly that it wants to use the new open instead of the one defined by the language. This is the approach you've outlined in your second paragraph. How is this a problem for you? Inserting a line in each of a list of parameterized files is a single system command.

Robot Framework Library Dynamic Import Not Remaining Global

Some background
I'm using Robot Framework with Python to create a small framework for test automation. I have a few different libraries; a couple are application-specific and one has keywords that I would like to be always available. This always-available library is my common library and I need it to be accessible from the functions in my other libraries as well.
The way I have accomplished this until now has been some boilerplate at the top of my other libraries. Specifically, in my other libraries I have:
try:
self.common_library = BuiltIn().get_library_instance("my_common_lib")
except RuntimeError:
BuiltIn().import_library("my_common_lib", True)
self.common_library = BuiltIn().get_library_instance("my_common_lib")
This code checks the current Robot context for the existence of the common library and gets a reference to it, importing the library first if necessary. This means that I have a reference to the common library within all my other libraries and means that whenever I import any of my libraries in the Robot settings table, I have access to the common library keywords too.
The issue is that when running multiple Robot tests in sequence, the common library seems to disappear. I have a few Robot scripts in a directory and run "robot *.robot". In each test, I run a keyword from the common library. I never import the common library in the settings table, as it should be automatically imported by the other libraries as outlined above. In the first test, the common library exists and the keywords in it work fine. In all the following tests, I am getting a keyword not found error. When I print the results of BuiltIn().get_library_instance(all=True) I can see that, while my application-specific library is still loaded, the common library no longer exists.
The Problem
All my libraries have ROBOT_LIBRARY_SCOPE = 'GLOBAL' in them, including the common library. My common library is dynamically imported via BuiltIn and has a global scope defined, but it seems to fall out of scope when running subsequent tests in one command. Is there a reason that dynamically imported libraries fall out of scope even if they have a global library scope?
Essentially, I want this common library to always be available in my robot scripts and have each of my custom libraries maintain a reference to the common library. If there is a better way to accomplish this or some way to make what I'm currently doing work, please let me know! Thanks.
A solution may be to unconditionally import the common library in all your custom libraries. E.g. in their constructors (__init__()) call this:
BuiltIn().import_library("my_common_lib", True)
Thus you will always have its keywords in scope. Naturally if that common lib does a step that must be ran only once (affects some resource, for example), that had to be accommodated for in it (using a singleton pattern, or something similar).
Edit: come to think of it, this might also not work, the __init__() will be called just once as the libraries are with global scope; and thus the common one's keywords will once again not be imported in the suite's namespace.
Enter RF's listener interface :) : in your custome libraries define the class method suite_start(), and move the try-except block in it. On the start of every suite that uses such library, the method will be executed, and the common keywords - available.
The same precaution as two paragraphs above - make sure reimporting the common library has no side effects.
Another solution may be to change the custom libraries scope to 'TEST SUITE', as you have already deducted yourself (and are reluctant to do, based on the comments :).
Thus the custom libraries will be reinstantiated on every import in the suites, and they will import the common library in the suite's namespace.

How to consume/integrate third-party modules in Python

I am new to Python and recently started writing a script which essentially reads a MySQL database and archives some files by uploading them to Amazon Glacier. I am using the Amazon-provided boto module along with a few other modules.
I noticed that I seem to be replicating the same pattern over and over again when installing and taking advantage of these modules that connect to external services. First, I write a wrapper module which reads my global config values and then defines a connection function, then I start writing functions in that module which perform the various tasks. For example, at the moment, my boto wrapper module is named awsbox and it consists of functions like getConnection and glacierUpload. Here's a brief example:
import config,sys,os
import boto,uuid
_awsConfig = config.get()['aws']
def getGlacierConnection():
return boto.connect_glacier( aws_access_key_id=_awsConfig['access_key_id'],
aws_secret_access_key=_awsConfig['secret_access_key'])
def glacierUpload( filePath ):
if not os.path.isfile( filePath ):
return False
awsConnect = getGlacierConnection()
vault = awsConnect.get_vault( _awsConfig['vault'] )
vault.upload_archive( filePath )
return True
My question is, should I be writing these "wrapper" modules? Is this the Pythonic way to consume these third-party modules? This method makes sense to me but I wonder if creating these interfaces makes my code less portable or modular, or whether or not there is a better way to integrate these various disparate modules into my main script structure.
You're using the modules as intented. You import them and then use them. As I see it, awsbox is the module that holds the implementation of the functions that match your needs.
So answering your cuestion:
should I be writing these "wrapper" modules?, yes (you can stop calling them "wrappers"), an error would be to rewrite those installed modules.
Is this the Pythonic way to consume these third-party modules?, Is the Python way. Authors write modules for you tu use(import).

How do I use multiple .mo files simultaneously for gettext translation?

In short, can I use in python many .mo files for the same language in the same time?
In my python application, I need to use gettext for I18N. This app uses kind of a plug-in system. This means you can download a plug-in and put it in the appropriate directory, and it runs like any other python package. The main application stores the .mo file it uses in let's say ./locale/en/LC_MESSAGES/main.mo. And the plug-in nr 1 has its own .mo file called plugin1.mo in that same directory.
I would use this to load the main.mo I18N messages:
gettext.install('main', './locale', unicode=False)
How can I install the others too, so that all the plug-ins are translated the way they should be?
The solutions I thought of:
Should I gettext.install() in each package's namespace? But this would override the _() defined previously and mess the future translations of the main application.
Is there a way to combine two .mo files in one (when a new plug-in is installed for example)?
At runtime can I combine them into one GNUTranslation object? Or override the default _() method that is added to the global namespace? Then, how would I go with that option?
Instead of _('Hello World'), I would use _('plugin1', 'Hello World in plug-in 1')
Note: The application is not supposed to be aware of all the plug-ins to be installed, so it cannot already have all the messages translated in its main.mo file.
gettext.install() installs the inalterable one and only _ into the builtins (module __builtin__ or builtins in py3) - app-global. This way there is no flexibility.
Note: Python name resolution order is: locals > module-globals > builtins .
So anyway gettext.translation() (Class-based API) or even gettext.GNUTranslations() (e.g. for custom .mo path schemes) would be used explicitely for having multiple translations separately or mixed-style at the same time.
Some options:
Via t = gettext.translation(...); _ = t.ugettext you could simply put separate translations as _ into each module-global namespace - probably more automated way in real world. Just perhaps the main translation could still go into the builtins too (via main_t.install()).
When a mix-all of all/many translations is ok or is what you want, you could chain several translations globally via t.install(); t.add_fallback(t_plugin1); t.add_fallback(t_plugin1);...; - and preserve an app-global approach otherwise.
gettext keywords other than _ could be used - and could be feed via the xgettext -k other_keyword option. But I'd dislike lengthy and module-unique names.
( Yet personally I prefer the keyword I generally over _, and I also enable an operator scheme like I % "some text" instead of _("some text") or I("some text"). Via I = t; t.__call__ = t.__mod__ = t.ugettext effectively; plus a small patch of pygettext.py. This pattern is more pleasant for typing, looks more readable and Pythonic to me, and avoids the crucial/ugly name collision of _ in Python with the interactive-last-result-anaphor (see sys.displayhook) when using gettext'ed modules on interactive Python prompts. _ is also "reserved" as my preferred (local) placeholder for unused values in expressions like _, _, x, y, z, _ = some_tuple.
After all gettext is a rather simple module & mechanism, and things are easily customizable in Python.)
You should use different domains for each plugin. The domain can be package name to prevent conflicts.
I do not understand why you need to translate something outside the plugin by using plugin's domain, but if you really need to, then you should disambiguate the domain each time.
Each plugin can provide it's own "undescore", readily bound to the plugin's domain:
from my.plugin import MessageFactory as _my_plugin
Please, note, that underscore is only a convention so the extraction tools can find i18n-enabled messages in the program. Plugins' messages should be marked with underscore in their respective packages (you do put them into separate packages, right?). In all other places, you are free to call these factories by some other name and leave underscore for the main program translation domain.
I am less sure about .mo-files, but you can surely compile all your .po files into one .mo file. However, if plugins are written by independent uncooperative authors, there could be msgid conflicts.
UPDATE:
If plugins are in the same package with the main app, then there is no sense in using individual translation domains for them (this is not your case). If plugins are in the separate packages, then extraction should be done independently for those packages. In both cases you have no problem with variable _. If for some reason main app wants plugins' translations in its code, use some other name for _, as in the answer. Of course, extraction tools will not identify anything but underscore.
In other words, plugins should care about their translations on their own. The main app could use plugin-specific translation function as part of plug-in API. Extraction or manual addition of strings into po/mo-files are also not of concern for the main app: its up to plugin author to provide translations.

Categories