Factory methods vs inject framework in Python - what is cleaner? - python

What I usually do in my applications is that I create all my services/dao/repo/clients using factory methods
class Service:
def init(self, db):
self._db = db
#classmethod
def from_env(cls):
return cls(db=PostgresDatabase.from_env())
And when I create app I do
service = Service.from_env()
what creates all dependencies
and in tests when I dont want to use real db I just do DI
service = Service(db=InMemoryDatabse())
I suppose that is quite far from clean/hex architecture since Service knows how to create a Database and knows which database
type it creates (could be also InMemoryDatabse or MongoDatabase)
I guess that in clean/hex architecture I would have
class DatabaseInterface(ABC):
#abstractmethod
def get_user(self, user_id: int) -> User:
pass
import inject
class Service:
#inject.autoparams()
def __init__(self, db: DatabaseInterface):
self._db = db
And I would set up injector framework to do
# in app
inject.clear_and_configure(lambda binder: binder
.bind(DatabaseInterface, PostgresDatabase()))
# in test
inject.clear_and_configure(lambda binder: binder
.bind(DatabaseInterface, InMemoryDatabse()))
And my questions are:
Is my way really bad? Is it not a clean architecture anymore?
What are the benefits of using inject?
Is it worth to bother and use inject framework?
Are there any other better ways of separating the domain from the outside?

There are several main goals in Dependency Injection technique, including (but not limited to):
Lowering coupling between parts of your system. This way you can change each part with less effort. See "High cohesion, low coupling"
To enforce stricter rules about responsibilities. One entity must do only one thing on its level of abstraction. Other entities must be defined as dependencies to this one. See "IoC"
Better testing experience. Explicit dependencies allow you to stub different parts of your system with some primitive test behaviour that has the same public API than your production code. See "Mocks arent' stubs"
The other thing to keep in mind is that we usually shall rely on abstractions, not implementations. I see a lot of people who use DI to inject only particular implementation. There's a big difference.
Because when you inject and rely on an implementation, there's no difference in what method we use to create objects. It just does not matter. For example, if you inject requests without proper abstractions you would still require anything similar with the same methods, signatures, and return types. You would not be able to replace this implementation at all. But, when you inject fetch_order(order: OrderID) -> Order it means that anything can be inside. requests, database, whatever.
To sum things up:
What are the benefits of using inject?
The main benefit is that you don't have to assemble your dependencies manually. However, this comes with a huge cost: you are using complex, even magical, tools to solve problems. One day or another complexity will fight you back.
Is it worth to bother and use inject framework?
One more thing about inject framework in particular. I don't like when objects where I inject something knows about it. It is an implementation detail!
How in a world Postcard domain model, for example, knows this thing?
I would recommend to use punq for simple cases and dependencies for complex ones.
inject also does not enforce a clean separation of "dependencies" and object properties. As it was said, one of the main goal of DI is to enforce stricter responsibilities.
In contrast, let me show how punq works:
from typing_extensions import final
from attr import dataclass
# Note, we import protocols, not implementations:
from project.postcards.repository.protocols import PostcardsForToday
from project.postcards.services.protocols import (
SendPostcardsByEmail,
CountPostcardsInAnalytics,
)
#final
#dataclass(frozen=True, slots=True)
class SendTodaysPostcardsUsecase(object):
_repository: PostcardsForToday
_email: SendPostcardsByEmail
_analytics: CountPostcardInAnalytics
def __call__(self, today: datetime) -> None:
postcards = self._repository(today)
self._email(postcards)
self._analytics(postcards)
See? We even don't have a constructor. We declaratively define our dependencies and punq will automatically inject them. And we do not define any specific implementations. Only protocols to follow. This style is called "functional objects" or SRP-styled classes.
Then we define the punq container itself:
# project/implemented.py
import punq
container = punq.Container()
# Low level dependencies:
container.register(Postgres)
container.register(SendGrid)
container.register(GoogleAnalytics)
# Intermediate dependencies:
container.register(PostcardsForToday)
container.register(SendPostcardsByEmail)
container.register(CountPostcardInAnalytics)
# End dependencies:
container.register(SendTodaysPostcardsUsecase)
And use it:
from project.implemented import container
send_postcards = container.resolve(SendTodaysPostcardsUsecase)
send_postcards(datetime.now())
See? Now our classes have no idea who and how creates them. No decorators, no special values.
Read more about SRP-styled classes here:
Enforcing Single Responsibility Principle in Python
Are there any other better ways of separating the domain from the outside?
You can use functional programming concepts instead of imperative ones. The main idea of function dependency injection is that you don't call things that relies on context you don't have. You schedule these calls for later, when the context is present. Here's how you can illustrate dependency injection with just simple functions:
from django.conf import settings
from django.http import HttpRequest, HttpResponse
from words_app.logic import calculate_points
def view(request: HttpRequest) -> HttpResponse:
user_word: str = request.POST['word'] # just an example
points = calculate_points(user_words)(settings) # passing the dependencies and calling
... # later you show the result to user somehow
# Somewhere in your `word_app/logic.py`:
from typing import Callable
from typing_extensions import Protocol
class _Deps(Protocol): # we rely on abstractions, not direct values or types
WORD_THRESHOLD: int
def calculate_points(word: str) -> Callable[[_Deps], int]:
guessed_letters_count = len([letter for letter in word if letter != '.'])
return _award_points_for_letters(guessed_letters_count)
def _award_points_for_letters(guessed: int) -> Callable[[_Deps], int]:
def factory(deps: _Deps):
return 0 if guessed < deps.WORD_THRESHOLD else guessed
return factory
The only problem with this pattern is that _award_points_for_letters will be hard to compose.
That's why we made a special wrapper to help the composition (it is a part of the returns:
import random
from typing_extensions import Protocol
from returns.context import RequiresContext
class _Deps(Protocol): # we rely on abstractions, not direct values or types
WORD_THRESHOLD: int
def calculate_points(word: str) -> RequiresContext[_Deps, int]:
guessed_letters_count = len([letter for letter in word if letter != '.'])
awarded_points = _award_points_for_letters(guessed_letters_count)
return awarded_points.map(_maybe_add_extra_holiday_point) # it has special methods!
def _award_points_for_letters(guessed: int) -> RequiresContext[_Deps, int]:
def factory(deps: _Deps):
return 0 if guessed < deps.WORD_THRESHOLD else guessed
return RequiresContext(factory) # here, we added `RequiresContext` wrapper
def _maybe_add_extra_holiday_point(awarded_points: int) -> int:
return awarded_points + 1 if random.choice([True, False]) else awarded_points
For example, RequiresContext has special .map method to compose itself with a pure function. And that's it. As a result you have just simple functions and composition helpers with simple API. No magic, no extra complexity. And as a bonus everything is properly typed and compatible with mypy.
Read more about this approach here:
Typed functional dependency injection
returns docs

The initial example is pretty close to a "proper" clean/hex. What's missing is the idea of a Composition Root, and you can do clean/hex without any injector framework. Without it, you'd do something like:
class Service:
def __init__(self, db):
self._db = db
# In your app entry point:
service = Service(PostGresDb(config.host, config.port, config.dbname))
which goes by Pure/Vanilla/Poor Man's DI, depending on who you talk to. An abstract interface is not absolutely necessary, since you can rely on duck-typing or structural typing.
Whether or not you want to use a DI framework is a matter of opinion and taste, but there are other simpler alternatives to inject like punq that you could consider, if you choose to go down that path.
https://www.cosmicpython.com/ is a good resource that looks at these issues in depth.

you may want to use a different database and you want to have the flexibility to do it in a simple way, for this reason, I consider dependency injection a better way to configure your service

Related

How can I specify a generic MutableSet, that demands existence of a update method, in a typed signature (Python >3.9)?

I have written a library. Some of its functions and methods operate on sets of Hashables, e.g.:
def some_function(my_set: set[Hashable]) -> None:
...
my_set.update(...)
...
How can I define something like an UpdatableSet (and use it instead of "set" in the signature of some_function), that demands existence of an update method, but allows for using some other class (from an external library) than set, that provides all necessary methods, in function calls?
def some_function(my_set: UpdatableSet[Hashable]) -> None:
...
my_set.update(...)
...
from intbitset import intbitset # see PyPI
some_set = intbitset(rhs=100)
some_function(some_set)
MutableSet[Hashable] is not enough, since it does not guarantee that there is an update method.
I use MyPy for type checking.
I thought of something like the following, but the register method is not found. And I do not know, if this is the right approach. Maybe defining some generic protocol would be the right way.
class UpdatableSet(MutableSet[_T], Generic[_T], ABC):
def update(self, other) -> None:
pass
UpdatableSet.register(set)
UpdatableSet.register(intbitset)
The comment of #SUTerliakov answers the question, and I was able to solve the problem this way:
The proper and type-safe solution would be generic Protocol[T] that defines all
methods of set you need. If there's only 5-6 methods, it's also convenient enough. –
SUTerliakov
Apr 12 at 19:34

Is it bad practice to modify attributes of one module from another module?

I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...

Is there a way to get ad-hoc polymorphism in Python?

Many languages support ad-hoc polymorphism (a.k.a. function overloading) out of the box. However, it seems that Python opted out of it. Still, I can imagine there might be a trick or a library that is able to pull it off in Python. Does anyone know of such a tool?
For example, in Haskell one might use this to generate test data for different types:
-- In some testing library:
class Randomizable a where
genRandom :: a
-- Overload for different types
instance Randomizable String where genRandom = ...
instance Randomizable Int where genRandom = ...
instance Randomizable Bool where genRandom = ...
-- In some client project, we might have a custom type:
instance Randomizable VeryCustomType where genRandom = ...
The beauty of this is that I can extend genRandom for my own custom types without touching the testing library.
How would you achieve something like this in Python?
Python is not a strongly typed language, so it really doesn't matter if yo have an instance of Randomizable or an instance of some other class which has the same methods.
One way to get the appearance of what you want could be this:
types_ = {}
def registerType ( dtype , cls ) :
types_[dtype] = cls
def RandomizableT ( dtype ) :
return types_[dtype]
Firstly, yes, I did define a function with a capital letter, but it's meant to act more like a class. For example:
registerType ( int , TheLibrary.Randomizable )
registerType ( str , MyLibrary.MyStringRandomizable )
Then, later:
type = ... # get whatever type you want to randomize
randomizer = RandomizableT(type) ()
print randomizer.getRandom()
A Python function cannot be automatically specialised based on static compile-time typing. Therefore its result can only depend on its arguments received at run-time and on the global (or local) environment, unless the function itself is modifiable in-place and can carry some state.
Your generic function genRandom takes no arguments besides the typing information. Thus in Python it should at least receive the type as an argument. Since built-in classes cannot be modified, the generic function (instance) implementation for such classes should be somehow supplied through the global environment or included into the function itself.
I've found out that since Python 3.4, there is #functools.singledispatch decorator. However, it works only for functions which receive a type instance (object) as the first argument, so it is not clear how it could be applied in your example. I am also a bit confused by its rationale:
In addition, it is currently a common anti-pattern for Python code to inspect the types of received arguments, in order to decide what to do with the objects.
I understand that anti-pattern is a jargon term for a pattern which is considered undesirable (and does not at all mean the absence of a pattern). The rationale thus claims that inspecting types of arguments is undesirable, and this claim is used to justify introducing a tool that will simplify ... dispatching on the type of an argument. (Incidentally, note that according to PEP 20, "Explicit is better than implicit.")
The "Alternative approaches" section of PEP 443 "Single-dispatch generic functions" however seems worth reading. There are several references to possible solutions, including one to "Five-minute Multimethods in Python" article by Guido van Rossum from 2005.
Does this count for ad hock polymorphism?
class A:
def __init__(self):
pass
def aFunc(self):
print "In A"
class B:
def __init__(self):
pass
def aFunc(self):
print "In B"
f = A()
f.aFunc()
f = B()
f.aFunc()
output
In A
In B
Another version of polymorphism
from module import aName
If two modules use the same interface, you could import either one and use it in your code.
One example of this is from xml.etree.ElementTree import XMLParser

How to do dependency injection python-way?

I've been reading a lot about python-way lately so my question is
How to do dependency injection python-way?
I am talking about usual scenarios when, for example, service A needs access to UserService for authorization checks.
It all depends on the situation. For example, if you use dependency injection for testing purposes -- so you can easily mock out something -- you can often forgo injection altogether: you can instead mock out the module or class you would otherwise inject:
subprocess.Popen = some_mock_Popen
result = subprocess.call(...)
assert some_mock_popen.result == result
subprocess.call() will call subprocess.Popen(), and we can mock it out without having to inject the dependency in a special way. We can just replace subprocess.Popen directly. (This is just an example; in real life you would do this in a much more robust way.)
If you use dependency injection for more complex situations, or when mocking whole modules or classes isn't appropriate (because, for example, you want to only mock out one particular call) then using class attributes or module globals for the dependencies is the usual choice. For example, considering a my_subprocess.py:
from subprocess import Popen
def my_call(...):
return Popen(...).communicate()
You can easily replace only the Popen call made by my_call() by assigning to my_subprocess.Popen; it wouldn't affect any other calls to subprocess.Popen (but it would replace all calls to my_subprocess.Popen, of course.) Similarly, class attributes:
class MyClass(object):
Popen = staticmethod(subprocess.Popen)
def call(self):
return self.Popen(...).communicate(...)
When using class attributes like this, which is rarely necessary considering the options, you should take care to use staticmethod. If you don't, and the object you're inserting is a normal function object or another type of descriptor, like a property, that does something special when retrieved from a class or instance, it would do the wrong thing. Worse, if you used something that right now isn't a descriptor (like the subprocess.Popen class, in the example) it would work now, but if the object in question changed to a normal function future, it would break confusingly.
Lastly, there's just plain callbacks; if you just want to tie a particular instance of a class to a particular service, you can just pass the service (or one or more of the service's methods) to the class initializer, and have it use that:
class MyClass(object):
def __init__(self, authenticate=None, authorize=None):
if authenticate is None:
authenticate = default_authenticate
if authorize is None:
authorize = default_authorize
self.authenticate = authenticate
self.authorize = authorize
def request(self, user, password, action):
self.authenticate(user, password)
self.authorize(user, action)
self._do_request(action)
...
helper = AuthService(...)
# Pass bound methods to helper.authenticate and helper.authorize to MyClass.
inst = MyClass(authenticate=helper.authenticate, authorize=helper.authorize)
inst.request(...)
When setting instance attributes like that, you never have to worry about descriptors firing, so just assigning the functions (or classes or other callables or instances) is fine.
How about this "setter-only" injection recipe?
http://code.activestate.com/recipes/413268/
It is quite pythonic, using the "descriptor" protocol with __get__()/__set__(), but rather invasive, requiring to replace all your attribute-setting code with a RequiredFeature instance initialized with the str-name of the Feature required.
After years using Python without any DI autowiring framework and Java with Spring I've come to realize plain simple Python code often doesn't need frameworks for dependency injection without autowiring (autowiring is what Guice and Spring both do in Java), i.e., just doing something like this is enough:
def foo(dep = None): # great for unit testing!
self.dep = dep or Dep() # callers can not care about this too
...
This is pure dependency injection (quite simple) but without magical frameworks for automatically injecting them for you (i.e., autowiring) and without Inversion of Control.
Though as I dealt with bigger applications this approach wasn't cutting it anymore. So I've come up with injectable a micro-framework that wouldn't feel non-pythonic and yet would provide first class dependency injection autowiring.
Under the motto Dependency Injection for Humans™ this is what it looks like:
# some_service.py
class SomeService:
#autowired
def __init__(
self,
database: Autowired(Database),
message_brokers: Autowired(List[Broker]),
):
pending = database.retrieve_pending_messages()
for broker in message_brokers:
broker.send_pending(pending)
# database.py
#injectable
class Database:
...
# message_broker.py
class MessageBroker(ABC):
def send_pending(messages):
...
# kafka_producer.py
#injectable
class KafkaProducer(MessageBroker):
...
# sqs_producer.py
#injectable
class SQSProducer(MessageBroker):
...
What is dependency injection?
Dependency injection is a principle that helps to decrease coupling and increase cohesion.
Coupling and cohesion are about how tough the components are tied.
High coupling. If the coupling is high it’s like using a superglue or welding. No easy way to disassemble.
High cohesion. High cohesion is like using the screws. Very easy to disassemble and assemble back or assemble a different way. It is an opposite to high coupling.
When the cohesion is high the coupling is low.
Low coupling brings a flexibility. Your code becomes easier to change and test.
How to implement the dependency injection?
Objects do not create each other anymore. They provide a way to inject the dependencies instead.
before:
import os
class ApiClient:
def __init__(self):
self.api_key = os.getenv('API_KEY') # <-- dependency
self.timeout = os.getenv('TIMEOUT') # <-- dependency
class Service:
def __init__(self):
self.api_client = ApiClient() # <-- dependency
def main() -> None:
service = Service() # <-- dependency
...
if __name__ == '__main__':
main()
after:
import os
class ApiClient:
def __init__(self, api_key: str, timeout: int):
self.api_key = api_key # <-- dependency is injected
self.timeout = timeout # <-- dependency is injected
class Service:
def __init__(self, api_client: ApiClient):
self.api_client = api_client # <-- dependency is injected
def main(service: Service): # <-- dependency is injected
...
if __name__ == '__main__':
main(
service=Service(
api_client=ApiClient(
api_key=os.getenv('API_KEY'),
timeout=os.getenv('TIMEOUT'),
),
),
)
ApiClient is decoupled from knowing where the options come from. You can read a key and a timeout from a configuration file or even get them from a database.
Service is decoupled from the ApiClient. It does not create it anymore. You can provide a stub or other compatible object.
Function main() is decoupled from Service. It receives it as an argument.
Flexibility comes with a price.
Now you need to assemble and inject the objects like this:
main(
service=Service(
api_client=ApiClient(
api_key=os.getenv('API_KEY'),
timeout=os.getenv('TIMEOUT'),
),
),
)
The assembly code might get duplicated and it’ll become harder to change the application structure.
Conclusion
Dependency injection brings you 3 advantages:
Flexibility. The components are loosely coupled. You can easily extend or change a functionality of the system by combining the components different way. You even can do it on the fly.
Testability. Testing is easy because you can easily inject mocks instead of real objects that use API or database, etc.
Clearness and maintainability. Dependency injection helps you reveal the dependencies. Implicit becomes explicit. And “Explicit is better than implicit” (PEP 20 - The Zen of Python). You have all the components and dependencies defined explicitly in the container. This provides an overview and control on the application structure. It is easy to understand and change it.
—-
I believe that through the already presented example you will understand the idea and be able to apply it to your problem, ie the implementation of UserService for authorization.
I recently released a DI framework for python that might help you here. I think its a fairly fresh take on it, but I'm not sure how 'pythonic' it is. Judge for yourself. Feedback is very welcome.
https://github.com/suned/serum

Building a minimal plugin architecture in Python

I have an application, written in Python, which is used by a fairly technical audience (scientists).
I'm looking for a good way to make the application extensible by the users, i.e. a scripting/plugin architecture.
I am looking for something extremely lightweight. Most scripts, or plugins, are not going to be developed and distributed by a third-party and installed, but are going to be something whipped up by a user in a few minutes to automate a repeating task, add support for a file format, etc. So plugins should have the absolute minimum boilerplate code, and require no 'installation' other than copying to a folder (so something like setuptools entry points, or the Zope plugin architecture seems like too much.)
Are there any systems like this already out there, or any projects that implement a similar scheme that I should look at for ideas / inspiration?
Mine is, basically, a directory called "plugins" which the main app can poll and then use imp.load_module to pick up files, look for a well-known entry point possibly with module-level config params, and go from there. I use file-monitoring stuff for a certain amount of dynamism in which plugins are active, but that's a nice-to-have.
Of course, any requirement that comes along saying "I don't need [big, complicated thing] X; I just want something lightweight" runs the risk of re-implementing X one discovered requirement at a time. But that's not to say you can't have some fun doing it anyway :)
module_example.py:
def plugin_main(*args, **kwargs):
print args, kwargs
loader.py:
def load_plugin(name):
mod = __import__("module_%s" % name)
return mod
def call_plugin(name, *args, **kwargs):
plugin = load_plugin(name)
plugin.plugin_main(*args, **kwargs)
call_plugin("example", 1234)
It's certainly "minimal", it has absolutely no error checking, probably countless security problems, it's not very flexible - but it should show you how simple a plugin system in Python can be..
You probably want to look into the imp module too, although you can do a lot with just __import__, os.listdir and some string manipulation.
Have a look at at this overview over existing plugin frameworks / libraries, it is a good starting point. I quite like yapsy, but it depends on your use-case.
While that question is really interesting, I think it's fairly hard to answer, without more details. What sort of application is this? Does it have a GUI? Is it a command-line tool? A set of scripts? A program with an unique entry point, etc...
Given the little information I have, I will answer in a very generic manner.
What means do you have to add plugins?
You will probably have to add a configuration file, which will list the paths/directories to load.
Another way would be to say "any files in that plugin/ directory will be loaded", but it has the inconvenient to require your users to move around files.
A last, intermediate option would be to require all plugins to be in the same plugin/ folder, and then to active/deactivate them using relative paths in a config file.
On a pure code/design practice, you'll have to determine clearly what behavior/specific actions you want your users to extend. Identify the common entry point/a set of functionalities that will always be overridden, and determine groups within these actions. Once this is done, it should be easy to extend your application,
Example using hooks, inspired from MediaWiki (PHP, but does language really matters?):
import hooks
# In your core code, on key points, you allow user to run actions:
def compute(...):
try:
hooks.runHook(hooks.registered.beforeCompute)
except hooks.hookException:
print('Error while executing plugin')
# [compute main code] ...
try:
hooks.runHook(hooks.registered.afterCompute)
except hooks.hookException:
print('Error while executing plugin')
# The idea is to insert possibilities for users to extend the behavior
# where it matters.
# If you need to, pass context parameters to runHook. Remember that
# runHook can be defined as a runHook(*args, **kwargs) function, not
# requiring you to define a common interface for *all* hooks. Quite flexible :)
# --------------------
# And in the plugin code:
# [...] plugin magic
def doStuff():
# ....
# and register the functionalities in hooks
# doStuff will be called at the end of each core.compute() call
hooks.registered.afterCompute.append(doStuff)
Another example, inspired from mercurial. Here, extensions only add commands to the hg commandline executable, extending the behavior.
def doStuff(ui, repo, *args, **kwargs):
# when called, a extension function always receives:
# * an ui object (user interface, prints, warnings, etc)
# * a repository object (main object from which most operations are doable)
# * command-line arguments that were not used by the core program
doMoreMagicStuff()
obj = maybeCreateSomeObjects()
# each extension defines a commands dictionary in the main extension file
commands = { 'newcommand': doStuff }
For both approaches, you might need common initialize and finalize for your extension.
You can either use a common interface that all your extension will have to implement (fits better with second approach; mercurial uses a reposetup(ui, repo) that is called for all extension), or use a hook-kind of approach, with a hooks.setup hook.
But again, if you want more useful answers, you'll have to narrow down your question ;)
Marty Allchin's simple plugin framework is the base I use for my own needs. I really recommand to take a look at it, I think it is really a good start if you want something simple and easily hackable. You can find it also as a Django Snippets.
I am a retired biologist who dealt with digital micrograqphs and found himself having to write an image processing and analysis package (not technically a library) to run on an SGi machine. I wrote the code in C and used Tcl for the scripting language. The GUI, such as it was, was done using Tk. The commands that appeared in Tcl were of the form "extensionName commandName arg0 arg1 ... param0 param1 ...", that is, simple space-separated words and numbers. When Tcl saw the "extensionName" substring, control was passed to the C package. That in turn ran the command through a lexer/parser (done in lex/yacc) and then called C routines as necessary.
The commands to operate the package could be run one by one via a window in the GUI, but batch jobs were done by editing text files which were valid Tcl scripts; you'd pick the template that did the kind of file-level operation you wanted to do and then edit a copy to contain the actual directory and file names plus the package commands. It worked like a charm. Until ...
1) The world turned to PCs and 2) the scripts got longer than about 500 lines, when Tcl's iffy organizational capabilities started to become a real inconvenience. Time passed ...
I retired, Python got invented, and it looked like the perfect successor to Tcl. Now, I have never done the port, because I have never faced up to the challenges of compiling (pretty big) C programs on a PC, extending Python with a C package, and doing GUIs in Python/Gt?/Tk?/??. However, the old idea of having editable template scripts seems still workable. Also, it should not be too great a burden to enter package commands in a native Python form, e.g.:
packageName.command( arg0, arg1, ..., param0, param1, ...)
A few extra dots, parens, and commas, but those aren't showstoppers.
I remember seeing that someone has done versions of lex and yacc in Python (try: http://www.dabeaz.com/ply/), so if those are still needed, they're around.
The point of this rambling is that it has seemed to me that Python itself IS the desired "lightweight" front end usable by scientists. I'm curious to know why you think that it is not, and I mean that seriously.
added later: The application gedit anticipates plugins being added and their site has about the clearest explanation of a simple plugin procedure I've found in a few minutes of looking around. Try:
https://wiki.gnome.org/Apps/Gedit/PythonPluginHowToOld
I'd still like to understand your question better. I am unclear whether you 1) want scientists to be able to use your (Python) application quite simply in various ways or 2) want to allow the scientists to add new capabilities to your application. Choice #1 is the situation we faced with the images and that led us to use generic scripts which we modified to suit the need of the moment. Is it Choice #2 which leads you to the idea of plugins, or is it some aspect of your application that makes issuing commands to it impracticable?
When i searching for Python Decorators, found a simple but useful code snippet. It may not fit in your needs but very inspiring.
Scipy Advanced Python#Plugin Registration System
class TextProcessor(object):
PLUGINS = []
def process(self, text, plugins=()):
if plugins is ():
for plugin in self.PLUGINS:
text = plugin().process(text)
else:
for plugin in plugins:
text = plugin().process(text)
return text
#classmethod
def plugin(cls, plugin):
cls.PLUGINS.append(plugin)
return plugin
#TextProcessor.plugin
class CleanMarkdownBolds(object):
def process(self, text):
return text.replace('**', '')
Usage:
processor = TextProcessor()
processed = processor.process(text="**foo bar**", plugins=(CleanMarkdownBolds, ))
processed = processor.process(text="**foo bar**")
I enjoyed the nice discussion on different plugin architectures given by Dr Andre Roberge at Pycon 2009. He gives a good overview of different ways of implementing plugins, starting from something really simple.
Its available as a podcast (second part following an explanation of monkey-patching) accompanied by a series of six blog entries.
I recommend giving it a quick listen before you make a decision.
I arrived here looking for a minimal plugin architecture, and found a lot of things that all seemed like overkill to me. So, I've implemented Super Simple Python Plugins. To use it, you create one or more directories and drop a special __init__.py file in each one. Importing those directories will cause all other Python files to be loaded as submodules, and their name(s) will be placed in the __all__ list. Then it's up to you to validate/initialize/register those modules. There's an example in the README file.
Actually setuptools works with a "plugins directory", as the following example taken from the project's documentation:
http://peak.telecommunity.com/DevCenter/PkgResources#locating-plugins
Example usage:
plugin_dirs = ['foo/plugins'] + sys.path
env = Environment(plugin_dirs)
distributions, errors = working_set.find_plugins(env)
map(working_set.add, distributions) # add plugins+libs to sys.path
print("Couldn't load plugins due to: %s" % errors)
In the long run, setuptools is a much safer choice since it can load plugins without conflicts or missing requirements.
Another benefit is that the plugins themselves can be extended using the same mechanism, without the original applications having to care about it.
Expanding on the #edomaur's answer may I suggest taking a look at simple_plugins (shameless plug), which is a simple plugin framework inspired by the work of Marty Alchin.
A short usage example based on the project's README:
# All plugin info
>>> BaseHttpResponse.plugins.keys()
['valid_ids', 'instances_sorted_by_id', 'id_to_class', 'instances',
'classes', 'class_to_id', 'id_to_instance']
# Plugin info can be accessed using either dict...
>>> BaseHttpResponse.plugins['valid_ids']
set([304, 400, 404, 200, 301])
# ... or object notation
>>> BaseHttpResponse.plugins.valid_ids
set([304, 400, 404, 200, 301])
>>> BaseHttpResponse.plugins.classes
set([<class '__main__.NotFound'>, <class '__main__.OK'>,
<class '__main__.NotModified'>, <class '__main__.BadRequest'>,
<class '__main__.MovedPermanently'>])
>>> BaseHttpResponse.plugins.id_to_class[200]
<class '__main__.OK'>
>>> BaseHttpResponse.plugins.id_to_instance[200]
<OK: 200>
>>> BaseHttpResponse.plugins.instances_sorted_by_id
[<OK: 200>, <MovedPermanently: 301>, <NotModified: 304>, <BadRequest: 400>, <NotFound: 404>]
# Coerce the passed value into the right instance
>>> BaseHttpResponse.coerce(200)
<OK: 200>
As one another approach to plugin system, You may check Extend Me project.
For example, let's define simple class and its extension
# Define base class for extensions (mount point)
class MyCoolClass(Extensible):
my_attr_1 = 25
def my_method1(self, arg1):
print('Hello, %s' % arg1)
# Define extension, which implements some aditional logic
# or modifies existing logic of base class (MyCoolClass)
# Also any extension class maby be placed in any module You like,
# It just needs to be imported at start of app
class MyCoolClassExtension1(MyCoolClass):
def my_method1(self, arg1):
super(MyCoolClassExtension1, self).my_method1(arg1.upper())
def my_method2(self, arg1):
print("Good by, %s" % arg1)
And try to use it:
>>> my_cool_obj = MyCoolClass()
>>> print(my_cool_obj.my_attr_1)
25
>>> my_cool_obj.my_method1('World')
Hello, WORLD
>>> my_cool_obj.my_method2('World')
Good by, World
And show what is hidden behind the scene:
>>> my_cool_obj.__class__.__bases__
[MyCoolClassExtension1, MyCoolClass]
extend_me library manipulates class creation process via metaclasses, thus in example above, when creating new instance of MyCoolClass we got instance of new class that is subclass of both MyCoolClassExtension and MyCoolClass having functionality of both of them, thanks to Python's multiple inheritance
For better control over class creation there are few metaclasses defined in this lib:
ExtensibleType - allows simple extensibility by subclassing
ExtensibleByHashType - similar to ExtensibleType, but haveing ability
to build specialized versions of class, allowing global extension
of base class and extension of specialized versions of class
This lib is used in OpenERP Proxy Project, and seems to be working good enough!
For real example of usage, look in OpenERP Proxy 'field_datetime' extension:
from ..orm.record import Record
import datetime
class RecordDateTime(Record):
""" Provides auto conversion of datetime fields from
string got from server to comparable datetime objects
"""
def _get_field(self, ftype, name):
res = super(RecordDateTime, self)._get_field(ftype, name)
if res and ftype == 'date':
return datetime.datetime.strptime(res, '%Y-%m-%d').date()
elif res and ftype == 'datetime':
return datetime.datetime.strptime(res, '%Y-%m-%d %H:%M:%S')
return res
Record here is extesible object. RecordDateTime is extension.
To enable extension, just import module that contains extension class, and (in case above) all Record objects created after it will have extension class in base classes, thus having all its functionality.
The main advantage of this library is that, code that operates extensible objects, does not need to know about extension and extensions could change everything in extensible objects.
setuptools has an EntryPoint:
Entry points are a simple way for distributions to “advertise” Python
objects (such as functions or classes) for use by other distributions.
Extensible applications and frameworks can search for entry points
with a particular name or group, either from a specific distribution
or from all active distributions on sys.path, and then inspect or load
the advertised objects at will.
AFAIK this package is always available if you use pip or virtualenv.
You can use pluginlib.
Plugins are easy to create and can be loaded from other packages, file paths, or entry points.
Create a plugin parent class, defining any required methods:
import pluginlib
#pluginlib.Parent('parser')
class Parser(object):
#pluginlib.abstractmethod
def parse(self, string):
pass
Create a plugin by inheriting a parent class:
import json
class JSON(Parser):
_alias_ = 'json'
def parse(self, string):
return json.loads(string)
Load the plugins:
loader = pluginlib.PluginLoader(modules=['sample_plugins'])
plugins = loader.plugins
parser = plugins.parser.json()
print(parser.parse('{"json": "test"}'))
I have spent time reading this thread while I was searching for a plugin framework in Python now and then. I have used some but there were shortcomings with them. Here is what I come up with for your scrutiny in 2017, a interface free, loosely coupled plugin management system: Load me later. Here are tutorials on how to use it.
I've spent a lot of time trying to find small plugin system for Python, which would fit my needs. But then I just thought, if there is already an inheritance, which is natural and flexible, why not use it.
The only problem with using inheritance for plugins is that you dont know what are the most specific(the lowest on inheritance tree) plugin classes are.
But this could be solved with metaclass, which keeps track of inheritance of base class, and possibly could build class, which inherits from most specific plugins ('Root extended' on the figure below)
So I came with a solution by coding such a metaclass:
class PluginBaseMeta(type):
def __new__(mcls, name, bases, namespace):
cls = super(PluginBaseMeta, mcls).__new__(mcls, name, bases, namespace)
if not hasattr(cls, '__pluginextensions__'): # parent class
cls.__pluginextensions__ = {cls} # set reflects lowest plugins
cls.__pluginroot__ = cls
cls.__pluginiscachevalid__ = False
else: # subclass
assert not set(namespace) & {'__pluginextensions__',
'__pluginroot__'} # only in parent
exts = cls.__pluginextensions__
exts.difference_update(set(bases)) # remove parents
exts.add(cls) # and add current
cls.__pluginroot__.__pluginiscachevalid__ = False
return cls
#property
def PluginExtended(cls):
# After PluginExtended creation we'll have only 1 item in set
# so this is used for caching, mainly not to create same PluginExtended
if cls.__pluginroot__.__pluginiscachevalid__:
return next(iter(cls.__pluginextensions__)) # only 1 item in set
else:
name = cls.__pluginroot__.__name__ + 'PluginExtended'
extended = type(name, tuple(cls.__pluginextensions__), {})
cls.__pluginroot__.__pluginiscachevalid__ = True
return extended
So when you have Root base, made with metaclass, and have tree of plugins which inherit from it, you could automatically get class, which inherits from the most specific plugins by just subclassing:
class RootExtended(RootBase.PluginExtended):
... your code here ...
Code base is pretty small (~30 lines of pure code) and as flexible as inheritance allows.
If you're interested, get involved # https://github.com/thodnev/pluginlib
You may also have a look at Groundwork.
The idea is to build applications around reusable components, called patterns and plugins. Plugins are classes that derive from GwBasePattern.
Here's a basic example:
from groundwork import App
from groundwork.patterns import GwBasePattern
class MyPlugin(GwBasePattern):
def __init__(self, app, **kwargs):
self.name = "My Plugin"
super().__init__(app, **kwargs)
def activate(self):
pass
def deactivate(self):
pass
my_app = App(plugins=[MyPlugin]) # register plugin
my_app.plugins.activate(["My Plugin"]) # activate it
There are also more advanced patterns to handle e.g. command line interfaces, signaling or shared objects.
Groundwork finds its plugins either by programmatically binding them to an app as shown above or automatically via setuptools. Python packages containing plugins must declare these using a special entry point groundwork.plugin.
Here are the docs.
Disclaimer: I'm one of the authors of Groundwork.
In our current healthcare product we have a plugin architecture implemented with interface class. Our tech stack are Django on top of Python for API and Nuxtjs on top of nodejs for frontend.
We have a plugin manager app written for our product which is basically pip and npm package in adherence with Django and Nuxtjs.
For new plugin development(pip and npm) we made plugin manager as dependency.
In Pip package:
With the help of setup.py you can add entrypoint of the plugin to do something with plugin manager(registry, initiations, ...etc.)
https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation
In npm package:
Similar to pip there are hooks in npm scripts to handle the installation.
https://docs.npmjs.com/misc/scripts
Our usecase:
plugin development team is separate from core devopment team now. The scope of plugin development is for integrating with 3rd party apps which are defined in any of the categories of the product. The plugin interfaces are categorised for eg:- Fax, phone, email ...etc plugin manager can be enhanced to new categories.
In your case: Maybe you can have one plugin written and reuse the same for doing stuffs.
If plugin developers has need to use reuse core objects that object can be used by doing a level of abstraction within plugin manager so that any plugins can inherit those methods.
Just sharing how we implemented in our product hope it will give a little idea.

Categories