Global state in Python module - python

I am writing a Python wrapper for a C library using the cffi.
The C library has to be initialized and shut down. Also, the cffi needs some place to save the state returned from ffi.dlopen().
I can see two paths here:
Either I wrap this whole stateful business in a class like this
class wrapper(object):
def __init__(self):
self.c = ffi.dlopen("mylibrary")
self.c.initialize()
def __del__(self):
self.c.terminate()
Or I provide two global functions that hide the state in a global variable
def initialize():
global __library
__library = ffi.dlopen("mylibrary")
__library.initialize()
def terminate():
__library.terminate()
del __library
The first path is somewhat cumbersome in that it requires the user to always create an object that really serves no other purpose other than managing the library state. On the other hand, it makes sure that terminate() is actually called every time.
The second path seems to result in a somewhat easier API. However, it exposes some hidden global state, which might be a bad thing. Also, if the user forgets to call terminate(), the C library is not unloaded correctly (which is not a big problem on the C side).
Which one of these paths would be more pythonic?

Exposing a wrapper object only makes sense in python if the library actually supports something like multiple instances in one application. If it doesn't support that or it's not really relevant go for kindall's suggestion and just initialize the library when imported and add an atexit handler for cleanup.
Adding wrappers around a stateless api or even an api without support for keeping different sets of state is not really pythonic and would raise expectations that different instances have some kind of isolation.
Example code:
import atexit
# Normal library initialization
__library = ffi.dlopen("mylibrary")
__library.initialize()
# Private library cleanup function
def __terminate():
__library.terminate()
# register function to be called on clean interpreter termination
atexit.register(__terminate)
For more details about atexit this question has some more details, as has the python documentation of course.

Related

How to extend built-in classes in python

I'm writing some code for an esp8266 micro controller using micro-python and it has some different class as well as some additional methods in the standard built in classes. To allow me to debug on my desktop I've built some helper classes so that the code will run. However I've run into a snag with micro-pythons time function which has a time.sleep_ms method since the standard time.sleep method on micropython does not accept floats. I tried using the following code to extend the built in time class but it fails to import properly. Any thoughts?
class time(time):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def sleep_ms(self, ms):
super().sleep(ms/1000)
This code exists in a file time.py. Secondly I know I'll have issues with having to import time.time that I would like to fix. I also realize I could call this something else and put traps for it in my micro controller code however I would like to avoid any special functions in what's loaded into the controller to save space and cycles.
You're not trying to override a class, you're trying to monkey-patch a module.
First off, if your module is named time.py, it will never be loaded in preference to the built-in time module. Truly built-in (as in compiled into the interpreter core, not just C extension modules that ship with CPython) modules are special, they are always loaded without checking sys.path, so you can't even attempt to shadow the time module, even if you wanted to (you generally don't, and doing so is incredibly ugly). In this case, the built-in time module shadows you; you can't import your module under the plain name time at all, because the built-in will be found without even looking at sys.path.
Secondly, assuming you use a different name and import it for the sole purpose of monkey-patching time (or do something terrible like adding the monkey patch to a custom sitecustomize module, it's not trivial to make the function truly native to the monkey-patched module (defining it in any normal way gives it a scope of the module where it was defined, not the same scope as other functions from the time module). If you don't need it to be "truly" defined as part of time, the simplest approach is just:
import time
def sleep_ms(ms):
return time.sleep(ms / 1000)
time.sleep_ms = sleep_ms
Of course, as mentioned, sleep_ms is still part of your module, and carries your module's scope around with it (that's why you do time.sleep, not just sleep; you could do from time import sleep to avoid qualifying it, but it's still a local alias that might not match time.sleep if someone else monkey-patches time.sleep later).
If you want to make it behave like it's part of the time module, so you can reference arbitrary things in time's namespace without qualification and always see the current function in time, you need to use eval to compile your code in time's scope:
import time
# Compile a string of the function's source to a code object that's not
# attached to any scope at all
# The filename argument is garbage, it's just for exception traceback
# reporting and the like
code = compile('def sleep_ms(ms): sleep(ms / 1000)', 'time.py', 'exec')
# eval the compiled code with a scope of the globals of the time module
# which both binds it to the time module's scope, and inserts the newly
# defined function directly into the time module's globals without
# defining it in your own module at all
eval(code, vars(time))
del code, time # May as well leave your monkey-patch module completely empty

Python abstract module possible?

I've built a module in Python in one single file without using classes. I do this so that using some api module becomes easier. Basically like this:
the_module.py
from some_api_module import some_api_call, another_api_call
def method_one(a, b):
return some_api_call(a + b)
def method_two(c, d, e):
return another_api_call(c * d * e)
I now need to built many similar modules, for different api modules, but I want all of them to have the same basic set of methods so that I can import any of these modules and call a function knowing that this function will behave the same in all the modules I built. To ensure they are all the same, I want to use some kind of abstract base module to build upon. I would normally grab the Abstract Base Classes module, but since I don't use classes at all, this doesn't work.
Does anybody know how I can implement an abstract base module on which I can build several other modules without using classes? All tips are welcome!
You are not using classes, but you could easily rewrite your code to do so.
A class is basically a namespace which contains functions and variables, as is a module.
Should not make a huge difference whether you call mymodule.method_one() or mymodule.myclass.method_one().
In python there is no such thing as interfaces which you might know from java.
The paradigm in python is Duck typing, that means more or less that for a given module you can tell whether it implements your API if it provides the right methods.
Python does this i.e. to determine what to do if you call myobject[i] on an instance of your class myclass. It looks whether the class has the method __getitem__ and if it does so, it replaces myobject[i] by myobject.__getitem__(i).
Yout don't have to tell python that your class supports this kind of access, python just figures it out from the way you defined your class.
The same way you should determine whether your module implements your API.
Maybe you want to look inside the hidden dictionary mymodule.__dict__ after import mymodulewhich contains all function names and pointers to them of your module. You could then check whether the right functions are present and raise an error otherwise
import my_module_4
#check if my_module_4 implements api
if all(func in my_module_4.__dict__ for func in ("method_one","method_two"):
print "API implemented"
else:
print "Warning: Not all API functions found in my_module_4"

Injecting arbitrary code into a Python SimpleXMLRPC Server

In the python docs of python SimpleXMLRPC Server, it is mentioned:
Warning Enabling the allow_dotted_names option allows intruders to access your module’s global variables and may allow intruders to execute arbitrary code on your machine. Only use this option on a secure, closed network.
Now I have a Server With the following code:
from xmlrpc.server import SimpleXMLRPCServer
from xmlrpc.server import SimpleXMLRPCRequestHandler
server = SimpleXMLRPCServer(("localhost", 8000),
requestHandler=RequestHandler)
server.register_introspection_functions()
server.register_function(pow)
def adder_function(x,y):
return x + y
server.register_function(adder_function, 'add')
class MyFuncs:
def mul(self, x, y):
return x * y
server.register_instance(MyFuncs(), allow_dotted_names=True)
server.serve_forever()
Please explain how the vulnerability can be exploited to inject arbitrary code onto the server? If my above code is not vulnerable, then give example of one which can be exploited and the client code to do so.
MyFuncs().mul is not just a callable function, it is (like all Python functions) a first-class object with its own properties.
Apart from a load of __xxx__ magic methods (which you can't access because SimpleXMLRPCServer blocks access to anything beginning with _), there are internal method members im_class (pointing to the class object), im_self (pointing to the MyFuncs() instance) and im_func (pointing to the function definition for mul). That function object itself has a number of accessible properties - most notably including func_globals which give access to the variable scope dictionary for the containing file.
So, by calling mul.im_func.func_globals.get an attacker would be able to read arbitrary global variables you had set in your script, or use update() on the dictionary to alter them. In the above example that's not exploitable because you have nothing sensitive in the global variables. But that's probably not something you want to rely on always staying true.
Full 'execute arbitrary code' is pretty unlikely, but you might imagine a writable global codeToExecute variable that gets evaled later, for example, or someone registering a whole module with register_instance, allowing all the modules it imported to be accessible (typical example: os and os.system).
In Python 3 this particular attack is no longer reachable because the function/method internal properties were renamed to double-underline versions, where they get blocked. But in general it seems like a bad idea to 'default open' and allow external access to any property on an instance just based on name - there is no guarantee that no other non-underline names will ever exist in the future, or that properties won't be added to the accessible built-in types (tuple, dict) of those properties that could be exploited in some way.
If you really need nested property access, it would seem safer to come up with a version of SimpleXMLRPCServer that requires something like a #rpc_accessible decoration to define what should be visible.

How to do dependency injection python-way?

I've been reading a lot about python-way lately so my question is
How to do dependency injection python-way?
I am talking about usual scenarios when, for example, service A needs access to UserService for authorization checks.
It all depends on the situation. For example, if you use dependency injection for testing purposes -- so you can easily mock out something -- you can often forgo injection altogether: you can instead mock out the module or class you would otherwise inject:
subprocess.Popen = some_mock_Popen
result = subprocess.call(...)
assert some_mock_popen.result == result
subprocess.call() will call subprocess.Popen(), and we can mock it out without having to inject the dependency in a special way. We can just replace subprocess.Popen directly. (This is just an example; in real life you would do this in a much more robust way.)
If you use dependency injection for more complex situations, or when mocking whole modules or classes isn't appropriate (because, for example, you want to only mock out one particular call) then using class attributes or module globals for the dependencies is the usual choice. For example, considering a my_subprocess.py:
from subprocess import Popen
def my_call(...):
return Popen(...).communicate()
You can easily replace only the Popen call made by my_call() by assigning to my_subprocess.Popen; it wouldn't affect any other calls to subprocess.Popen (but it would replace all calls to my_subprocess.Popen, of course.) Similarly, class attributes:
class MyClass(object):
Popen = staticmethod(subprocess.Popen)
def call(self):
return self.Popen(...).communicate(...)
When using class attributes like this, which is rarely necessary considering the options, you should take care to use staticmethod. If you don't, and the object you're inserting is a normal function object or another type of descriptor, like a property, that does something special when retrieved from a class or instance, it would do the wrong thing. Worse, if you used something that right now isn't a descriptor (like the subprocess.Popen class, in the example) it would work now, but if the object in question changed to a normal function future, it would break confusingly.
Lastly, there's just plain callbacks; if you just want to tie a particular instance of a class to a particular service, you can just pass the service (or one or more of the service's methods) to the class initializer, and have it use that:
class MyClass(object):
def __init__(self, authenticate=None, authorize=None):
if authenticate is None:
authenticate = default_authenticate
if authorize is None:
authorize = default_authorize
self.authenticate = authenticate
self.authorize = authorize
def request(self, user, password, action):
self.authenticate(user, password)
self.authorize(user, action)
self._do_request(action)
...
helper = AuthService(...)
# Pass bound methods to helper.authenticate and helper.authorize to MyClass.
inst = MyClass(authenticate=helper.authenticate, authorize=helper.authorize)
inst.request(...)
When setting instance attributes like that, you never have to worry about descriptors firing, so just assigning the functions (or classes or other callables or instances) is fine.
How about this "setter-only" injection recipe?
http://code.activestate.com/recipes/413268/
It is quite pythonic, using the "descriptor" protocol with __get__()/__set__(), but rather invasive, requiring to replace all your attribute-setting code with a RequiredFeature instance initialized with the str-name of the Feature required.
After years using Python without any DI autowiring framework and Java with Spring I've come to realize plain simple Python code often doesn't need frameworks for dependency injection without autowiring (autowiring is what Guice and Spring both do in Java), i.e., just doing something like this is enough:
def foo(dep = None): # great for unit testing!
self.dep = dep or Dep() # callers can not care about this too
...
This is pure dependency injection (quite simple) but without magical frameworks for automatically injecting them for you (i.e., autowiring) and without Inversion of Control.
Though as I dealt with bigger applications this approach wasn't cutting it anymore. So I've come up with injectable a micro-framework that wouldn't feel non-pythonic and yet would provide first class dependency injection autowiring.
Under the motto Dependency Injection for Humans™ this is what it looks like:
# some_service.py
class SomeService:
#autowired
def __init__(
self,
database: Autowired(Database),
message_brokers: Autowired(List[Broker]),
):
pending = database.retrieve_pending_messages()
for broker in message_brokers:
broker.send_pending(pending)
# database.py
#injectable
class Database:
...
# message_broker.py
class MessageBroker(ABC):
def send_pending(messages):
...
# kafka_producer.py
#injectable
class KafkaProducer(MessageBroker):
...
# sqs_producer.py
#injectable
class SQSProducer(MessageBroker):
...
What is dependency injection?
Dependency injection is a principle that helps to decrease coupling and increase cohesion.
Coupling and cohesion are about how tough the components are tied.
High coupling. If the coupling is high it’s like using a superglue or welding. No easy way to disassemble.
High cohesion. High cohesion is like using the screws. Very easy to disassemble and assemble back or assemble a different way. It is an opposite to high coupling.
When the cohesion is high the coupling is low.
Low coupling brings a flexibility. Your code becomes easier to change and test.
How to implement the dependency injection?
Objects do not create each other anymore. They provide a way to inject the dependencies instead.
before:
import os
class ApiClient:
def __init__(self):
self.api_key = os.getenv('API_KEY') # <-- dependency
self.timeout = os.getenv('TIMEOUT') # <-- dependency
class Service:
def __init__(self):
self.api_client = ApiClient() # <-- dependency
def main() -> None:
service = Service() # <-- dependency
...
if __name__ == '__main__':
main()
after:
import os
class ApiClient:
def __init__(self, api_key: str, timeout: int):
self.api_key = api_key # <-- dependency is injected
self.timeout = timeout # <-- dependency is injected
class Service:
def __init__(self, api_client: ApiClient):
self.api_client = api_client # <-- dependency is injected
def main(service: Service): # <-- dependency is injected
...
if __name__ == '__main__':
main(
service=Service(
api_client=ApiClient(
api_key=os.getenv('API_KEY'),
timeout=os.getenv('TIMEOUT'),
),
),
)
ApiClient is decoupled from knowing where the options come from. You can read a key and a timeout from a configuration file or even get them from a database.
Service is decoupled from the ApiClient. It does not create it anymore. You can provide a stub or other compatible object.
Function main() is decoupled from Service. It receives it as an argument.
Flexibility comes with a price.
Now you need to assemble and inject the objects like this:
main(
service=Service(
api_client=ApiClient(
api_key=os.getenv('API_KEY'),
timeout=os.getenv('TIMEOUT'),
),
),
)
The assembly code might get duplicated and it’ll become harder to change the application structure.
Conclusion
Dependency injection brings you 3 advantages:
Flexibility. The components are loosely coupled. You can easily extend or change a functionality of the system by combining the components different way. You even can do it on the fly.
Testability. Testing is easy because you can easily inject mocks instead of real objects that use API or database, etc.
Clearness and maintainability. Dependency injection helps you reveal the dependencies. Implicit becomes explicit. And “Explicit is better than implicit” (PEP 20 - The Zen of Python). You have all the components and dependencies defined explicitly in the container. This provides an overview and control on the application structure. It is easy to understand and change it.
—-
I believe that through the already presented example you will understand the idea and be able to apply it to your problem, ie the implementation of UserService for authorization.
I recently released a DI framework for python that might help you here. I think its a fairly fresh take on it, but I'm not sure how 'pythonic' it is. Judge for yourself. Feedback is very welcome.
https://github.com/suned/serum

Problem using super(python 2.5.2)

I'm writing a plugin system for my program and I can't get past one thing:
class ThingLoader(object):
'''
Loader class
'''
def loadPlugins(self):
'''
Get all the plugins from plugins folder
'''
from diones.thingpad.plugin.IntrospectionHelper import loadClasses
classList=loadClasses('./plugins', IPlugin)#Gets a list of
#plugin classes
self.plugins={}#Dictionary that should be filled with
#touples of objects and theirs states, activated, deactivated.
classList[0](self)#Runs nicelly
foo = classList[1]
print foo#prints <class 'TestPlugin.TestPlugin'>
foo(self)#Raise an exception
The test plugin looks like this:
import diones.thingpad.plugin.IPlugin as plugin
class TestPlugin(plugin.IPlugin):
'''
classdocs
'''
def __init__(self, loader):
self.name='Test Plugin'
super(TestPlugin, self).__init__(loader)
Now the IPlugin looks like this:
class IPlugin(object):
'''
classdocs
'''
name=''
def __init__(self, loader):
self.loader=loader
def activate(self):
pass
All the IPlugin classes works flawlessy by them selves, but when called by ThingLoader the program gets an exception:
File "./plugins\TestPlugin.py", line 13, in __init__
super(TestPlugin, self).__init__(loader) NameError:
global name 'super' is not defined
I looked all around and I simply don't know what is going on.
‘super’ is a builtin. Unless you went out of your way to delete builtins, you shouldn't ever see “global name 'super' is not defined”.
I'm looking at your user web link where there is a dump of IntrospectionHelper. It's very hard to read without the indentation, but it looks like you may be doing exactly that:
built_in_list = ['__builtins__', '__doc__', '__file__', '__name__']
for i in built_in_list:
if i in module.__dict__:
del module.__dict__[i]
That's the original module dict you're changing there, not an informational copy you are about to return! Delete these members from a live module and you can expect much more than ‘super’ to break.
It's very hard to keep track of what that module is doing, but my reaction is there is far too much magic in it. The average Python program should never need to be messing around with the import system, sys.path, and monkey-patching __magic__ module members. A little bit of magic can be a neat trick, but this is extremely fragile. Just off the top of my head from browsing it, the code could be broken by things like:
name clashes with top-level modules
any use of new-style classes
modules supplied only as compiled bytecode
zipimporter
From the incredibly round-about functions like getClassDefinitions, extractModuleNames and isFromBase, it looks to me like you still have quite a bit to learn about the basics of how Python works. (Clues: getattr, module.__name__ and issubclass, respectively.)
In this case now is not the time to be diving into import magic! It's hard. Instead, do things The Normal Python Way. It may be a little more typing to say at the bottom of a package's mypackage/__init__.py:
from mypackage import fooplugin, barplugin, bazplugin
plugins= [fooplugin.FooPlugin, barplugin.BarPlugin, bazplugin.BazPlugin]
but it'll work and be understood everywhere without relying on a nest of complex, fragile magic.
Incidentally, unless you are planning on some in-depth multiple inheritance work (and again, now may not be the time for that), you probably don't even need to use super(). The usual “IPlugin.__init__(self, ...)” method of calling a known superclass is the straightforward thing to do; super() is not always “the newer, better way of doing things” and there are things you should understand about it before you go charging into using it.
Unless you're running a version of Python earlier than 2.2 (pretty unlikely), super() is definitely a built-in function (available in every scope, and without importing anything).
May be worth checking your version of Python (just start up the interactive prompt by typing python at the command line).

Categories