Share plugin resources with implemented permission rules - python

I have multiple scripts that are exporting a same interface and they're executed using execfile() in insulated scope.
The thing is, I want them to share some resources so that each new script doesn't have to load them again from the start, thus loosing starting speed and using unnecessary amount of RAM.
The scripts are in reality much better encapsulated and guarded from malicious plug-ins than presented in example below, that's where problems for me begins.
The thing is, I want the script that creates a resource to be able to fill it with data, remove data or remove a resource, and of course access it's data.
But other scripts shouldn't be able to change another's scripts resource, just read it. I want to be sure that newly installed plug-ins cannot interfere with already loaded and running ones via abuse of shared resources.
Example:
class SharedResources:
# Here should be a shared resource manager that I tried to write
# but got stuck. That's why I ask this long and convoluted question!
# Some beginning:
def __init__ (self, owner):
self.owner = owner
def __call__ (self):
# Here we should return some object that will do
# required stuff. Read more for details.
pass
class plugin (dict):
def __init__ (self, filename):
dict.__init__(self)
# Here some checks and filling with secure versions of __builtins__ etc.
# ...
self["__name__"] = "__main__"
self["__file__"] = filename
# Add a shared resources manager to this plugin
self["SharedResources"] = SharedResources(filename)
# And then:
execfile(filename, self, self)
# Expose the plug-in interface to outside world:
def __getattr__ (self, a):
return self[a]
def __setattr__ (self, a, v):
self[a] = v
def __delattr__ (self, a):
del self[a]
# Note: I didn't use self.__dict__ because this makes encapsulation easier.
# In future I won't use object itself at all but separate dict to do it. For now let it be
----------------------------------------
# An example of two scripts that would use shared resource and be run with plugins["name"] = plugin("<filename>"):
# Presented code is same in both scripts, what comes after will be different.
def loadSomeResource ():
# Do it here...
return loadedresource
# Then Load this resource if it's not already loaded in shared resources, if it isn't then add loaded resource to shared resources:
shr = SharedResources() # This would be an instance allowing access to shared resources
if not shr.has_key("Default Resources"):
shr.create("Default Resources")
if not shr["Default Resources"].has_key("SomeResource"):
shr["Default Resources"].add("SomeResource", loadSomeResource())
resource = shr["Default Resources"]["SomeResource"]
# And then we use normally resource variable that can be any object.
# Here I Used category "Default Resources" to add and/or retrieve a resource named "SomeResource".
# I want more categories so that plugins that deal with audio aren't mixed with plug-ins that deal with video for instance. But this is not strictly needed.
# Here comes code specific for each plug-in that will use shared resource named "SomeResource" from category "Default Resources".
...
# And end of plugin script!
----------------------------------------
# And then, in main program we load plug-ins:
import os
plugins = {} # Here we store all loaded plugins
for x in os.listdir("plugins"):
plugins[x] = plugin(x)
Let say that our two scripts are stored in plugins directory and are both using some WAVE files loaded into memory.
Plugin that loads first will load the WAVE and put it into RAM.
The other plugin will be able to access already loaded WAVE but not to replace or delete it, thus messing with other plugin.
Now, I want each resource to have an owner, some id or filename of the plugin script, and that this resource is writable only by it's owner.
No tweaking or workarounds should enable the other plugin to access the first one.
I almost did it and then got stuck, and my head is spining with concepts that when implemented do the thing, but only partially.
This eats me, so I cannot concentrate any more. Any suggestion is more than welcome!
Adding:
This is what I use now without any safety included:
# Dict that will hold a category of resources (should implement some security):
class ResourceCategory (dict):
def __getattr__ (self, i): return self[i]
def __setattr__ (self, i, v): self[i] = v
def __delattr__ (self, i): del self[i]
SharedResources = {} # Resource pool
class ResourceManager:
def __init__ (self, owner):
self.owner = owner
def add (self, category, name, value):
if not SharedResources.has_key(category):
SharedResources[category] = ResourceCategory()
SharedResources[category][name] = value
def get (self, category, name):
return SharedResources[category][name]
def rem (self, category, name=None):
if name==None: del SharedResources[category]
else: del SharedResources[category][name]
def __call__ (self, category):
if not SharedResources.has_key(category):
SharedResources[category] = ResourceCategory()
return SharedResources[category]
__getattr__ = __getitem__ = __call__
# When securing, this must not be left as this, it is unsecure, can provide a way back to SharedResources pool:
has_category = has_key = SharedResources.has_key
Now a plugin capsule:
class plugin(dict):
def __init__ (self, path, owner):
dict.__init__()
self["__name__"] = "__main__"
# etc. etc.
# And when adding resource manager to the plugin, register it with this plugin as an owner
self["SharedResources"] = ResourceManager(owner)
# ...
execfile(path, self, self)
# ...
Example of a plugin script:
#-----------------------------------
# Get a category we want. (Using __call__() ) Note: If a category doesn't exist, it is created automatically.
AudioResource = SharedResources("Audio")
# Use an MP3 resource (let say a bytestring):
if not AudioResource.has_key("Beep"):
f = open("./sounds/beep.mp3", "rb")
Audio.Beep = f.read()
f.close()
# Take a reference out for fast access and nicer look:
beep = Audio.Beep # BTW, immutables doesn't propagate as references by themselves, doesn't they? A copy will be returned, so the RAM space usage will increase instead. Immutables shall be wrapped in a composed data type.
This works perfectly but, as I said, messing resources is too much easy here.
I would like an instance of ResourceManager() to be in charge to whom return what version of stored data.

So, my general approach would be this.
Have a central shared resource pool. Access through this pool would be read-only for everybody. Wrap all data in the shared pool so that no one "playing by the rules" can edit anything in it.
Each agent (plugin) maintains knowledge of what it "owns" at the time it loads it. It keeps a read/write reference for itself, and registers a reference to the resource to the centralized read-only pool.
When an plugin is loaded, it gets a reference to the central, read-only pool that it can register new resources with.
So, only addressing the issue of python native data structures (and not instances of custom classes), a fairly locked down system of read-only implementations is as follows. Note that the tricks that are used to lock them down are the same tricks that someone could use to get around the locks, so the sandboxing is very weak if someone with a little python knowledge is actively trying to break it.
import collections as _col
import sys
if sys.version_info >= (3, 0):
immutable_scalar_types = (bytes, complex, float, int, str)
else:
immutable_scalar_types = (basestring, complex, float, int, long)
# calling this will circumvent any control an object has on its own attribute lookup
getattribute = object.__getattribute__
# types that will be safe to return without wrapping them in a proxy
immutable_safe = immutable_scalar_types
def add_immutable_safe(cls):
# decorator for adding a new class to the immutable_safe collection
# Note: only ImmutableProxyContainer uses it in this initial
# implementation
global immutable_safe
immutable_safe += (cls,)
return cls
def get_proxied(proxy):
# circumvent normal object attribute lookup
return getattribute(proxy, "_proxied")
def set_proxied(proxy, proxied):
# circumvent normal object attribute setting
object.__setattr__(proxy, "_proxied", proxied)
def immutable_proxy_for(value):
# Proxy for known container types, reject all others
if isinstance(value, _col.Sequence):
return ImmutableProxySequence(value)
elif isinstance(value, _col.Mapping):
return ImmutableProxyMapping(value)
elif isinstance(value, _col.Set):
return ImmutableProxySet(value)
else:
raise NotImplementedError(
"Return type {} from an ImmutableProxyContainer not supported".format(
type(value)))
#add_immutable_safe
class ImmutableProxyContainer(object):
# the only names that are allowed to be looked up on an instance through
# normal attribute lookup
_allowed_getattr_fields = ()
def __init__(self, proxied):
set_proxied(self, proxied)
def __setattr__(self, name, value):
# never allow attribute setting through normal mechanism
raise AttributeError(
"Cannot set attributes on an ImmutableProxyContainer")
def __getattribute__(self, name):
# enforce attribute lookup policy
allowed_fields = getattribute(self, "_allowed_getattr_fields")
if name in allowed_fields:
return getattribute(self, name)
raise AttributeError(
"Cannot get attribute {} on an ImmutableProxyContainer".format(name))
def __repr__(self):
proxied = get_proxied(self)
return "{}({})".format(type(self).__name__, repr(proxied))
def __len__(self):
# works for all currently supported subclasses
return len(get_proxied(self))
def __hash__(self):
# will error out if proxied object is unhashable
proxied = getattribute(self, "_proxied")
return hash(proxied)
def __eq__(self, other):
proxied = get_proxied(self)
if isinstance(other, ImmutableProxyContainer):
other = get_proxied(other)
return proxied == other
class ImmutableProxySequence(ImmutableProxyContainer, _col.Sequence):
_allowed_getattr_fields = ("count", "index")
def __getitem__(self, index):
proxied = get_proxied(self)
value = proxied[index]
if isinstance(value, immutable_safe):
return value
return immutable_proxy_for(value)
class ImmutableProxyMapping(ImmutableProxyContainer, _col.Mapping):
_allowed_getattr_fields = ("get", "keys", "values", "items")
def __getitem__(self, key):
proxied = get_proxied(self)
value = proxied[key]
if isinstance(value, immutable_safe):
return value
return immutable_proxy_for(value)
def __iter__(self):
proxied = get_proxied(self)
for key in proxied:
if not isinstance(key, immutable_scalar_types):
# If mutable keys are used, returning them could be dangerous.
# If owner never puts a mutable key in, then integrity should
# be okay. tuples and frozensets should be okay as keys, but
# are not supported in this implementation for simplicity.
raise NotImplementedError(
"keys of type {} not supported in "
"ImmutableProxyMapping".format(type(key)))
yield key
class ImmutableProxySet(ImmutableProxyContainer, _col.Set):
_allowed_getattr_fields = ("isdisjoint", "_from_iterable")
def __contains__(self, value):
return value in get_proxied(self)
def __iter__(self):
proxied = get_proxied(self)
for value in proxied:
if isinstance(value, immutable_safe):
yield value
yield immutable_proxy_for(value)
#classmethod
def _from_iterable(cls, it):
return set(it)
NOTE: this is only tested on Python 3.4, but I tried to write it to be compatible with both Python 2 and 3.
Make the root of the shared resources a dictionary. Give a ImmutableProxyMapping of that dictionary to the plugins.
private_shared_root = {}
public_shared_root = ImmutableProxyMapping(private_shared_root)
Create an API where the plugins can register new resources to the public_shared_root, probably on a first-come-first-served basis (if it's already there, you can't register it). Pre-populate private_shared_root with any containers you know you're going to need, or any data you want to share with all plugins but you know you want to be read-only.
It might be convenient if the convention for the keys in the shared root mapping were all strings, like file-system paths (/home/dalen/local/python) or dotted paths like python library objects (os.path.expanduser). That way collision detection is immediate and trivial/obvious if plugins try to add the same resource to the pool.

Related

Is there a way to sync a serializable structure with python multiprocessing?

If you create a new Process in python, it will serialize and copy the entire available scope, as far as I understand it. If you use multiprocessing.Pipe() it also allows sending various things, not just raw bytes.
However, instead of sending, I simply want to update a variable that contains a simple POD object like this:
class MyStats:
def __init__(self):
self.bytes_read = 0
self.bytes_written = 0
So say that in a process, when I update these stats, I want to tell python to serialize it and send it to the parent process' side somehow. I don't want to have to create multiprocessing.Value for each and every one of these things, that sounds super tedious.
Is there a way to tell python to pass and overwrite a specific object property somehow?
A manager is what you need here: it will be slower but all data stored inside will be automatically synced with other processes. Here is a simple example below:
from multiprocessing.managers import BaseManager, public_methods, NamespaceProxy
from multiprocessing import Process
def make_proxy(name, cls, base=None):
"""
Args:
name : A string that should match the variable name the proxy will be assigned to
cls : The class for which you want to create a proxy for
base : If you are subclassing NamespaceProxy (or any other implementation) and want to use that subclass as the
base for this new proxy, then pass the subclass as the base using this argument
"""
exposed = public_methods(cls) + ['__getattribute__', '__setattr__', '__delattr__']
return _MakeProxyType(name, exposed, base)
def _MakeProxyType(name, exposed, base=None):
"""
Attempts to replicate multiprocessing.managers.MakeProxType properly
"""
if base is None:
base = NamespaceProxy
exposed = tuple(exposed)
dic = {}
for meth in exposed:
if hasattr(base, meth):
continue
exec('''def %s(self, *args, **kwds):
return self._callmethod(%r, args, kwds)''' % (meth, meth), dic)
ProxyType = type(name, (base,), dic)
ProxyType._exposed_ = exposed
return ProxyType
class MyStats:
def __init__(self):
self.bytes_read = 0
self.bytes_written = 0
def worker(my_stats):
my_stats.bytes_read = 100
print("Worker process read 100 bytes!")
# Remember to set the name of the variable and the "name" argument to the same value otherwise you will have trouble
# pickling this. If for some reason you cannot do this then you must change the variable's __qualname__ property to
# reflect where the object actually resides so pickle can find it.
MyStatsProxy = make_proxy('MyStatsProxy', MyStats)
if __name__ == "__main__":
# Register our proxy and start the manager process
BaseManager.register("MyStats", MyStats, MyStatsProxy)
manager = BaseManager()
manager.start()
# Create our shared instance and modify it from another process
my_stats = manager.MyStats()
p = Process(target=worker, args=(my_stats,))
p.start()
p.join()
# Check value from main process
print(f"In main process, bytes read are {my_stats.bytes_read}!")
Output
Worker process read 100 bytes!
In main process, bytes read are 100!
Check this question and its answers for more useful information about managers/registering classes and alternate methods to achieve the same result
Note: Keep in mind that managers return pickled values for all objects you access through it. So any modifications you do on mutable objects should be done from within an instance method rather than requesting the mutable object through the proxy and modifying it from outside. For example, doing below will not modify the attribute some_list in the manager at all, only the local copy (to the process) of this attribute will be modified:
my_stats.some_list[0] = "some value"
Instead, you should create an instance method for modifications and call that instead:
my_stats.modify_list(0, "some value")
Alternatively, you can also force the manager to update the mutable object by re-assigning the new value for the object:
local_copy = my_stats.some_list
local_copy[0] = "some value"
my_stats.some_list = local_copy

Add post behavior to python's assignment (=) operator

I'm working with a redis database and I'd like to integrate my prototype code to my baseline as seamlessly as possible so I'm trying to bury most of the inner workings with the interface from the python
client to the redis server code into a few base classes that I will subclass throughout my production code.
I'm wondering if the assignment operator in python = is a callable and whether if it possible to modify the callable's pre and post behavior, particularly the post behavior such that I can call object.save() so that the redis cache would be updated behind the scenes without having to explicitly call it.
For example,
# using the redis-om module
from redis_om import JsonModel
kwargs = {'attr1': 'someval1', 'attr2': 'someval2'}
jsonModel = JsonModel(**kwargs)
# as soon as assignment completes, redis database
# has the new value updated without needing to
# call jsonModel.save()
jsonModel.attr1 = 'newvalue'
You can make such things with proxy class through __getattr__ and __setattr__ methods:
class ProxySaver:
def __init__(self, model):
self.__dict__['_model'] = model
def __getattr__(self, attr):
return getattr(self._model, attr)
def __setattr__(self, attr, value):
setattr(self._model, attr, value)
self._model.save()
p = ProxySaver(jsonModel)
print(p.attr1)
p.attr1 = 'test'
But if attributes has a complex types (list, dict, objects, ... ), assignment for nested objects will not been detected and save call will be skipped (p.attr1.append('test'), p.attr1[0] = 'test2').

Extending subclass instances from superclass instances in Python

Is there any way to instantiate a subclass object as an extension of a superclass object, in such a way that the subclass retains the arguments passed to the superclass?
I'm working on a simple melody generator to get back into programming. The idea was that a Project can contain an arbitrary number of Instruments, which can have any number of Sequences.
Every subordinate object would retain all of the information of the superior objects (e.g. every instrument shares the project's port device, and so forth).
I figured I could do something like this:
import rtmidi
class Project:
def __init__(self, p_port=None):
self.port = p_port
# Getter / Setter removed for brevity
class Instrument(Project):
def __init__(self, p_channel=1)
self.channel = p_channel
# Getter / Setter removed for brevity
def port_setup():
midi_out = rtmidi.MidiOut()
selected_port = midi_out.open_port(2)
return selected_port
if __name__ == '__main__':
port = port_setup()
project = Project(p_port=port)
project.inst1 = Instrument()
print(project.port, project.inst1.port)
The expectation was that the new instrument would extend the created Project and inherit the port passed to its parent.
However, that doesn't work; the project and instrument return different objects, so there seems to be no relation between the objects at all. A quick Google search also doesn't turn up any information, which I assume means I'm really missing something.
Is there a proper way to set up nested structures like this?
Your relationship is that each Project has many Instruments. An Instrument is not a Project.
One first step could be to tell each Instrument which project is belongs to:
import rtmidi
class Project:
def __init__(self, p_port=None):
self.port = p_port
# Getter / Setter removed for brevity
class Instrument:
def __init__(self, project, p_channel=1)
self.project = project
self.channel = p_channel
# Getter / Setter removed for brevity
def port_setup():
midi_out = rtmidi.MidiOut()
selected_port = midi_out.open_port(2)
return selected_port
if __name__ == '__main__':
port = port_setup()
project = Project(p_port=port)
project.inst1 = Instrument(project)
print(project.port, project.inst1.project.port)

keep track of member variable access through instance

I am trying to write some module which keep track of member variable access
through instance.
1. is it possible to know member variable has access using instance at run time?
2. if yes, any design/pointer or idea
Purpose: I would like to write simple script which will read sample file(module) and member variable accessed by instance. So we can develop this as a part of debuging framework.
For example, if I write in main time.initial_time than my script able to detect that initial_time has been accessed by time Instance. it will be run at the run time. I mean, it will be part of existing flow
Real Purpose
The object contain 1000 value but some of them used by each module. if it's become debug framework so we can easily identify and print information of member variable access by instance. Yes each module create instance of data class.
Sample file
"""testing pylint code"""
#!/usr/bin/env py
class Sample(object):
"""create sample class"""
def __init__(self):
"""seting variable"""
self.intial_time = 0
def main():
"""main functionality"""
time = Sample()
print time.initial_time
if __name__ == " __main__":
main()
You can do it using descriptors.
Properties is a special case of descriptors but I believe they will not help you as much in this case.
Here is a descriptor that does exactly what you want:
from collections import defaultdict
class TrackedAttribute:
def __init__(self, default_value):
self.default = default_value
# Dict mapping an instance to it's value
self.instance_dict = defaultdict(lambda: default_value)
def __get__(self, inst, owner):
if inst is None:
print("Accessed from class %r" % (owner,))
return self.default
print("Accessed from instance %r" % (inst,))
return self.instance_dict[inst]
def __set__(self, inst, value):
print("Setting from instance %r" % (inst,))
self.instance_dict[inst] = value
class Simple:
time = TrackedAttribute(0)
There may be a better answer more suitable to your specific needs (trying to identify unused variables), but Python has a property decorator that you could use:
class Sample(object):
def __init__(self):
self._initial_time = 0
#property
def initial_time(self):
print('self.initial_time has been read')
return self._initial_time
>>> print(Sample().initial_time)
self.initial_time has been read
0
>>>

Is my method right for gargabe collecting circular referenced objects?

When I was playing with my newly created html module, I used weakref module to overcome the circular reference problem. Everything seems to be fine for me! but I am not sure about the way I followed and not sure about the Scope class below. I tried to have a smallest working example (Here is a link for full code). Html class is just for creating html output with python objects. The example below does not do that for simplicity, of course.
# encoding: utf-8
from __future__ import print_function, unicode_literals
import weakref
class Scope(object):
def __init__(self):
self.ref_holder = set()
def add(self, obj):
self.ref_holder.add(obj)
def __enter__(self):
return self
def __exit__(self, *args, **kwargs):
self.ref_holder = None
class Html(object):
def __init__(self, parent=None, tag="", scope=None):
self.scope = scope
if parent is None:
self.parent = None
elif type(parent) != weakref.CallableProxyType:
self.parent = weakref.proxy(parent)
if self.scope:
self.scope.add(parent)
elif parent.scope:
parent.scope.add(self)
else:
self.parent = parent
self.tag = tag
if self.scope:
self.scope.add(self)
self.children = []
def append(self, html):
if isinstance(html, basestring):
html = Html(tag=html)
return self.append(html)
elif isinstance(html, self.__class__):
self.children.append(html)
return html
else:
raise Exception("Unknown type")
def __unicode__(self):
return 'Html "{tag}" children = {children}'.format(tag=self.tag,
children=list(map(str, self.children)))
def __str__(self):
return self.__unicode__()
if __name__ == "__main__":
with Scope() as scope:
test_form = Html(tag="form", scope=scope)
test_form.append(Html(tag="label"))
test_input = Html(tag="input")
test_form.append(test_input)
print(test_form)
Here are my concerns and I will appreciate your guidance:
I call reference holder class as Scope. It just holds the references to objects even if they are not assigned to any variable so Html object is not garbage collected (note: some objects can change parent/child relation ship and therefore there is not left any strong reference to object, in the real code).
I could simply hold the object references in a list and delete it after that but using with statement seems nicer. Is the class name Scope right for this task and the way I hold references is right? Is there a good way to hold the objects' strong references created on the fly different than my method?
I believe setting the Scope.ref_holder variable to None after exiting with statement, frees all the strong references and then gc collects them. I tested this by disabling gc and calling gc.collect then no object exists as unreacable, am I right to assume this method assures there is no leakage?
EDIT
I added the link for full source code.
Code is compliant with Python 2.7
I think the point here is that you are using __exit__ matched with with in Python.
Basically it should not happen like this. In __enter__ you are returning the self, the __exit__ should remove all the references. Are you sure that you don't have any exception in the middle? Due to exception __exit__ may return false and your garbage collection may not be complete.
However to answer your question, I can tell you that the way you are making it None in the __exit__ method is forceful garbage cleaning & I think it completely depends on the OS. If self.ref_holder is somehow liked with addressed of those objects, they may be cleaned properly.
Try to do del self.ref_holder rather. Assure more powerful cleanup.
Perhaps a bit OT: why explicit class for scope?
How about something simpler instead?
#contextlib.contextmanager
def scope():
rv = set()
try:
yield rv
finally:
pass
# if you must be explicit or
# if you want side-effect in leaked scopes
rv.clear()

Categories