keep track of member variable access through instance - python

I am trying to write some module which keep track of member variable access
through instance.
1. is it possible to know member variable has access using instance at run time?
2. if yes, any design/pointer or idea
Purpose: I would like to write simple script which will read sample file(module) and member variable accessed by instance. So we can develop this as a part of debuging framework.
For example, if I write in main time.initial_time than my script able to detect that initial_time has been accessed by time Instance. it will be run at the run time. I mean, it will be part of existing flow
Real Purpose
The object contain 1000 value but some of them used by each module. if it's become debug framework so we can easily identify and print information of member variable access by instance. Yes each module create instance of data class.
Sample file
"""testing pylint code"""
#!/usr/bin/env py
class Sample(object):
"""create sample class"""
def __init__(self):
"""seting variable"""
self.intial_time = 0
def main():
"""main functionality"""
time = Sample()
print time.initial_time
if __name__ == " __main__":
main()

You can do it using descriptors.
Properties is a special case of descriptors but I believe they will not help you as much in this case.
Here is a descriptor that does exactly what you want:
from collections import defaultdict
class TrackedAttribute:
def __init__(self, default_value):
self.default = default_value
# Dict mapping an instance to it's value
self.instance_dict = defaultdict(lambda: default_value)
def __get__(self, inst, owner):
if inst is None:
print("Accessed from class %r" % (owner,))
return self.default
print("Accessed from instance %r" % (inst,))
return self.instance_dict[inst]
def __set__(self, inst, value):
print("Setting from instance %r" % (inst,))
self.instance_dict[inst] = value
class Simple:
time = TrackedAttribute(0)

There may be a better answer more suitable to your specific needs (trying to identify unused variables), but Python has a property decorator that you could use:
class Sample(object):
def __init__(self):
self._initial_time = 0
#property
def initial_time(self):
print('self.initial_time has been read')
return self._initial_time
>>> print(Sample().initial_time)
self.initial_time has been read
0
>>>

Related

Custom destructor in Python

Let's say I have two classes:
class Container():
def __init__(self, name):
self.name = name
class Data():
def __init__(self):
self._containers = []
def add_container(self,name):
self._containers.append(name)
setattr(self, name, Container(name))
Now let's say
myData = Data()
myData.add_container('contA')
Now, if I do del myData.contA it of course doesn't remove name from myData._containers.
So how would I write a destructor in Container so it deletes the attribute but also removes name from the _containers list?
You seem to be used to a language with deterministic object destruction and dedicated methods for performing that destruction. Python doesn't work that way. Python has no destructors, and even if it had destructors, there is no guarantee that del myData.contA would render the Container object eligible for destruction, let alone actually destroy it.
Probably the simplest way is to just define a remove_container paralleling your add_container:
def remove_container(self, name):
self._containers.remove(name)
delattr(self, name)
If you really want the syntax for this operation to be del myData.contA, then hook into attribute deletion, by implementing a __delattr__ on Data:
def __delattr__(self, name):
self._containers.remove(name)
super().__delattr__(name)
You want to overload the __delattr__ special method: https://docs.python.org/3/reference/datamodel.html#object.delattr
class Data:
[...]
def __delattr__(self, key):
super().__delattr__(name)
#find and remove the Container from _containers

Call a class method only once

I created the following class:
import loader
import pandas
class SavTool(pd.DataFrame):
def __init__(self, path):
pd.DataFrame.__init__(self, data=loader.Loader(path).data)
#property
def path(self):
return path
#property
def meta_dict(self):
return loader.Loader(path).dict
If the class is instantiated the instance becomes a pandas DataFrame which I wanted to extend by other attributes like the path to the file and a dictionary containing meta information (called 'meta_dict').
What I want is the following: the dictionary 'meta_dict' shall be mutable. Namely, the following should work:
df = SavTool("somepath")
df.meta_dict["new_key"] = "new_value"
print df.meta_dict["new_key"]
But what happens is that every time I use the syntax 'df.meta_dict' the method 'meta_dict' is called and the original 'meta_dict' from loader.Loader is returned such that 'df.meta_dict' cannot be changed. Therefore, the syntax leads to "KeyError: 'new_key'". 'meta_dict' shall be called only once and then never again if it is used/called a second/third... time. The second/third... time 'meta_dict' should just be an attribute, in this case a dictionary.
How can I fix this? Maybe the whole design of the class is bad and should be changed (I'm new to using classes)? Thanks for your answers!
When you call loader.Loader you'll create a new instance of the dictionary each time. The #property doesn't cache anything for you, just provides a convenience for wrapping complicated getters for a clean interface for the caller.
Something like this should work. I also updated the path variable so it's bound correctly on the class and returned in the path property correctly.
import loader
import pandas
class SavTool(pd.DataFrame):
def __init__(self, path):
pd.DataFrame.__init__(self, data=loader.Loader(path).data)
self._path = path
self._meta_dict = loader.Loader(path).dict
#property
def path(self):
return self._path
#property
def meta_dict(self):
return self._meta_dict
def update_meta_dict(self, **kwargs):
self._meta_dict.update(kwargs)
Another way to just cache the variable is by using hasattr:
#property
def meta_dict(self):
if not hasattr(self, "_meta_dict"):
self._meta_dict = loader.Loader(path).dict
return self._meta_dict

Alternatives to decorators for saving metadata about classes

I'm writing a GUI library, and I'd like to let the programmer provide meta-information about their program which I can use to fine-tune the GUI. I was planning to use function decorators for this purpose, for example like this:
class App:
#Useraction(description='close the program', hotkey='ctrl+q')
def quit(self):
sys.exit()
The problem is that this information needs to be bound to the respective class. For example, if the program is an image editor, it might have an Image class which provides some more Useractions:
class Image:
#Useraction(description='invert the colors')
def invert_colors(self):
...
However, since the concept of unbound methods has been removed in python 3, there doesn't seem to be a way to find a function's defining class. (I found this old answer, but that doesn't work in a decorator.)
So, since it looks like decorators aren't going to work, what would be the best way to do this? I'd like to avoid having code like
class App:
def quit(self):
sys.exit()
Useraction(App.quit, description='close the program', hotkey='ctrl+q')
if at all possible.
For completeness' sake, the #Useraction decorator would look somewhat like this:
class_metadata= defaultdict(dict)
def Useraction(**meta):
def wrap(f):
cls= get_defining_class(f)
class_metadata[cls][f]= meta
return f
return wrap
You are using decorators to add meta data to methods. That is fine. It can be done e.g. this way:
def user_action(description):
def decorate(func):
func.user_action = {'description': description}
return func
return decorate
Now, you want to collect that data and store it in a global dictionary in form class_metadata[cls][f]= meta. For that, you need to find all decorated methods and their classes.
The simplest way to do that is probably using metaclasses. In metaclass, you can define what happens when a class is created. In this case, go through all methods of the class, find decorated methods and store them in the dictionary:
class UserActionMeta(type):
user_action_meta_data = collections.defaultdict(dict)
def __new__(cls, name, bases, attrs):
rtn = type.__new__(cls, name, bases, attrs)
for attr in attrs.values():
if hasattr(attr, 'user_action'):
UserActionMeta.user_action_meta_data[rtn][attr] = attr.user_action
return rtn
I have put the global dictionary user_action_meta_data in the meta class just because it felt logical. It can be anywhere.
Now, just use that in any class:
class X(metaclass=UserActionMeta):
#user_action('Exit the application')
def exit(self):
pass
Static UserActionMeta.user_action_meta_data now contains the data you want:
defaultdict(<class 'dict'>, {<class '__main__.X'>: {<function exit at 0x00000000029F36C8>: {'description': 'Exit the application'}}})
I've found a way to make decorators work with the inspect module, but it's not a great solution, so I'm still open to better suggestions.
Basically what I'm doing is to traverse the interpreter stack until I find the current class. Since no class object exists at this time, I extract the class's qualname and module instead.
import inspect
def get_current_class():
"""
Returns the name of the current module and the name of the class that is currently being created.
Has to be called in class-level code, for example:
def deco(f):
print(get_current_class())
return f
def deco2(arg):
def wrap(f):
print(get_current_class())
return f
return wrap
class Foo:
print(get_current_class())
#deco
def f(self):
pass
#deco2('foobar')
def f2(self):
pass
"""
frame= inspect.currentframe()
while True:
frame= frame.f_back
if '__module__' in frame.f_locals:
break
dict_= frame.f_locals
cls= (dict_['__module__'], dict_['__qualname__'])
return cls
Then in a sort of post-processing step, I use the module and class names to find the actual class object.
def postprocess():
global class_metadata
def findclass(module, qualname):
scope= sys.modules[module]
for name in qualname.split('.'):
scope= getattr(scope, name)
return scope
class_metadata= {findclass(cls[0], cls[1]):meta for cls,meta in class_metadata.items()}
The problem with this solution is the delayed class lookup. If classes are overwritten or deleted, the post-processing step will find the wrong class or fail altogether. Example:
class C:
#Useraction(hotkey='ctrl+f')
def f(self):
print('f')
class C:
pass
postprocess()

Share plugin resources with implemented permission rules

I have multiple scripts that are exporting a same interface and they're executed using execfile() in insulated scope.
The thing is, I want them to share some resources so that each new script doesn't have to load them again from the start, thus loosing starting speed and using unnecessary amount of RAM.
The scripts are in reality much better encapsulated and guarded from malicious plug-ins than presented in example below, that's where problems for me begins.
The thing is, I want the script that creates a resource to be able to fill it with data, remove data or remove a resource, and of course access it's data.
But other scripts shouldn't be able to change another's scripts resource, just read it. I want to be sure that newly installed plug-ins cannot interfere with already loaded and running ones via abuse of shared resources.
Example:
class SharedResources:
# Here should be a shared resource manager that I tried to write
# but got stuck. That's why I ask this long and convoluted question!
# Some beginning:
def __init__ (self, owner):
self.owner = owner
def __call__ (self):
# Here we should return some object that will do
# required stuff. Read more for details.
pass
class plugin (dict):
def __init__ (self, filename):
dict.__init__(self)
# Here some checks and filling with secure versions of __builtins__ etc.
# ...
self["__name__"] = "__main__"
self["__file__"] = filename
# Add a shared resources manager to this plugin
self["SharedResources"] = SharedResources(filename)
# And then:
execfile(filename, self, self)
# Expose the plug-in interface to outside world:
def __getattr__ (self, a):
return self[a]
def __setattr__ (self, a, v):
self[a] = v
def __delattr__ (self, a):
del self[a]
# Note: I didn't use self.__dict__ because this makes encapsulation easier.
# In future I won't use object itself at all but separate dict to do it. For now let it be
----------------------------------------
# An example of two scripts that would use shared resource and be run with plugins["name"] = plugin("<filename>"):
# Presented code is same in both scripts, what comes after will be different.
def loadSomeResource ():
# Do it here...
return loadedresource
# Then Load this resource if it's not already loaded in shared resources, if it isn't then add loaded resource to shared resources:
shr = SharedResources() # This would be an instance allowing access to shared resources
if not shr.has_key("Default Resources"):
shr.create("Default Resources")
if not shr["Default Resources"].has_key("SomeResource"):
shr["Default Resources"].add("SomeResource", loadSomeResource())
resource = shr["Default Resources"]["SomeResource"]
# And then we use normally resource variable that can be any object.
# Here I Used category "Default Resources" to add and/or retrieve a resource named "SomeResource".
# I want more categories so that plugins that deal with audio aren't mixed with plug-ins that deal with video for instance. But this is not strictly needed.
# Here comes code specific for each plug-in that will use shared resource named "SomeResource" from category "Default Resources".
...
# And end of plugin script!
----------------------------------------
# And then, in main program we load plug-ins:
import os
plugins = {} # Here we store all loaded plugins
for x in os.listdir("plugins"):
plugins[x] = plugin(x)
Let say that our two scripts are stored in plugins directory and are both using some WAVE files loaded into memory.
Plugin that loads first will load the WAVE and put it into RAM.
The other plugin will be able to access already loaded WAVE but not to replace or delete it, thus messing with other plugin.
Now, I want each resource to have an owner, some id or filename of the plugin script, and that this resource is writable only by it's owner.
No tweaking or workarounds should enable the other plugin to access the first one.
I almost did it and then got stuck, and my head is spining with concepts that when implemented do the thing, but only partially.
This eats me, so I cannot concentrate any more. Any suggestion is more than welcome!
Adding:
This is what I use now without any safety included:
# Dict that will hold a category of resources (should implement some security):
class ResourceCategory (dict):
def __getattr__ (self, i): return self[i]
def __setattr__ (self, i, v): self[i] = v
def __delattr__ (self, i): del self[i]
SharedResources = {} # Resource pool
class ResourceManager:
def __init__ (self, owner):
self.owner = owner
def add (self, category, name, value):
if not SharedResources.has_key(category):
SharedResources[category] = ResourceCategory()
SharedResources[category][name] = value
def get (self, category, name):
return SharedResources[category][name]
def rem (self, category, name=None):
if name==None: del SharedResources[category]
else: del SharedResources[category][name]
def __call__ (self, category):
if not SharedResources.has_key(category):
SharedResources[category] = ResourceCategory()
return SharedResources[category]
__getattr__ = __getitem__ = __call__
# When securing, this must not be left as this, it is unsecure, can provide a way back to SharedResources pool:
has_category = has_key = SharedResources.has_key
Now a plugin capsule:
class plugin(dict):
def __init__ (self, path, owner):
dict.__init__()
self["__name__"] = "__main__"
# etc. etc.
# And when adding resource manager to the plugin, register it with this plugin as an owner
self["SharedResources"] = ResourceManager(owner)
# ...
execfile(path, self, self)
# ...
Example of a plugin script:
#-----------------------------------
# Get a category we want. (Using __call__() ) Note: If a category doesn't exist, it is created automatically.
AudioResource = SharedResources("Audio")
# Use an MP3 resource (let say a bytestring):
if not AudioResource.has_key("Beep"):
f = open("./sounds/beep.mp3", "rb")
Audio.Beep = f.read()
f.close()
# Take a reference out for fast access and nicer look:
beep = Audio.Beep # BTW, immutables doesn't propagate as references by themselves, doesn't they? A copy will be returned, so the RAM space usage will increase instead. Immutables shall be wrapped in a composed data type.
This works perfectly but, as I said, messing resources is too much easy here.
I would like an instance of ResourceManager() to be in charge to whom return what version of stored data.
So, my general approach would be this.
Have a central shared resource pool. Access through this pool would be read-only for everybody. Wrap all data in the shared pool so that no one "playing by the rules" can edit anything in it.
Each agent (plugin) maintains knowledge of what it "owns" at the time it loads it. It keeps a read/write reference for itself, and registers a reference to the resource to the centralized read-only pool.
When an plugin is loaded, it gets a reference to the central, read-only pool that it can register new resources with.
So, only addressing the issue of python native data structures (and not instances of custom classes), a fairly locked down system of read-only implementations is as follows. Note that the tricks that are used to lock them down are the same tricks that someone could use to get around the locks, so the sandboxing is very weak if someone with a little python knowledge is actively trying to break it.
import collections as _col
import sys
if sys.version_info >= (3, 0):
immutable_scalar_types = (bytes, complex, float, int, str)
else:
immutable_scalar_types = (basestring, complex, float, int, long)
# calling this will circumvent any control an object has on its own attribute lookup
getattribute = object.__getattribute__
# types that will be safe to return without wrapping them in a proxy
immutable_safe = immutable_scalar_types
def add_immutable_safe(cls):
# decorator for adding a new class to the immutable_safe collection
# Note: only ImmutableProxyContainer uses it in this initial
# implementation
global immutable_safe
immutable_safe += (cls,)
return cls
def get_proxied(proxy):
# circumvent normal object attribute lookup
return getattribute(proxy, "_proxied")
def set_proxied(proxy, proxied):
# circumvent normal object attribute setting
object.__setattr__(proxy, "_proxied", proxied)
def immutable_proxy_for(value):
# Proxy for known container types, reject all others
if isinstance(value, _col.Sequence):
return ImmutableProxySequence(value)
elif isinstance(value, _col.Mapping):
return ImmutableProxyMapping(value)
elif isinstance(value, _col.Set):
return ImmutableProxySet(value)
else:
raise NotImplementedError(
"Return type {} from an ImmutableProxyContainer not supported".format(
type(value)))
#add_immutable_safe
class ImmutableProxyContainer(object):
# the only names that are allowed to be looked up on an instance through
# normal attribute lookup
_allowed_getattr_fields = ()
def __init__(self, proxied):
set_proxied(self, proxied)
def __setattr__(self, name, value):
# never allow attribute setting through normal mechanism
raise AttributeError(
"Cannot set attributes on an ImmutableProxyContainer")
def __getattribute__(self, name):
# enforce attribute lookup policy
allowed_fields = getattribute(self, "_allowed_getattr_fields")
if name in allowed_fields:
return getattribute(self, name)
raise AttributeError(
"Cannot get attribute {} on an ImmutableProxyContainer".format(name))
def __repr__(self):
proxied = get_proxied(self)
return "{}({})".format(type(self).__name__, repr(proxied))
def __len__(self):
# works for all currently supported subclasses
return len(get_proxied(self))
def __hash__(self):
# will error out if proxied object is unhashable
proxied = getattribute(self, "_proxied")
return hash(proxied)
def __eq__(self, other):
proxied = get_proxied(self)
if isinstance(other, ImmutableProxyContainer):
other = get_proxied(other)
return proxied == other
class ImmutableProxySequence(ImmutableProxyContainer, _col.Sequence):
_allowed_getattr_fields = ("count", "index")
def __getitem__(self, index):
proxied = get_proxied(self)
value = proxied[index]
if isinstance(value, immutable_safe):
return value
return immutable_proxy_for(value)
class ImmutableProxyMapping(ImmutableProxyContainer, _col.Mapping):
_allowed_getattr_fields = ("get", "keys", "values", "items")
def __getitem__(self, key):
proxied = get_proxied(self)
value = proxied[key]
if isinstance(value, immutable_safe):
return value
return immutable_proxy_for(value)
def __iter__(self):
proxied = get_proxied(self)
for key in proxied:
if not isinstance(key, immutable_scalar_types):
# If mutable keys are used, returning them could be dangerous.
# If owner never puts a mutable key in, then integrity should
# be okay. tuples and frozensets should be okay as keys, but
# are not supported in this implementation for simplicity.
raise NotImplementedError(
"keys of type {} not supported in "
"ImmutableProxyMapping".format(type(key)))
yield key
class ImmutableProxySet(ImmutableProxyContainer, _col.Set):
_allowed_getattr_fields = ("isdisjoint", "_from_iterable")
def __contains__(self, value):
return value in get_proxied(self)
def __iter__(self):
proxied = get_proxied(self)
for value in proxied:
if isinstance(value, immutable_safe):
yield value
yield immutable_proxy_for(value)
#classmethod
def _from_iterable(cls, it):
return set(it)
NOTE: this is only tested on Python 3.4, but I tried to write it to be compatible with both Python 2 and 3.
Make the root of the shared resources a dictionary. Give a ImmutableProxyMapping of that dictionary to the plugins.
private_shared_root = {}
public_shared_root = ImmutableProxyMapping(private_shared_root)
Create an API where the plugins can register new resources to the public_shared_root, probably on a first-come-first-served basis (if it's already there, you can't register it). Pre-populate private_shared_root with any containers you know you're going to need, or any data you want to share with all plugins but you know you want to be read-only.
It might be convenient if the convention for the keys in the shared root mapping were all strings, like file-system paths (/home/dalen/local/python) or dotted paths like python library objects (os.path.expanduser). That way collision detection is immediate and trivial/obvious if plugins try to add the same resource to the pool.

Decorator to register Python methods in PyCLIPS

I make use of PyCLIPS to integrate CLIPS into Python. Python methods are registered in CLIPS using clips.RegisterPythonFunction(method, optional-name). Since I have to register several functions and want to keep the code clear, I am looking for a decorator to do the registration.
This is how it is done now:
class CLIPS(object):
...
def __init__(self, data):
self.data = data
clips.RegisterPythonFunction(self.pyprint, "pyprint")
def pyprint(self, value):
print self.data, "".join(map(str, value))
and this is how I would like to do it:
class CLIPS(object):
...
def __init__(self, data):
self.data = data
#clips.RegisterPythonFunction(self.pyprint, "pyprint")
#clips_callable
def pyprint(self, value):
print self.data, "".join(map(str, value))
It keeps the coding of the methods and registering them in one place.
NB: I use this in a multiprocessor set-up in which the CLIPS process runs in a separate process like this:
import clips
import multiprocessing
class CLIPS(object):
def __init__(self, data):
self.environment = clips.Environment()
self.data = data
clips.RegisterPythonFunction(self.pyprint, "pyprint")
self.environment.Load("test.clp")
def Run(self, cycles=None):
self.environment.Reset()
self.environment.Run()
def pyprint(self, value):
print self.data, "".join(map(str, value))
class CLIPSProcess(multiprocessing.Process):
def run(self):
p = multiprocessing.current_process()
self.c = CLIPS("%s %s" % (p.name, p.pid))
self.c.Run()
if __name__ == "__main__":
p = multiprocessing.current_process()
c = CLIPS("%s %s" % (p.name, p.pid))
c.Run()
# Now run CLIPS from another process
cp = CLIPSProcess()
cp.start()
it should be fairly simple to do like this:
# mock clips for testing
class clips:
#staticmethod
def RegisterPythonFunction(func, name):
print "register: ", func, name
def clips_callable(fnc):
clips.RegisterPythonFunction(fnc, fnc.__name__)
return fnc
#clips_callable
def test(self):
print "test"
test()
edit: if used on a class method it will register the unbound method only. So it won't work if the function will be called without an instance of the class as the first argument. Therefore this would be usable to register module level functions, but not class methods. To do that, you'll have to register them in __init__.
It seems that the elegant solution proposed by mata wouldn't work because the CLIPS environment should be initialized before registering methods to it.
I'm not a Python expert, but from some searching it seems that combination of inspect.getmembers() and hasattr() will do the trick for you - you could loop all members of your class, and register the ones that have the #clips_callable attribute to CLIPS.
Got it working now by using a decorator to set an attribute on the method to be registered in CLIPS and using inspect in init to fetch the methods and register them. Could have used some naming strategy as well, but I prefer using a decorator to make the registering more explicit. Python functions can be registered before initializing a CLIPS environment. This is what I have done.
import inspect
def clips_callable(func):
from functools import wraps
#wraps(func)
def wrapper(*__args,**__kw):
return func(*__args,**__kw)
setattr(wrapper, "clips_callable", True)
return wrapper
class CLIPS(object):
def __init__(self, data):
members = inspect.getmembers(self, inspect.ismethod)
for name, method in members:
try:
if method.clips_callable:
clips.RegisterPythonFunction(method, name)
except:
pass
...
#clips_callable
def pyprint(self, value):
print self.data, "".join(map(str, value))
For completeness, the CLIPS code in test.clp is included below.
(defrule MAIN::start-me-up
=>
(python-call pyprint "Hello world")
)
If somebody knows a more elegant approach, please let me know.

Categories