I'm struggling to pickle a wrapped function when I use a custom callable class as a wrapper.
I have a callable class "Dependee" that keeps track of dependencies for a wrapped function with a member variable "depends_on". I'd like to use a decorator to wrap functions and also be able to pickle the resulting wrapped function.
So I define my dependee class. Note the use of functools.update_wrapper.
>>> class Dependee:
...
... def __init__(self, func, depends_on=None):
... self.func = func
... self.depends_on = depends_on or []
... functools.update_wrapper(self, func)
...
... def __call__(self, *args, **kwargs):
... return self.func(*args, **kwargs)
...
Then I define my decorator such that it will return an instance of the Dependee wrapper class.
>>> class depends:
...
... def __init__(self, on=None):
... self.depends_on = on or []
...
... def __call__(self, func):
... return Dependee(func, self.depends_on)
...
Here's an example of a wrapped function.
>>> #depends(on=["foo", "bar"])
... def sum(x, y): return x+y
...
The member variable seems to be accessible.
>>> print(sum.depends_on)
['foo', 'bar']
I can call the function as expected.
>>> print(sum(1,2))
3
But I can't pickle the wrapped instance.
>>> print(pickle.dumps(sum))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <function sum at 0x7f543863fbf8>: it's not the same object as __main__.sum
What am I missing? How can I give pickle a more appropriately qualified name so that it can find the instance rather than the original function. Note that manual wrapping works just fine.
>>> def sum2_func(x,y): return x+y
...
>>> sum2 = Dependee(sum2_func, depends_on=["foo", "bar"])
>>> print(sum2.depends_on)
['foo', 'bar']
>>> print(sum2(1,2))
3
>>> print(pickle.loads(pickle.dumps(sum2)).depends_on)
['foo', 'bar']
You just need a better serializer, like dill. As for how it works, dill just does a lot of registering python types with the equivalent of copy_reg -- it also treats __main__ similarly to a module, and lastly can serialize by reference or by object. So the last bit is relevant if you want to serialize a function or class, and take the class/function definition with the pickle. It's a little bigger of a pickle than serializing by reference, but it's more robust.
Here's your code exactly:
>>> import dill
>>> import functors
>>> class Dependee:
... def __init__(self, func, depends_on=None):
... self.func = func
... self.depends_on = depends_on or []
... functools.update_wrapper(self, func)
... def __call__(self, *args, **kwargs):
... return self.func(*args, **kwargs)
...
>>>
>>> class depends:
... def __init__(self, on=None):
... self.depends_on = on or []
... def __call__(self, func):
... return Dependee(func, self.depends_on)
...
>>> #depends(on=['foo','bar'])
... def sum(x,y): return x+y
...
>>> print(sum.depends_on)
['foo', 'bar']
>>> print(sum(1,2))
3
>>> _sum = dill.dumps(sum)
>>> sum_ = dill.loads(_sum)
>>> print(sum_(1,2))
3
>>> print(sum_.depends_on)
['foo', 'bar']
>>>
Get dill here: https://github.com/uqfoundation
Yep, well-known pickle problem -- can't pickle functions or classes that can't just be retrieved by their name in the module. See e.g https://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule for clear examples (specifically on how this affects modwsgi, but also of the issue in general).
In this case since all you're doing is adding attributes to the function, you can get away with a simplified approach:
class depends:
def __init__(self, on=None):
self.depends_on = on or []
def __call__(self, func):
func.func = func
func.depends_on = self.depends_on or []
return func
the return func is the key idea -- return the same object that's being decorated (possibly after decorating it, as here, with additional attributes -- but, not a different object, else the name-vs-identity issue perks up).
Now this will work (just your original code, only changing depends as above):
$ python d.py
['foo', 'bar']
3
c__main__
sum
p0
.
Of course, this isn't a general-purpose solution (it only works if it's feasible for the decorator to return the same object it's decorating), just one that works in your example.
I am not aware of any serialization approach able to serialize and de-serialize Python objects without this limitation, alas.
Related
In python a function is a first class object. A class can be called. So you can replace a function with a class. But can you make a function behave like a class? Can you add and remove attributes or call inner functions( then called methods) in a function?
I found a way to do this via code inspection.
import inspect
class AddOne(object):
"""class definition"""
def __init__(self, num):
self.num = num
def getResult(self):
"""
class method
"""
def addOneFunc(num):
"inner function"
return num + 1
return addOneFunc(self.num);
if __name__ == '__main__':
two = AddOne(1);
two_src = '\n'.join([line[4:] for line in inspect.getsource(AddOne.getResult).split('\n')])
one_src = '\n'.join([line[4:] for line in two_src.split('\n')
if line[:4] == ' ' and line[4:8] == ' ' or line[4:8] == 'def '])
one_co = compile(one_src, '<string>', 'exec')
exec one_co
print addOneFunc(5)
print addOneFunc.__doc__
But is there a way to access the local variables and functions defined in a class in a more direct way?
EDIT
The question is about how to access the inner structure of python to get a better understanding. Of course I wouldn't do this in normal programming. The question arose when we had a discussion about private variables in python. My opinion was this to be against the philosophy of the language. So someone came up with the example above. At the moment it seems he is right. You cannot access the function inside a function without the inspect module, rendering this function private. With co_varnames we are awfully close because we already have the name of the function. But where is the namespace dictionary to hold the name. If you try to use
getResult.__dict__
it is empty. What I like to have is an answer from python like
function addOneFunc at <0xXXXXXXXXX>
You can consider a function to be an instance of a class that only implements __call__, i.e.
def foo(bar):
return bar
is roughly equivalent to
class Foo(object):
def __call__(self, bar):
return bar
foo = Foo()
Function instances have a __dict__ attribute, so you can freely add new attributes to them.
Adding an attribute to a function can be used, for example, to implement a memoization decorator, which caches previous calls to a function:
def memo(f):
#functools.wraps(f)
def func(*args):
if args not in func.cache: # access attribute
func.cache[args] = f(*args)
return func.cache[args]
func.cache = {} # add attribute
return func
Note that this attribute can also be accessed inside the function, although it can't be defined until after the function.
You could therefore do something like:
>>> def foo(baz):
def multiply(x, n):
return x * n
return multiply(foo.bar(baz), foo.n)
>>> def bar(baz):
return baz
>>> foo.bar = bar
>>> foo.n = 2
>>> foo('baz')
'bazbaz'
>>> foo.bar = len
>>> foo('baz')
6
(although it's possible that nobody would thank you for it!)
Note, however, that multiply, which was not made an attribute of foo, is not accessible from outside the function:
>>> foo.multiply(1, 2)
Traceback (most recent call last):
File "<pyshell#20>", line 1, in <module>
foo.multiply(1, 2)
AttributeError: 'function' object has no attribute 'multiply'
The other question addresses exactly what you're trying to do:
>>> import inspect
>>> import new
>>> class AddOne(object):
"""Class definition."""
def __init__(self, num):
self.num = num
def getResult(self):
"""Class method."""
def addOneFunc(num):
"inner function"
return num + 1
return addOneFunc(self.num)
>>> two = AddOne(1)
>>> for c in two.getResult.func_code.co_consts:
if inspect.iscode(c):
print new.function(c, globals())
<function addOneFunc at 0x0321E930>
Not sure if the following is what you're thinking about, but you can do this:
>>> def f(x):
... print(x)
...
>>> f.a = 1
>>> f.a
1
>>> f(54)
54
>>> f.a = f
>>> f
<function f at 0x7fb03579b320>
>>> f.a
<function f at 0x7fb03579b320>
>>> f.a(2)
2
So you can assign attributes to a function, and those attributes can be variables or functions (note that f.a = f was chosen for simplicity; you can assign f.a to any function of course).
If you want to access the local variables inside the function, I think then it's more difficult, and you may indeed need to revert to introspection. The example below uses the func_code attribute:
>>> def f(x):
... a = 1
... return x * a
...
>>> f.func_code.co_nlocals
2
>>> f.func_code.co_varnames
('x', 'a')
>>> f.func_code.co_consts
(None, 1)
>>> def hehe():
... return "spam"
...
>>> repr(hehe)
'<function hehe at 0x7fe5624e29b0>'
I want to have:
>>> repr(hehe)
'hehe function created by awesome programmer'
How do I do that? Putting __repr__ inside hehe function does not work.
EDIT:
In case you guys are wondering why I want to do this:
>>> defaultdict(hehe)
defaultdict(<function hehe at 0x7f0e0e252280>, {})
I just don't like the way it shows here.
No, you cannot change the representation of a function object; if you wanted to add documentation, you'd add a docstring:
def foo():
"""Frob the bar baz"""
and access that with help(foo) or print foo.__doc__.
You can create a callable object with a custom __repr__, which acts just like a function:
class MyCallable(object):
def __call__(self):
return "spam"
def __repr__(self):
return 'hehe function created by awesome programmer'
Demo:
>>> class MyCallable(object):
... def __call__(self):
... return "spam"
... def __repr__(self):
... return 'hehe function created by awesome programmer'
...
>>> hehe = MyCallable()
>>> hehe
hehe function created by awesome programmer
>>> hehe()
'spam'
Usually, when you want to change something about the function, say function signature, function behavior or function attributes, you should consider using a decorator. So here is how you might implement what you want:
class change_repr(object):
def __init__(self, functor):
self.functor = functor
# lets copy some key attributes from the original function
self.__name__ = functor.__name__
self.__doc__ = functor.__doc__
def __call__(self, *args, **kwargs):
return self.functor(*args, **kwargs)
def __repr__(self):
return '<function %s created by ...>' % self.functor.__name__
#change_repr
def f():
return 'spam'
print f() # spam
print repr(f) # <function hehe created by ...>
Note, that you can only use class based decorator, since you need to override __repr__ method, which you can't do with a function object.
Not directly the answer to your question, but perhaps you really want a docstring?
>>> def hehe():
... '''hehe function created by awesome programmer'''
... return 'spam'
...
>>> help(hehe)
Help on function hehe in module __main__:
hehe()
hehe function created by awesome programmer
Here's a slightly more flexible version of what's in Alexander Zhukov's answer:
def representation(repr_text):
class Decorator(object):
def __init__(self, functor):
self.functor = functor
def __call__(self, *args, **kwargs):
return self.functor(*args, **kwargs)
def __repr__(self):
return (repr_text % self.functor.__name__ if '%' in repr_text
else repr_text)
return Decorator
from collections import defaultdict
#representation('<function %s created by awesome programmer>')
def f():
return list
dd = defaultdict(f)
print repr(dd)
Output:
defaultdict(<function f created by awesome programmer>, {})
Sincerepresentation()returns a decorator, if you wanted the same boilerplate on several functions you could do something like this:
myrepr = representation('<function %s created by awesome programmer>')
#myrepr
def f():
...
#myrepr
def g():
...
etc
I have classes that have attributes set with #property decorator. They function as getter and setter using try and except clauses inside them. If attribute is not set, it gets data from database and uses it to instatiate objects from other classes. I tried to keep the example short, but the code used to instantiate attribute objects is a little different with each attribute. What they have in common is the try-except at the beginning.
class SubClass(TopClass):
#property
def thing(self):
try:
return self._thing
except AttributeError:
# We don't have any thing yet
pass
thing = get_some_thing_from_db('thing')
if not thing:
raise AttributeError()
self._thing = TheThing(thing)
return self._thing
#property
def another_thing(self):
try:
return self._another_thing
except AttributeError:
# We don't have things like this yet
pass
another_thing = get_some_thing_from_db('another')
if not another_thing:
raise AttributeError()
self._another_thing = AnotherThing(another_thing)
return self._another_thing
...etc...
#property
def one_more_thing(self):
try:
return self._one_more_thing
except AttributeError:
# We don't have this thing yet
pass
one_thing = get_some_thing_from_db('one')
if not one_thing:
raise AttributeError()
self._one_more_thing = OneThing(one_thing)
return self._one_more_thing
My question: is this a proper (e.g. pythonic) way of doing stuff? To me it seems a bit awkward to add the try-except-segment on top of everything. On the other hand it keeps the code short. Or is there a better way of defining attributes?
So long as you are using at least Python 3.2, use the functools.lru_cache() decorator.
import functools
class SubClass(TopClass):
#property
#functools.lru_cache()
def thing(self):
thing = get_some_thing_from_db('thing')
if not thing:
raise AttributeError()
return TheThing(thing)
A quick runnable example:
>>> import functools
>>> class C:
#property
#functools.lru_cache()
def foo(self):
print("Called foo")
return 42
>>> c = C()
>>> c.foo
Called foo
42
>>> c.foo
42
If you have a lot of these you can combine the decorators:
>>> def lazy_property(f):
return property(functools.lru_cache()(f))
>>> class C:
#lazy_property
def foo(self):
print("Called foo")
return 42
>>> c = C()
>>> c.foo
Called foo
42
>>> c.foo
42
If you are still on an older version of Python there's a fully featured backport of lru_cache on ActiveState although as in this case you're not passing any parameters when you call it you could probably replace it with something much simpler.
#YAmikep asks how to access the cache_info() method of lru_cache. It's a little bit messy, but you can still access it through the property object:
>>> C.foo.fget.cache_info()
CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)
I would like to know what is the python way of initializing a class member but only when accessing it, if accessed.
I tried the code below and it is working but is there something simpler than that?
class MyClass(object):
_MY_DATA = None
#staticmethod
def _retrieve_my_data():
my_data = ... # costly database call
return my_data
#classmethod
def get_my_data(cls):
if cls._MY_DATA is None:
cls._MY_DATA = MyClass._retrieve_my_data()
return cls._MY_DATA
You could use a #property on the metaclass instead:
class MyMetaClass(type):
#property
def my_data(cls):
if getattr(cls, '_MY_DATA', None) is None:
my_data = ... # costly database call
cls._MY_DATA = my_data
return cls._MY_DATA
class MyClass(metaclass=MyMetaClass):
# ...
This makes my_data an attribute on the class, so the expensive database call is postponed until you try to access MyClass.my_data. The result of the database call is cached by storing it in MyClass._MY_DATA, the call is only made once for the class.
For Python 2, use class MyClass(object): and add a __metaclass__ = MyMetaClass attribute in the class definition body to attach the metaclass.
Demo:
>>> class MyMetaClass(type):
... #property
... def my_data(cls):
... if getattr(cls, '_MY_DATA', None) is None:
... print("costly database call executing")
... my_data = 'bar'
... cls._MY_DATA = my_data
... return cls._MY_DATA
...
>>> class MyClass(metaclass=MyMetaClass):
... pass
...
>>> MyClass.my_data
costly database call executing
'bar'
>>> MyClass.my_data
'bar'
This works because a data descriptor like property is looked up on the parent type of an object; for classes that's type, and type can be extended by using metaclasses.
This answer is for a typical instance attribute/method only, not for a class attribute/classmethod, or staticmethod.
For Python 3.8+, how about using the cached_property decorator? It memoizes.
from functools import cached_property
class MyClass:
#cached_property
def my_lazy_attr(self):
print("Initializing and caching attribute, once per class instance.")
return 7**7**8
For Python 3.2+, how about using both property and lru_cache decorators? The latter memoizes.
from functools import lru_cache
class MyClass:
#property
#lru_cache()
def my_lazy_attr(self):
print("Initializing and caching attribute, once per class instance.")
return 7**7**8
Credit: answer by Maxime R.
Another approach to make the code cleaner is to write a wrapper function that does the desired logic:
def memoize(f):
def wrapped(*args, **kwargs):
if hasattr(wrapped, '_cached_val'):
return wrapped._cached_val
result = f(*args, **kwargs)
wrapped._cached_val = result
return result
return wrapped
You can use it as follows:
#memoize
def expensive_function():
print "Computing expensive function..."
import time
time.sleep(1)
return 400
print expensive_function()
print expensive_function()
print expensive_function()
Which outputs:
Computing expensive function...
400
400
400
Now your classmethod would look as follows, for example:
class MyClass(object):
#classmethod
#memoize
def retrieve_data(cls):
print "Computing data"
import time
time.sleep(1) #costly DB call
my_data = 40
return my_data
print MyClass.retrieve_data()
print MyClass.retrieve_data()
print MyClass.retrieve_data()
Output:
Computing data
40
40
40
Note that this will cache just one value for any set of arguments to the function, so if you want to compute different values depending on input values, you'll have to make memoize a bit more complicated.
Consider the pip-installable Dickens package which is available for Python 3.5+. It has a descriptors package which provides the relevant cachedproperty and cachedclassproperty decorators, the usage of which is shown in the example below. It seems to work as expected.
from descriptors import cachedproperty, classproperty, cachedclassproperty
class MyClass:
FOO = 'A'
def __init__(self):
self.bar = 'B'
#cachedproperty
def my_cached_instance_attr(self):
print('Initializing and caching attribute, once per class instance.')
return self.bar * 2
#cachedclassproperty
def my_cached_class_attr(cls):
print('Initializing and caching attribute, once per class.')
return cls.FOO * 3
#classproperty
def my_class_property(cls):
print('Calculating attribute without caching.')
return cls.FOO + 'C'
Ring gives lru_cache-like interface but working with any kind of descriptor supports: https://ring-cache.readthedocs.io/en/latest/quickstart.html#method-classmethod-staticmethod
class Page(object):
(...)
#ring.lru()
#classmethod
def class_content(cls):
return cls.base_content
#ring.lru()
#staticmethod
def example_dot_com():
return requests.get('http://example.com').content
See the link for more details.
I want to get the invoked times of each function or variable from existing codes which is writing in python.
What i thought is override the object's getattribute function, such as below:
acc = {}
class object(object):
def __getattribute__(self, p):
acc.update({str(self) + p: acc.get(str(self) + p, 0) + 1})
return supe(object, self).__getattribute__(p)
class A(object):
def a(self):
pass
class B(A):
def b(self):
pass
def main():
a = A()
a.a()
b = B()
b.b()
b.a = 'a'
b.a
print acc
if __name__ == '__main__':
main()
But, it only can calculate functions and variable in object, how can i calculate the normal functions or variable, such as:
def fun1():
pass
fun1()
fun1()
I want to get the result as 2, is there any tool or method to do it?
I am sorry my pool english, What i really need is the invoked times not the run time.
such as above, we said, fun1() is invoked two times.
Use a decorator.
>>> def timestamp(container, get_timestamp):
... def timestamp_decorator(func):
... def decorated(*args, **kwargs):
... container[func.func_name] = get_timestamp()
... return func(*args, **kwargs)
... return decorated
... return timestamp_decorator
...
And you use it like this:
>>> import datetime
>>> def get_timestamp():
... return datetime.datetime.now()
...
>>> timestamps = {}
>>> #timestamp(timestamps, get_timestamp)
... def foo(a):
... return a * 2
...
>>> x = foo(2)
>>> print x, timestamps
4 {'foo': datetime.datetime(2012, 2, 14, 9, 55, 15, 789893)}
There would be a way to create a counter decorator to a function (nbot a timestamp decorator) -and to automatically wrap all functions in a given module with this decorator -
so, if the module where you want to count the function calls in is named "mymodule" you can write:
class function_counter(object):
def __init__(self, func):
self.counter = 0
self.func = func
def __call__(self, *args, **kw):
self.counter += 1
return self.func(*args, **kw)
And:
>>> #function_counter
... def bla():
... pass
...
>>>
>>> bla()
>>> bla()
>>> bla()
>>> bla.counter
3
To apply this to all the functions in a module, you can write something like:
import mymodule
from types import FunctionType, BuiltinFunctionType
# define the "function_counter" class as above here (or import it)
for key, value in mymodule.__dict__.items():
if isinstance(value, (FunctionType, BuiltinFunctionType)):
mymodule.__dict__[key] = function_counter(value)
That would do for counting function usage.
If you want to count module level variable usage though, it is not that easy - as
you can't just override the mechanism attribute retrieving from a module object as you did for a class in your example.
The way to go there, is to substitute your module for a class - that implements the attribute counting scheme as you do in your example - after you import your module - and have all module attributes to be assigned to instance attributes in this class.
This is not a tested example (unlike the above), but try something along:
import mymodule
from types import FunctionType
class Counter(object):
# counter __getattribute__ just as you did above
c = Counter()
for key, value in mymodule.__dict__.items():
setattr(c, key, staticmethod(value) if isinstance(value, FunctionType) else value)
mymodule = c