Decorator to alter function behavior - python

I've found that I have two unrelated functions that implement identical behavior in different ways. I'm now wondering if there's a way, via decorators probably, to deal with this efficiently, to avoid writing the same logic over and over if the behavior is added elsewhere.
Essentially I have two functions in two different classes that have a flag called exact_match. Both functions check for some type of equivalence in the objects that they are members of. The exact_match flag forces to function to check float comparisons exactly instead of with a tolerance. You can see how I do this below.
def is_close(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
def _equal(val_a, val_b):
"""Wrapper for equality test to send in place of is_close."""
return val_a == val_b
#staticmethod
def get_equivalence(obj_a, obj_b, check_name=True, exact_match=False):
equivalence_func = is_close
if exact_match:
# If we're looking for an exact match, changing the function we use to the equality tester.
equivalence_func = _equal
if check_name:
return obj_a.name == obj_b.name
# Check minimum resolutions if they are specified
if 'min_res' in obj_a and 'min_res' in obj_b and not equivalence_func(obj_a['min_res'], obj_b['min_res']):
return False
return False
As you can see, standard procedure has us use the function is_close when we don't need an exact match, but we swap out the function call when we do. Now another function needs this same logic, swapping out the function. Is there a way to use decorators or something similar to handle this type of logic when I know a specific function call may need to be swapped out?

No decorator needed; just pass the desired function as an argument to get_equivalence (which is now little more than a wrapper that applies
the argument).
def make_eq_with_tolerance(rel_tol=1e-09, abs_tol=0.0):
def _(a, b):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
return _
# This is just operator.eq, by the way
def _equal(val_a, val_b-):
return val_a == val_b
def same_name(a, b):
return a.name == b.name
Now get_equivalence takes three arguments: the two objects to compare
and a function that gets called on those two arguments.
#staticmethod
def get_equivalence(obj_a, obj_b, equivalence_func):
return equivalence_func(obj_a, obj_b)
Some example calls:
get_equivalence(a, b, make_eq_with_tolerance())
get_equivalence(a, b, make_eq_with_tolerance(rel_tol=1e-12)) # Really tight tolerance
get_equivalence(a, b, _equal)
get_equivalence(a, b, same_name)

I came up with an alternative solution that is perhaps less correct but answers let's me solve the problem as I originally wanted to.
My solution uses a utility class that can be used as a member of a class or as a mixin for the class to provide the utility functions in a convenient way. Below, the functions _equals and is_close are defined elsewhere as their implementations is besides the point.
class EquivalenceUtil(object):
def __init__(self, equal_comparator=_equals, inexact_comparator=is_close):
self.equals = equal_comparator
self.default_comparator = inexact_comparator
def check_equivalence(self, obj_a, obj_b, exact_match=False, **kwargs):
return self.equals(obj_a, obj_b, **kwargs) if exact_match else self.default_comparator(obj_a, obj_b, **kwargs)
It's a simple class that can be used like so:
class BBOX(object):
_equivalence = EquivalenceUtil()
def __init__(self, **kwargs):
...
#classmethod
def are_equivalent(cls, bbox_a, bbox_b, exact_match=False):
"""Test for equivalence between two BBOX's."""
bbox_list = bbox_a.as_list
other_list = bbox_b.as_list
for _index in range(0, 3):
if not cls._equivalence.check_equivalence(bbox_list[_index],
other_list[_index],
exact_match=exact_match):
return False
return True
This solution is more opaque to the user about how things are checked behind the scenes, which is important for my project. Additionally it is pretty flexible and can be reused within a class in multiple places and ways, and easily added to a new class.
In my original example the code can turn into this:
class TileGrid(object):
def __init__(self, **kwargs):
...
#staticmethod
def are_equivalent(grid1, grid2, check_name=False, exact_match=False):
if check_name:
return grid1.name == grid2.name
# Check minimum resolutions if they are specified
if 'min_res' in grid1 and 'min_res' in grid2 and not cls._equivalence.check_equivalence(grid1['min_res'], grid2['min_res'], exact_match=exact_match):
return False
# Compare the bounding boxes of the two grids if they exist in the grid
if 'bbox' in grid1 and 'bbox' in grid2:
return BBOX.are_equivalent(grid1.bbox, grid2.bbox, exact_mach=exact_match)
return False
I can't recommend this approach in the general case, because I can't help but feel there's some code smell to it, but it does exactly what I need it to and will solve a great many problems for my current codebase. We have specific requirements, this is a specific solution. The solution by chepner is probably best for the general case of letting the user decide how a function should test equivalence.

Related

python set() membership and hashable objects

I wanted to store instances of a class in a set, so I could use the set methods to find intersections, etc. My class has a __hash__() function, along with an __eq__ and a __lt__, and is decorated with functools.total_ordering
When I create two sets, each containing the same two objects, and do a set_a.difference(set_b), I get a result with a single object, and I have no idea why. I was expecting none, or at the least, 2, indicating a complete failure in my understanding of how sets work. But one?
for a in set_a:
print(a, a.__hash__())
for b in set_b:
print(b, b.__hash__(), b in set_a)
(<foo>, -5267863171333807568)
(<bar>, -8020339072063373731)
(<foo>, -5267863171333807568, False)
(<bar)>, -8020339072063373731, True)
Why is the <foo> object in set_b not considered to be in set_a? What other properties does an object require in order to be considered a member of a set? And why is bar considered to be a part of set_a, but not foo?
edit: updating with some more info. I figured that simply showing that the two objects' hash() results where the same meant that they where indeed the same, so I guess that's where my mistake probably comes from.
#total_ordering
class Thing(object):
def __init__(self, i):
self.i = i
def __eq__(self, other):
return self.i == other.i
def __lt__(self, other):
return self.i < other.i
def __repr__(self):
return "<Thing {}>".format(self.i)
def __hash__(self):
return hash(self.i)
I figured it out thanks to some of the questions in the comments- the problem was due to the fact that I had believed that ultimately, the hash function decides if two objects are the same, or not. The __eq__ also needs to match, which it always did in my tests and attempts to create a minimal example here.
However, when pulling data from a DB in prod, a certain float was being rounded down, and thus, the x == y was failing in prod. Argh.

Python Exercise involving functions, recursion and classes

I'm doing an exercise where I'm to create a class representing functions (written as lambda expressions) and several methods involving them.
The ones I've written so far are:
class Func():
def __init__(self, func, domain):
self.func = func
self.domain = domain
def __call__(self, x):
if self.domain(x):
return self.func(x)
return None
def compose(self, other):
comp_func= lambda x: self.func(other(x))
comp_dom= lambda x: other.domain(x) and self.domain(other(x))
return Func(comp_func, comp_dom)
def exp(self, k):
exp_func= self
for i in range(k-1):
exp_func = Func.compose(exp_func, self)
return exp_func
As you can see above, the function exp composes a function with itself k-1 times. Now I'm to write a recursive version of said function, taking the same arguments "self" and "k".
However I'm having difficulty figuring out how it would work. In the original exp I wrote I had access to the original function "self" throughout all iterations, however when making a recursive function I lose access to the original function and with each iteration only have access to the most recent composed function. So for example, if I try composing self with self a certain number of times I will get:
f= x+3
f^2= x+6
(f^2)^2= x+12
So we skipped the function x+9.
How do I get around this? Is there a way to still retain access to the original function?
Update:
def exp_rec(self, k):
if k==1:
return self
return Func.compose(Func.exp_rec(self, k-1), self)
This is an exercise, so I won't provide the answer.
In recursion, you want to do two things:
Determine and check a "guard condition" that tells you when to stop; and
Determine and compute the "recurrence relation" that tells you the next value.
Consider a simple factorial function:
def fact(n):
if n == 1:
return 1
return n * fact(n - 1)
In this example, the guard condition is fairly obvious- it's the only conditional statement! And the recurrence relation is in the return statement.
For your exercise, things are slightly less obvious, since the intent is to define a function composition, rather than a straight integer computation. But consider:
f = Func(lambda x: x + 3)
(This is your example.) You want f.exp(1) to be the same as f, and f.exp(2) to be f(f(x)). That right there tells you the guard condition and the recurrence relation:
The guard condition is that exp() only works for positive numbers. This is because exp(0) might have to return different things for different input types (what does exp(0) return when f = Func(lambda s: s + '!') ?).
So test for exp(1), and let that condition be the original lambda.
Then, when recursively defining exp(n+1), let that be the composition of your original lambda with exp(n).
You have several things to consider: First, your class instance has data associated with it. That data will "travel along" with you in your recursion, so you don't have to pass so many parameters recursively. Second, you need to decide whether Func.exp() should create a new Func(), or whether it should modify the existing Func object. Finally, consider how you would write a hard-coded function, Func.exp2() that just constructed what we would call Func.exp(2). That should give you an idea of your recurrence relation.
Update
Based on some comments, I feel like I should show this code. If you are going to have your recursive function modify the self object, instead of returning a new object, then you will need to "cache" the values from self before they get modified, like so:
func = self.func
domain = self.domain
... recursive function modifies self.func and self.domain

Implementing **<class>?

Edit:
This question has been marked duplicate but I don't think that it is. Implementing the suggested answer, that is to use the Mapping abc, does not have the behavior I would like:
from collections import Mapping
class data(Mapping):
def __init__(self,params):
self.params = params
def __getitem__(self,k):
print "getting",k
return self.params[k]
def __len__(self):
return len(self.params)
def __iter__(self):
return ( k for k in self.params.keys() )
def func(*args,**kwargs):
print "In func"
return None
ps = data({"p1":1.,"p2":2.})
print "\ncalling...."
func(ps)
print "\ncalling...."
func(**ps)
Output:
calling....
In func
calling....
in __getitem__ p2
in __getitem__ p1
In func
Which, as mentioned in the question, is not what I want.
The other solution, given in the comments, is to modify the routines that are causing problems. That will certainly work, however I was looking for a quick (lazy?) fix!
Question:
How can I implement the ** operator for a class, other than via __getitem__? For example I would like to be able to do this::
def func(**kwargs):
<do some clever stuff>
x = some_generic_class():
func( **x )
without an implicit call to some_generic_class.__getitem__(). In my application I have already implemented __getitem__ with some data logging which I do not want to perform when the class is referenced as above.
If it's not possible to overload the ** operator, is it possible to detect when __getitem__ is being called as a result of the class being passed to a function, rather than explicitly?
Background:
I am working on a physics model that is built out of a set of packages which are chosen according to user input at runtime. The flexible structure of the model means that I rarely know the required parameters and so i pass a dict of parameter names and values between the models. In order to make this more user friendly I am now trying to develop a class paramlist that overloads the dict functionality with a set of routines that do some consistency checking, set default values, etc. The idea is that I pass an instance of paramlist rather than a dict. One of the more important aims is to keep a log of which members of paramlist have been referenced by the physics packages and which ones have not. A stripped out version is below, which aims to maintain a second dict that logs whether a parameter has been referenced::
class paramlist(object):
def __init__( self, params ):
self.params = copy(params)
self.used = { k:False for k in self.params }
def __getitem__(self, k):
try:
v = self.params[k]
except KeyError:
raise KeyError("Parameter {} not in parameter list".format(k))
else:
self.used[k] = True
return v
def __setitem__(self,k,v):
self.params[k] = v
self.used[k] = False
Which does not have the behaviour I want:
ps = paramlist( {"p1":1.} )
def donothing( *args, **kwargs ):
return None
donothing(ps)
print paramlist.used["p1"]
donothing(**ps)
print paramlist.used["p1"]
Output:
False
True
I would like the use dict to remain False in both cases, so that I can tell the user that one of their parameters was not used (implying that they screwed up and a default value has been used instead). I presume that the ** case has the effect of calling __getitem__ on every entry in the paramlist.

Python - __eq__ method not being called

I have a set of objects, and am interested in getting a specific object from the set. After some research, I decided to use the solution provided here: http://code.activestate.com/recipes/499299/
The problem is that it doesn't appear to be working.
I have two classes defined as such:
class Foo(object):
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return (self.a, self.b, self.c)
def __eq__(self, other):
return self.__key() == other.__key()
def __hash__(self):
return hash(self.__key())
class Bar(Foo):
def __init__(self, a, b, c, d, e):
self.a = a
self.b = b
self.c = c
self.d = d
self.e = e
Note: equality of these two classes should only be defined on the attributes a, b, c.
The wrapper _CaptureEq in http://code.activestate.com/recipes/499299/ also defines its own __eq__ method. The problem is that this method never gets called (I think). Consider,
bar_1 = Bar(1,2,3,4,5)
bar_2 = Bar(1,2,3,10,11)
summary = set((bar_1,))
assert(bar_1 == bar_2)
bar_equiv = get_equivalent(summary, bar_2)
bar_equiv.d should equal 4 and likewise bar_equiv .e should equal 5, but they are not. Like I mentioned, it looks like the __CaptureEq __eq__ method does not get called when the statement bar_2 in summary is executed.
Is there some reason why the __CaptureEq __eq__ method is not being called? Hopefully this is not too obscure of a question.
Brandon's answer is informative, but incorrect. There are actually two problems, one with
the recipe relying on _CaptureEq being written as an old-style class (so it won't work properly if you try it on Python 3 with a hash-based container), and one with your own Foo.__eq__ definition claiming definitively that the two objects are not equal when it should be saying "I don't know, ask the other object if we're equal".
The recipe problem is trivial to fix: just define __hash__ on the comparison wrapper class:
class _CaptureEq:
'Object wrapper that remembers "other" for successful equality tests.'
def __init__(self, obj):
self.obj = obj
self.match = obj
# If running on Python 3, this will be a new-style class, and
# new-style classes must delegate hash explicitly in order to populate
# the underlying special method slot correctly.
# On Python 2, it will be an old-style class, so the explicit delegation
# isn't needed (__getattr__ will cover it), but it also won't do any harm.
def __hash__(self):
return hash(self.obj)
def __eq__(self, other):
result = (self.obj == other)
if result:
self.match = other
return result
def __getattr__(self, name): # support anything else needed by __contains__
return getattr(self.obj, name)
The problem with your own __eq__ definition is also easy to fix: return NotImplemented when appropriate so you aren't claiming to provide a definitive answer for comparisons with unknown objects:
class Foo(object):
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return (self.a, self.b, self.c)
def __eq__(self, other):
if not isinstance(other, Foo):
# Don't recognise "other", so let *it* decide if we're equal
return NotImplemented
return self.__key() == other.__key()
def __hash__(self):
return hash(self.__key())
With those two fixes, you will find that Raymond's get_equivalent recipe works exactly as it should:
>>> from capture_eq import *
>>> bar_1 = Bar(1,2,3,4,5)
>>> bar_2 = Bar(1,2,3,10,11)
>>> summary = set((bar_1,))
>>> assert(bar_1 == bar_2)
>>> bar_equiv = get_equivalent(summary, bar_2)
>>> bar_equiv.d
4
>>> bar_equiv.e
5
Update: Clarified that the explicit __hash__ override is only needed in order to correctly handle the Python 3 case.
The problem is that the set compares two objects the “wrong way around” for this pattern to intercept the call to __eq__(). The recipe from 2006 evidently was written against containers that, when asked if x was present, went through the candidate y values already present in the container doing:
x == y
comparisons, in which case an __eq__() on x could do special actions during the search. But the set object is doing the comparison the other way around:
y == x
for each y in the set. Therefore this pattern might simply not be usable in this form when your data type is a set. You can confirm this by instrumenting Foo.__eq__() like this:
def __eq__(self, other):
print '__eq__: I am', self.d, self.e, 'and he is', other.d, other.e
return self.__key() == other.__key()
You will then see a message like:
__eq__: I am 4 5 and he is 10 11
confirming that the equality comparison is posing the equality question to the object already in the set — which is, alas, not the object wrapped with Hettinger's _CaptureEq object.
Update:
And I forgot to suggest a way forward: have you thought about using a dictionary? Since you have an idea here of a key that is a subset of the data inside the object, you might find that splitting out the idea of the key from the idea of the object itself might alleviate the need to attempt this kind of convoluted object interception. Just write a new function that, given an object and your dictionary, computes the key and looks in the dictionary and returns the object already in the dictionary if the key is present else inserts the new object at the key.
Update 2: well, look at that — Nick's answer uses a NotImplemented in one direction to force the set to do the comparison in the other direction. Give the guy a few +1's!
There are two issues here. The first is that:
t = _CaptureEq(item)
if t in container:
return t.match
return default
Doesn't do what you think. In particular, t will never be in container, since _CaptureEq doesn't define __hash__. This becomes more obvious in Python 3, since it will point this out to you rather than providing a default __hash__. The code for _CaptureEq seems to believe that providing an __getattr__ will solve this - it won't, since Python's special method lookups are not guaranteed to go through all the same steps as normal attribute lookups - this is the same reason __hash__ (and various others) need to be defined on a class and can't be monkeypatched onto an instance. So, the most direct way around this is to define _CaptureEq.__hash__ like so:
def __hash__(self):
return hash(self.obj)
But that still isn't guaranteed to work, because of the second issue: set lookup is not guaranteed to test equality. sets are based on hashtables, and only do an equality test if there's more than one item in a hash bucket. You can't (and don't want to) force items that hash differently into the same bucket, since that's all an implementation detail of set. The easiest way around this issue, and to neatly sidestep the first one, is to use a list instead:
summary = [bar_1]
assert(bar_1 == bar_2)
bar_equiv = get_equivalent(summary, bar_2)
assert(bar_equiv is bar_1)

Python lazy evaluator

Is there a Pythonic way to encapsulate a lazy function call, whereby on first use of the function f(), it calls a previously bound function g(Z) and on the successive calls f() returns a cached value?
Please note that memoization might not be a perfect fit.
I have:
f = g(Z)
if x:
return 5
elif y:
return f
elif z:
return h(f)
The code works, but I want to restructure it so that g(Z) is only called if the value is used. I don't want to change the definition of g(...), and Z is a bit big to cache.
EDIT: I assumed that f would have to be a function, but that may not be the case.
I'm a bit confused whether you seek caching or lazy evaluation. For the latter, check out the module lazy.py by Alberto Bertogli.
Try using this decorator:
class Memoize:
def __init__ (self, f):
self.f = f
self.mem = {}
def __call__ (self, *args, **kwargs):
if (args, str(kwargs)) in self.mem:
return self.mem[args, str(kwargs)]
else:
tmp = self.f(*args, **kwargs)
self.mem[args, str(kwargs)] = tmp
return tmp
(extracted from dead link: http://snippets.dzone.com/posts/show/4840 / https://web.archive.org/web/20081026130601/http://snippets.dzone.com/posts/show/4840)
(Found here: Is there a decorator to simply cache function return values? by Alex Martelli)
EDIT: Here's another in form of properties (using __get__) http://code.activestate.com/recipes/363602/
You can employ a cache decorator, let see an example
from functools import wraps
class FuncCache(object):
def __init__(self):
self.cache = {}
def __call__(self, func):
#wraps(func)
def callee(*args, **kwargs):
key = (args, str(kwargs))
# see is there already result in cache
if key in self.cache:
result = self.cache.get(key)
else:
result = func(*args, **kwargs)
self.cache[key] = result
return result
return callee
With the cache decorator, here you can write
my_cache = FuncCache()
#my_cache
def foo(n):
"""Expensive calculation
"""
sum = 0
for i in xrange(n):
sum += i
print 'called foo with result', sum
return sum
print foo(10000)
print foo(10000)
print foo(1234)
As you can see from the output
called foo with result 49995000
49995000
49995000
The foo will be called only once. You don't have to change any line of your function foo. That's the power of decorators.
There are quite a few decorators out there for memoization:
http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize
http://code.activestate.com/recipes/498110-memoize-decorator-with-o1-length-limited-lru-cache/
http://code.activestate.com/recipes/496879-memoize-decorator-function-with-cache-size-limit/
Coming up with a completely general solution is harder than you might think. For instance, you need to watch out for non-hashable function arguments and you need to make sure the cache doesn't grow too large.
If you're really looking for a lazy function call (one where the function is only actually evaluated if and when the value is needed), you could probably use generators for that.
EDIT: So I guess what you want really is lazy evaluation after all. Here's a library that's probably what you're looking for:
http://pypi.python.org/pypi/lazypy/0.5
Just for completness, here is a link for my lazy-evaluator decorator recipe:
https://bitbucket.org/jsbueno/metapython/src/f48d6bd388fd/lazy_decorator.py
Here's a pretty brief lazy-decorator, though it lacks using #functools.wraps (and actually returns an instance of Lazy plus some other potential pitfalls):
class Lazy(object):
def __init__(self, calculate_function):
self._calculate = calculate_function
def __get__(self, obj, _=None):
if obj is None:
return self
value = self._calculate(obj)
setattr(obj, self._calculate.func_name, value)
return value
# Sample use:
class SomeClass(object):
#Lazy
def someprop(self):
print 'Actually calculating value'
return 13
o = SomeClass()
o.someprop
o.someprop
Curious why you don't just use a lambda in this scenario?
f = lambda: g(z)
if x:
return 5
if y:
return f()
if z:
return h(f())
Even after your edit, and the series of comments with detly, I still don't really understand. In your first sentence, you say the first call to f() is supposed to call g(), but subsequently return cached values. But then in your comments, you say "g() doesn't get called no matter what" (emphasis mine). I'm not sure what you're negating: Are you saying g() should never be called (doesn't make much sense; why does g() exist?); or that g() might be called, but might not (well, that still contradicts that g() is called on the first call to f()). You then give a snippet that doesn't involve g() at all, and really doesn't relate to either the first sentence of your question, or to the comment thread with detly.
In case you go editing it again, here is the snippet I am responding to:
I have:
a = f(Z)
if x:
return 5
elif y:
return a
elif z:
return h(a)
The code works, but I want to
restructure it so that f(Z) is only
called if the value is used. I don't
want to change the definition of
f(...), and Z is a bit big to cache.
If that is really your question, then the answer is simply
if x:
return 5
elif y:
return f(Z)
elif z:
return h(f(Z))
That is how to achieve "f(Z) is only called if the value is used".
I don't fully understand "Z is a bit big to cache". If you mean there will be too many different values of Z over the course of program execution that memoization is useless, then maybe you have to resort to precalculating all the values of f(Z) and just looking them up at run time. If you can't do this (because you can't know the values of Z that your program will encounter) then you are back to memoization. If that's still too slow, then your only real option is to use something faster than Python (try Psyco, Cython, ShedSkin, or hand-coded C module).

Categories