A reoccurring pattern in my Python programming on GAE is getting some entity from the data store, then possibly changing that entity based on various conditions. In the end I need to .put() the entity back to the data store to ensure that any changes that might have been made to it get saved.
However often there were no changes actually made and the final .put() is just a waste of money. How to easily make sure that I only put an entity if it has really changed?
The code might look something like
def handle_get_request():
entity = Entity.get_by_key_name("foobar")
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
entity.put()
I could maintain a "changed" flag which I set if any condition changed the entity, but that seems very brittle. If I forget to set it somewhere, then changes would be lost.
What I ended up using
def handle_get_request():
entity = Entity.get_by_key_name("foobar")
original_xml = entity.to_xml()
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
if entity.to_xml() != original_xml: entity.put()
I would not call this "elegant". Elegant would be if the object just saved itself automatically in the end, but I felt this was simple and readable enough to do for now.
Why not check if the result equals (==) the original and so decide whether to save it. This depends on a correctly implemented __eq__, but by default a field-by-field comparison based on the __dict__ should do it.
def __eq__(self, other) :
return self.__dict__ == other.__dict__
(Be sure that the other rich comparison and hash operators work correctly if you do this. See here.)
One possible solution is using a wrapper that tracks any attribute change:
class Wrapper(object):
def __init__(self, x):
self._x = x
self._changed = False
def __setattr__(self, name, value):
if name[:1] == "_":
object.__setattr__(self, name, value)
else:
if getattr(self._x, name) != value:
setattr(self._x, name, value)
self._changed = True
def __getattribute__(self, name):
if name[:1] == "_":
return object.__getattribute__(self, name)
return getattr(self._x, name)
class Contact:
def __init__(self, name, address):
self.name = name
self.address = address
c = Contact("Me", "Here")
w = Wrapper(c)
print w.name # --> Me
w.name = w.name
print w.name, w._changed # --> Me False
w.name = "6502"
print w.name, w._changed # --> 6502 True
This answer is a part of an question i posted about a Python checksum of a dict
With the answers of this question I developed a method to generate checksum from
a db.Model.
This is an example:
>>> class Actor(db.Model):
... name = db.StringProperty()
... age = db.IntegerProperty()
...
>>> u = Actor(name="John Doe", age=26)
>>> util.checksum_from_model(u, Actor)
'-42156217'
>>> u.age = 47
>>> checksum_from_model(u, Actor)
'-63393076'
I defined these methods:
def checksum_from_model(ref, model, exclude_keys=[], exclude_properties=[]):
"""Returns the checksum of a db.Model.
Attributes:
ref: The reference og the db.Model
model: The model type instance of db.Model.
exclude_keys: To exclude a list of properties name like 'updated'
exclude_properties: To exclude list of properties type like 'db.DateTimeProperty'
Returns:
A checksum in signed integer.
"""
l = []
for key, prop in model.properties().iteritems():
if not (key in exclude_keys) and \
not any([True for x in exclude_properties if isinstance(prop, x)]):
l.append(getattr(ref, key))
return checksum_from_list(l)
def checksum_from_list(l):
"""Returns a checksum from a list of data into an int."""
return reduce(lambda x,y : x^y, [hash(repr(x)) for x in l])
Note:
For the base36 implementation: http://en.wikipedia.org/wiki/Base_36#Python_implementation
Edit:
I removed the return in base36, now these functions run without dependences. (An advice from #Skirmantas)
Didn't work with GAE but in same situation I'd use something like:
entity = Entity.get_by_key_name("foobar")
prev_entity_state = deepcopy(entity.__dict__)
if phase_of_moon() == "full":
entity.werewolf = True
if random.choice([True, False]):
entity.lucky = True
if some_complicated_condition:
entity.answer = 42
if entity.__dict__ == prev_entity_state:
entity.put()
Related
I have a Python app with a Firebase-database backend.
When I retrieve the data from my database, I want to check if those values
are available (if not, that means that the database is somehow corrupted, as mandatories fields are missing)
My current implementation is the following:
self.foo = myDbRef.get('foo')
self.bar = myDbRef.get('bar')
self.bip = myDbRef.get('bip')
self.plop = myDbRef.get('plop')
if self.foo is None or self.bar is None or self.bip is None or self.plop is None:
self.isValid = False
return ErrorCode.CORRUPTED_DATABASE
This works fine, is compact, but have a major issue: I will get the information that the database is corrupted,
but not what field is missing (could be just one of them, or more, or all !)
The idiomatic approach should be
if self.foo is None:
self.isValid = False
return ErrorCode.CORRUPTED_DATABASE, "FOO IS MISSING" # could be a string, an enum value, whatever, I have the information
if self.bar is None:
self.isValid = False
return ErrorCode.CORRUPTED_DATABASE, "BAR IS MISSING"
if self.bip is None:
self.isValid = False
return ErrorCode.CORRUPTED_DATABASE, "BIP IS MISSING"
But this is not pretty, not factorized (All my 'init from db' functions use the same pattern... I don't want to multiply my
number of lines by a factor of 10 for such a case).
This is not a '100% python' question, but I hope the langage has something for me to handle this like a boss (it's python: it usually does !)
You could extract the checks into a generator and leave the flag and return statements outside.
def invalid_fields():
if self.foo is None: yield "FOO"
if self.bar is None: yield "BAR"
if self.bip is None: yield "BIP"
invalid = list(invalid_fields())
if invalid:
self.isValid = False
return ErrorCode.CORRUPTED_DATABASE, "MISSING {}".format(", ".join(invalid))
This has the advantage of telling you about all the missing fields if there are more than one.
I made a class to contain some of your functionality that I can't access. I also made ErrorCode a string as a hack, since that's not defined in my tools and I'm not sure how you want the None names returned with/beside the ErrorCode.
Build a dict of names and values, check that the dict contains no None values, and if it does, return which keys:
myDbRef = {'foo' : None,
'bar': 1,
'bip': 2,
'plop': 3}
class Foo():
def __init__(self):
self.foo = myDbRef.get('foo')
self.bar = myDbRef.get('bar')
self.bip = myDbRef.get('bip')
self.plop = myDbRef.get('plop')
def check(self):
temp_dict = {}
for key in ['foo','bar','bip','plop']:
temp_dict[key] = myDbRef.get(key)
vals = {k:v for k,v in temp_dict.items() if v is None}
if vals:
self.isValid = False
return ("ErrorCode.CORRUPTED_DATABASE", [k for k in vals.keys()])
f = Foo()
print(f.check())
Result: ('ErrorCode.CORRUPTED_DATABASE', ['foo'])
Use a function and a loop:
def checknone(**things_with_names):
for name, thing in things_with_names.items():
if thing is None:
return ErrorCode.CORRUPTED_DATABASE, name + " IS MISSING"
return True
And use as such:
result = checknone(foo=self.foo, bar=self.bar, bip=self.bip, plop=self.plop)
if result is not True:
self.isValid = False
return result
For maximum gains, put it as a method of a class that you will Mixin into all your classes that use this. That way it can also set isValid.
You can dynamically create and search your instance attributes like so:
class Foo():
def __init__(self):
# First, define the list of attributes you want to look for and an empty list of errors
self.attrbs = ['foo','bar','bip','plop']
self.errors = []
# Iterate through the attributes list
for attrb in self.attrbs:
# Create and assign self.foo to MyDbRef.get('foo'), etc
self.__dict__[attrb] = myDbRef.get(attrb)
# Check if attribute is empty, if so, add to error
if not self.__dict__[attrb]:
self.errors.append(attrb.upper())
# Check if there are any errors
if self.errors:
self.is_valid = False
return (ErrorCode.CORRUPTED_DATABASE, "MISSING {errs}".format(errs='/'.join(self.errors)))
else:
self.is_valid = True
def pnamedtuple(type_name, field_names, mutable=False):
pass
class type_name:
def __init__(self, x, y):
self.x = x
self.y = y
self._fields = ['x','y']
self._mutable = False
def get_x(self):
return self.x
def get_y(self):
return self.y
def __getitem__(self,i):
if i > 1 or i <0:
raise IndexError
if i == 0 or i == 'x':
return self.get_x():
if i == 1 or i == 'y':
return self.get_y():
the getitem method to overload the [] (indexing operator) for this class: an index of 0 returns the value of the first field name in the field_names list; an index of 1 returns the value of the second field name in the field_names list, etc. Also, the index can be a string with the named field. So, for p = Point(1,2) writing p.get_x(), or p[0]), or p['x'] returns a result of 1. Raise an IndexError with an appropriate message if the index is out of bounds int or a string that does not name a field.
I am not sure how to fix the getitme function. below is the bsc.txt
c-->t1 = Triple1(1,2,3)
c-->t2 = Triple2(1,2,3)
c-->t3 = Triple3(1,2,3)
# Test __getitem__ functions
e-->t1[0]-->1
e-->t1[1]-->2
e-->t1[2]-->3
e-->t1['a']-->1
e-->t1['b']-->2
e-->t1['c']-->3
^-->t1[4]-->IndexError
^-->t1['d']-->IndexError
^-->t1[3.2]-->IndexError
can someone tell how to fix my _getitem _ function to get the output in bsc.txt? many thanks.
You've spelled __getitem__ incorrectly. Magic methods require two __ underscores before and after them.
So you haven't overloaded the original __getitem__ method, you've simply created a new method named _getitem_.
Python 3 does not allow strings and integers to be compared with > or <; it's best to stick with == if you don't yet know the type of i. You could use isinstance, but here you can easily convert the only two valid integer values to strings (or vice versa), then work only on strings.
def __getitem__(self, i):
if i == 0:
i = "x"
elif i == 1:
i = "y"
if i == "x":
return self.get_x()
elif i == "y":
return self.get_y()
else:
raise IndexError("Invalid key: {}".format(i))
your function is interesting, but there are some issues with it:
In python 3 you can't compare string with numbers, so you first should check with == against know values and or types. For example
def __getitem__(self,i):
if i in {0,"x"}:
return self.x
elif i in {1,"y"}:
return self.y
else:
raise IndexError(repr(i))
But defined like that (in your code or in the example above) for an instance t1 this t1[X] for all string X others than "x" or "y" will always fail as you don't adjust it for any other value. And that is because
pnamedtuple looks like you want for it to be a factory like collections.namedtuple, but it fail to be general enough because you don't use any the arguments of your function at all. And no, type_name is not used either, whatever value it have is throw away when you make the class declaration.
how to fix it?
You need other ways to store the value of the fields and its respective name, for example a dictionary lets call it self._data
To remember how you called yours field, use the argument of your function, for instance self._fields = field_names
To accept a unknown number of arguments use * like __init__(self, *values) then verify that you have the same numbers of values and fields and build your data structure of point 1 (the dictionary)
Once that those are ready then __getitem__ become something like:
def __getitem__(self, key):
if key in self._data:
return self._data[key]
elif isintance(key,int) and 0 <= key < len(self._fields):
return self._data[ self._fields[key] ]
else:
raise IndexError( repr(key) )
or you can simple inherit from a appropriate namedtuple and the only thing you need to do is overwrite its __getitem__ like
def __getitem__(self,key):
if key in self._fields:
return getattr(self,key)
return super().__getitem__(key)
I have a Flask-SQLAlchemy model that contains several relationships to tables for holding quantities in decimal, fraction, and integer:
class Packet(db.Model):
# ...
qty_decimal_id = db.Column(db.Integer, db.ForeignKey('qty_decimals.id'))
_qty_decimal = db.relationship('QtyDecimal', backref='_packets')
qty_fraction_id = db.Column(db.Integer, db.ForeignKey('qty_fractions.id'))
_qty_fraction = db.relationship('QtyFraction', backref='_packets')
qty_integer_id = db.Column(db.Integer, db.ForeignKey('qty_integers.id'))
_qty_integer = db.relationship('QtyInteger', backref='_packets')
# ...
The tables each contain a column named 'value' that contains the actual value, such that if I want to store an integer quantity of '100', I would store it in ._qty_integer.value. I have created a hybrid_property that currently gets whichever of these relationships is not null, and sets to a relevant relationship depending on what kind of data is detected by the setter:
#hybrid_property
def quantity(self):
retqty = None
for qty in [self._qty_decimal, self._qty_fraction, self._qty_integer]:
if qty is not None:
if retqty is None:
retqty = qty
else:
raise RuntimeError('More than one type of quantity'
' was detected for this packet. Only'
' one type of quantity may be set!')
return retqty.value
#quantity.setter
def quantity(self, value):
if is_decimal(value):
self.clear_quantity()
self._qty_decimal = QtyDecimal.query.filter_by(value=value).\
first() or QtyDecimal(value)
elif is_fraction(value):
if is_480th(value):
self.clear_quantity()
self._qty_fraction = QtyFraction.query.filter_by(value=value).\
first() or QtyFraction(value)
else:
raise ValueError('Fractions must have denominators'
' 480 is divisible by!')
elif is_int(value):
self.clear_quantity()
self._qty_integer = QtyInteger.query.filter_by(value=value).\
first() or QtyInteger(value)
else:
raise ValueError('Could not determine appropriate type for '
'quantity! Please make sure it is a integer, '
'decimal number, or fraction.')
The getter and setter for .quantity work perfectly as far as I can tell (it's always possible I missed some edge cases in my test suite) but I cannot, for the life of me, figure out how to implement a comparator or expression such that one could, for example, get all packets that have a quantity of 100, or all packets with a quantity of 1/4. (Units are implemented too, but irrelevant to this question.) As far as I can tell it's not doable as an expression, but it should be doable as a comparator. I can't figure out how to pull it off, though. So far this is my best attempt at a comparator:
class QuantityComparator(Comparator):
def __init__(self, qty_decimal, qty_fraction, qty_integer):
self.qty_decimal = qty_decimal
self.qty_fraction = qty_fraction
self.qty_integer = qty_integer
def __eq__(self, other):
if is_decimal(other):
return self.qty_decimal.value == other
elif is_fraction(other):
if is_480th(other):
return self.qty_fraction.value == other
else:
raise ValueError('Cannot query using a fraction with a '
'denominator 480 is not divisible by!')
elif is_int(other):
return self.qty_fraction.value == other
else:
raise ValueError('Could not parse query value'
' as a valid quantity!')
#quantity.comparator
def quantity(self):
return self.QuantityComparator(self._qty_decimal,
self._qty_fraction,
self._qty_integer)
Unsurprisingly, this does not work, when I try to run Packet.query.filter_by(quantity=) it raises an exception:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Packet._qty_fraction has an attribute 'value'
If the solution to getting this use case to work is in the SQLAlchemy docs, I've either missed it or (more likely) haven't wrapped my head around enough of SQLAlchemy yet to figure it out.
I have come up with a stopgap solution that at least lets users get Packets based on quantity:
#staticmethod
def quantity_equals(value):
if is_decimal(value):
return Packet.query.join(Packet._qty_decimal).\
filter(QtyDecimal.value == value)
elif is_fraction(value):
if is_480th(value):
return Packet.query.join(Packet._qty_fraction).\
filter(QtyFraction.value == value)
else:
raise ValueError('Fraction could not be converted to 480ths!')
elif is_int(value):
return Packet.query.join(Packet._qty_integer).\
filter(QtyInteger.value == value)
This works and gets me what looks to be the correct Query object, as shown by this test:
def test_quantity_equals(self):
pkt1 = Packet()
pkt2 = Packet()
pkt3 = Packet()
db.session.add_all([pkt1, pkt2, pkt3])
pkt1.quantity = Decimal('3.14')
pkt2.quantity = Fraction(1, 4)
pkt3.quantity = 100
db.session.commit()
qty_dec_query = Packet.quantity_equals(Decimal('3.14'))
self.assertIs(pkt1, qty_dec_query.first())
self.assertEqual(qty_dec_query.count(), 1)
qty_frac_query = Packet.quantity_equals(Fraction(1, 4))
self.assertIs(pkt2, qty_frac_query.first())
self.assertEqual(qty_frac_query.count(), 1)
qty_int_query = Packet.quantity_equals(100)
self.assertIs(pkt3, qty_int_query.first())
self.assertEqual(qty_int_query.count(), 1)
I can easily make similar dirty methods to substitute for other comparison operators, but I would think it's possible to do it in a custom comparator such as the aforementioned QuantityComparator, or to otherwise achieve the desired ability to use the .quantity property in query filters.
Can anybody help me get the QuantityComparator working, or point me in the right direction for figuring it out myself?
Edit: Solution
While I haven't actually solved the explicit question of making a hybrid_property with multiple relationships queryable, I have solved the core issue of representing a quantity that could either be int, float (I use float instead of Decimal here because no arithmetic is done the quantity) or Fraction in the database, and making it possible to use values in queries as they would be used by the value setter. The solution was to create a quantities table with rows for a floating point value, an integer numerator, an integer denominator, and a boolean representing whether or not the stored value should be interpreted as a decimal number (as opposed to a fraction) and create an sqlalchemy.ext.hybrid.Comparator for the hybrid property which compares against Quantity._float and Quantity.is_decimal:
class Quantity(db.Model):
# ...
_denominator = db.Column(db.Integer)
_float = db.Column(db.Float)
is_decimal = db.Column(db.Boolean, default=False)
_numerator = db.Column(db.Integer)
# ...
#staticmethod
def for_cmp(val):
"""Convert val to float so it can be used to query against _float."""
if Quantity.dec_check(val): # True if val looks like a decimal number
return float(val)
elif isinstance(val, str):
frac = Quantity.str_to_fraction(val)
else:
frac = Fraction(val)
return float(frac)
#hybrid_property
def value(self):
if self._float is not None:
if self.is_decimal:
return self._float
elif self._denominator == 1:
return self._numerator
else:
return Fraction(self._numerator, self._denominator)
else:
return None
class ValueComparator(Comparator):
def operate(self, op, other):
return and_(op(Quantity._float, Quantity.for_cmp(other)),
Quantity.is_decimal == Quantity.dec_check(other))
#value.comparator
def value(cls):
return Quantity.ValueComparator(cls)
#value.setter
def value(self, val):
if val is not None:
if Quantity.dec_check(val):
self.is_decimal = True
self._float = float(val)
self._numerator = None
self._denominator = None
else:
self.is_decimal = False
if isinstance(val, str):
frac = Quantity.str_to_fraction(val)
else:
frac = Fraction(val)
self._numerator = frac.numerator
self._denominator = frac.denominator
self._float = float(frac)
else:
self.is_decimal = None
self._numerator = None
self._denominator = None
self._float = None
Since I don't have the original model I asked this question for any longer, I can't readily go back and answer this question properly, but I'd imagine it could be done using join or select.
I'm implementing a caching service in python. I'm using a simple dictionary so far. What I'd like to do is to count number of hits (number of times when a stored value was retrieved by the key). Python builtin dict has no such possibility (as far as I know). I searched through 'python dictionary count' and found Counter (also on stackoverflow), but this doesn't satisfy my requirements I guess. I don't need to count what already exists. I need to increment something that come from the outside. And I think that storing another dictionary with hits counting only is not the best data structure I can get :)
Do you have any ideas how to do it efficiently?
For an alternative method, if you're using Python 3 (or are willing to add this module to your Python 2 project, which has a slightly different interface), I strongly recommend the lru_cache decorator.
See the docs here. For example, this code :
from functools import lru_cache
#lru_cache(maxsize=32)
def meth(a, b):
print("Taking some time", a, b)
return a + b
print(meth(2, 3))
print(meth(2, 4))
print(meth(2, 3))
...will output :
Taking some time 2 3
5
Taking some time 2 4
6
5 <--- Notice that this function result is cached
As per the documentation, you can get the number of hits and misses with meth.cache_info(), and clear the cache with meth.cache_clear().
You can subclass a built-in dict class:
class CustomDict(dict):
def __init__(self, *args, **kwargs):
self.hits = {}
super(CustomDict, self).__init__(*args, **kwargs)
def __getitem__(self, key):
if key not in self.hits:
self.hits[key] = 0
self.hits[key] += 1
return super(CustomDict, self).__getitem__(key)
usage:
>>> d = CustomDict()
>>> d["test"] = "test"
>>> d["test"]
'test'
>>> d["test"]
'test'
>>> d.hits["test"]
2
Having another dictionary to store the hit counts is probably not a bad option, but you could also do something like:
class CacheService(object):
def __init__(self):
self.data = {}
def __setitem__(self, key, item):
self.data[key] = [item, 0]
def __getitem__(self, key):
value = self.data[key]
value[1] += 1
return value[0]
def getcount(self, key):
return self.data[key][1]
You can use it something like this:
>>> cs = CacheService()
>>> cs[1] = 'one'
>>> cs[2] = 'two'
>>> print cs.getcount(1)
0
>>> cs[1]
'one'
>>> print cs.getcount(1)
1
It will be much easier to just overload the built-in dict data type. This will solve your problem.
def CountDict(dict):
count = {}
def __getitem__(self, key):
CountDict.count[key] = CountDict.count.get(key, 0) + 1
return super(CountDict, self).__getitem__(self, key)
def __setitem__(self, key, value):
return super(CountDict, self).__setitem__(self, key, value)
def get_count(self, key):
return CountDict.count.get(key, 0)
This will give you lot more flexibility. Like you can have two counts one for number of reads and another for number of writes, if you wish without much of a complexity. To learn more about super, see here.
Edited to meet OP's need of keeping a count for reading a key. The output can be obtained by calling get_count method.
>>>my_dict = CountDict()
>>>my_dict["a"] = 1
>>>my_dict["a"]
>>>1
>>>my_dict["a"]
>>>1
>>>my_dict.get_count("a")
>>>2
You could try this approach.
class AccessCounter(object):
'''A class that contains a value and implements an access counter.
The counter increments each time the value is changed.'''
def __init__(self, val):
super(AccessCounter, self).__setattr__('counter', 0)
super(AccessCounter, self).__setattr__('value', val)
def __setattr__(self, name, value):
if name == 'value':
super(AccessCounter, self).__setattr__('counter', self.counter + 1)
# Make this unconditional.
# If you want to prevent other attributes to be set, raise AttributeError(name)
super(AccessCounter, self).__setattr__(name, value)
def __delattr__(self, name):
if name == 'value':
super(AccessCounter, self).__setattr__('counter', self.counter + 1)
super(AccessCounter, self).__delattr__(name)
My problem is the following: I have some python classes that have properties that are derived from other properties; and those should be cached once they are calculated, and the cached results should be invalidated each time the base properties are changed.
I could do it manually, but it seems quite difficult to maintain if the number of properties grows. So I would like to have something like Makefile rules inside my objects to automatically keep track of what needs to be recalculated.
The desired syntax and behaviour should be something like that:
# this does dirty magic, like generating the reverse dependency graph,
# and preparing the setters that invalidate the cached values
#dataflow_class
class Test(object):
def calc_a(self):
return self.b + self.c
def calc_c(self):
return self.d * 2
a = managed_property(calculate=calc_a, depends_on=('b', 'c'))
b = managed_property(default=0)
c = managed_property(calculate=calc_c, depends_on=('d',))
d = managed_property(default=0)
t = Test()
print t.a
# a has not been initialized, so it calls calc_a
# gets b value
# c has not been initialized, so it calls calc_c
# c value is calculated and stored in t.__c
# a value is calculated and stored in t.__a
t.b = 1
# invalidates the calculated value stored in self.__a
print t.a
# a has been invalidated, so it calls calc_a
# gets b value
# gets c value, from t.__c
# a value is calculated and stored in t.__a
print t.a
# gets value from t.__a
t.d = 2
# invalidates the calculated values stored in t.__a and t.__c
So, is there something like this already available or should I start implementing my own? In the second case, suggestions are welcome :-)
Here, this should do the trick.
The descriptor mechanism (through which the language implements "property") is
more than enough for what you want.
If the code bellow does not work in some corner cases, just write me.
class DependentProperty(object):
def __init__(self, calculate=None, default=None, depends_on=()):
# "name" and "dependence_tree" properties are attributes
# set up by the metaclass of the owner class
if calculate:
self.calculate = calculate
else:
self.default = default
self.depends_on = set(depends_on)
def __get__(self, instance, owner):
if hasattr(self, "default"):
return self.default
if not hasattr(instance, "_" + self.name):
setattr(instance, "_" + self.name,
self.calculate(instance, getattr(instance, "_" + self.name + "_last_value")))
return getattr(instance, "_" + self.name)
def __set__(self, instance, value):
setattr(instance, "_" + self.name + "_last_value", value)
setattr(instance, "_" + self.name, self.calculate(instance, value))
for attr in self.dependence_tree[self.name]:
delattr(instance, attr)
def __delete__(self, instance):
try:
delattr(instance, "_" + self.name)
except AttributeError:
pass
def assemble_tree(name, dict_, all_deps = None):
if all_deps is None:
all_deps = set()
for dependance in dict_[name].depends_on:
all_deps.add(dependance)
assemble_tree(dependance, dict_, all_deps)
return all_deps
def invert_tree(tree):
new_tree = {}
for key, val in tree.items():
for dependence in val:
if dependence not in new_tree:
new_tree[dependence] = set()
new_tree[dependence].add(key)
return new_tree
class DependenceMeta(type):
def __new__(cls, name, bases, dict_):
dependence_tree = {}
properties = []
for key, val in dict_.items():
if not isinstance(val, DependentProperty):
continue
val.name = key
val.dependence_tree = dependence_tree
dependence_tree[key] = set()
properties.append(val)
inverted_tree = {}
for property in properties:
inverted_tree[property.name] = assemble_tree(property.name, dict_)
dependence_tree.update(invert_tree(inverted_tree))
return type.__new__(cls, name, bases, dict_)
if __name__ == "__main__":
# Example and visual test:
class Bla:
__metaclass__ = DependenceMeta
def calc_b(self, x):
print "Calculating b"
return x + self.a
def calc_c(self, x):
print "Calculating c"
return x + self.b
a = DependentProperty(default=10)
b = DependentProperty(depends_on=("a",), calculate=calc_b)
c = DependentProperty(depends_on=("b",), calculate=calc_c)
bla = Bla()
bla.b = 5
bla.c = 10
print bla.a, bla.b, bla.c
bla.b = 10
print bla.b
print bla.c
I would like to have something like Makefile rules
then use one! You may consider this model:
one rule = one python file
one result = one *.data file
the pipe is implemented as a makefile or with another dependency analysis tool (cmake, scons)
The hardware test team in our company use such a framework for intensive exploratory tests:
you can integrate other languages and tools easily
you get a stable and proven solution
computations may be distributed one multiple cpu/computers
you track dependencies on values and rules
debug of intermediate values is easy
the (big) downside to this method is that you have to give up python import keyword because it creates an implicit (and untracked) dependency (there are workarounds for this).
import collections
sentinel=object()
class ManagedProperty(object):
'''
If deptree = {'a':set('b','c')}, then ManagedProperties `b` and
`c` will be reset whenever `a` is modified.
'''
def __init__(self,property_name,calculate=None,depends_on=tuple(),
default=sentinel):
self.property_name=property_name
self.private_name='_'+property_name
self.calculate=calculate
self.depends_on=depends_on
self.default=default
def __get__(self,obj,objtype):
if obj is None:
# Allows getattr(cls,mprop) to return the ManagedProperty instance
return self
try:
return getattr(obj,self.private_name)
except AttributeError:
result=(getattr(obj,self.calculate)()
if self.default is sentinel else self.default)
setattr(obj,self.private_name,result)
return result
def __set__(self,obj,value):
# obj._dependencies is defined by #register
map(obj.__delattr__,getattr(obj,'_dependencies').get(self.property_name,tuple()))
setattr(obj,self.private_name,value)
def __delete__(self,obj):
if hasattr(obj,self.private_name):
delattr(obj,self.private_name)
def register(*mproperties):
def flatten_dependencies(name, deptree, all_deps=None):
'''
A deptree such as {'c': set(['a']), 'd': set(['c'])} means
'a' depends on 'c' and 'c' depends on 'd'.
Given such a deptree, flatten_dependencies('d', deptree) returns the set
of all property_names that depend on 'd' (i.e. set(['a','c']) in the
above case).
'''
if all_deps is None:
all_deps = set()
for dep in deptree.get(name,tuple()):
all_deps.add(dep)
flatten_dependencies(dep, deptree, all_deps)
return all_deps
def classdecorator(cls):
deptree=collections.defaultdict(set)
for mprop in mproperties:
setattr(cls,mprop.property_name,mprop)
# Find all ManagedProperties in dir(cls). Note that some of these may be
# inherited from bases of cls; they may not be listed in mproperties.
# Doing it this way allows ManagedProperties to be overridden by subclasses.
for propname in dir(cls):
mprop=getattr(cls,propname)
if not isinstance(mprop,ManagedProperty):
continue
for underlying_prop in mprop.depends_on:
deptree[underlying_prop].add(mprop.property_name)
# Flatten the dependency tree so no recursion is necessary. If one were
# to use recursion instead, then a naive algorithm would make duplicate
# calls to __delete__. By flattening the tree, there are no duplicate
# calls to __delete__.
dependencies={key:flatten_dependencies(key,deptree)
for key in deptree.keys()}
setattr(cls,'_dependencies',dependencies)
return cls
return classdecorator
These are the unit tests I used to verify its behavior.
if __name__ == "__main__":
import unittest
import sys
def count(meth):
def wrapper(self,*args):
countname=meth.func_name+'_count'
setattr(self,countname,getattr(self,countname,0)+1)
return meth(self,*args)
return wrapper
class Test(unittest.TestCase):
def setUp(self):
#register(
ManagedProperty('d',default=0),
ManagedProperty('b',default=0),
ManagedProperty('c',calculate='calc_c',depends_on=('d',)),
ManagedProperty('a',calculate='calc_a',depends_on=('b','c')))
class Foo(object):
#count
def calc_a(self):
return self.b + self.c
#count
def calc_c(self):
return self.d * 2
#register(ManagedProperty('c',calculate='calc_c',depends_on=('b',)),
ManagedProperty('a',calculate='calc_a',depends_on=('b','c')))
class Bar(Foo):
#count
def calc_c(self):
return self.b * 3
self.Foo=Foo
self.Bar=Bar
self.foo=Foo()
self.foo2=Foo()
self.bar=Bar()
def test_two_instances(self):
self.foo.b = 1
self.assertEqual(self.foo.a,1)
self.assertEqual(self.foo.b,1)
self.assertEqual(self.foo.c,0)
self.assertEqual(self.foo.d,0)
self.assertEqual(self.foo2.a,0)
self.assertEqual(self.foo2.b,0)
self.assertEqual(self.foo2.c,0)
self.assertEqual(self.foo2.d,0)
def test_initialization(self):
self.assertEqual(self.foo.a,0)
self.assertEqual(self.foo.calc_a_count,1)
self.assertEqual(self.foo.a,0)
self.assertEqual(self.foo.calc_a_count,1)
self.assertEqual(self.foo.b,0)
self.assertEqual(self.foo.c,0)
self.assertEqual(self.foo.d,0)
self.assertEqual(self.bar.a,0)
self.assertEqual(self.bar.b,0)
self.assertEqual(self.bar.c,0)
self.assertEqual(self.bar.d,0)
def test_dependence(self):
self.assertEqual(self.Foo._dependencies,
{'c': set(['a']), 'b': set(['a']), 'd': set(['a', 'c'])})
self.assertEqual(self.Bar._dependencies,
{'c': set(['a']), 'b': set(['a', 'c'])})
def test_setting_property_updates_dependent(self):
self.assertEqual(self.foo.a,0)
self.assertEqual(self.foo.calc_a_count,1)
self.foo.b = 1
# invalidates the calculated value stored in foo.a
self.assertEqual(self.foo.a,1)
self.assertEqual(self.foo.calc_a_count,2)
self.assertEqual(self.foo.b,1)
self.assertEqual(self.foo.c,0)
self.assertEqual(self.foo.d,0)
self.foo.d = 2
# invalidates the calculated values stored in foo.a and foo.c
self.assertEqual(self.foo.a,5)
self.assertEqual(self.foo.calc_a_count,3)
self.assertEqual(self.foo.b,1)
self.assertEqual(self.foo.c,4)
self.assertEqual(self.foo.d,2)
self.assertEqual(self.bar.a,0)
self.assertEqual(self.bar.calc_a_count,1)
self.assertEqual(self.bar.b,0)
self.assertEqual(self.bar.c,0)
self.assertEqual(self.bar.calc_c_count,1)
self.assertEqual(self.bar.d,0)
self.bar.b = 2
self.assertEqual(self.bar.a,8)
self.assertEqual(self.bar.calc_a_count,2)
self.assertEqual(self.bar.b,2)
self.assertEqual(self.bar.c,6)
self.assertEqual(self.bar.calc_c_count,2)
self.assertEqual(self.bar.d,0)
self.bar.d = 2
self.assertEqual(self.bar.a,8)
self.assertEqual(self.bar.calc_a_count,2)
self.assertEqual(self.bar.b,2)
self.assertEqual(self.bar.c,6)
self.assertEqual(self.bar.calc_c_count,2)
self.assertEqual(self.bar.d,2)
sys.argv.insert(1,'--verbose')
unittest.main(argv=sys.argv)