I have a complex data structure, essentially a dict where the keys are hashed instances. If I only know the hashed value that equals the key, how can I get back the instance? I can do it by brute force, but it seems I should be able to get the key/instance in O(1) time.
class Test:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
self.arr = ["extra"]
def __str__(self):
return self.foo + self.bar
def __hash__(self):
return hash(str(self))
def __eq__(self, other):
return hash(self) == hash(other)
my_thing = Test("FOO", "BAR")
my_dict = dict()
my_dict[my_thing] = 1
for k, v in my_dict.iteritems():
if k == "FOOBAR":
print k.arr
Edit: I want to be able to get the mutable data in the instance (in this case the array). So that if I only know the hash of "FOOBAR" I would like to be able to get ["extra"], without having to traverse the entire dictionary matching keys (the for loop at the bottom)
The dict should map a key (foo, bar) to the data you want to retrieve. Here is one way you could implement this:
class Test(object):
...
def key(self):
return (self.foo, self.bar)
my_thing = Test("FOO", "BAR")
my_dict = {}
my_dict[my_thing.key()] = my_thing
print my_dict[("FOO", "BAR")].arr
Note that I modified your key function to avoid collisions like:
Test("FOO", "BAR") == Test("FOOB", "AR")
It sounds like you have the string "FOOBAR" (for example) and you want to retrieve a key k in your dict for which str(k) == "FOOBAR".
One way to do this is to just reconstruct a new Test object that will have the same string representation and use that for your lookup:
my_thing = my_dict(Test("FOO", "BAR"))
But this is inefficient and creates a dependency between object creation and your string representation.
If this string representation has its own intrinsic value as a key, you can just maintain your dict (or another dict) keyed by string instead:
my_index = dict()
my_index[str(my_thing)] = my_thing
That way, you can lookup your value given the string only:
print my_index["FOOBAR"].arr
Related
My problem can be divide in two parts. The first one is not allow more two equal values in the dictionary. For example, I have this class:
class MyClass():
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return tuple(self.__dict__[key] for key in self.__dict__)
def __eq__(self, other):
if isinstance(other, type(self)):
return self.__key() == other.__key()
return NotImplemented
And I want to create and stored many objects in a dictionary like this
if __name__ == '__main__':
obj1 = MyClass(1, 2, 3)
obj2 = MyClass(3, 4, 5)
obj3 = MyClass(1, 2, 3)
myDict = {} # empty dictionary
myDict['a'] = obj1 # one key-value item
myDict['b'] = obj2 # two key-value items
myDict['c'] = obj3 # not allowed, value already stored
How to be sure that obj3 can't be stored in the dictionary ?
The second part of my problem is track when a mutable object change to avoid it be equal to the other values in the dictionary, i.e.:
obj2.a = 1; obj2.b = 2; obj2.c = 3 # not allowed
I coded a class Container that inherit from the dictionary class, to store the values (with uniques keys), and I added a set, to track the values in the dictionary, i.e.:
class MyContainer(dict):
def __init__(self):
self.unique_objects_values = set()
def __setitem__(self, key, value):
if key not in self: # overwrite not allowed
if value not in self.unique_object_values: # duplicate objects values don't allowed
super(MyContainer, self).__setitem__(key, value)
self.unique_object_values.add(value)
else:
print("Object already exist. Object didn't stored")
else:
print("Key already exist. Object didn't stored")
And add parent member to MyClass to check if the values aren't already stored but I'm not pretty sure if a data structure already exist to solve my problem.
Make another dictionary of the valued elements and check before adding the values to the original dictionary if it is present then don't add.
I want to create a subclass of dict that includes a custom comparison function that applies to all nested dicts. This example class ignores all dict values with the key 'j' at the top level, but doesn't replace lower level dicts when a copy is made:
import copy
p = {'a': 1, 'j': 2, 'c': [{'j':'cat','k':'dog'}]}
class udict(dict):
def __init__(self, x):
dict.__init__(self, copy.deepcopy(x))
def __eq__(self, other):
return all([self[k]==other[k] for k in set(self.keys())-set('j')])
a = udict(p)
b = udict(p)
a==b # True
b['j'] = 5
a==b # True - 'j' keys are imaginary and invisible
b['a'] = 5
a==b # False
b = udict(p)
b['c'][0]['j'] = 'bird'
a==b # False (should be True, but list contains dicts, not udicts)
I could manually tree-walk arbitrarily deep data structures replacing each dict with a udict, but if I have to walk the data structure anyway, I'll just do the comparison in the recursion without defining a custom class.
So is there a way to define a custom subclass that automatically replaces all embedded instances of the base class?
You can implement the __deepcopy__ method on your
custom class: https://docs.python.org/2/library/copy.html -
You will have to "use recursion" - but it still seens it will be easier than anythng else you'd have to do in there:
from copy import deepcopy
def custom_deepcopier(dct, memo=None):
result = MD()
for key, value in dct.items():
if isinstance(value, dict):
result[key] = MD(value)
else:
result[key] = deepcopy(value, memo)
return result
class MD(dict):
def __init__(self, x=None):
if x:
dict.__init__(self, custom_deepcopier(x))
def __eq__(self, other):
...
__deepcopy__ = custom_deepcopier
In declaring things this way, the custom_deepcopier is used both as the deepcopy method called authomatically when deep-copying one of your custom dicts, but can also be "bootstraped" with a plain dictionary, being called as a stand-alone function.
And finally, not directly related to the answer you need, on your real code, consider inheriting from collections.UserDict instead of dict - there are some shortcuts in the native code for dicts that might bring in bad surprises for you in your inherited classes. (including in the inherent recursion used for __eq__)
A simpler approach requires no copying of data, and the recursion that replaces selected dicts with a subclass is short, explicit, and easily understandable. The subclass overrides only the equality test, it doesn't need __init__ or __copy__ methods:
class MyDict(dict):
def __eq__(self, other):
return <custom equality test result>
def replaceable(var):
if <dict instance should be replaced by subclass instance>:
return <dict of instances to be replaced>
return {}
def replacedict(var)
if isinstance(var, list):
for i, v in enumerate(var):
var[i] = replacedict(v)
elif isinstance(var, dict):
for k, v in var.items():
var[k] = replacedict(v)
rep = replaceable(var)
for k, v in rep.items():
rep[k] = MyDict(v)
return(var)
For the specific case of testing JSON Schemas to test if multiple properties can be merged into a patternProperties:
def replaceable(var):
if 'type' in var and var['type'] == 'object' and \
'properties' in var and isinstance(var['properties'],dict):
return var['properties']
return {}
I want to dynamically query which objects from a class I would like to retrieve. getattr seems like what I want, and it performs fine for top-level objects in the class. However, I'd like to also specify sub-elements.
class MyObj(object):
def __init__(self):
self.d = {'a':1, 'b':2}
self.c = 3
myobj = MyObj()
val = getattr(myobj, "c")
print val # Correctly prints 3
val = getattr(myobj, "d['a']") # Seemingly incorrectly formatted query
print val # Throws an AttributeError
How can I get the object's dictionary elements via a string?
The reason you're getting an error is that getattr(myobj, "d['a']") looks for an attribute named d['a'] on the object, and there isn't one. Your attribute is named d and it's a dictionary. Once you have a reference to the dictionary, then you can access items in it.
mydict = getattr(myobj, "d")
val = mydict["a"]
Or as others have shown, you can combine this in one step (I showed it as two to better illustrate what is actually happening):
val = getattr(myobj, "d")["a"]
Your question implies that you think that items of a dictionary in an object are "sub-elements" of the object. An item in a dictionary, however, is a different thing from an attribute of an object. (getattr() wouldn't work with something like o.a either, though; it just gets one attribute of one object. If that's an object too and you want to get one of its attributes, that's another getattr().)
You can pretty easily write a function that walks an attribute path (given in a string) and attempts to resolve each name either as a dictionary key or an attribute:
def resolve(obj, attrspec):
for attr in attrspec.split("."):
try:
obj = obj[attr]
except (TypeError, KeyError):
obj = getattr(obj, attr)
return obj
The basic idea here is that you take a path and for each component of the path, try to find either an item in a dictionary-like container or an attribute on an object. When you get to the end of the path, return what you've got. Your example would be resolve(myobj, "d.a")
You simply use square brackets to get the dictionary's element:
val = getattr(myobj, "d")["a"]
That'll set val to 1.
If you need the dictionary item to be dynamic as well, you'll need to call get on the result of getattr:
value = getattr(myobj, 'd').get('a')
Thanks to Kindall's answer, I found the following works well for dict keys that are stings.
class Obj2(object):
def __init__(self):
self.d = {'a':'A', 'b':'B', 'c': {'three': 3, 'twothree': (2,3)}}
self.c = 4
class MyObj(object):
def __init__(self):
self.d = {'a':1, 'b':2, 'c': {'two': 2, 'onetwo': (1,2)}}
self.c = 3
self.obj2 = Obj2()
def resolve(self, obj, attrspec):
attrssplit = attrspec.split(".")
attr = attrssplit[0]
try:
obj = obj[attr]
except (TypeError, KeyError):
obj = getattr(obj, attr)
if len(attrssplit) > 1:
attrspec = attrspec.partition(".")[2] # right part of the string.
return self.resolve(obj, attrspec) # Recurse
return obj
def __getattr__(self, name):
return self.resolve(self, name)
# Test
myobj = MyObj()
print getattr(myobj, "c")
print getattr(myobj, "d.a")
print getattr(myobj, "d.c.two")
print getattr(myobj, "obj2.d.a")
print getattr(myobj, "obj2.d.c.twothree")
I need a collection of objects which can be looked up by a certain (unique) attribute common to each of the objects. Right now I am using a dicitionary assigning the dictionary key to the attribute.
Here is an example of what I have now:
class Item():
def __init__(self, uniq_key, title=None):
self.key = uniq_key
self.title = title
item_instance_1 = Item("unique_key1", title="foo")
item_instance_2 = Item("unique_key3", title="foo")
item_instance_3 = Item("unique_key2", title="foo")
item_collection = {
item_instance_1.key: item_instance_1,
item_instance_2.key: item_instance_2,
item_instance_3.key: item_instance_3
}
item_instance_1.key = "new_key"
Now this seems a rather cumbersome solution, as the key is not a reference to the attribute but takes the value of the key-attribute on assignment, meaning that:
the keys of the dictionary duplicate information already present in form of the object attribute and
when the object attribute is changed the dictionary key is not updated.
Using a list and iterating through the object seems even more inefficient.
So, is there more fitting data structure than dict for this particular case, a collection of objects giving me random access based on a certain object attribute?
This would need to work with Python 2.4 as that's what I am stuck with (at work).
If it hasn't been obvious, I'm new to Python.
There is actually no duplication of information as you fear: the dict's key, and the object's .key attribute, are just two references to exactly the same object.
The only real problem is "what if the .key gets reassigned". Well then, clearly you must use a property that updates all the relevant dicts as well as the instance's attribute; so each object must know all the dicts in which it may be enregistered. Ideally one would want to use weak references for the purpose, to avoid circular dependencies, but, alas, you can't take a weakref.ref (or proxy) to a dict. So, I'm using normal references here, instead (the alternative is not to use dict instances but e.g. some special subclass -- not handy).
def enregister(d, obj):
obj.ds.append(d)
d[obj.key] = obj
class Item(object):
def __init__(self, uniq_key, title=None):
self._key = uniq_key
self.title = title
self.ds = []
def adjust_key(self, newkey):
newds = [d for d in self.ds if self._key in d]
for d in newds:
del d[self._key]
d[newkey] = self
self.ds = newds
self._key = newkey
def get_key(self):
return self._key
key = property(get_key, adjust_key)
Edit: if you want a single collection with ALL the instances of Item, that's even easier, as you can make the collection a class-level attribute; indeed it can be a WeakValueDictionary to avoid erroneously keeping items alive, if that's what you need. I.e.:
class Item(object):
all = weakref.WeakValueDictionary()
def __init__(self, uniq_key, title=None):
self._key = uniq_key
self.title = title
# here, if needed, you could check that the key
# is not ALREADY present in self.all
self.all[self._key] = self
def adjust_key(self, newkey):
# "key non-uniqueness" could be checked here too
del self.all[self._key]
self.all[newkey] = self
self._key = newkey
def get_key(self):
return self._key
key = property(get_key, adjust_key)
Now you can use Item.all['akey'], Item.all.get('akey'), for akey in Item.all:, and so forth -- all the rich functionality of dicts.
There are a number of great things you can do here. One example would be to let the class keep track of everything:
class Item():
_member_dict = {}
#classmethod
def get_by_key(cls,key):
return cls._member_dict[key]
def __init__(self, uniq_key, title=None):
self.key = uniq_key
self.__class__._member_dict[key] = self
self.title = title
>>> i = Item('foo')
>>> i == Item.get_by_key('foo')
True
Note you will retain the update problem: if key changes, the _member_dict falls out of sync. This is where encapsulation will come in handy: make it (practically) impossible to change key without updating the dictionary. For a good tutorial on how to do that, see this tutorial.
Well, dict really is what you want. What may be cumbersome is not the dict itself, but the way you are building it. Here is a slight enhancement to your example, showing how to use a list expression and the dict constructor to easily create your lookup dict. This also shows how to create a multimap kind of dict, to look up matching items given a field value that might be duplicated across items:
class Item(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def __str__(self):
return str(self.__dict__)
def __repr__(self):
return str(self)
allitems = [
Item(key="red", title="foo"),
Item(key="green", title="foo"),
Item(key="blue", title="foofoo"),
]
# if fields are unique
itemByKey = dict([(i.key,i) for i in allitems])
# if field value can be duplicated across items
# (for Python 2.5 and higher, you could use a defaultdict from
# the collections module)
itemsByTitle = {}
for i in allitems:
if i.title in itemsByTitle:
itemsByTitle[i.title].append(i)
else:
itemsByTitle[i.title] = [i]
print itemByKey["red"]
print itemsByTitle["foo"]
Prints:
{'key': 'red', 'title': 'foo'}
[{'key': 'red', 'title': 'foo'}, {'key': 'green', 'title': 'foo'}]
Editing to correct the problem I had - which was due to my "collection = dict()" default parameter (*bonk*). Now, each call to the function will return a class with its own collection as intended - this for convenience in case more than one such collection should be needed. Also am putting the collection in the class and just returning the class instead of the two separately in a tuple as before. (Leaving the default container here as dict(), but that could be changed to Alex's WeakValueDictionary, which is of course very cool.)
def make_item_collection(container = None):
''' Create a class designed to be collected in a specific collection. '''
container = dict() if container is None else container
class CollectedItem(object):
collection = container
def __init__(self, key, title=None):
self.key = key
CollectedItem.collection[key] = self
self.title = title
def update_key(self, new_key):
CollectedItem.collection[
new_key] = CollectedItem.collection.pop(self.key)
self.key = new_key
return CollectedItem
# Usage Demo...
Item = make_item_collection()
my_collection = Item.collection
item_instance_1 = Item("unique_key1", title="foo1")
item_instance_2 = Item("unique_key2", title="foo2")
item_instance_3 = Item("unique_key3", title="foo3")
for k,v in my_collection.iteritems():
print k, v.title
item_instance_1.update_key("new_unique_key")
print '****'
for k,v in my_collection.iteritems():
print k, v.title
And here's the output in Python 2.5.2:
unique_key1 foo1
unique_key2 foo2
unique_key3 foo3
****
new_unique_key foo1
unique_key2 foo2
unique_key3 foo3
I'm writing an "envirorment" where each variable is composed by a value and a description:
class my_var:
def __init__(self, value, description):
self.value = value
self.description = description
Variables are created and put inside a dictionary:
my_dict["foo"] = my_var(0.5, "A foo var")
This is cool but 99% of operations with variable are with the "value" member. So I have to write like this:
print my_dict["foo"].value + 15 # Prints 15.5
or
my_dict["foo"].value = 17
I'd like that all operation on the object my_dict["foo"] could default to the "value" member. In other words I'd like to write:
print my_dict["foo"] + 15 # Prints 5.5
and stuff like that.
The only way I found is to reimplement all underscore-members (eq, add, str, etc) but I feel like this is the wrong way somehow. Is there a magic method I could use?
A workaround would be to have more dictionaries, like this:
my_dict_value["foo"] = 0.5
my_dict_description["foo"] = "A foo var"
but I don't like this solution. Do you have any suggestions?
Two general notes.
Please use Upper Case for Class Names.
Please (unless using Python 3.0) subclass object. class My_Var(object):, for example.
Now to your question.
Let's say you do
x= My_Var(0.5, "A foo var")
How does python distinguish between x, the composite object and x's value (x.value)?
Do you want the following behavior?
Sometimes x means the whole composite object.
Sometimes x means x.value.
How do you distinguish between the two? How will you tell Python which you mean?
I would personally just use two dictionaries, one for values and one for descriptions. Your desire for magic behavior is not very Pythonic.
With that being said, you could implement your own dict class:
class DescDict(dict):
def __init__(self, *args, **kwargs):
self.descs = {}
dict.__init__(self)
def __getitem__(self, name):
return dict.__getitem__(self, name)
def __setitem__(self, name, tup):
value, description = tup
self.descs[name] = description
dict.__setitem__(self, name, value)
def get_desc(self, name):
return self.descs[name]
You'd use this class as follows:
my_dict = DescDict()
my_dict["foo"] = (0.5, "A foo var") # just use a tuple if you only have 2 vals
print my_dict["foo"] + 15 # prints 15.5
print my_dict.get_desc("foo") # prints 'A foo var'
If you decide to go the magic behavior route, then this should be a good starting point.
You could create an object that mostly acts like "value" but has an additional attribute "description, by implementing the operators in section "Emulating numeric types" of http://docs.python.org/reference/datamodel.html
class Fooness(object):
def __init__(self,val, description):
self._val = val
self.description = description
def __add__(self,other):
return self._val + other
def __sub__(self,other):
return self._val - other
def __mul__(self,other):
return self._val * other
# etc
def __str__(self):
return str(self._val)
f = Fooness(10,"my f'd up fooness")
b = f + 10
print 'b=',b
d = f - 7
print 'd=',d
print 'f.description=',f.description
Produces:
b= 20
d= 3
f.description= my f'd up fooness
I think the fact that you have to do so much work to make a fancy shortcut is an indication that you're going against the grain. What you're doing violates LSP; it's counter-intuitive.
my_dict[k] = v;
print my_dict[k] == v # should be True
Even two separate dicts would be preferable to changing the meaning of dict.