How to create a python dictionary only accepts uniques mutable objects - python

My problem can be divide in two parts. The first one is not allow more two equal values in the dictionary. For example, I have this class:
class MyClass():
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return tuple(self.__dict__[key] for key in self.__dict__)
def __eq__(self, other):
if isinstance(other, type(self)):
return self.__key() == other.__key()
return NotImplemented
And I want to create and stored many objects in a dictionary like this
if __name__ == '__main__':
obj1 = MyClass(1, 2, 3)
obj2 = MyClass(3, 4, 5)
obj3 = MyClass(1, 2, 3)
myDict = {} # empty dictionary
myDict['a'] = obj1 # one key-value item
myDict['b'] = obj2 # two key-value items
myDict['c'] = obj3 # not allowed, value already stored
How to be sure that obj3 can't be stored in the dictionary ?
The second part of my problem is track when a mutable object change to avoid it be equal to the other values in the dictionary, i.e.:
obj2.a = 1; obj2.b = 2; obj2.c = 3 # not allowed
I coded a class Container that inherit from the dictionary class, to store the values (with uniques keys), and I added a set, to track the values in the dictionary, i.e.:
class MyContainer(dict):
def __init__(self):
self.unique_objects_values = set()
def __setitem__(self, key, value):
if key not in self: # overwrite not allowed
if value not in self.unique_object_values: # duplicate objects values don't allowed
super(MyContainer, self).__setitem__(key, value)
self.unique_object_values.add(value)
else:
print("Object already exist. Object didn't stored")
else:
print("Key already exist. Object didn't stored")
And add parent member to MyClass to check if the values aren't already stored but I'm not pretty sure if a data structure already exist to solve my problem.

Make another dictionary of the valued elements and check before adding the values to the original dictionary if it is present then don't add.

Related

Overwrite value in the difference instance variable? [duplicate]

This question already has answers here:
How to avoid having class data shared among instances?
(7 answers)
Closed 3 years ago.
Why am I getting the previous value into the different instance variables?
For example:
d1 = myDict()
d2 = myDict()
d1.assign(2,3)
d2.assign(2,2)
print(d1.getval(2))
print(d2.getval(2))
class myDict(object):
""" Implements a dictionary without using a dictionary """
aDict = {}
def __init__(self):
""" initialization of your representation """
#FILL THIS IN
def assign(self, k, v):
""" k (the key) and v (the value), immutable objects """
#FILL THIS IN
self.k = k
self.v = v
if self.k not in myDict.aDict:
self.aDict[self.k] = self.v
else:
self.aDict[self.k] = self.v
def getval(self, k):
""" k, immutable object """
#FILL THIS IN
if k in myDict.aDict:
return myDict.aDict[k]
else:
KeyError ('KeyError successfully raised')
# return myDict.aDict[k]
def delete(self, k):
""" k, immutable object """
#FILL THIS IN
if k in myDict.aDict.keys():
del myDict.aDict[k]
else:
raise KeyError('KeyError successfully raised')
def __str__(self):
return str(myDict.aDict)
d1 = myDict()
d2 = myDict()
d1.assign(2,3)
d2.assign(2,2)
print(d1.getval(2))
print(d2.getval(2))
My output:
2
2
4
1
Correct output:
3
2
4
1
aDict has been defined as a class level attribute, this means that all instances of myDict will share the same aDict.
To assign a separate aDict for each instance you should do this in the __init__ method
def __init__(self):
self.aDict = {}
It looks like aDict is not an instance variable, but a static variable. This means there exists only one aDict instance on a class-level which is shared by all instances of your myDict class. As a result, any changes you make to aDict in any given myDict instance will be reflected in all myDict instances.
In addition, I'd like to point out that you're not following the instructions of the assignment. The docstring of your class says you must implement this class without using a dictionary.

Getting a dictionary key that is an instance

I have a complex data structure, essentially a dict where the keys are hashed instances. If I only know the hashed value that equals the key, how can I get back the instance? I can do it by brute force, but it seems I should be able to get the key/instance in O(1) time.
class Test:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
self.arr = ["extra"]
def __str__(self):
return self.foo + self.bar
def __hash__(self):
return hash(str(self))
def __eq__(self, other):
return hash(self) == hash(other)
my_thing = Test("FOO", "BAR")
my_dict = dict()
my_dict[my_thing] = 1
for k, v in my_dict.iteritems():
if k == "FOOBAR":
print k.arr
Edit: I want to be able to get the mutable data in the instance (in this case the array). So that if I only know the hash of "FOOBAR" I would like to be able to get ["extra"], without having to traverse the entire dictionary matching keys (the for loop at the bottom)
The dict should map a key (foo, bar) to the data you want to retrieve. Here is one way you could implement this:
class Test(object):
...
def key(self):
return (self.foo, self.bar)
my_thing = Test("FOO", "BAR")
my_dict = {}
my_dict[my_thing.key()] = my_thing
print my_dict[("FOO", "BAR")].arr
Note that I modified your key function to avoid collisions like:
Test("FOO", "BAR") == Test("FOOB", "AR")
It sounds like you have the string "FOOBAR" (for example) and you want to retrieve a key k in your dict for which str(k) == "FOOBAR".
One way to do this is to just reconstruct a new Test object that will have the same string representation and use that for your lookup:
my_thing = my_dict(Test("FOO", "BAR"))
But this is inefficient and creates a dependency between object creation and your string representation.
If this string representation has its own intrinsic value as a key, you can just maintain your dict (or another dict) keyed by string instead:
my_index = dict()
my_index[str(my_thing)] = my_thing
That way, you can lookup your value given the string only:
print my_index["FOOBAR"].arr

How can I make a custom conversion from an object to a dict?

I have a Python class that stores some fields and has some properties, like
class A(object):
def __init__(self, x, y):
self.x = x
self.y = y
#property
def z(self):
return self.x+1
What changes do I need to make to the class so that I can do
>>> a = A(1,5)
>>> dict(a)
{'y':5, 'z':2}
where I specify that I want to return y and z? I can't just use a.__dict__ because it would contain x but not z. I would like to be able to specify anything that can be accessed with __getattribute__.
Add an __iter__() method to your class that returns an iterator of the object's items as key-value pairs. Then you can pass your object instance directly to the dict() constructor, as it accepts a sequence of key-value pairs.
def __iter__(self):
for key in "y", "z":
yield key, getattr(self, key)
If you want this to be a little more flexible, and allow the list of attributes to be easily overridden on subclasses (or programmatically) you could store the key list as an attribute of your class:
_dictkeys = "y", "z"
def __iter__(self):
for key in self._dictkeys:
yield key, getattr(self, key)
If you want the dictionary to contain all attributes (including those inherited from parent classes) try:
def __iter__(self):
for key in dir(self):
if not key.startswith("_"):
value = getattr(self, key)
if not callable(value):
yield key, value
This excludes members that start with "_" and also callable objects (e.g. classes and functions).
I think a reasonable approach to this problem would be to create an asdict method. When you say you want to specify which keys you want the dict to contain, I assume you're happy with that information being passed when the method is called. If that's not what you mean, let me know. (This incorporates kindall's excellent suggestion.)
class A(object):
def __init__(self, x, y):
self.x = x
self.y = y
#property
def z(self):
return self.x+1
def asdict(self, *keys):
if not keys:
keys = ['y', 'z']
return dict((key, getattr(self, key)) for key in keys)
Tested:
>>> A(1, 2).asdict('x', 'y')
{'y': 2, 'x': 1}
>>> A(1, 2).asdict()
{'y': 2, 'z': 2}
A naive way to solve this would be the following. It might not be enough for your needs but I am just pointing it out in case you've missed it.
>>> dict(y=a.y, z=a.z)
>>> {'y': 5, 'z': 2}

Immutable dictionary, only use as a key for another dictionary

I had the need to implement a hashable dict so I could use a dictionary as a key for another dictionary.
A few months ago I used this implementation: Python hashable dicts
However I got a notice from a colleague saying 'it is not really immutable, thus it is not safe. You can use it, but it does make me feel like a sad Panda'.
So I started looking around to create one that is immutable. I have no need to compare the 'key-dict' to another 'key-dict'. Its only use is as a key for another dictionary.
I have come up with the following:
class HashableDict(dict):
"""Hashable dict that can be used as a key in other dictionaries"""
def __new__(self, *args, **kwargs):
# create a new local dict, that will be used by the HashableDictBase closure class
immutableDict = dict(*args, **kwargs)
class HashableDictBase(object):
"""Hashable dict that can be used as a key in other dictionaries. This is now immutable"""
def __key(self):
"""Return a tuple of the current keys"""
return tuple((k, immutableDict[k]) for k in sorted(immutableDict))
def __hash__(self):
"""Return a hash of __key"""
return hash(self.__key())
def __eq__(self, other):
"""Compare two __keys"""
return self.__key() == other.__key() # pylint: disable-msg=W0212
def __repr__(self):
"""#see: dict.__repr__"""
return immutableDict.__repr__()
def __str__(self):
"""#see: dict.__str__"""
return immutableDict.__str__()
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
return HashableDictBase()
I used the following to test the functionality:
d = {"a" : 1}
a = HashableDict(d)
b = HashableDict({"b" : 2})
print a
d["b"] = 2
print a
c = HashableDict({"a" : 1})
test = {a : "value with a dict as key (key a)",
b : "value with a dict as key (key b)"}
print test[a]
print test[b]
print test[c]
which gives:
{'a': 1}
{'a': 1}
value with a dict as key (key a)
value with a dict as key (key b)
value with a dict as key (key a)
as output
Is this the 'best possible' immutable dictionary that I can use that satisfies my requirements? If not, what would be a better solution?
If you are only using it as a key for another dict, you could go for frozenset(mutabledict.items()). If you need to access the underlying mappings, you could then use that as the parameter to dict.
mutabledict = dict(zip('abc', range(3)))
immutable = frozenset(mutabledict.items())
read_frozen = dict(immutable)
read_frozen['a'] # => 1
Note that you could also combine this with a class derived from dict, and use the frozenset as the source of the hash, while disabling __setitem__, as suggested in another answer. (#RaymondHettinger's answer for code which does just that).
The Mapping abstract base class makes this easy to implement:
import collections
class ImmutableDict(collections.Mapping):
def __init__(self, somedict):
self._dict = dict(somedict) # make a copy
self._hash = None
def __getitem__(self, key):
return self._dict[key]
def __len__(self):
return len(self._dict)
def __iter__(self):
return iter(self._dict)
def __hash__(self):
if self._hash is None:
self._hash = hash(frozenset(self._dict.items()))
return self._hash
def __eq__(self, other):
return self._dict == other._dict
I realize this has already been answered, but types.MappingProxyType is an analogous implementation for Python 3.3. Regarding the original question of safety, there is a discussion in PEP 416 -- Add a frozendict builtin type on why the idea of a frozendict was rejected.
In order for your immutable dictionary to be safe, all it needs to do is never change its hash. Why don't you just disable __setitem__ as follows:
class ImmutableDict(dict):
def __setitem__(self, key, value):
raise Exception("Can't touch this")
def __hash__(self):
return hash(tuple(sorted(self.items())))
a = ImmutableDict({'a':1})
b = {a:1}
print b
print b[a]
a['a'] = 0
The output of the script is:
{{'a': 1}: 1}
1
Traceback (most recent call last):
File "ex.py", line 11, in <module>
a['a'] = 0
File "ex.py", line 3, in __setitem__
raise Exception("Can't touch this")
Exception: Can't touch this
Here is a link to pip install-able implementation of #RaymondHettinger's answer: https://github.com/pcattori/icicle
Simply pip install icicle and you can from icicle import FrozenDict!
Update: icicle has been deprecated in favor of maps: https://github.com/pcattori/maps (documentation, PyPI).
It appears I am late to post. Not sure if anyone else has come up with ideas. But here is my take on it. The Dict is immutable and hashable. I made it immutable by overriding all the methods, magic and otherwise, with a custom '_readonly' function that raises an Exception. This is done when the object is instantiated. To get around the problem of not being able to apply the values I set the 'hash' under '__new__'. I then I override the '__hash__'function. Thats it!
class ImmutableDict(dict):
_HASH = None
def __new__(cls, *args, **kwargs):
ImmutableDict._HASH = hash(frozenset(args[0].items()))
return super(ImmutableDict, cls).__new__(cls, args)
def __hash__(self):
return self._HASH
def _readonly(self, *args, **kwards):
raise TypeError("Cannot modify Immutable Instance")
__delattr__ = __setattr__ = __setitem__ = pop = update = setdefault = clear = popitem = _readonly
Test:
immutabled1 = ImmutableDict({"This": "That", "Cheese": "Blarg"})
dict1 = {immutabled1: "Yay"}
dict1[immutabled1]
"Yay"
dict1
{{'Cheese': 'Blarg', 'This': 'That'}: 'Yay'}
Variation of Raymond Hettinger's answer by wrapping the self._dict with types.MappingProxyType.
class ImmutableDict(collections.Mapping):
"""
Copies a dict and proxies it via types.MappingProxyType to make it immutable.
"""
def __init__(self, somedict):
dictcopy = dict(somedict) # make a copy
self._dict = MappingProxyType(dictcopy) # lock it
self._hash = None
def __getitem__(self, key):
return self._dict[key]
def __len__(self):
return len(self._dict)
def __iter__(self):
return iter(self._dict)
def __hash__(self):
if self._hash is None:
self._hash = hash(frozenset(self._dict.items()))
return self._hash
def __eq__(self, other):
return self._dict == other._dict
def __repr__(self):
return str(self._dict)
You can use an enum:
import enum
KeyDict1 = enum.Enum('KeyDict1', {'InnerDictKey1':'bla', 'InnerDictKey2 ':2})
d = { KeyDict1: 'whatever', KeyDict2: 1, ...}
You can access the enums like you would a dictionary:
KeyDict1['InnerDictKey2'].value # This is 2
You can iterate over the names, and get their values... It does everything you'd expect.
You can try using https://github.com/Lightricks/freeze
It provides recursively immutable and hashable dictionaries
from freeze import FDict
a_mutable_dict = {
"list": [1, 2],
"set": {3, 4},
}
a_frozen_dict = FDict(a_mutable_dict)
print(a_frozen_dict)
print(hash(a_frozen_dict))
# FDict: {'list': FList: (1, 2), 'set': FSet: {3, 4}}
# -4855611361973338606

Python data structure for a collection of objects with random access based on an attribute

I need a collection of objects which can be looked up by a certain (unique) attribute common to each of the objects. Right now I am using a dicitionary assigning the dictionary key to the attribute.
Here is an example of what I have now:
class Item():
def __init__(self, uniq_key, title=None):
self.key = uniq_key
self.title = title
item_instance_1 = Item("unique_key1", title="foo")
item_instance_2 = Item("unique_key3", title="foo")
item_instance_3 = Item("unique_key2", title="foo")
item_collection = {
item_instance_1.key: item_instance_1,
item_instance_2.key: item_instance_2,
item_instance_3.key: item_instance_3
}
item_instance_1.key = "new_key"
Now this seems a rather cumbersome solution, as the key is not a reference to the attribute but takes the value of the key-attribute on assignment, meaning that:
the keys of the dictionary duplicate information already present in form of the object attribute and
when the object attribute is changed the dictionary key is not updated.
Using a list and iterating through the object seems even more inefficient.
So, is there more fitting data structure than dict for this particular case, a collection of objects giving me random access based on a certain object attribute?
This would need to work with Python 2.4 as that's what I am stuck with (at work).
If it hasn't been obvious, I'm new to Python.
There is actually no duplication of information as you fear: the dict's key, and the object's .key attribute, are just two references to exactly the same object.
The only real problem is "what if the .key gets reassigned". Well then, clearly you must use a property that updates all the relevant dicts as well as the instance's attribute; so each object must know all the dicts in which it may be enregistered. Ideally one would want to use weak references for the purpose, to avoid circular dependencies, but, alas, you can't take a weakref.ref (or proxy) to a dict. So, I'm using normal references here, instead (the alternative is not to use dict instances but e.g. some special subclass -- not handy).
def enregister(d, obj):
obj.ds.append(d)
d[obj.key] = obj
class Item(object):
def __init__(self, uniq_key, title=None):
self._key = uniq_key
self.title = title
self.ds = []
def adjust_key(self, newkey):
newds = [d for d in self.ds if self._key in d]
for d in newds:
del d[self._key]
d[newkey] = self
self.ds = newds
self._key = newkey
def get_key(self):
return self._key
key = property(get_key, adjust_key)
Edit: if you want a single collection with ALL the instances of Item, that's even easier, as you can make the collection a class-level attribute; indeed it can be a WeakValueDictionary to avoid erroneously keeping items alive, if that's what you need. I.e.:
class Item(object):
all = weakref.WeakValueDictionary()
def __init__(self, uniq_key, title=None):
self._key = uniq_key
self.title = title
# here, if needed, you could check that the key
# is not ALREADY present in self.all
self.all[self._key] = self
def adjust_key(self, newkey):
# "key non-uniqueness" could be checked here too
del self.all[self._key]
self.all[newkey] = self
self._key = newkey
def get_key(self):
return self._key
key = property(get_key, adjust_key)
Now you can use Item.all['akey'], Item.all.get('akey'), for akey in Item.all:, and so forth -- all the rich functionality of dicts.
There are a number of great things you can do here. One example would be to let the class keep track of everything:
class Item():
_member_dict = {}
#classmethod
def get_by_key(cls,key):
return cls._member_dict[key]
def __init__(self, uniq_key, title=None):
self.key = uniq_key
self.__class__._member_dict[key] = self
self.title = title
>>> i = Item('foo')
>>> i == Item.get_by_key('foo')
True
Note you will retain the update problem: if key changes, the _member_dict falls out of sync. This is where encapsulation will come in handy: make it (practically) impossible to change key without updating the dictionary. For a good tutorial on how to do that, see this tutorial.
Well, dict really is what you want. What may be cumbersome is not the dict itself, but the way you are building it. Here is a slight enhancement to your example, showing how to use a list expression and the dict constructor to easily create your lookup dict. This also shows how to create a multimap kind of dict, to look up matching items given a field value that might be duplicated across items:
class Item(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def __str__(self):
return str(self.__dict__)
def __repr__(self):
return str(self)
allitems = [
Item(key="red", title="foo"),
Item(key="green", title="foo"),
Item(key="blue", title="foofoo"),
]
# if fields are unique
itemByKey = dict([(i.key,i) for i in allitems])
# if field value can be duplicated across items
# (for Python 2.5 and higher, you could use a defaultdict from
# the collections module)
itemsByTitle = {}
for i in allitems:
if i.title in itemsByTitle:
itemsByTitle[i.title].append(i)
else:
itemsByTitle[i.title] = [i]
print itemByKey["red"]
print itemsByTitle["foo"]
Prints:
{'key': 'red', 'title': 'foo'}
[{'key': 'red', 'title': 'foo'}, {'key': 'green', 'title': 'foo'}]
Editing to correct the problem I had - which was due to my "collection = dict()" default parameter (*bonk*). Now, each call to the function will return a class with its own collection as intended - this for convenience in case more than one such collection should be needed. Also am putting the collection in the class and just returning the class instead of the two separately in a tuple as before. (Leaving the default container here as dict(), but that could be changed to Alex's WeakValueDictionary, which is of course very cool.)
def make_item_collection(container = None):
''' Create a class designed to be collected in a specific collection. '''
container = dict() if container is None else container
class CollectedItem(object):
collection = container
def __init__(self, key, title=None):
self.key = key
CollectedItem.collection[key] = self
self.title = title
def update_key(self, new_key):
CollectedItem.collection[
new_key] = CollectedItem.collection.pop(self.key)
self.key = new_key
return CollectedItem
# Usage Demo...
Item = make_item_collection()
my_collection = Item.collection
item_instance_1 = Item("unique_key1", title="foo1")
item_instance_2 = Item("unique_key2", title="foo2")
item_instance_3 = Item("unique_key3", title="foo3")
for k,v in my_collection.iteritems():
print k, v.title
item_instance_1.update_key("new_unique_key")
print '****'
for k,v in my_collection.iteritems():
print k, v.title
And here's the output in Python 2.5.2:
unique_key1 foo1
unique_key2 foo2
unique_key3 foo3
****
new_unique_key foo1
unique_key2 foo2
unique_key3 foo3

Categories