Symmetric dictionary where d[a][b] == d[b][a] - python

I have an algorithm in python which creates measures for pairs of values, where m(v1, v2) == m(v2, v1) (i.e. it is symmetric). I had the idea to write a dictionary of dictionaries where these values are stored in a memory-efficient way, so that they can easily be retrieved with keys in any order. I like to inherit from things, and ideally, I'd love to write a symmetric_dict where s_d[v1][v2] always equals s_d[v2][v1], probably by checking which of the v's is larger according to some kind of ordering relation and then switching them around so that the smaller element one is always mentioned first. i.e. when calling s_d[5][2] = 4, the dict of dicts will turn them around so that they are in fact stored as s_d[2][5] = 4, and the same for retrieval of the data.
I'm also very open for a better data structure, but I'd prefer an implementation with "is-a" relationship to something which just uses a dict and preprocesses some function arguments.

You could use a frozenset as the key for your dict:
>>> s_d = {}
>>> s_d[frozenset([5,2])] = 4
>>> s_d[frozenset([2,5])]
4
It would be fairly straightforward to write a subclass of dict that took iterables as key arguments and then turned then into a frozenset when storing values:
class SymDict(dict):
def __getitem__(self, key):
return dict.__getitem__(self, frozenset(key))
def __setitem__(self, key, value):
dict.__setitem__(self, frozenset(key), value)
Which gives you:
>>> s_d = SymDict()
>>> s_d[5,2] = 4
>>> s_d[2,5]
4

Doing it with nested indexing as shown will be extremely difficult. It's better to use a tuple as the key instead. That way the tuple can be sorted and an encapsulated dict can be accessed for the value.
d[2, 5] = 4
print d[5, 2]

Here's a slightly different approach that looks promising. Although the SymDict class isn't a dict subclass, it mostly behaves like one, and there's only a single private dictionary involved. I think one interesting feature is that fact that it preserves the natural [][] lookup syntax you seemed to want.
class SymDict(object):
def __init__(self, *args, **kwrds):
self._mapping = _SubSymDict(*args, **kwrds)
def __getitem__(self, key1):
self._mapping.set_key1(key1)
return self._mapping
def __setitem__(self, key1, value):
raise NotImplementedError
def __str__(self):
return '_mapping: ' + self._mapping.__str__()
def __getattr__(self, name):
return getattr(self._mapping, name)
class _SubSymDict(dict):
def __init__(self, *args, **kwrds):
dict.__init__(self, *args, **kwrds)
def set_key1(self, key1):
self.key1 = key1
def __getitem__(self, key2):
return dict.__getitem__(self, frozenset((self.key1, key2)))
def __setitem__(self, key2, value):
dict.__setitem__(self, frozenset((self.key1, key2)), value)
symdict = SymDict()
symdict[2][4] = 24
symdict[4][2] = 42
print 'symdict[2][4]:', symdict[2][4]
# symdict[2][4]: 42
print 'symdict[4][2]:', symdict[4][2]
# symdict[4][2]: 42
print 'symdict:', symdict
# symdict: _mapping: {frozenset([2, 4]): 42}
print symdict.keys()
# [frozenset([2, 4])]

Just as an alternative to Dave Webb's frozenset, why not do a SymDict like the following:
class SymDict(dict):
def __getitem__(self, key):
return dict.__getitem__(self, key if key[0] < key[1] else (key[1],key[0]))
def __setitem__(self, key, value):
dict.__setitem__(self, key if key[0] < key[1] else (key[1],key[0]), value)
From a quick test, this is more than 10% faster for getting and setting items than using a frozenset. Anyway, just another idea. However, it is less adaptable than the frozenset as it is really only set up to be used with tuples of length 2. As far as I can tell from the OP, that doesn't seem to be an issue here.

Improving on Justin Peel's solution, you need to add __delitem__ and __contains__ methods for a few more dictionary operations to work. So, for completeness,
class SymDict(dict):
def __getitem__(self, key):
return dict.__getitem__(self, key if key[0] < key[1] else (key[1],key[0]))
def __setitem__(self, key, value):
dict.__setitem__(self, key if key[0] < key[1] else (key[1],key[0]), value)
def __delitem__(self, key):
return dict.__delitem__(self, key if key[0] < key[1] else (key[1],key[0]))
def __contains__(self, key):
return dict.__contains__(self, key if key[0] < key[1] else (key[1],key[0]))
So then
>>> s_d = SymDict()
>>> s_d[2,5] = 4
>>> s_d[5,2]
4
>>> (5,2) in s_d
True
>>> del s_d[5,2]
>>> s_d
{}
I'm not sure, though, whether that covers all the bases, but it was good enough for my own code.

An obvious alternative is to use a (v1,v2) tuple as the key into a single standard dict, and insert both (v1,v2) and (v2,v1) into the dictionary, making them refer to the same object on the right-hand side.

I'd extract the function for more readability(for patvarilly answer)
class SymDict(dict):
def __getitem__(self, key):
return dict.__getitem__(self, self.symm(key))
def __setitem__(self, key, value):
dict.__setitem__(self, self.symm(key), value)
def __delitem__(self, key):
return dict.__delitem__(self, self.symm(key))
def __contains__(self, key):
return dict.__contains__(self, self.symm(key))
#staticmethod
def symm(key):
return key if key[0] < key[1] else (key[1], key[0]).

Related

Why Python dict have attribute statuses but dot operator does not work? [duplicate]

How do I make Python dictionary members accessible via a dot "."?
For example, instead of writing mydict['val'], I'd like to write mydict.val.
Also I'd like to access nested dicts this way. For example
mydict.mydict2.val
would refer to
mydict = { 'mydict2': { 'val': ... } }
I've always kept this around in a util file. You can use it as a mixin on your own classes too.
class dotdict(dict):
"""dot.notation access to dictionary attributes"""
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
mydict = {'val':'it works'}
nested_dict = {'val':'nested works too'}
mydict = dotdict(mydict)
mydict.val
# 'it works'
mydict.nested = dotdict(nested_dict)
mydict.nested.val
# 'nested works too'
You can do it using this class I just made. With this class you can use the Map object like another dictionary(including json serialization) or with the dot notation. I hope to help you:
class Map(dict):
"""
Example:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
"""
def __init__(self, *args, **kwargs):
super(Map, self).__init__(*args, **kwargs)
for arg in args:
if isinstance(arg, dict):
for k, v in arg.iteritems():
self[k] = v
if kwargs:
for k, v in kwargs.iteritems():
self[k] = v
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(Map, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(Map, self).__delitem__(key)
del self.__dict__[key]
Usage examples:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
# Add new key
m.new_key = 'Hello world!'
# Or
m['new_key'] = 'Hello world!'
print m.new_key
print m['new_key']
# Update values
m.new_key = 'Yay!'
# Or
m['new_key'] = 'Yay!'
# Delete key
del m.new_key
# Or
del m['new_key']
Install dotmap via pip
pip install dotmap
It does everything you want it to do and subclasses dict, so it operates like a normal dictionary:
from dotmap import DotMap
m = DotMap()
m.hello = 'world'
m.hello
m.hello += '!'
# m.hello and m['hello'] now both return 'world!'
m.val = 5
m.val2 = 'Sam'
On top of that, you can convert it to and from dict objects:
d = m.toDict()
m = DotMap(d) # automatic conversion in constructor
This means that if something you want to access is already in dict form, you can turn it into a DotMap for easy access:
import json
jsonDict = json.loads(text)
data = DotMap(jsonDict)
print data.location.city
Finally, it automatically creates new child DotMap instances so you can do things like this:
m = DotMap()
m.people.steve.age = 31
Comparison to Bunch
Full disclosure: I am the creator of the DotMap. I created it because Bunch was missing these features
remembering the order items are added and iterating in that order
automatic child DotMap creation, which saves time and makes for cleaner code when you have a lot of hierarchy
constructing from a dict and recursively converting all child dict instances to DotMap
Derive from dict and and implement __getattr__ and __setattr__.
Or you can use Bunch which is very similar.
I don't think it's possible to monkeypatch built-in dict class.
Use SimpleNamespace:
>>> from types import SimpleNamespace
>>> d = dict(x=[1, 2], y=['a', 'b'])
>>> ns = SimpleNamespace(**d)
>>> ns.x
[1, 2]
>>> ns
namespace(x=[1, 2], y=['a', 'b'])
Fabric has a really nice, minimal implementation. Extending that to allow for nested access, we can use a defaultdict, and the result looks something like this:
from collections import defaultdict
class AttributeDict(defaultdict):
def __init__(self):
super(AttributeDict, self).__init__(AttributeDict)
def __getattr__(self, key):
try:
return self[key]
except KeyError:
raise AttributeError(key)
def __setattr__(self, key, value):
self[key] = value
Make use of it as follows:
keys = AttributeDict()
keys.abc.xyz.x = 123
keys.abc.xyz.a.b.c = 234
That elaborates a bit on Kugel's answer of "Derive from dict and and implement __getattr__ and __setattr__". Now you know how!
I tried this:
class dotdict(dict):
def __getattr__(self, name):
return self[name]
you can try __getattribute__ too.
make every dict a type of dotdict would be good enough, if you want to init this from a multi-layer dict, try implement __init__ too.
I recently came across the 'Box' library which does the same thing.
Installation command : pip install python-box
Example:
from box import Box
mydict = {"key1":{"v1":0.375,
"v2":0.625},
"key2":0.125,
}
mydict = Box(mydict)
print(mydict.key1.v1)
I found it to be more effective than other existing libraries like dotmap, which generate python recursion error when you have large nested dicts.
link to library and details: https://pypi.org/project/python-box/
If you want to pickle your modified dictionary, you need to add few state methods to above answers:
class DotDict(dict):
"""dot.notation access to dictionary attributes"""
def __getattr__(self, attr):
return self.get(attr)
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
def __getstate__(self):
return self
def __setstate__(self, state):
self.update(state)
self.__dict__ = self
You can achieve this using SimpleNamespace
from types import SimpleNamespace
# Assign values
args = SimpleNamespace()
args.username = 'admin'
# Retrive values
print(args.username) # output: admin
Don't. Attribute access and indexing are separate things in Python, and you shouldn't want them to perform the same. Make a class (possibly one made by namedtuple) if you have something that should have accessible attributes and use [] notation to get an item from a dict.
To build upon epool's answer, this version allows you to access any dict inside via the dot operator:
foo = {
"bar" : {
"baz" : [ {"boo" : "hoo"} , {"baba" : "loo"} ]
}
}
For instance, foo.bar.baz[1].baba returns "loo".
class Map(dict):
def __init__(self, *args, **kwargs):
super(Map, self).__init__(*args, **kwargs)
for arg in args:
if isinstance(arg, dict):
for k, v in arg.items():
if isinstance(v, dict):
v = Map(v)
if isinstance(v, list):
self.__convert(v)
self[k] = v
if kwargs:
for k, v in kwargs.items():
if isinstance(v, dict):
v = Map(v)
elif isinstance(v, list):
self.__convert(v)
self[k] = v
def __convert(self, v):
for elem in range(0, len(v)):
if isinstance(v[elem], dict):
v[elem] = Map(v[elem])
elif isinstance(v[elem], list):
self.__convert(v[elem])
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(Map, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(Map, self).__delitem__(key)
del self.__dict__[key]
Building on Kugel's answer and taking Mike Graham's words of caution into consideration, what if we make a wrapper?
class DictWrap(object):
""" Wrap an existing dict, or create a new one, and access with either dot
notation or key lookup.
The attribute _data is reserved and stores the underlying dictionary.
When using the += operator with create=True, the empty nested dict is
replaced with the operand, effectively creating a default dictionary
of mixed types.
args:
d({}): Existing dict to wrap, an empty dict is created by default
create(True): Create an empty, nested dict instead of raising a KeyError
example:
>>>dw = DictWrap({'pp':3})
>>>dw.a.b += 2
>>>dw.a.b += 2
>>>dw.a['c'] += 'Hello'
>>>dw.a['c'] += ' World'
>>>dw.a.d
>>>print dw._data
{'a': {'c': 'Hello World', 'b': 4, 'd': {}}, 'pp': 3}
"""
def __init__(self, d=None, create=True):
if d is None:
d = {}
supr = super(DictWrap, self)
supr.__setattr__('_data', d)
supr.__setattr__('__create', create)
def __getattr__(self, name):
try:
value = self._data[name]
except KeyError:
if not super(DictWrap, self).__getattribute__('__create'):
raise
value = {}
self._data[name] = value
if hasattr(value, 'items'):
create = super(DictWrap, self).__getattribute__('__create')
return DictWrap(value, create)
return value
def __setattr__(self, name, value):
self._data[name] = value
def __getitem__(self, key):
try:
value = self._data[key]
except KeyError:
if not super(DictWrap, self).__getattribute__('__create'):
raise
value = {}
self._data[key] = value
if hasattr(value, 'items'):
create = super(DictWrap, self).__getattribute__('__create')
return DictWrap(value, create)
return value
def __setitem__(self, key, value):
self._data[key] = value
def __iadd__(self, other):
if self._data:
raise TypeError("A Nested dict will only be replaced if it's empty")
else:
return other
Use __getattr__, very simple, works in
Python 3.4.3
class myDict(dict):
def __getattr__(self,val):
return self[val]
blockBody=myDict()
blockBody['item1']=10000
blockBody['item2']="StackOverflow"
print(blockBody.item1)
print(blockBody.item2)
Output:
10000
StackOverflow
I like the Munch and it gives lot of handy options on top of dot access.
import munch
temp_1 = {'person': { 'fname': 'senthil', 'lname': 'ramalingam'}}
dict_munch = munch.munchify(temp_1)
dict_munch.person.fname
The language itself doesn't support this, but sometimes this is still a useful requirement. Besides the Bunch recipe, you can also write a little method which can access a dictionary using a dotted string:
def get_var(input_dict, accessor_string):
"""Gets data from a dictionary using a dotted accessor-string"""
current_data = input_dict
for chunk in accessor_string.split('.'):
current_data = current_data.get(chunk, {})
return current_data
which would support something like this:
>> test_dict = {'thing': {'spam': 12, 'foo': {'cheeze': 'bar'}}}
>> output = get_var(test_dict, 'thing.spam.foo.cheeze')
>> print output
'bar'
>>
I ended up trying BOTH the AttrDict and the Bunch libraries and found them to be way to slow for my uses. After a friend and I looked into it, we found that the main method for writing these libraries results in the library aggressively recursing through a nested object and making copies of the dictionary object throughout. With this in mind, we made two key changes. 1) We made attributes lazy-loaded 2) instead of creating copies of a dictionary object, we create copies of a light-weight proxy object. This is the final implementation. The performance increase of using this code is incredible. When using AttrDict or Bunch, these two libraries alone consumed 1/2 and 1/3 respectively of my request time(what!?). This code reduced that time to almost nothing(somewhere in the range of 0.5ms). This of course depends on your needs, but if you are using this functionality quite a bit in your code, definitely go with something simple like this.
class DictProxy(object):
def __init__(self, obj):
self.obj = obj
def __getitem__(self, key):
return wrap(self.obj[key])
def __getattr__(self, key):
try:
return wrap(getattr(self.obj, key))
except AttributeError:
try:
return self[key]
except KeyError:
raise AttributeError(key)
# you probably also want to proxy important list properties along like
# items(), iteritems() and __len__
class ListProxy(object):
def __init__(self, obj):
self.obj = obj
def __getitem__(self, key):
return wrap(self.obj[key])
# you probably also want to proxy important list properties along like
# __iter__ and __len__
def wrap(value):
if isinstance(value, dict):
return DictProxy(value)
if isinstance(value, (tuple, list)):
return ListProxy(value)
return value
See the original implementation here by https://stackoverflow.com/users/704327/michael-merickel.
The other thing to note, is that this implementation is pretty simple and doesn't implement all of the methods you might need. You'll need to write those as required on the DictProxy or ListProxy objects.
def dict_to_object(dick):
# http://stackoverflow.com/a/1305663/968442
class Struct:
def __init__(self, **entries):
self.__dict__.update(entries)
return Struct(**dick)
If one decides to permanently convert that dict to object this should do. You can create a throwaway object just before accessing.
d = dict_to_object(d)
This solution is a refinement upon the one offered by epool to address the requirement of the OP to access nested dicts in a consistent manner. The solution by epool did not allow for accessing nested dicts.
class YAMLobj(dict):
def __init__(self, args):
super(YAMLobj, self).__init__(args)
if isinstance(args, dict):
for k, v in args.iteritems():
if not isinstance(v, dict):
self[k] = v
else:
self.__setattr__(k, YAMLobj(v))
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(YAMLobj, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(YAMLobj, self).__delitem__(key)
del self.__dict__[key]
With this class, one can now do something like: A.B.C.D.
For infinite levels of nesting of dicts, lists, lists of dicts, and dicts of lists.
It also supports pickling
This is an extension of this answer.
class DotDict(dict):
# https://stackoverflow.com/a/70665030/913098
"""
Example:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
Iterable are assumed to have a constructor taking list as input.
"""
def __init__(self, *args, **kwargs):
super(DotDict, self).__init__(*args, **kwargs)
args_with_kwargs = []
for arg in args:
args_with_kwargs.append(arg)
args_with_kwargs.append(kwargs)
args = args_with_kwargs
for arg in args:
if isinstance(arg, dict):
for k, v in arg.items():
self[k] = v
if isinstance(v, dict):
self[k] = DotDict(v)
elif isinstance(v, str) or isinstance(v, bytes):
self[k] = v
elif isinstance(v, Iterable):
klass = type(v)
map_value: List[Any] = []
for e in v:
map_e = DotDict(e) if isinstance(e, dict) else e
map_value.append(map_e)
self[k] = klass(map_value)
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(DotDict, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(DotDict, self).__delitem__(key)
del self.__dict__[key]
def __getstate__(self):
return self.__dict__
def __setstate__(self, d):
self.__dict__.update(d)
if __name__ == "__main__":
import pickle
def test_map():
d = {
"a": 1,
"b": {
"c": "d",
"e": 2,
"f": None
},
"g": [],
"h": [1, "i"],
"j": [1, "k", {}],
"l":
[
1,
"m",
{
"n": [3],
"o": "p",
"q": {
"r": "s",
"t": ["u", 5, {"v": "w"}, ],
"x": ("z", 1)
}
}
],
}
map_d = DotDict(d)
w = map_d.l[2].q.t[2].v
assert w == "w"
pickled = pickle.dumps(map_d)
unpickled = pickle.loads(pickled)
assert unpickled == map_d
kwargs_check = DotDict(a=1, b=[dict(c=2, d="3"), 5])
assert kwargs_check.b[0].d == "3"
kwargs_and_args_check = DotDict(d, a=1, b=[dict(c=2, d="3"), 5])
assert kwargs_and_args_check.l[2].q.t[2].v == "w"
assert kwargs_and_args_check.b[0].d == "3"
test_map()
I dislike adding another log to a (more than) 10-year old fire, but I'd also check out the dotwiz library, which I've recently released - just this year actually.
It's a relatively tiny library, which also performs really well for get (access) and set (create) times in benchmarks, at least as compared to other alternatives.
Install dotwiz via pip
pip install dotwiz
It does everything you want it to do and subclasses dict, so it operates like a normal dictionary:
from dotwiz import DotWiz
dw = DotWiz()
dw.hello = 'world'
dw.hello
dw.hello += '!'
# dw.hello and dw['hello'] now both return 'world!'
dw.val = 5
dw.val2 = 'Sam'
On top of that, you can convert it to and from dict objects:
d = dw.to_dict()
dw = DotWiz(d) # automatic conversion in constructor
This means that if something you want to access is already in dict form, you can turn it into a DotWiz for easy access:
import json
json_dict = json.loads(text)
data = DotWiz(json_dict)
print data.location.city
Finally, something exciting I am working on is an existing feature request so that it automatically creates new child DotWiz instances so you can do things like this:
dw = DotWiz()
dw['people.steve.age'] = 31
dw
# ✫(people=✫(steve=✫(age=31)))
Comparison with dotmap
I've added a quick and dirty performance comparison with dotmap below.
First, install both libraries with pip:
pip install dotwiz dotmap
I came up with the following code for benchmark purposes:
from timeit import timeit
from dotwiz import DotWiz
from dotmap import DotMap
d = {'hey': {'so': [{'this': {'is': {'pretty': {'cool': True}}}}]}}
dw = DotWiz(d)
# ✫(hey=✫(so=[✫(this=✫(is=✫(pretty={'cool'})))]))
dm = DotMap(d)
# DotMap(hey=DotMap(so=[DotMap(this=DotMap(is=DotMap(pretty={'cool'})))]))
assert dw.hey.so[0].this['is'].pretty.cool == dm.hey.so[0].this['is'].pretty.cool
n = 100_000
print('dotwiz (create): ', round(timeit('DotWiz(d)', number=n, globals=globals()), 3))
print('dotmap (create): ', round(timeit('DotMap(d)', number=n, globals=globals()), 3))
print('dotwiz (get): ', round(timeit("dw.hey.so[0].this['is'].pretty.cool", number=n, globals=globals()), 3))
print('dotmap (get): ', round(timeit("dm.hey.so[0].this['is'].pretty.cool", number=n, globals=globals()), 3))
Results, on my M1 Mac, running Python 3.10:
dotwiz (create): 0.189
dotmap (create): 1.085
dotwiz (get): 0.014
dotmap (get): 0.335
This also works with nested dicts and makes sure that dicts which are appended later behave the same:
class DotDict(dict):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Recursively turn nested dicts into DotDicts
for key, value in self.items():
if type(value) is dict:
self[key] = DotDict(value)
def __setitem__(self, key, item):
if type(item) is dict:
item = DotDict(item)
super().__setitem__(key, item)
__setattr__ = __setitem__
__getattr__ = dict.__getitem__
Using namedtuple allows dot access.
It is like a lightweight object which also has the properties of a tuple.
It allows to define properties and access them using the dot operator.
from collections import namedtuple
Data = namedtuple('Data', ['key1', 'key2'])
dataObj = Data(val1, key2=val2) # can instantiate using keyword arguments and positional arguments
Access using dot operator
dataObj.key1 # Gives val1
datObj.key2 # Gives val2
Access using tuple indices
dataObj[0] # Gives val1
dataObj[1] # Gives val2
But remember this is a tuple; not a dict. So the below code will give error
dataObj['key1'] # Gives TypeError: tuple indices must be integers or slices, not str
Refer: namedtuple
It is an old question but I recently found that sklearn has an implemented version dict accessible by key, namely Bunch
https://scikit-learn.org/stable/modules/generated/sklearn.utils.Bunch.html#sklearn.utils.Bunch
Simplest solution.
Define a class with only pass statement in it. Create object for this class and use dot notation.
class my_dict:
pass
person = my_dict()
person.id = 1 # create using dot notation
person.phone = 9999
del person.phone # Remove a property using dot notation
name_data = my_dict()
name_data.first_name = 'Arnold'
name_data.last_name = 'Schwarzenegger'
person.name = name_data
person.name.first_name # dot notation access for nested properties - gives Arnold
One simple way to get dot access (but not array access), is to use a plain object in Python. Like this:
class YourObject:
def __init__(self, *args, **kwargs):
for k, v in kwargs.items():
setattr(self, k, v)
...and use it like this:
>>> obj = YourObject(key="value")
>>> print(obj.key)
"value"
... to convert it to a dict:
>>> print(obj.__dict__)
{"key": "value"}
The answer of #derek73 is very neat, but it cannot be pickled nor (deep)copied, and it returns None for missing keys. The code below fixes this.
Edit: I did not see the answer above that addresses the exact same point (upvoted). I'm leaving the answer here for reference.
class dotdict(dict):
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
def __getattr__(self, name):
try:
return self[name]
except KeyError:
raise AttributeError(name)
I just needed to access a dictionary using a dotted path string, so I came up with:
def get_value_from_path(dictionary, parts):
""" extracts a value from a dictionary using a dotted path string """
if type(parts) is str:
parts = parts.split('.')
if len(parts) > 1:
return get_value_from_path(dictionary[parts[0]], parts[1:])
return dictionary[parts[0]]
a = {'a':{'b':'c'}}
print(get_value_from_path(a, 'a.b')) # c
The implemention used by kaggle_environments is a function called structify.
class Struct(dict):
def __init__(self, **entries):
entries = {k: v for k, v in entries.items() if k != "items"}
dict.__init__(self, entries)
self.__dict__.update(entries)
def __setattr__(self, attr, value):
self.__dict__[attr] = value
self[attr] = value
# Added benefit of cloning lists and dicts.
def structify(o):
if isinstance(o, list):
return [structify(o[i]) for i in range(len(o))]
elif isinstance(o, dict):
return Struct(**{k: structify(v) for k, v in o.items()})
return o
https://github.com/Kaggle/kaggle-environments/blob/master/kaggle_environments/utils.py
This may be useful for testing AI simulation agents in games like ConnectX
from kaggle_environments import structify
obs = structify({ 'remainingOverageTime': 60, 'step': 0, 'mark': 1, 'board': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]})
conf = structify({ 'timeout': 2, 'actTimeout': 2, 'agentTimeout': 60, 'episodeSteps': 1000, 'runTimeout': 1200, 'columns': 7, 'rows': 6, 'inarow': 4, '__raw_path__': '/kaggle_simulations/agent/main.py' })
def agent(obs, conf):
action = obs.step % conf.columns
return action
Not a direct answer to the OP's question, but inspired by and perhaps useful for some.. I've created an object-based solution using the internal __dict__ (In no way optimized code)
payload = {
"name": "John",
"location": {
"lat": 53.12312312,
"long": 43.21345112
},
"numbers": [
{
"role": "home",
"number": "070-12345678"
},
{
"role": "office",
"number": "070-12345679"
}
]
}
class Map(object):
"""
Dot style access to object members, access raw values
with an underscore e.g.
class Foo(Map):
def foo(self):
return self.get('foo') + 'bar'
obj = Foo(**{'foo': 'foo'})
obj.foo => 'foobar'
obj._foo => 'foo'
"""
def __init__(self, *args, **kwargs):
for arg in args:
if isinstance(arg, dict):
for k, v in arg.iteritems():
self.__dict__[k] = v
self.__dict__['_' + k] = v
if kwargs:
for k, v in kwargs.iteritems():
self.__dict__[k] = v
self.__dict__['_' + k] = v
def __getattribute__(self, attr):
if hasattr(self, 'get_' + attr):
return object.__getattribute__(self, 'get_' + attr)()
else:
return object.__getattribute__(self, attr)
def get(self, key):
try:
return self.__dict__.get('get_' + key)()
except (AttributeError, TypeError):
return self.__dict__.get(key)
def __repr__(self):
return u"<{name} object>".format(
name=self.__class__.__name__
)
class Number(Map):
def get_role(self):
return self.get('role')
def get_number(self):
return self.get('number')
class Location(Map):
def get_latitude(self):
return self.get('lat') + 1
def get_longitude(self):
return self.get('long') + 1
class Item(Map):
def get_name(self):
return self.get('name') + " Doe"
def get_location(self):
return Location(**self.get('location'))
def get_numbers(self):
return [Number(**n) for n in self.get('numbers')]
# Tests
obj = Item({'foo': 'bar'}, **payload)
assert type(obj) == Item
assert obj._name == "John"
assert obj.name == "John Doe"
assert type(obj.location) == Location
assert obj.location._lat == 53.12312312
assert obj.location._long == 43.21345112
assert obj.location.latitude == 54.12312312
assert obj.location.longitude == 44.21345112
for n in obj.numbers:
assert type(n) == Number
if n.role == 'home':
assert n.number == "070-12345678"
if n.role == 'office':
assert n.number == "070-12345679"

How to allow for extra characters in dictionary keys

Let's say I have a dictionary like this:
my_dict = {
hey: onevalue,
hat: twovalue,
how: threevalue
}
Is there any way to account for a reference to a key while having a variable, say a number in it?
such as my_dict['hey1'] = onevalue
or my_dict['hey2'] = onevalue
I'm trying to allow for such a variable instead of stripping the numbers from the reference to the key. Thank you in advance.
There is not. Due to how dicts work (as a hash table), they have no way to accommodate what you want. You will have to create your own type and define the __*item__() methods to implement your logic.
Building on this answer, you can define your own dict compatible type to strip trailing digits;
import collections
class TransformedDict(collections.MutableMapping):
"""A dictionary which applies an arbitrary key-altering function before accessing the keys"""
def __init__(self, *args, **kwargs):
self.store = dict()
self.update(dict(*args, **kwargs)) # use the free update to set keys
def __getitem__(self, key):
return self.store[self.__keytransform__(key)]
def __setitem__(self, key, value):
self.store[self.__keytransform__(key)] = value
def __delitem__(self, key):
del self.store[self.__keytransform__(key)]
def __iter__(self):
return iter(self.store)
def __len__(self):
return len(self.store)
def __keytransform__(self, key):
for i in xrange(len(key)-1,-1,-1):
if not key[i].isdigit():
break
return key[:i]
a = TransformedDict()
a['hello'] = 5
print a['hello123']
>>> 5

Define a python dictionary with immutable keys but mutable values

Well, the question is in the title: how do I define a python dictionary with immutable keys but mutable values? I came up with this (in python 2.x):
class FixedDict(dict):
"""
A dictionary with a fixed set of keys
"""
def __init__(self, dictionary):
dict.__init__(self)
for key in dictionary.keys():
dict.__setitem__(self, key, dictionary[key])
def __setitem__(self, key, item):
if key not in self:
raise KeyError("The key '" +key+"' is not defined")
dict.__setitem__(self, key, item)
but it looks to me (unsurprisingly) rather sloppy. In particular, is this safe or is there the risk of actually changing/adding some keys, since I'm inheriting from dict?
Thanks.
Consider proxying dict instead of subclassing it. That means that only the methods that you define will be allowed, instead of falling back to dict's implementations.
class FixedDict(object):
def __init__(self, dictionary):
self._dictionary = dictionary
def __setitem__(self, key, item):
if key not in self._dictionary:
raise KeyError("The key {} is not defined.".format(key))
self._dictionary[key] = item
def __getitem__(self, key):
return self._dictionary[key]
Also, you should use string formatting instead of + to generate the error message, since otherwise it will crash for any value that's not a string.
The problem with direct inheritance from dict is that it's quite hard to comply with the full dict's contract (e.g. in your case, update method won't behave in a consistent way).
What you want, is to extend the collections.MutableMapping:
import collections
class FixedDict(collections.MutableMapping):
def __init__(self, data):
self.__data = data
def __len__(self):
return len(self.__data)
def __iter__(self):
return iter(self.__data)
def __setitem__(self, k, v):
if k not in self.__data:
raise KeyError(k)
self.__data[k] = v
def __delitem__(self, k):
raise NotImplementedError
def __getitem__(self, k):
return self.__data[k]
def __contains__(self, k):
return k in self.__data
Note that the original (wrapped) dict will be modified, if you don't want that to happen, use copy or deepcopy.
How you prevent someone from adding new keys depends entirely on why someone might try to add new keys. As the comments state, most dictionary methods that modify the keys don't go through __setitem__, so a .update() call will add new keys just fine.
If you only expect someone to use d[new_key] = v, then your __setitem__ is fine. If they might use other ways to add keys, then you have to put in more work. And of course, they can always use this to do it anyway:
dict.__setitem__(d, new_key, v)
You can't make things truly immutable in Python, you can only stop particular changes.

Immutable dictionary, only use as a key for another dictionary

I had the need to implement a hashable dict so I could use a dictionary as a key for another dictionary.
A few months ago I used this implementation: Python hashable dicts
However I got a notice from a colleague saying 'it is not really immutable, thus it is not safe. You can use it, but it does make me feel like a sad Panda'.
So I started looking around to create one that is immutable. I have no need to compare the 'key-dict' to another 'key-dict'. Its only use is as a key for another dictionary.
I have come up with the following:
class HashableDict(dict):
"""Hashable dict that can be used as a key in other dictionaries"""
def __new__(self, *args, **kwargs):
# create a new local dict, that will be used by the HashableDictBase closure class
immutableDict = dict(*args, **kwargs)
class HashableDictBase(object):
"""Hashable dict that can be used as a key in other dictionaries. This is now immutable"""
def __key(self):
"""Return a tuple of the current keys"""
return tuple((k, immutableDict[k]) for k in sorted(immutableDict))
def __hash__(self):
"""Return a hash of __key"""
return hash(self.__key())
def __eq__(self, other):
"""Compare two __keys"""
return self.__key() == other.__key() # pylint: disable-msg=W0212
def __repr__(self):
"""#see: dict.__repr__"""
return immutableDict.__repr__()
def __str__(self):
"""#see: dict.__str__"""
return immutableDict.__str__()
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
return HashableDictBase()
I used the following to test the functionality:
d = {"a" : 1}
a = HashableDict(d)
b = HashableDict({"b" : 2})
print a
d["b"] = 2
print a
c = HashableDict({"a" : 1})
test = {a : "value with a dict as key (key a)",
b : "value with a dict as key (key b)"}
print test[a]
print test[b]
print test[c]
which gives:
{'a': 1}
{'a': 1}
value with a dict as key (key a)
value with a dict as key (key b)
value with a dict as key (key a)
as output
Is this the 'best possible' immutable dictionary that I can use that satisfies my requirements? If not, what would be a better solution?
If you are only using it as a key for another dict, you could go for frozenset(mutabledict.items()). If you need to access the underlying mappings, you could then use that as the parameter to dict.
mutabledict = dict(zip('abc', range(3)))
immutable = frozenset(mutabledict.items())
read_frozen = dict(immutable)
read_frozen['a'] # => 1
Note that you could also combine this with a class derived from dict, and use the frozenset as the source of the hash, while disabling __setitem__, as suggested in another answer. (#RaymondHettinger's answer for code which does just that).
The Mapping abstract base class makes this easy to implement:
import collections
class ImmutableDict(collections.Mapping):
def __init__(self, somedict):
self._dict = dict(somedict) # make a copy
self._hash = None
def __getitem__(self, key):
return self._dict[key]
def __len__(self):
return len(self._dict)
def __iter__(self):
return iter(self._dict)
def __hash__(self):
if self._hash is None:
self._hash = hash(frozenset(self._dict.items()))
return self._hash
def __eq__(self, other):
return self._dict == other._dict
I realize this has already been answered, but types.MappingProxyType is an analogous implementation for Python 3.3. Regarding the original question of safety, there is a discussion in PEP 416 -- Add a frozendict builtin type on why the idea of a frozendict was rejected.
In order for your immutable dictionary to be safe, all it needs to do is never change its hash. Why don't you just disable __setitem__ as follows:
class ImmutableDict(dict):
def __setitem__(self, key, value):
raise Exception("Can't touch this")
def __hash__(self):
return hash(tuple(sorted(self.items())))
a = ImmutableDict({'a':1})
b = {a:1}
print b
print b[a]
a['a'] = 0
The output of the script is:
{{'a': 1}: 1}
1
Traceback (most recent call last):
File "ex.py", line 11, in <module>
a['a'] = 0
File "ex.py", line 3, in __setitem__
raise Exception("Can't touch this")
Exception: Can't touch this
Here is a link to pip install-able implementation of #RaymondHettinger's answer: https://github.com/pcattori/icicle
Simply pip install icicle and you can from icicle import FrozenDict!
Update: icicle has been deprecated in favor of maps: https://github.com/pcattori/maps (documentation, PyPI).
It appears I am late to post. Not sure if anyone else has come up with ideas. But here is my take on it. The Dict is immutable and hashable. I made it immutable by overriding all the methods, magic and otherwise, with a custom '_readonly' function that raises an Exception. This is done when the object is instantiated. To get around the problem of not being able to apply the values I set the 'hash' under '__new__'. I then I override the '__hash__'function. Thats it!
class ImmutableDict(dict):
_HASH = None
def __new__(cls, *args, **kwargs):
ImmutableDict._HASH = hash(frozenset(args[0].items()))
return super(ImmutableDict, cls).__new__(cls, args)
def __hash__(self):
return self._HASH
def _readonly(self, *args, **kwards):
raise TypeError("Cannot modify Immutable Instance")
__delattr__ = __setattr__ = __setitem__ = pop = update = setdefault = clear = popitem = _readonly
Test:
immutabled1 = ImmutableDict({"This": "That", "Cheese": "Blarg"})
dict1 = {immutabled1: "Yay"}
dict1[immutabled1]
"Yay"
dict1
{{'Cheese': 'Blarg', 'This': 'That'}: 'Yay'}
Variation of Raymond Hettinger's answer by wrapping the self._dict with types.MappingProxyType.
class ImmutableDict(collections.Mapping):
"""
Copies a dict and proxies it via types.MappingProxyType to make it immutable.
"""
def __init__(self, somedict):
dictcopy = dict(somedict) # make a copy
self._dict = MappingProxyType(dictcopy) # lock it
self._hash = None
def __getitem__(self, key):
return self._dict[key]
def __len__(self):
return len(self._dict)
def __iter__(self):
return iter(self._dict)
def __hash__(self):
if self._hash is None:
self._hash = hash(frozenset(self._dict.items()))
return self._hash
def __eq__(self, other):
return self._dict == other._dict
def __repr__(self):
return str(self._dict)
You can use an enum:
import enum
KeyDict1 = enum.Enum('KeyDict1', {'InnerDictKey1':'bla', 'InnerDictKey2 ':2})
d = { KeyDict1: 'whatever', KeyDict2: 1, ...}
You can access the enums like you would a dictionary:
KeyDict1['InnerDictKey2'].value # This is 2
You can iterate over the names, and get their values... It does everything you'd expect.
You can try using https://github.com/Lightricks/freeze
It provides recursively immutable and hashable dictionaries
from freeze import FDict
a_mutable_dict = {
"list": [1, 2],
"set": {3, 4},
}
a_frozen_dict = FDict(a_mutable_dict)
print(a_frozen_dict)
print(hash(a_frozen_dict))
# FDict: {'list': FList: (1, 2), 'set': FSet: {3, 4}}
# -4855611361973338606

Indexable weak ordered set in Python

I was wondering if there is an easy way to build an indexable weak ordered set in Python. I tried to build one myself. Here's what I came up with:
"""
An indexable, ordered set of objects, which are held by weak reference.
"""
from nose.tools import *
import blist
import weakref
class WeakOrderedSet(blist.weaksortedset):
"""
A blist.weaksortedset whose key is the insertion order.
"""
def __init__(self, iterable=()):
self.insertion_order = weakref.WeakKeyDictionary() # value_type to int
self.last_key = 0
super().__init__(key=self.insertion_order.__getitem__)
for item in iterable:
self.add(item)
def __delitem__(self, index):
values = super().__getitem__(index)
super().__delitem__(index)
if not isinstance(index, slice):
# values is just one element
values = [values]
for value in values:
if value not in self:
del self.insertion_order[value]
def add(self, value):
# Choose a key so that value is on the end.
if value not in self.insertion_order:
key = self.last_key
self.last_key += 1
self.insertion_order[value] = key
super().add(value)
def discard(self, value):
super().discard(value)
if value not in self:
del self.insertion_order[value]
def remove(self, value):
super().remove(value)
if value not in self:
del self.insertion_order[value]
def pop(self, *args, **kwargs):
value = super().pop(*args, **kwargs)
if value not in self:
del self.insertion_order[value]
def clear(self):
super().clear()
self.insertion_order.clear()
def update(self, *args):
for arg in args:
for item in arg:
self.add(item)
if __name__ == '__main__':
class Dummy:
def __init__(self, value):
self.value = value
x = [Dummy(i) for i in range(10)]
w = WeakOrderedSet(reversed(x))
del w[2:8]
assert_equals([9,8,1,0], [i.value for i in w])
del w[0]
assert_equals([8,1,0], [i.value for i in w])
del x
assert_equals([], [i.value for i in w])
Is there an easier way to do this?
The easiest way to is to take advantage of existing components in the standard library.
OrderedDict and the MutableSet ABC make it easy to write an OrderedSet.
Likewise, you can reuse the existing weakref.WeakSet and replace its underlying set() with an OrderedSet.
Indexing is more difficult to achieve -- these easiest way it to convert it to a list when needed. That is necessary because sets and dicts are intrinsically sparse.
import collections.abc
import weakref
class OrderedSet(collections.abc.MutableSet):
def __init__(self, values=()):
self._od = collections.OrderedDict().fromkeys(values)
def __len__(self):
return len(self._od)
def __iter__(self):
return iter(self._od)
def __contains__(self, value):
return value in self._od
def add(self, value):
self._od[value] = None
def discard(self, value):
self._od.pop(value, None)
class OrderedWeakrefSet(weakref.WeakSet):
def __init__(self, values=()):
super(OrderedWeakrefSet, self).__init__()
self.data = OrderedSet()
for elem in values:
self.add(elem)
Use it like this:
>>> names = OrderedSet(['Alice', 'Bob', 'Carol', 'Bob', 'Dave', 'Edna'])
>>> len(names)
5
>>> 'Bob' in names
True
>>> s = list(names)
>>> s[2]
'Carol'
>>> s[4]
'Edna'
Note as of Python 3.7, regular dicts are guaranteed to be ordered, so you can substitute dict for OrderedDict in this recipe and it will all work fine :-)
Raymond has a great and succinct answer, as usual, but I actually came here a while back interested in the indexable part, more than the weakref part. I eventually built my own answer, which became the IndexedSet type in the boltons utility library. Basically, it's all the best parts of the list and set APIs, combined.
>>> x = IndexedSet(list(range(4)) + list(range(8)))
>>> x
IndexedSet([0, 1, 2, 3, 4, 5, 6, 7])
>>> x - set(range(2))
IndexedSet([2, 3, 4, 5, 6, 7])
>>> x[-1]
7
>>> fcr = IndexedSet('freecreditreport.com')
>>> ''.join(fcr[:fcr.index('.')])
'frecditpo'
If the weakref part is critical you can likely add it via inheritance or direct modification of a copy of the code (the module is standalone, pure-Python, and 2/3 compatible).

Categories