I want to create a data structure for storing various possible paths through a plane with polygons scattered across it. I decided on using nested, multi-level dictionaries to save the various possible paths splitting at fixed points.
A possible instance of such a dictionary would be:
path_dictionary = {starting_coordinates:{new_fixpoint1:{new_fixpoint1_1:...}, new_fixpoint2:{new_fixpoint2_1:...}}}
Now I want to continue building up that structure with new paths from the last fixpoints, so I would have to edit the dictionary at various nesting levels. My plan was to provide a sorted keylist which contains all the fixpoints of the given path and I would have a function to add at to the last provided key.
To achieve this I would have to be able to access the dictionary with the keylist like this:
keylist = [starting_coordinates, new_fixpoint1, new_fixpoint1_1, new_fixpoint1_1_3, ...]
path_dictionary = {starting_coordinates:{new_fixpoint1:{new_fixpoint1_1:...}, new_fixpoint2:{new_fixpoint2_1:...}}}
path_dictionary [keylist [0]] [keylist [1]] [keylist [2]] [...] = additional_fixpoint
Question: How can I write to a variable nesting/depth level in the multi-level dictionary when I have a keylist of some length?
Any help is very much appreciated.
I was playing around with the idea of using multiple indexes, and a defaultdict. And this came out:
from collections import defaultdict
class LayeredDict(defaultdict):
def __getitem__(self, key):
if isinstance(key, (tuple, list)):
if len(key) == 1:
return self[key[0]]
return self[key[0]][key[1:]]
return super(LayeredDict, self).__getitem__(key)
def __setitem__(self, key, value):
if isinstance(key, (tuple, list)):
if len(key) == 1:
self[key[0]] = value
else:
self[key[0]][key[1:]] = value
else:
super(LayeredDict, self).__setitem__(key, value)
def __init__(self, *args, **kwargs):
super(LayeredDict, self).__init__(*args, **kwargs)
self.default_factory = type(self) # override default
I haven't fully tested it, but it should allow you to create any level of nested dictionaries, and index them with a tuple.
>>> x = LayeredDict()
>>> x['abc'] = 'blah'
>>> x['abc']
'blah'
>>> x[0, 8, 2] = 1.2345
>>> x[0, 8, 1] = 8.9
>>> x[0, 8, 'xyz'] = 10.1
>>> x[0, 8].keys()
[1, 2, 'xyz']
>>> x['abc', 1] = 5
*** TypeError: 'str' object does not support item assignment
Unfortunately expansion notation (or whatever it's called) isn't supported, but
you can just pass a list or tuple in as an index.
>>> keylist = (0, 8, 2)
>>> x[*keylist]
*** SyntaxError: invalid syntax (<stdin>, line 1)
>>> x[keylist]
1.2345
Also, the isinstance(key, (tuple, list)) condition means a tuple can't be used as a key.
You can certainly write accessors for such a nested dictionary:
def get(d,l):
return get(d[l[0]],l[1:]) if l else d
def set(d,l,v):
while len(l)>1:
d=d[l.pop(0)]
l,=l # verify list length of 1
d[l]=v
(Neither of these is efficient for long lists; faster versions would use a variable index rather than [1:] or pop(0).)
As for other approaches, there’s not nearly enough here to go on for picking one.
Related
I'm working with a complex structure of nested dictionaries and I want to use some functions that take a list as an argument.
Is there any way of getting a list of values from a dictionary but keeping both linked in a way that if I modify one the other also gets modified?
Let me illustrate with an example:
# I have this dictionary
d = {'first': 1, 'second': 2}
# I want to get a list like this
print(l) # shows [1, 2]
# I want to modify the list and see the changes reflected in d
l[0] = 23
print(l) # shows [23, 2]
print(d) # shows {'fist': 23, 'second': 2}
Is there any way of achieve something similar?
You'd have to create a custom sequence object that wraps the dictionary, mapping indices back to keys to access get or set values:
from collections.abc import Sequence
class ValueSequence(Sequence):
def __init__(self, d):
self._d = d
def __len__(self):
return len(self._d)
def _key_for_index(self, index):
# try to avoid iteration over the whole dictionary
if index >= len(self):
raise IndexError(index)
return next(v for i, v in enumerate(self._d) if i == index)
def __getitem__(self, index):
key = self._key_for_index(index)
return self._d[key]
def __setitem__(self, index, value):
key = self._key_for_index(index)
self._d[key] = value
def __repr__(self):
return repr(list(self._d.values()))
The object doesn't support deletions, insertions, appending or extending. Only manipulation of existing dictionary values are supported. The object is also live; if you alter the dictionary, the object will reflect those changes directly.
Demo:
>>> d = {'first': 1, 'second': 2}
>>> l = ValueSequence(d)
>>> print(l)
[1, 2]
>>> l[0] = 23
>>> print(l)
[23, 2]
>>> print(d)
{'first': 23, 'second': 2}
>>> d['second'] = 42
>>> l
[23, 42]
These are not necessarily efficient, however.
Inheriting from the Sequence ABC gives you a few bonus methods:
>>> l.index(42)
1
>>> l.count(42)
1
>>> 23 in l
True
>>> list(reversed(l))
[42, 23]
Take into account that dictionaries are unordered; the above object will reflect changes to the dictionary directly, and if such changes result in a different ordering then that will result in a different order of the values too. The order of a dictionary does remain stable if you don't add or remove keys, however.
Given a collection that is ordered and keyed (like OrderedDict or SortedContainers SortedDict) I want to do the following:
d['first'] = 'hi'
d['second'] = 'there'
d['third'] = 'world'
(ix, value) = d.get_index_and_value('second')
assert d.iloc[ix + 1] == 'third'
# or list(d.keys())[ix + 1] with OrderedDict
however I cannot see an efficient way to get both the index and the value given a key ((ix, value) = d.get_index_and_value('second')).
Is this possible with SortedDict, or another container?
In practice, my keys are a sortable collection (dates) if that means there is a better container I could use.
You can use the index method of keys:
ix, value = d.keys().index('second'), d['second']
this will return
(1, 'there')
If you don't want to repeat yourself, you can make this a function, or extend OrderedDict to include this as a method:
def get_index_and_value(d, key):
return d.keys().index(key), d[key]
print(get_index_and_value(d, 'second')
I need a dictionary data structure that store dictionaries as seen below:
custom = {1: {'a': np.zeros(10), 'b': np.zeros(100)},
2: {'c': np.zeros(20), 'd': np.zeros(200)}}
But the problem is that I iterate over this data structure many times in my code. Every time I iterate over it, I need the order of iteration to be respected because all the elements in this complex data structure are mapped to a 1D array (serialized if you will), and thus the order is important. I thought about writing a ordered dict of ordered dict for that matter, but I'm not sure this is the right solution as it seems I may be choosing the wrong data structure. What would be the most adequate solution for my case?
UPDATE
So this is what I came up with so far:
class Test(list):
def __init__(self, *args, **kwargs):
super(Test, self).__init__(*args, **kwargs)
for k,v in args[0].items():
self[k] = OrderedDict(v)
self.d = -1
self.iterator = iter(self[-1].keys())
self.etype = next(self.iterator)
self.idx = 0
def __iter__(self):
return self
def __next__(self):
try:
self.idx += 1
return self[self.d][self.etype][self.idx-1]
except IndexError:
self.etype = next(self.iterator)
self.idx = 0
return self[self.d][self.etype][self.idx-1]
def __call__(self, d):
self.d = -1 - d
self.iterator = iter(self[self.d].keys())
self.etype = next(self.iterator)
self.idx = 0
return self
def main(argv=()):
tst = Test(elements)
for el in tst:
print(el)
# loop over a lower dimension
for el in tst(-2):
print(el)
print(tst)
return 0
if __name__ == "__main__":
sys.exit(main())
I can iterate as many times as I want in this ordered structure, and I implemented __call__ so I can iterate over the lower dimensions. I don't like the fact that if there isn't a lower dimension present in the list, it doesn't give me any errors. I also have the feeling that every time I call return self[self.d][self.etype][self.idx-1] is less efficient than the original iteration over the dictionary. Is this true? How can I improve this?
Here's another alternative that uses an OrderedDefaultdict to define the tree-like data structure you want. I'm reusing the definition of it from another answer of mine.
To make use of it, you have to ensure the entries are defined in the order you want to access them in later on.
class OrderedDefaultdict(OrderedDict):
def __init__(self, *args, **kwargs):
if not args:
self.default_factory = None
else:
if not (args[0] is None or callable(args[0])):
raise TypeError('first argument must be callable or None')
self.default_factory = args[0]
args = args[1:]
super(OrderedDefaultdict, self).__init__(*args, **kwargs)
def __missing__ (self, key):
if self.default_factory is None:
raise KeyError(key)
self[key] = default = self.default_factory()
return default
def __reduce__(self): # optional, for pickle support
args = (self.default_factory,) if self.default_factory else ()
return self.__class__, args, None, None, self.iteritems()
Tree = lambda: OrderedDefaultdict(Tree)
custom = Tree()
custom[1]['a'] = np.zeros(10)
custom[1]['b'] = np.zeros(100)
custom[2]['c'] = np.zeros(20)
custom[2]['d'] = np.zeros(200)
I'm not sure I understand your follow-on question. If the data structure is limited to two levels, you could use nested for loops to iterate over its elements in the order they were defined. For example:
for key1, subtree in custom.items():
for key2, elem in subtree.items():
print('custom[{!r}][{!r}]: {}'.format(key1, key2, elem))
(In Python 2 you'd want to use iteritems() instead of items().)
I think using OrderedDicts is the best way. They're built-in and relatively fast:
custom = OrderedDict([(1, OrderedDict([('a', np.zeros(10)),
('b', np.zeros(100))])),
(2, OrderedDict([('c', np.zeros(20)),
('d', np.zeros(200))]))])
If you want to make it easy to iterate over the contents of the your data structure, you can always provide a utility function to do so:
def iter_over_contents(data_structure):
for delem in data_structure.values():
for v in delem.values():
for row in v:
yield row
Note that in Python 3.3+, which allows yield from <expression>, the last for loop can be eliminated:
def iter_over_contents(data_structure):
for delem in data_structure.values():
for v in delem.values():
yield from v
With one of those you'll then be able to write something like:
for elem in iter_over_contents(custom):
print(elem)
and hide the complexity.
While you could define your own class in an attempt to encapsulate this data structure and use something like the iter_over_contents() generator function as its __iter__() method, that approach would likely be slower and wouldn't allow expressions using two levels of indexing such this following:
custom[1]['b']
which using nested dictionaries (or OrderedDefaultdicts as shown in my other answer) would.
Could you just use a list of dictionaries?
custom = [{'a': np.zeros(10), 'b': np.zeros(100)},
{'c': np.zeros(20), 'd': np.zeros(200)}]
This could work if the outer dictionary is the only one you need in the right order. You could still access the inner dictionaries with custom[0] or custom[1] (careful, indexing now starts at 0).
If not all of the indices are used, you could do the following:
custom = [None] * maxLength # maximum dict size you expect
custom[1] = {'a': np.zeros(10), 'b': np.zeros(100)}
custom[2] = {'c': np.zeros(20), 'd': np.zeros(200)}
You can fix the order of your keys while iterating when you sort them first:
for key in sorted(custom.keys()):
print(key, custom[key])
If you want to reduce sorted()-calls, you may want to store the keys in an extra list which will then serve as your iteration order:
ordered_keys = sorted(custom.keys())
for key in ordered_keys:
print(key, custom[key])
You should be ready to go for as many iterations over your data structure, as you need.
Basically, if I'm trying to access a dict value which I expect to be an iterable is there an easy one-liner to account for that value not being present aside from using some like DefaultDict. There's this
for el in (myDict.get('myIterable') or []):
pass
Doesn't feel particularly pythonic though...
for item in a_dict.get("some_key",[]):
#do whatever
if the item is guaranteed to be a list if present ... if it might be things other than a list you will need a different solution
You can make a subclass of dict that provides a default value with the __missing__(self, key) method:
class EmptyIterableDict(dict):
def __missing__(self, key):
return []
Example usage:
test = EmptyIterableDict()
test['a'] = [3,2,1]
test['b'] = [2,1]
test['c'] = [1]
for v in test['a']:
print v
3
2
1
for v in test['d']:
print v
If you already have a vanilla dict that you want to iterate like that over, you can make a temporary copy:
original = {'a': [1], 'b': [2,3]}
temp = EmptyIterableDict(original)
for v in temp['d']:
print v
An explicit, multi-line approach to this would be:
if 'my_iterable' in my_dict:
for item in my_dict['my_iterable']:
print(item)
which could also be written as a one-line comprehension:
[print(item) for item in my_dict['my_iterable'] if 'my_iterable' in my_dict]
This isn't a one-liner but it accounts for both possible failures.
try:
for item in dictionary[key]:
print(item)
except KeyError:
pass # Key wasn't present in the dictionary.
except TypeError:
pass # Key was present but corresponding item not iterable.
I have a list of dictionaries and strings like so:
listDict = [{'id':1,'other':2}, {'id':3,'other':4},
{'name':'Some name','other':6}, 'some string']
I want a list of all the ids (or other attributes) from the dictionaries via dot operator. So, from the given list I would get the list:
listDict.id
[1,3]
listDict.other
[2,4,6]
listDict.name
['Some name']
Thanks
python doesn't work this way. you'd have to redefine your listDict. the built-in list type doesn't support such access. the simpler way is just to get the new lists like this:
>>> ids = [d['id'] for d in listDict if isinstance(d, dict) and 'id' in d]
>>> ids
[1, 3]
P.S. your data structure seems to be awfully heterogeneous. if you explain what you're trying to do, a better solution can be found.
To do that, you'd need to create a class based on list:
class ListDict(list):
def __init__(self, listofD=None):
if listofD is not None:
for d in listofD:
self.append(d)
def __getattr__(self, attr):
res = []
for d in self:
if attr in d:
res.append(d[attr])
return res
if __name__ == "__main__":
z = ListDict([{'id':1, 'other':2}, {'id':3,'other':4},
{'name':"some name", 'other':6}, 'some string'])
print z.id
print z.other
print z.name