Just wrote some nasty code that iterates over a dict or a list in Python. I have a feeling this was not the best way to go about it.
The problem is that in order to iterate over a dict, this is the convention:
for key in dict_object:
dict_object[key] = 1
But modifying the object properties by key does not work if the same thing is done on a list:
# Throws an error because the value of key is the property value, not
# the list index:
for key in list_object:
list_object[key] = 1
The way I solved this problem was to write this nasty code:
if isinstance(obj, dict):
for key in obj:
do_loop_contents(obj, key)
elif isinstance(obj, list):
for i in xrange(0, len(obj)):
do_loop_contents(obj, i)
def do_loop_contents(obj, key):
obj[key] = 1
Is there a better way to do this?
Thanks!
I've never needed to do this, ever. But if I did, I'd probably do something like this:
seq_iter = x if isinstance(x, dict) else xrange(len(x))
For example, in function form:
>>> def seq_iter(obj):
... return obj if isinstance(obj, dict) else xrange(len(obj))
...
>>> x = [1,2,3]
>>> for i in seq_iter(x):
... x[i] = 99
...
>>> x
[99, 99, 99]
>>>
>>> x = {1: 2, 2:3, 3:4}
>>> for i in seq_iter(x):
... x[i] = 99
...
>>> x
{1: 99, 2: 99, 3: 99}
This is the correct way, but if for some reason you need to treat these two objects the same way, you can create an iterable that will return indexes / keys no matter what:
def common_iterable(obj):
if isinstance(obj, dict):
return obj
else:
return (index for index, value in enumerate(obj))
Which will behave in the way you wanted:
>>> d = {'a': 10, 'b': 20}
>>> l = [1,2,3,4]
>>> for index in common_iterable(d):
d[index] = 0
>>> d
{'a': 0, 'b': 0}
>>> for index in common_iterable(l):
l[index] = 0
>>> l
[0, 0, 0, 0]
Or probably more efficiently, using a generator:
def common_iterable(obj):
if isinstance(obj, dict):
for key in obj:
yield key
else:
for index, value in enumerate(obj):
yield index
To be pythonic and ducktype-y, and also to follow "ask for forgiveness not permission", you could do something like:
try:
iterator = obj.iteritems()
except AttributeError:
iterator = enumerate(obj)
for reference, value in iterator:
do_loop_contents(obj, reference)
Though if all you need is the key/index:
try:
references = obj.keys()
except AttributeError:
references = range(len(obj))
for reference in references:
do_loop_contents(obj, reference)
Or as a function:
def reference_and_value_iterator(iterable):
try:
return iterable.iteritems()
except AttributeError:
return enumerate(iterable)
for reference, value in reference_and_value_iterator(obj):
do_loop_contents(obj, reference)
Or for just the references:
def references(iterable):
try:
return iterable.keys()
except AttributeError:
return range(len(iterable))
for reference in references(obj):
do_loop_contents(obj, reference)
test_list = [2, 3, 4]
for i, entry in enumerate(test_list):
test_list[i] = entry * 2
print(test_list) # Gives: [4, 6, 8]
But you probably want a list comprehension:
test_list = [2, 3, 4]
test_list = [entry * 2 for entry in test_list]
print(test_list) # Gives: [4, 6, 8]
You probably just want to have a different code depending on if the object you are trying to change is a dict or a list.
if type(object)==type([]):
for key in range(len(object)):
object[key]=1
elif type(object)==type({}): #use 'else' if you know that object will be a dict if not a list
for key in object:
object[key]=1
I stumbled upon this post while searching for a better one, here's how I did it.
for row in [dict_or_list] if not type(dict_or_list) is list else dict_or_list:
for i,v in row.items():
print(i,v)
Related
I want to create a data structure for storing various possible paths through a plane with polygons scattered across it. I decided on using nested, multi-level dictionaries to save the various possible paths splitting at fixed points.
A possible instance of such a dictionary would be:
path_dictionary = {starting_coordinates:{new_fixpoint1:{new_fixpoint1_1:...}, new_fixpoint2:{new_fixpoint2_1:...}}}
Now I want to continue building up that structure with new paths from the last fixpoints, so I would have to edit the dictionary at various nesting levels. My plan was to provide a sorted keylist which contains all the fixpoints of the given path and I would have a function to add at to the last provided key.
To achieve this I would have to be able to access the dictionary with the keylist like this:
keylist = [starting_coordinates, new_fixpoint1, new_fixpoint1_1, new_fixpoint1_1_3, ...]
path_dictionary = {starting_coordinates:{new_fixpoint1:{new_fixpoint1_1:...}, new_fixpoint2:{new_fixpoint2_1:...}}}
path_dictionary [keylist [0]] [keylist [1]] [keylist [2]] [...] = additional_fixpoint
Question: How can I write to a variable nesting/depth level in the multi-level dictionary when I have a keylist of some length?
Any help is very much appreciated.
I was playing around with the idea of using multiple indexes, and a defaultdict. And this came out:
from collections import defaultdict
class LayeredDict(defaultdict):
def __getitem__(self, key):
if isinstance(key, (tuple, list)):
if len(key) == 1:
return self[key[0]]
return self[key[0]][key[1:]]
return super(LayeredDict, self).__getitem__(key)
def __setitem__(self, key, value):
if isinstance(key, (tuple, list)):
if len(key) == 1:
self[key[0]] = value
else:
self[key[0]][key[1:]] = value
else:
super(LayeredDict, self).__setitem__(key, value)
def __init__(self, *args, **kwargs):
super(LayeredDict, self).__init__(*args, **kwargs)
self.default_factory = type(self) # override default
I haven't fully tested it, but it should allow you to create any level of nested dictionaries, and index them with a tuple.
>>> x = LayeredDict()
>>> x['abc'] = 'blah'
>>> x['abc']
'blah'
>>> x[0, 8, 2] = 1.2345
>>> x[0, 8, 1] = 8.9
>>> x[0, 8, 'xyz'] = 10.1
>>> x[0, 8].keys()
[1, 2, 'xyz']
>>> x['abc', 1] = 5
*** TypeError: 'str' object does not support item assignment
Unfortunately expansion notation (or whatever it's called) isn't supported, but
you can just pass a list or tuple in as an index.
>>> keylist = (0, 8, 2)
>>> x[*keylist]
*** SyntaxError: invalid syntax (<stdin>, line 1)
>>> x[keylist]
1.2345
Also, the isinstance(key, (tuple, list)) condition means a tuple can't be used as a key.
You can certainly write accessors for such a nested dictionary:
def get(d,l):
return get(d[l[0]],l[1:]) if l else d
def set(d,l,v):
while len(l)>1:
d=d[l.pop(0)]
l,=l # verify list length of 1
d[l]=v
(Neither of these is efficient for long lists; faster versions would use a variable index rather than [1:] or pop(0).)
As for other approaches, there’s not nearly enough here to go on for picking one.
I have the following problem: I create a dictionary in which the keys are IDs (0 to N) and the values are list of one or more numbers.
D = dict()
D[0] = [1]
D[1] = [2]
D[2] = [0]
OR:
D = dict()
D[0] = [1, 2]
D[1] = [1, 2]
D[2] = [0]
When the list stored in dictionary has more than one value, it always means that this list is present under 2 different keys. What I now want is to convert both dict into this:
D = dict()
D[0] = 1
D[1] = 2
D[2] = 0
For the first one, it's simple, the function will simply replace the values of the dict by the first value in the list:
def transform_dict(D):
for key, value in D.items():
D[key] = value[0]
return D
However, in the second case, the function must assign one of the key with one of the value, and the second with another. For instance, the key "0" can be assign the value "1" or "2"; and the key "1" will be assign the other one.
I am struggling with this simple problem, and I don't see a way to do this efficiently. Do you have any idea?
EDIT: Explanation n°2
The initial dict can have the following format:
D[key1] = [val1]
D[key2] = [val2]
D[key3] = [val3, val4]
D[key4] = [val3, val4]
If a list of values is composed of more than one element, it means that a second key exist within the dictionnary with the same list of values (key3 and key4).
The goal is to transform this dict into:
D[key1] = val1
D[key2] = val2
D[key3] = val3
D[key4] = val4
Where val3 and val4 are attributed to key3 and key4 in whatever way (I don't care which one goes with which key).
EDIT2: Examples:
# Input dict
D[0] = [7]
D[1] = [5]
D[2] = [4]
D[3] = [1, 2, 3]
D[4] = [6, 8]
D[5] = [1, 2, 3]
D[6] = [1, 2, 3]
D[7] = [6, 8]
#Output
D[0] = 7
D[1] = 5
D[2] = 4
D[3] = 1
D[4] = 6
D[5] = 2
D[6] = 3
D[7] = 8
You can also create a class which behaves likes a dictionary. That way you don't need any additional functions to "clean" the dictionary afterwards but rather solve it on the fly :)
How it works:
We extend collections.abc.Mapping and overwrite the standard dictionary functions __getitem__, __setitem__ and __iter__. We use self._storage to save the actual dictionary.
We use a second dictionary _unresolved to keep track of the keys which haven't been resolved yet. In the example above it for example has the entry (1, 2, 3): [4, 5].
We use a helper function _resolve() that checks if the len((1,2,3)) == len([4,5]). On the moment you assign D[6] this lengths are equal, and the items are assigned to self._storage.
Tried to add comments in the code.
from collections.abc import Mapping
from collections import defaultdict
class WeirdDict(Mapping):
def __init__(self, *args, **kw):
self._storage = dict() # the actual dictionary returned
self._unresolved = defaultdict(list) # a reversed mapping of the unresolved items
for key, value in dict(*args, **kw).items():
self._unresolved_vals[value].append(key)
self._resolve()
def __getitem__(self, key):
return self._storage[key]
def __setitem__(self, key, val):
""" Setter. """
if type(val) == int:
self._storage[key] = val
elif len(val) == 1:
self._storage[key] = val[0]
elif key not in self._storage:
self._unresolved[tuple(val)].append(key)
self._resolve()
def _resolve(self):
""" Helper function - checks if any keys can be resolved """
resolved = set()
for val, keys in self._unresolved.items(): # left to resolve
if len(val) == len(keys): # if we can resolve (count exhausted)
for i, k in enumerate(keys):
self._storage[k] = val[i]
resolved.add(val)
# Remove from todo list
for val in resolved:
del self._unresolved[val]
def __iter__(self):
return iter(self._storage)
def __len__(self):
return len(self._storage)
And then start with:
D = WeirdDict()
D[0] = [7]
D[1] = 5
D[2] = (4)
D[3] = (1, 2, 3)
D[4] = (6, 8)
D[5] = (1, 2, 3)
D[6] = (1, 2, 3)
D[7] = [6, 8]
# Try this for different output
D[7] # gives 8
I am not sure this is the most efficient, but it seems a way of doing it:
in_dict = dict()
in_dict[0] = [7]
in_dict[1] = [5]
in_dict[2] = [4]
in_dict[3] = [1, 2, 3]
in_dict[4] = [6, 8]
in_dict[5] = [1, 2, 3]
in_dict[6] = [1, 2, 3]
in_dict[7] = [6, 8]
out_dict = dict()
out_dict[0] = 7
out_dict[1] = 5
out_dict[2] = 4
out_dict[3] = 1
out_dict[4] = 6
out_dict[5] = 2
out_dict[6] = 3
out_dict[7] = 8
def weird_process(mapping):
result = dict()
for key, val in mapping.items():
if len(val) == 1:
result[key] = val[0]
elif key not in result: # was: `else:`
# find other keys having the same value
matching_keys = [k for k, v in mapping.items() if v == val]
for i, k in enumerate(matching_keys):
result[k] = val[i]
return result
weird_process(in_dict) == out_dict
# True
EDIT: I have simplified the code a little bit.
EDIT2: I have improved the efficiency by skipping elements that have been already processed
EDIT3
An even faster approach would be to use a temporary copy of the input keys to reduce the inner looping by consuming the input as soon as it gets used:
def weird_process(mapping):
unseen = set(mapping.keys())
result = dict()
for key, val in mapping.items():
if len(val) == 1:
result[key] = val[0]
elif key not in result:
# find other keys having the same value
matching_keys = [k for k in unseen if mapping[k] == val]
for i, k in enumerate(matching_keys):
result[k] = val[i]
unseen.remove(k)
return result
I have a list like this:
a = [['cat1.subcat1.item1', 0], ['cat1.subcat1.item2', 'hello], [cat1.subcat2.item1, 1337], [cat2.item1, 'test']]
So there may be several subcategories with items, split by a dot. But the number of categoryies and the level of depth isn't fixed and not equal among the categories.
I want the list to look like this:
a = [['cat1', [
['subcat1', [
['item1', 0],
['item2', 'hello']
]],
['subcat2', [
['item1', 1337]
]],
]],
['cat2', [
['item1', 'test']
]]
]
I hope this makes sense.
In the end I need a json string out of this. If it is somehow easier it could also directly be converted to the json string.
Any idea how to achieve this? Thanks!
You should use a nested dictionary structure. This can be processed efficiently using collections.defaultdict and functools.reduce.
Conversion to a regular dictionary is possible, though usually not necessary.
Solution
from collections import defaultdict
from functools import reduce
from operator import getitem
def getFromDict(dataDict, mapList):
"""Iterate nested dictionary"""
return reduce(getitem, mapList, dataDict)
tree = lambda: defaultdict(tree)
d = tree()
for i, j in a:
path = i.split('.')
getFromDict(d, path[:-1])[path[-1]] = j
Result
def default_to_regular_dict(d):
"""Convert nested defaultdict to regular dict of dicts."""
if isinstance(d, defaultdict):
d = {k: default_to_regular_dict(v) for k, v in d.items()}
return d
res = default_to_regular_dict(d)
{'cat1': {'subcat1': {'item1': 0,
'item2': 'hello'},
'subcat2': {'item1': 1337}},
'cat2': {'item1': 'test'}}
Explanation
getFromDict(d, path[:-1]) takes a list path[:-1] and recursively accesses dictionary values corresponding to the list items from dictionary d. I've implemented this bit functionally via functools.reduce and operator.getitem.
We then access the key path[-1], the last element of the list, from the resulting dictionary tree. This will be a dictionary since d is a defaultdict of dictionaries. We can then assign value j to this dictionary.
Not as pretty as #jpp their solution, but hey at least I tried. Using the merge function to merge deep dicts, as seen in this answer.
def merge(a, b, path=None):
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a
a = [['cat1.subcat1.item1', 0], ['cat1.subcat1.item2', 'hello'], ['cat1.subcat2.item1', 1337], ['cat2.item1', 'test']]
# convert to dict
b = {x[0]:x[1] for x in a}
res = {}
# iterate over dict
for k, v in list(b.items()):
s = k.split('.')
temp = {}
# iterate over reverse indices,
# build temp dict from the ground up
for i in reversed(range(len(s))):
if i == len(s)-1:
temp = {s[i]: v}
else:
temp = {s[i]: temp}
# merge temp dict with main dict b
if i == 0:
res = merge(res, temp)
temp = {}
print(res)
# {'cat1': {'subcat1': {'item1': 0, 'item2': 'hello'}, 'subcat2': {'item1': 1337}}, 'cat2': {'item1': 'test'}}
I'm working with a complex structure of nested dictionaries and I want to use some functions that take a list as an argument.
Is there any way of getting a list of values from a dictionary but keeping both linked in a way that if I modify one the other also gets modified?
Let me illustrate with an example:
# I have this dictionary
d = {'first': 1, 'second': 2}
# I want to get a list like this
print(l) # shows [1, 2]
# I want to modify the list and see the changes reflected in d
l[0] = 23
print(l) # shows [23, 2]
print(d) # shows {'fist': 23, 'second': 2}
Is there any way of achieve something similar?
You'd have to create a custom sequence object that wraps the dictionary, mapping indices back to keys to access get or set values:
from collections.abc import Sequence
class ValueSequence(Sequence):
def __init__(self, d):
self._d = d
def __len__(self):
return len(self._d)
def _key_for_index(self, index):
# try to avoid iteration over the whole dictionary
if index >= len(self):
raise IndexError(index)
return next(v for i, v in enumerate(self._d) if i == index)
def __getitem__(self, index):
key = self._key_for_index(index)
return self._d[key]
def __setitem__(self, index, value):
key = self._key_for_index(index)
self._d[key] = value
def __repr__(self):
return repr(list(self._d.values()))
The object doesn't support deletions, insertions, appending or extending. Only manipulation of existing dictionary values are supported. The object is also live; if you alter the dictionary, the object will reflect those changes directly.
Demo:
>>> d = {'first': 1, 'second': 2}
>>> l = ValueSequence(d)
>>> print(l)
[1, 2]
>>> l[0] = 23
>>> print(l)
[23, 2]
>>> print(d)
{'first': 23, 'second': 2}
>>> d['second'] = 42
>>> l
[23, 42]
These are not necessarily efficient, however.
Inheriting from the Sequence ABC gives you a few bonus methods:
>>> l.index(42)
1
>>> l.count(42)
1
>>> 23 in l
True
>>> list(reversed(l))
[42, 23]
Take into account that dictionaries are unordered; the above object will reflect changes to the dictionary directly, and if such changes result in a different ordering then that will result in a different order of the values too. The order of a dictionary does remain stable if you don't add or remove keys, however.
I have a data structure:
my_list = [0] {key1: [1, 2, 3]
key2: [4, 5, 6]
key3: .....
key4: .....}
[1] {key1: [.......]
key2: [... etc.
That is, a list of 4 dictionaries, each dictionary having 4 keys and each key a list of 3 values. Nice and consistent.
I want to loop through each value in each list and update it with an external function. This external function takes 2 arguments (the value I'm updating and the float value contained in its respective key). It's a basic bit of math but the problem is in iterating through the files as it is getting complex and I'm getting lost.
What I have done so far:
def Efunction_conversion(my_list):
converted_dict_list = []
for i in range(0,4):
new_dict = {key:[external_function(float(key), value) for key, value in my_list[i].iteritems()]} ##problems occur here
converted_dict_list.append(new_dict)
return converted_dict_list
The code is not working and it may be obvious to others why.
The external function:
def external~_function(key, value):
E = ((value - key)/key)**2
return E
And the error, TypeError: unsupported operand type(s) for -: 'list' and 'float'
So the line main iteration line is passing a list instead of each element it seems.
Inside your for loop, you are creating the dict with only a single key, and the looping over of the keys is happening only for the list comprehension -
[external_function(float(key), value) for key, value in my_list[i].iteritems()]
It is not happening for the dict as such. also, If I am not wrong, value is a list, so you are passing the whole list as parameter to the external function, which may not be what you want.
A simple way to do this would be (for Python 2.7+ with dictionary comprehension) -
def Efunction_conversion(my_list):
converted_dict_list = []
for x in my_list:
converted_dict_list.append({key:[external_function(float(key),y) for y in value] for key, value in x.iteritems()}
return converted_dict_list
A one liner for this would be -
def Efunction_conversion(my_list):
return [{key:[external_function(float(key),y) for y in value] for key, value in x.iteritems()} for x in my_list]
If you are having a trouble writing/understanding a list/dict comprehension, indent it first before trying to write a one-liner:
my_list = [
{'key1': [1, 2, 3],
'key2': [4, 5, 6],}
]
def external_function(key, value):
return key + str(value)
other_list = [
{
k: [external_function(k, el) for el in v]
for k, v in d.iteritems()
}
for d in my_list
]
[{'key1': ['key11', 'key12', 'key13'], 'key2': ['key24', 'key25', 'key26']}]
The problem is with your external function you could change it to this:
def external_function(key, value):
E=0.0
for v in value:
E += ((v- key)/key)**2
return E
If you want it as a list:
def external_function(key, value):
E=[]
for v in value:
E.append( ((v- key)/key)**2 )
return E
your external_function required a float/int but it got a list so the error was thrown