Cast dict to defaultdict - python

The following code uses the {} operator to combine two defaultdicts.
from collections import defaultdict
aa=defaultdict(str)
bb=defaultdict(str)
aa['foo']+= '1'
bb['bar']+= '2'
cc = {**aa,**bb}
type(cc)
But, as we see if we run this, the {} operator returns a dict type not a defaultdict type.
Is there a way to cast a dict back to a defaultdict?

You can use unpacking directly in a call to defaultdict. defaultdict is a subclass of dict, and will pass those arguments to its parent to create a dictionary as though they had been passed to dict.
cc = defaultdict(str, **aa, **bb)
# defaultdict(<class 'str'>, {'bar': '2', 'foo': '1'})

You can do it the long way. The benefit of this method is you don't need to re-specify the type of defaultdict:
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
cc = merge_two_dicts(aa, bb)
Unpacking in a single expression works but is inefficient:
n = 500000
d1 = defaultdict(int)
d1.update({i: i for i in range(n)})
d2 = defaultdict(int)
d2.update({i+n:i+n for i in range(n)})
%timeit defaultdict(int, {**d1, **d2}) # 150 ms per loop
%timeit merge_two_dicts(d1, d2) # 90.9 ms per loop

The defaultdict constructor can take two arguments, where the first is the function to use for the default, and the second a mapping (dict). It copies the keys/values from the dict passed in.
>>> d = defaultdict(list, {'a': [1,2,3]})
>>> d['a']
[1, 2, 3]

Related

Python array of tuples group by first, store second

So I have an array of tuples something like this
query_results = [("foo", "bar"), ("foo", "qux"), ("baz", "foo")]
I would like to achieve something like:
{
"foo": ["bar", "qux"],
"baz": ["foo"]
}
So I have tried using this
from itertools import groupby
grouped_results = {}
for key, y in groupby(query_results, lambda x: x[0]):
grouped_results[key] = [y[1] for u in list(y)]
The issue I have is although the number of keys are correct, the number of values in each array is dramatically lower than it should be. Can anyone explain why this happens and what I should be doing?
You better use a defaultdict for this:
from collections import defaultdict
result = defaultdict(list)
for k,v in query_results:
result[k].append(v)
Which yields:
>>> result
defaultdict(<class 'list'>, {'baz': ['foo'], 'foo': ['bar', 'qux']})
If you wish to turn it into a vanilla dictionary again, you can - after the for loop - use:
result = dict(result)
this then results in:
>>> dict(result)
{'baz': ['foo'], 'foo': ['bar', 'qux']}
A defaultdict is constructed with a factory, here list. In case the key cannot be found in the dictionary, the factory is called (list() constructs a new empty list). The result is then associated with the key.
So for each key k that is not yet in the dictionary, we will construct a new list first. We then call .append(v) on that list to append values to it.
Well why not use a simple for loop?
grouped_results = {}
for key, value in query_results:
grouped_results.setdefault(key, []).append(value)
Output:
{'foo': ['bar', 'qux'], 'baz': ['foo']}
How about using a defaultdict?
d = defaultdict(list)
for pair in query_results:
d[pair[0]].append(pair[1])

Usage of setdefault instead of defaultdict

I need to create a structure like this :
D = {i:{j:{k:0,l:1,m:2}},a:{b:{c:0,d:4}}}
So this can be done using defaultdict:
D = defaultdict(defaultdict(Counter))
How do i use setdefault here?
EDIT :
Is it possible to combine setdefault and defaultdict ?
To build a multi-level dictionary with setdefault() you'd need to repeatedly access the keys like this:
>>> from collections import Counter
>>> d = {}
>>> d.setdefault("i", {}).setdefault("j", Counter())
Counter()
>>> d
{'i': {'j': Counter()}}
To generalize the usage for new keys you could use a function:
def get_counter(d, i, j):
return d.setdefault(i, {}).setdefault(j, Counter())

How can I populate a dictionary with an enumerated list?

I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}

recursive dictionary creation python

is there any way to dynamically create missing keys if i want to want to set a variable in a subdictionary.
essentially I want to create any missing keys and set my value.
self.portdict[switchname][str(neighbor['name'])]['local']['ports'] = []
currently i'm doing it but its messy:
if not switchname in self.portdict:
self.portdict[switchname] = {}
if not str(neighbor['name']) in self.portdict[switchname]:
self.portdict[switchname][str(neighbor['name'])] = {}
if not 'local' in self.portdict[switchname][str(neighbor['name'])]:
self.portdict[switchname][str(neighbor['name'])]['local'] = {}
if not 'ports' in self.portdict[switchname][str(neighbor['name'])]['local']:
self.portdict[switchname][str(neighbor['name'])]['local']['ports'] = []
Is there any way to do this in one or two lines instead?
This is easier to do without recursion:
def set_by_path(dct, path, value):
ipath = iter(path)
p_last = next(ipath)
try:
while True:
p_next = next(ipath)
dct = dct.setdefault(p_last, {})
p_last = p_next
except StopIteration:
dct[p_last] = value
And a test case:
d = {}
set_by_path(d, ['foo', 'bar', 'baz'], 'qux')
print d # {'foo': {'bar': {'baz': 'qux'}}}
If you want to have it so you don't need a function, you can use the following defaultdict factory which allows you to nest things arbitrarily deeply:
from collections import defaultdict
defaultdict_factory = lambda : defaultdict(defaultdict_factory)
d = defaultdict_factory()
d['foo']['bar']['baz'] = 'qux'
print d
Use collections.defaultdict
self.portdict = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: []))))
I've run into a similar problem in the past. I found that defaultdict was the right answer for me—but writing the super long definitions (like the one in #o11c's answer or #Apero's answer) was no good. Here's what I came up with instead:
from collections import defaultdict
from functools import partial
def NestedDefaultDict(levels, baseFn):
def NDD(lvl):
return partial(defaultdict, NDD(lvl-1)) if lvl > 0 else baseFn
return defaultdict(NDD(levels-1))
This creates a dictionary with levels of nested dictionaries. So if you have levels=3, then you need 3 keys to access the bottom-level value. The second argument is a function which is used to create the bottom-level values. Something like list or lambda: 0 or even dict would work well.
Here's an example of using the "automatic" keys with 4 levels, and list as the default function:
>>> x = NestedDefaultDict(4, list)
>>> x[1][2][3][4].append('hello')
>>> x
defaultdict(<functools.partial object at 0x10b5c22b8>, {1: defaultdict(<functools.partial object at 0x10b5c2260>, {2: defaultdict(<functools.partial object at 0x10b5c2208>, {3: defaultdict(<type 'list'>, {4: ['hello']})})})})
I think that's basically what you'd want for the case in your question. Your 4 "levels" are switch-name, neighbor-name, local, & ports—and it looks like you want a list at the bottom-level to store your ports.
Another example using 2 levels and lambda: 0 as the default:
>>> y = NestedDefaultDict(2, lambda: 0)
>>> y['foo']['bar'] += 7
>>> y['foo']['baz'] += 10
>>> y['foo']['bar'] += 1
>>> y
defaultdict(<functools.partial object at 0x1021f1310>, {'foo': defaultdict(<function <lambda> at 0x1021f3938>, {'baz': 10, 'bar': 8})})
Have a close look to collections.defaultdict:
from collections import defaultdict
foo = defaultdict(dict)
foo['bar'] = defaultdict(dict)
foo['bar']['baz'] = defaultdict(dict)
foo['bar']['baz']['aaa'] = 1
foo['bor'] = 0
foo['bir'] = defaultdict(list)
foo['bir']['biz'].append(1)
foo['bir']['biz'].append(2)
print foo
defaultdict(<type 'dict'>, {'bir': defaultdict(<type 'list'>, {'biz': [1, 2]}), 'bor': 0, 'bar': defaultdict(<type 'dict'>, {'baz': defaultdict(<type 'dict'>, {'aaa': 1})})})

Python - tuple unpacking in dict comprehension

I'm trying to write a function that turns strings of the form 'A=5, b=7' into a dict {'A': 5, 'b': 7}. The following code snippets are what happen inside the main for loop - they turn a single part of the string into a single dict element.
This is fine:
s = 'A=5'
name, value = s.split('=')
d = {name: int(value)}
This is not:
s = 'A=5'
d = {name: int(value) for name, value in s.split('=')}
ValueError: need more than 1 value to unpack
Why can't I unpack the tuple when it's in a dict comprehension? If I get this working then I can easily make the whole function into a single compact dict comprehension.
In your code, s.split('=') will return the list: ['A', '5']. When iterating over that list, a single string gets returned each time (the first time it is 'A', the second time it is '5') so you can't unpack that single string into 2 variables.
You could try: for name,value in [s.split('=')]
More likely, you have an iterable of strings that you want to split -- then your dict comprehension becomes simple (2 lines):
splitstrs = (s.split('=') for s in list_of_strings)
d = {name: int(value) for name,value in splitstrs }
Of course, if you're obsessed with 1-liners, you can combine it, but I wouldn't.
Sure you could do this:
>>> s = 'A=5, b=7'
>>> {k: int(v) for k, v in (item.split('=') for item in s.split(','))}
{'A': 5, ' b': 7}
But in this case I would just use this more imperative code:
>>> d = {}
>>> for item in s.split(','):
k, v = item.split('=')
d[k] = int(v)
>>> d
{'A': 5, ' b': 7}
Some people tend to believe you'll go to hell for using eval, but...
s = 'A=5, b=7'
eval('dict(%s)' % s)
Or better, to be safe (thanks to mgilson for pointing it out):
s = 'A=5, b=7'
eval('dict(%s)' % s, {'__builtins__': None, 'dict': dict})
See mgilson answer to why the error is happening. To achieve what you want, you could use:
d = {name: int(value) for name,value in (x.split('=',1) for x in s.split(','))}
To account for spaces, use .strip() as needed (ex.: x.strip().split('=',1)).
How about this code:
a="A=5, b=9"
b=dict((x, int(y)) for x, y in re.findall("([a-zA-Z]+)=(\d+)", a))
print b
Output:
{'A': 5, 'b': 9}
This version will work with other forms of input as well, for example
a="A=5 b=9 blabla: yyy=100"
will give you
{'A': 5, 'b': 9, 'yyy': 100}
>>> strs='A=5, b=7'
>>> {x.split('=')[0].strip():int(x.split('=')[1]) for x in strs.split(",")}
{'A': 5, 'b': 7}
for readability you should use normal for-in loop instead of comprehensions.
strs='A=5, b=7'
dic={}
for x in strs.split(','):
name,val=x.split('=')
dic[name.strip()]=int(val)
How about this?
>>> s
'a=5, b=3, c=4'
>>> {z.split('=')[0].strip(): int(z.split('=')[1]) for z in s.split(',')}
{'a': 5, 'c': 4, 'b': 3}
Since Python 3.8, you can use walrus operator (:=) for this kind of operation. It allows to assign variables in the middle of expressions (in this case, assign the list created by .split('=') to kv).
s = 'A=5, b=7'
{(kv := item.split('='))[0]: int(kv[1]) for item in s.split(', ')}
# {'A': 5, 'b': 7}
One feature is that it leaks the assigned variable, kv, outside the scope it was defined in. If you want to avoid that, you can use a nested for-loop where the inner loop is over a singleton list (as suggested in mgilson's answer).
{k: int(v) for item in s.split(', ') for k,v in [item.split('=')]}
Since Python 3.9, loops over singleton lists are optimized to be as fast as simple assignments, i.e. y in [expr] is as fast as y = expr.

Categories