Is there a way to get the original/consistent list of keys from defaultdict even when non existing keys were requested?
from collections import defaultdict
>>> d = defaultdict(lambda: 'default', {'key1': 'value1', 'key2' :'value2'})
>>>
>>> d.keys()
['key2', 'key1']
>>> d['bla']
'default'
>>> d.keys() # how to get the same: ['key2', 'key1']
['key2', 'key1', 'bla']
You have to exclude. the keys that has the default value!
>>> [i for i in d if d[i]!=d.default_factory()]
['key2', 'key1']
Time comparison with method suggested by Jean,
>>> def funct(a=None,b=None,c=None):
... s=time.time()
... eval(a)
... print time.time()-s
...
>>> funct("[i for i in d if d[i]!=d.default_factory()]")
9.29832458496e-05
>>> funct("[k for k,v in d.items() if v!=d.default_factory()]")
0.000100135803223
>>> ###storing the default value to a variable and using the same in the list comprehension reduces the time to a certain extent!
>>> defa=d.default_factory()
>>> funct("[i for i in d if d[i]!=defa]")
8.82148742676e-05
>>> funct("[k for k,v in d.items() if v!=defa]")
9.79900360107e-05
[key for key in d.keys() if key != 'default']
default_factory() is a callable and need not return the same value each time!
>>> from collections import defaultdict
>>> from random import random
>>> d = defaultdict(lambda: random())
>>> d[1]
0.7411252345322932
>>> d[2]
0.09672701444816645
>>> d.keys()
dict_keys([1, 2])
>>> d.default_factory()
0.06277993247659297
>>> d.default_factory()
0.4388136209046052
>>> d.keys()
dict_keys([1, 2])
>>> [k for k in d.keys() if d[k] != d.default_factory()]
[1, 2]
Related
I have a dictionary looking like this:
d ={'key1':{'key2':{'key11':{'key12':'value 13'}}},'key3':[{'key4':'value2', 'key5': 'value3'}]}
I want to get the value for 'key12' so I can do this:
d.get('key1').get('key2').get('key11').get('key12')
and it will return this:
'value 13'
if I had a list like this:
['key1', 'key2', 'key11', 'key12']
how could I call the get recursively over the above list to return the same result?
You can use functools.reduce:
>>> from functools import reduce
>>> keys = ['key1', 'key2', 'key11', 'key12']
>>> reduce(dict.get, keys, d)
#or, reduce(lambda x,y:x.get(y), keys, d)
'value 13'
In python 3.8+ you can use the initial key in itertools.accumulate:
>>> from itertools import accumulate
>>> list(accumulate(keys, dict.get, initial=d))[-1]
'value 13'
I would still prefer functools.reduce, even though Guido doesn't
d ={'key1':{'key2':{'key11':{'key12':'value 13'}}},'key3':[{'key4':'value2', 'key5': 'value3'}]}
ans = d.get('key1').get('key2').get('key11').get('key12')
print(ans)
key_list = ['key1', 'key2', 'key11', 'key12']
for key in key_list:
d = d[key]
print(d)
Output:
value 13
value 13
You can use the following function which gets the dictionary and the list and returns the requested value:
def func(a, b):
for key in b:
a = a.get(key)
if a is None:
raise ValueError('There is no such key as ' + str(key))
return a
Usage:
a = {'key1':{'key2':{'key11':{'key12':'value 13'}}},'key3':[{'key4':'value2', 'key5': 'value3'}]}
b = ['key1', 'key2', 'key11', 'key12']
print(func(a, b))
Output:
value 13
So, I have 2 dictionaries, I have to check for missing keys and for matching keys, check if they have same or different values.
dict1 = {..}
dict2 = {..}
#key values in a list that are missing in each
missing_in_dict1_but_in_dict2 = []
missing_in_dict2_but_in_dict1 = []
#key values in a list that are mismatched between the 2 dictionaries
mismatch = []
What's the most efficient way to do this?
You can use dictionary view objects, which act as sets. Subtract sets to get the difference:
missing_in_dict1_but_in_dict2 = dict2.keys() - dict1
missing_in_dict2_but_in_dict1 = dict1.keys() - dict2
For the keys that are the same, use the intersection, with the & operator:
mismatch = {key for key in dict1.keys() & dict2 if dict1[key] != dict2[key]}
If you are still using Python 2, use dict.viewkeys().
Using dictionary views to produce intersections and differences is very efficient, the view objects themselves are very lightweight the algorithms to create the new sets from the set operations can make direct use of the O(1) lookup behaviour of the underlying dictionaries.
Demo:
>>> dict1 = {'foo': 42, 'bar': 81}
>>> dict2 = {'bar': 117, 'spam': 'ham'}
>>> dict2.keys() - dict1
{'spam'}
>>> dict1.keys() - dict2
{'foo'}
>>> [key for key in dict1.keys() & dict2 if dict1[key] != dict2[key]]
{'bar'}
and a performance comparison with creating separate set() objects:
>>> import timeit
>>> import random
>>> def difference_views(d1, d2):
... missing1 = d2.keys() - d1
... missing2 = d1.keys() - d2
... mismatch = {k for k in d1.keys() & d2 if d1[k] != d2[k]}
... return missing1, missing2, mismatch
...
>>> def difference_sets(d1, d2):
... missing1 = set(d2) - set(d1)
... missing2 = set(d1) - set(d2)
... mismatch = {k for k in set(d1) & set(d2) if d1[k] != d2[k]}
... return missing1, missing2, mismatch
...
>>> testd1 = {random.randrange(1000000): random.randrange(1000000) for _ in range(10000)}
>>> testd2 = {random.randrange(1000000): random.randrange(1000000) for _ in range(10000)}
>>> timeit.timeit('d(d1, d2)', 'from __main__ import testd1 as d1, testd2 as d2, difference_views as d', number=1000)
1.8643521590274759
>>> timeit.timeit('d(d1, d2)', 'from __main__ import testd1 as d1, testd2 as d2, difference_sets as d', number=1000)
2.811345119960606
Using set() objects is slower, especially when your input dictionaries get larger.
One easy way is to create sets from the dict keys and subtract them:
>>> dict1 = { 'a': 1, 'b': 1 }
>>> dict2 = { 'b': 1, 'c': 1 }
>>> missing_in_dict1_but_in_dict2 = set(dict2) - set(dict1)
>>> missing_in_dict1_but_in_dict2
set(['c'])
>>> missing_in_dict2_but_in_dict1 = set(dict1) - set(dict2)
>>> missing_in_dict2_but_in_dict1
set(['a'])
Or you can avoid casting the second dict to a set by using .difference():
>>> set(dict1).difference(dict2)
set(['a'])
>>> set(dict2).difference(dict1)
set(['c'])
I have a dictionary which I need to deconstruct its keys and values in perhaps two lists(or any other type that does the job) and later in another function, construct the exact same dictionary putting back the keys and values. What's the right way of approaching this?
You can use dict.items() to get all the key-value pairs from the dictionary, then either store them directly...
>>> d = {"foo": 42, "bar": 23}
>>> items = list(d.items())
>>> dict(items)
{'bar': 23, 'foo': 42}
... or distribute them to two separate lists, using zip:
>>> keys, values = zip(*d.items())
>>> dict(zip(keys, values))
{'bar': 23, 'foo': 42}
d = {'jack': 4098, 'sape': 4139}
k, v = d.keys(), d.values()
# Do stuff with keys and values
# -
# Create new dict from keys and values
nd = dict(zip(k, v))
Better Don't deconstruct it. Where you need the keys and values as list you can get that with the following methods.
keyList=list(dict.keys())
valueList = [dict[key] for key in keyList] or [dict[key] for key in dict.keys()]
Hope it helps.
To deconstruct a Dict to two list
>>> test_dict={"a":1, "b":2}
>>> keyList=[]
>>> valueList =[]
>>> for key,val in test_dict.items():
... keyList.append(key)
... valueList.append(val)
>>> print valueList
[1, 2]
>>> print keyList
['a', 'b']
To construct from two list of key and value I would use zip method along with dict comprehension.
>>> {key:val for key,val in zip(keyList,valueList)}
{'a': 1, 'b': 2}
Can someone help me read this Python dictionary-tuples?
I am new to python and I can't get much out of it
Grammar = {'AB':('S', 'B'), 'BB':'A', 'a':'A', 'b':'B'}
Note:
The grammar is the grammar of a Context Free Grammar.
To convert the dictionary into what you want you can do something like this:
>>> from collection import defaultdict
>>> grammar = {'AB':('S', 'B'), 'BB':'A', 'a':'A', 'b':'B'}
>>> tmp_result = defaultdict(list)
>>> def tuplify(val):
... if not isinstance(val, tuple):
... val = (val,)
... return val
...
>>> for key, value in grammar.items():
... values = tuplify(value)
... for val in values:
... tmp_result[val].append(key)
...
>>> tmp_result
defaultdict(<type 'list'>, {'A': ['a', 'BB'], 'S': ['AB'], 'B': ['AB', 'b']})
>>> result = {key:tuple(val) for key, val in tmp_result.items()}
>>> result
{'A': ('a', 'BB'), 'S': ('AB',), 'B': ('AB', 'b')}
Where the class collections.defaultdict is a dict-like class, which uses a factory to create a default value when the key is missing. For example writing:
>>> D = defaultdict(list)
>>> D[5].append(3)
>>> D[5]
[3]
Can be written using normal dicts like:
>>> D = {}
>>> if 5 in D: # key present, use that value
... val = D[5]
... else: # otherwise create a default value and sets it
... val = list()
... D[5] = val
...
>>> val.append(3)
>>> D[5]
[3]
The "factory" passed to defaultdict(factory) can be any callable that doesn't receive arguments, for example:
>>> n = 0
>>> def factory():
... global n
... print('Factory called!')
... n += 1
... return n #returns numbers 1, 2, 3, 4, ...
...
>>> D = defaultdict(factory)
>>> D[0]
Factory called!
1
>>> D[0] # the keys exists, thus the factory is not called.
1
>>> D[1]
Factory called!
2
>>> D[1]
2
Pardon me for not finding a better title.
Say I have two lists:
list1 = ["123", "123", "123", "456"]
list2 = ["0123", "a123", "1234", "null"]
which describe a mapping (see this question). I want to create a dict from those lists, knowing that list1 contains the keys and list2 the values. The dict in this case should be:
dict1 = {"123":("0123", "a123", "1234"), "456":("null",)}
because list1 informs us that "123" is associated to three values.
How could I programmatically generate such a dictionary?
from collections import defaultdict
dd = defaultdict(list)
for key, val in zip(list1, list2):
dd[key].append(val)
defaultdict() is your friend:
>>> from collections import defaultdict
>>> result = defaultdict(tuple)
>>> for key, value in zip(list1, list2):
... result[key] += (value,)
...
This produces tuples; if lists are fine, use Jon Clement's variation of the same technique.
>>> from collections import defaultdict
>>> list1 = ["123", "123", "123", "456"]
>>> list2 = ["0123", "a123", "1234", "null"]
>>> d = defaultdict(list)
>>> for i, key in enumerate(list1):
... d[key].append(list2[i])
...
>>> d
defaultdict(<type 'list'>, {'123': ['0123', 'a123', '1234'], '456': ['null']})
>>>
And a non-defaultdict solution:
from itertools import groupby
from operator import itemgetter
dict( (k, tuple(map(itemgetter(1), v))) for k, v in groupby(sorted(zip(list1,list2)), itemgetter(0)))