I have a dictionary like this:
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
and want the inverse like this:
dict2 = dict({1:['a','b','c'], 2:['a','b','c'], 3:['a','b'], 4:['b']})
Like these questions:
Inverse Dict in Python \\
In-place dictionary inversion in Python
But I want to do it with non-unique keys and I don't want in-place conversion. I have some code working, but I was wondering if there's a dictionary comprehension way of doing this.
from collections import defaultdict
dict2 = defaultdict(list)
for i in dict1:
for j in dict1[i]:
dict2[j].append(i)
I tried this, but it only works for unique mappings. By unique I mean something like "for each value, there is only one key under which the value is listed". So unique mapping: '1: [a], 2: [b], 3: [c] -> a: [1], b: [2], c: [3]' VS non-unique mapping '1: [a], 2: [a, b], 3: [b, c] -> a: [1, 2], b: [2, 3], c: [3]'
dict2 = {j: i for i in dict1 for j in dict1[i]}
I think it must be something like this:
dict2 = {j: [i for i in dict1 if j in dict1[i]] for j in dict1[i]} # I know this doesn't work
Besides it not working, it seems like a comprehension like this would be inefficient. Is there an efficient, one liner way of doing this?
Standard dict:
>>> dict2 = {}
>>> for key, values in dict1.items():
... for value in values:
... dict2.setdefault(value, []).append(key)
...
>>> dict2
{1: ['a', 'c', 'b'], 2: ['a', 'c', 'b'], 3: ['a', 'b'], 4: ['b']}
defaultdict:
>>> dict2 = defaultdict(list)
>>> for key, values in dict1.items():
... for value in values:
... dict2[value].append(key)
...
>>> dict2
{1: ['a', 'c', 'b'], 2: ['a', 'c', 'b'], 3: ['a', 'b'], 4: ['b']}
I figured out an answer based on Vroomfondel's answer:
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {item: [key for key in dict1 if item in dict1[key]] for value in dict1.values() for item in value}
This isn't the fastest, but it is a one liner and it is not the slowest of the options presented!
from timeit import timeit
methods = [['Vroomfondel1', '''from collections import defaultdict
import itertools
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for k,v in itertools.chain.from_iterable([itertools.product(vals,key) for key,vals in dict1.items()]):
dict2[k].append(v)'''],
['Vroomfondel2', '''from collections import defaultdict
import itertools
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
[dict2[k].append(v) for k,v in itertools.chain.from_iterable([itertools.product(vals,key) for key,vals in dict1.items()])]'''],
['***Vroomfondel2 mod', '''dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {item: [key for key in dict1 if item in dict1[key]] for value in dict1.values() for item in value}'''],
['mhlester1', '''dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {}
for key, values in dict1.items():
for value in values:
dict2.setdefault(value, []).append(key)'''],
['mhlester1 mod', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for key, values in dict1.items():
for value in values:
dict2[value].append(key)'''],
['mhlester2', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for key, values in dict1.items():
for value in values:
dict2[value].append(key)'''],
['initial', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for i in dict1:
for j in dict1[i]:
dict2[j].append(i)''']
]
for method in methods:
print "% 15s" % (method[0]), '\t', timeit(method[1], number=10000)
prints out:
Vroomfondel1 0.202519893646
Vroomfondel2 0.164724111557
***Vroomfondel2 mod 0.114083051682
mhlester1 0.0599339008331
mhlester1 mod 0.091933965683
mhlester2 0.0900268554688
initial 0.0953099727631
As a one-liner (thanks to mhlesters input), but with so-so readability (and only working because the values in dict2 are mutable and thus setdefault returning a reference to them):
import itertools
[dict2.setdefault(k,[]).append(v) for k,v in itertools.chain.from_iterable([itertools.product(vals,[key]) for key,vals in dict1.items()])]
Or with a for loop:
import collections
import itertools
dict2=collections.defaultdict(list)
for k,v in itertools.chain.from_iterable([itertools.product(vals,[key]) for key,vals in dict1.items()]):
dict2[k].append(v)
Related
For example, in dict1 the keys 1, 2, 3 all have the same value 'a', but the keys 3 and 5 have different values, 'b' and 'd'. What I want is:
If N keys have the same value and N >=3, then I want to remove all other elements from the dict and only keep those N key values, which means 'b' & 'd' have to be removed from the dict.
The following code works, but it seems very verbose. Is there a better way to do this?
from collections import defaultdict
dict1 = {1:'a', 2:'a', '3':'b', '4': 'a', '5':'d'}
l1 = [1, 2, 3, 4, 5]
dict2 = defaultdict(list)
for k, v in dict1.items():
dict2[v].append(k)
to_be_removed = []
is_to_be_removed = False
for k, values in dict2.items():
majority = len(values)
if majority>=3:
is_to_be_removed = True
else:
to_be_removed.extend(values)
if is_to_be_removed:
for d in to_be_removed:
del dict1[d]
print(f'New dict: {dict1}')
You can use collections.Counter to get the frequency of every value, then use a dictionary comprehension to retain only the keys that have the desired corresponding value:
from collections import Counter
dict1 = {1:'a', 2:'a', '3':'b', '4': 'a', '5':'d'}
ctr = Counter(dict1.values())
result = {key: value for key, value in dict1.items() if ctr[value] >= 3}
print(result)
This outputs:
{1: 'a', 2: 'a', '4': 'a'}
This question already has answers here:
is it possible to reverse a dictionary in python using dictionary comprehension
(5 answers)
Closed 2 years ago.
While I've been improving my Python skills I have one question.
My code is below:
# def invertDictionary(dict):
# new_dict = {}
# for key, value in dict.items():
# if value in new_dict:
# new_dict[value].append(key)
# else:
# new_dict[value]=[key]
# return new_dict
def invertDictionary(dict):
new_dict = {value:([key] if value else [key]) for key, value in dict.items()}
return new_dict;
invertDictionary({'a':3, 'b':3, 'c':3})
I am trying to get output like {3:['a','b','c']}. I have achieved that using a normal for-loop; I just want to know how to get these results using a Dictionary Comprehension. I tried but in append it's getting an error. Please let me know how to achieve this.
Thanks in Advance!
You missed that you also need a list comprehension to build the list.
Iterate over the values in the dict, and build the needed list of keys for each one.
Note that this is a quadratic process, whereas the canonical (and more readable) for loop is linear.
d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
inv_dict = {v: [key for key, val in d.items() if val == v]
for v in set(d.values())}
result:
{0: ['g'],
3: ['a', 'b', 'c'],
4: ['e', 'f']
}
Will this do?
while your original version with a regular for loop is the best solution for this, here is a variation on #Prune answer that doesn't goes over the dict multiple times
>>> import itertools
>>> d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
>>> {group_key:[k for k,_ in dict_items]
for group_key,dict_items in itertools.groupby(
sorted(d.items(),key=lambda x:x[-1]),
key=lambda x:x[-1]
)
}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
first we sorted the items of the dict by value with a key function to sorted using a lambda function to extract the value part of the item tuple, then we use the groupby to group those with the same value together with the same key function and finally with a list comprehension extract just the key
--
as noted by Kelly, we can use the get method from the dict to get the value to make it shorter and use the fact that iteration over a dict give you its keys
>>> {k: list(g) for k, g in itertools.groupby(sorted(d, key=d.get), d.get)}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
You could use a defalutdict and the append method.
from collections import defaultdict
dict1 = {'a': 3, 'b': 3, 'c': 3}
dict2 = defaultdict(list)
{dict2[v].append(k) for k, v in dict1.items()}
dict2
>>> defaultdict(list, {3: ['a', 'b', 'c']})
Say I have the following dictionary.
>> sample_dict = {"1": ['a','b','c'], "2": ['d','e','f'], "3": ['g','h','a']}
I would like to find a way that would look at the values of each of the keys and return whether or not the value lists have the a duplicate variable inside.
For example it would output:
>> [["1","3"] , ['a']]
I've looked at a few of the posts here and tried to use and/or change them to accomplish this, however none of what I have found has worked as intended. They would work if it was as follows:
>> sample_dict = {"1": ['a','b','c'], "2": ['d','e','f'], "3": ['a','b','c']}
but not if only a single value within the list was the same.
You could use another dictionary to map the values to the lists of corresponding keys. Then just select the values that map to more than one key, e.g.:
from collections import defaultdict
sample_dict = {'1': ['a','b','c'], '2': ['d','e','f'], '3': ['g','h','a']}
d = defaultdict(list) # automatically initialize every value to a list()
for k, v in sample_dict.items():
for x in v:
d[x].append(k)
for k, v in d.items():
if len(v) > 1:
print([v, k])
Output:
[['1', '3'], 'a']
If the list elements are hashable, you can use .setdefault to build an inverse mapping like so:
>>> sample_dict = {"1": ['a','b','c'], "2": ['d','e','f'], "3": ['g','h','a']}
>>> aux = {}
>>> for k, v in sample_dict.items():
... for i in v:
... aux.setdefault(i, []).append(k)
...
>>> [[v, k] for k, v in aux.items() if len(v) > 1]
[[['1', '3'], 'a']]
Dictionaries map from keys to values, not from values to keys. But you can write a function for one-off calculations. This will incur O(n) time complexity and is not recommended for larger dictionaries:
def find_keys(d, val):
return [k for k, v in d.items() if val in v]
res = find_keys(sample_dict, 'a') # ['1', '3']
If you do this often, I recommend you "invert" your dictionary via collections.defaultdict:
from collections import defaultdict
dd = defaultdict(list)
for k, v in sample_dict.items():
for w in v:
dd[w].append(k)
print(dd)
defaultdict(<class 'list'>, {'a': ['1', '3'], 'b': ['1'], 'c': ['1'], 'd': ['2'],
'e': ['2'], 'f': ['2'], 'g': ['3'], 'h': ['3']})
This costs O(n) for the inversion, as well as additional memory, but now allows you to access the keys associated with an input value in O(1) time, e.g. dd['a'] will return ['1', '3'].
You can use defaultdict from collections module to do this
for example,
from collections import defaultdict
sample_dict = {"1": ['a','b','c'], "2": ['d','e','f'], "3": ['g','h','a']}
d = defaultdict(list)
for keys, vals in sample_dict.items():
for v in vals:
d[v].append(keys)
print(d)
d will return a dict, where the keys will be the values that are repeated and values will be the list in which they were repeated in
The output of above code is defaultdict(list,{'a': ['1', '3'],'b': ['1'],'c': ['1'],'d': ['2'],'e': ['2'],'f': ['2'],'g': ['3'],'h': ['3']})
Although it IS possible to get form in which you desired the output to be in, but it is not generally recommended because we are trying to get what character get repeated in which list, that feels like a job of a dictionary
I have two python lists:
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
The "keys" are cluster ID list for the corresponding words in "values" list. I wish to print key-value pairs using
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
dictionary = dict(zip(keys, values))
for key, value in dictionary.items() :
print (key, value)
But it only prints
1 apple
2 paper
3 tennis
What I actually want is to get all values for all keys like this
1 [apple]
2 [book,pen,paper]
3 [soccer,tennis]
I know that my current code should logically print the first output as keys are unique. But how can I change it so that it will print all values for all keys? Thank you in advance!
from collections import defaultdict
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
d = defaultdict(list)
for k, v in zip(keys, values):
d[k].append(v)
Looks like what you want is a mapping from one key to multiple values, one way to accomplish it would be:
from collections import defaultdict
d = defaultdict(list)
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
for tuple in zip(keys, values):
d[tuple[0]].append(tuple[1])
print(d) # defaultdict(<class 'list'>, {1: ['apple'], 2: ['book', 'pen', 'paper'], 3: ['soccer', 'tennis']})
You can use itertools:
import itertools
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
final_data = {a:[i[0] for i in b] for a, b in [(a, list(b)) for a, b in itertools.groupby(sorted(zip(values, keys), key=lambda x:x[-1]), key=lambda x:x[-1])]}
Output:
{1: ['apple'], 2: ['book', 'pen', 'paper'], 3: ['soccer', 'tennis']}
pure python also works
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
d = dict(zip(keys, [[] for _ in keys])) # dict w keys, empty lists as values
for k, v in zip(keys, values):
d[k].append(v)
d
Out[128]: {1: ['apple'], 2: ['book', 'pen', 'paper'], 3: ['soccer', 'tennis']}
Two method :
If you want you can use default dict as many already have been suggested :
Data is :
keys=[1,2,2,3,2,3]
values=['apple','book','pen','soccer','paper','tennis']
Method: 1
import collections
d=collections.defaultdict(list)
for i in zip(keys,values):
d[i[0]].append(i[1])
print(d)
output:
{1: ['apple'], 2: ['book', 'pen', 'paper'], 3: ['soccer', 'tennis']}
Or if you want to develop your own logic without importing any external module then you can try:
result={}
for i in zip(keys,values):
if i[0] not in result:
result[i[0]]=[i[1]]
else:
result[i[0]].append(i[1])
print(result)
output:
{1: ['apple'], 2: ['book', 'pen', 'paper'], 3: ['soccer', 'tennis']}
i want to count the number of each value in a dictionary, and construct a new one with the value as the key, and a list of keys that had said value as the value.
Input :
b = {'a':3,'b':3,'c':8,'d':3,'e':8}
Output:
c = { '3':[a. b. d]
'8':[c, e]
}
I ve written the following, but it raises a key error and doesnt give any output, could someone help?
def dictfreq(b):
counter = dict()
for k,v in b.iteritems():
if v not in counter:
counter[v].append(k)
else:
counter[v].append(k)
return counter
print dictfreq(b)
Better way to achieve this is via using collections.defaultdict. For example:
from collections import defaultdict
b = {'a':3,'b':3,'c':8,'d':3,'e':8}
new_dict = defaultdict(list) # `list` as default value
for k, v in b.items():
new_dict[v].append(k)
The final value hold by new_dict will be:
{8: ['c', 'e'], 3: ['a', 'b', 'd']}
Change this
if v not in counter:
counter[v].append(k)
else:
counter[v].append(k)
to this:
if v not in counter:
counter[v] = [] # add empty `list` if value `v` is not found as key
counter[v].append(k)
You can use dict.setdefault method:
>>> c = {}
>>> for key, value in b.iteritems():
... c.setdefault(value, []).append(key)
...
>>> c
{8: ['c', 'e'], 3: ['a', 'b', 'd']}
In Python3 use b.items() instead.