Python, list of tuples split into dictionaries - python

I have a list of tuples as shown:
lt = [(1,'a'),(1,'b'),(2,'a'),(3,'b'),(3,'c')]
I want to make the numbers keys of a dictionary and have them point to a list. That list then holds all associations in the list of tuples. So in the list above, it would split into a dictionary as:
dict_lt:{
1:[a,b],
2:[a],
3:[b,c]
}
Currently I use the dictionary's flexibility in automatically declaring new keys, which I then force point to an empty list. Then I fill that list accordingly.
dict_lt = {}
for tup in lt:
dict_lt[tup[0]] = []
for tup in lt:
dict_lt[tup[0]].append(tup[1])
This works fine, but its a tad slow since it needs to iterate twice over the same list, and it just seems overall redundant. Is there a better way?

You don't need to iterate the list twice. You can use setdefault() to set the initial value if the key is not in the dictionary:
lt = [(1,'a'),(1,'b'),(2,'a'),(3,'b'),(3,'c')]
d = {}
for k, v in lt:
d.setdefault(k, []).append(v)
print(d)
prints
{1: ['a', 'b'], 2: ['a'], 3: ['b', 'c']}

You can use collections.defaultdict with list factory or dict.setdefault to create a list that you can append the values to.
collections.defaultdict:
out = collections.defaultdict(list)
for k, v in lt:
out[k].append(v)
dict.setdefault:
out = {}
for k, v in lt:
out.setdefault(k, []).append(v)
Example:
In [11]: lt = [(1, 'a'),(1, 'b'),(2, 'a'),(3, 'b'),(3, 'c')]
In [12]: out = {}
In [13]: for k, v in lt:
...: out.setdefault(k, []).append(v)
...:
In [14]: out
Out[14]: {1: ['a', 'b'], 2: ['a'], 3: ['b', 'c']}
In [15]: out = collections.defaultdict(list)
In [16]: for k, v in lt:
...: out[k].append(v)
...:
...:
In [17]: out
Out[17]: defaultdict(list, {1: ['a', 'b'], 2: ['a'], 3: ['b', 'c']})

You can use defaultdict(list) in your code instead of dict, and just omit the first loop.
from collections import defaultdict
dict_lt = defaultdict(list)
for tup in lt:
dict_lt[tup[0]].append(tup[1])

Related

Dictionary Comprehension in Python for key:[1,2,3] [duplicate]

This question already has answers here:
is it possible to reverse a dictionary in python using dictionary comprehension
(5 answers)
Closed 2 years ago.
While I've been improving my Python skills I have one question.
My code is below:
# def invertDictionary(dict):
# new_dict = {}
# for key, value in dict.items():
# if value in new_dict:
# new_dict[value].append(key)
# else:
# new_dict[value]=[key]
# return new_dict
def invertDictionary(dict):
new_dict = {value:([key] if value else [key]) for key, value in dict.items()}
return new_dict;
invertDictionary({'a':3, 'b':3, 'c':3})
I am trying to get output like {3:['a','b','c']}. I have achieved that using a normal for-loop; I just want to know how to get these results using a Dictionary Comprehension. I tried but in append it's getting an error. Please let me know how to achieve this.
Thanks in Advance!
You missed that you also need a list comprehension to build the list.
Iterate over the values in the dict, and build the needed list of keys for each one.
Note that this is a quadratic process, whereas the canonical (and more readable) for loop is linear.
d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
inv_dict = {v: [key for key, val in d.items() if val == v]
for v in set(d.values())}
result:
{0: ['g'],
3: ['a', 'b', 'c'],
4: ['e', 'f']
}
Will this do?
while your original version with a regular for loop is the best solution for this, here is a variation on #Prune answer that doesn't goes over the dict multiple times
>>> import itertools
>>> d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
>>> {group_key:[k for k,_ in dict_items]
for group_key,dict_items in itertools.groupby(
sorted(d.items(),key=lambda x:x[-1]),
key=lambda x:x[-1]
)
}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
first we sorted the items of the dict by value with a key function to sorted using a lambda function to extract the value part of the item tuple, then we use the groupby to group those with the same value together with the same key function and finally with a list comprehension extract just the key
--
as noted by Kelly, we can use the get method from the dict to get the value to make it shorter and use the fact that iteration over a dict give you its keys
>>> {k: list(g) for k, g in itertools.groupby(sorted(d, key=d.get), d.get)}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
You could use a defalutdict and the append method.
from collections import defaultdict
dict1 = {'a': 3, 'b': 3, 'c': 3}
dict2 = defaultdict(list)
{dict2[v].append(k) for k, v in dict1.items()}
dict2
>>> defaultdict(list, {3: ['a', 'b', 'c']})

Python : Count frequences in dictionary

i want to count the number of each value in a dictionary, and construct a new one with the value as the key, and a list of keys that had said value as the value.
Input :
b = {'a':3,'b':3,'c':8,'d':3,'e':8}
Output:
c = { '3':[a. b. d]
'8':[c, e]
}
I ve written the following, but it raises a key error and doesnt give any output, could someone help?
def dictfreq(b):
counter = dict()
for k,v in b.iteritems():
if v not in counter:
counter[v].append(k)
else:
counter[v].append(k)
return counter
print dictfreq(b)
Better way to achieve this is via using collections.defaultdict. For example:
from collections import defaultdict
b = {'a':3,'b':3,'c':8,'d':3,'e':8}
new_dict = defaultdict(list) # `list` as default value
for k, v in b.items():
new_dict[v].append(k)
The final value hold by new_dict will be:
{8: ['c', 'e'], 3: ['a', 'b', 'd']}
Change this
if v not in counter:
counter[v].append(k)
else:
counter[v].append(k)
to this:
if v not in counter:
counter[v] = [] # add empty `list` if value `v` is not found as key
counter[v].append(k)
You can use dict.setdefault method:
>>> c = {}
>>> for key, value in b.iteritems():
... c.setdefault(value, []).append(key)
...
>>> c
{8: ['c', 'e'], 3: ['a', 'b', 'd']}
In Python3 use b.items() instead.

If variable is equal to any value in a list

I want to make a IF statement inside a for loop, that I want it to be triggered if the variable is equal to any value in the list.
Sample data:
list = [variable1, variable2, variable3]
Right now I have this sample code:
for k, v in result_dict.items():
if k == 'varible1' or k == 'variable2' or k == 'variable2':
But the problem is the list will grow larger and I don't to have to create multiple OR statements for every variable.
how can I do it?
This is what the in operator is for. Do:
list = [variable1, variable2, variable3]
for k, v in result_dict.items():
if k in list:
Another way to do it is with sets:
>>> l = ['a', 'b', 'c']
>>> d = {'a': 1, 'b': 2, 'c': 'three', 'd': 4, 'e': 5, 'f': 6}
>>> keys = set(l).intersection(d.keys())
>>> keys
set(['a', 'c', 'b'])
Then you can iterate over those keys:
for k in set(l).intersection(d.keys()):
do_something(d[k])
This should be more efficient than repetitively calling in on the list. Call set() on the shortest of the list or dictionary.
You may need another FOR loop.
for k, v in result_dict.items():
for i in list:
if i==k:

Collapse a list of lists, grouping on specific element and appending other elements

How could I transform a list such as:
l=[ ['A', 'C21'], ['A','D43'],['B','D34'],['C','D45'],['C',D56']
to:
[ ['A','C21 D43'], ['B','D34'],['C','D45 D56'] ]
Where the grouping is performed according to element #0 of each sub list
and elements #1 are string concatenated within each group?
Try this :
l=[ ['A', 'C21'], ['A','D43'],['B','D34'],['C','D45'],['C','D56']]
x = {}
for a in l:
if a[0] not in x.keys():
x[a[0]] = [a[1]]
else:
x[a[0]].append(a[1])
print x
array_result = []
for keys, vals in x.iteritems():
array_result.append([keys, ' '.join(vals)])
print array_result
If the keys are contiguous then you can use itertools.groupby, eg:
from itertools import groupby
data =[ ['A', 'C21'], ['A','D43'],['B','D34'],['C','D45'],['C','D56'] ]
new_data = [[k, ' '.join(el[1] for el in g)] for k, g in groupby(data, lambda L: L[0])]
# [['A', 'C21 D43'], ['B', 'D34'], ['C', 'D45 D56']]
If not and order doesn't really matter, then:
from collections import defaultdict
dd = defaultdict(list)
for key, val in data:
dd[key].append(val)
new_data = [[k, ' '.join(v)] for k,v in dd.items()]
# [['B', 'D34'], ['C', 'D45 D56'], ['A', 'C21 D43']]
Alternatively - make use of dict.setdefault, eg:
d = {}
for key, val in data:
d.setdefault(key, []).append(val)
new_data = [[k, ' '.join(v)] for k,v in d.items()]
Or, if the keys aren't contiguous, but the output should maintain the order of the input, then use collections.OrderedDict, eg:
from collections import OrderedDict
d = OrderedDict()
for key, val in data:
d.setdefault(key, []).append(val)
new_data = [[k, ' '.join(v)] for k,v in d.items()]
# [['A', 'C21 D43'], ['B', 'D34'], ['C', 'D45 D56']]

Converting dictionaries to list sorted by values, with multiple values per item

In Python, I have a simple problem of converting lists and dictionaries that I have solved using explicit type check to tell the difference between integers and list of integers. I'm somewhat new to python, and I'd curious if there is a more 'pythonic' way to solve the problem,i.e. that avoids an explicit type check.
In short: Trying to sort a dictionary's keys using the values, but where each key can have multiple values, and the key needs to appear multiple times in the list. Data comes in the form {'a':1, 'b':[0,2],...}. Everything I have come up (using sorted( , key = ) ) with is tripped up by the fact the values that occur once can be specified not as an integer instead of a length of list 1.
I'd like to convert between dictionaries of the form {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]} and lists ['b', 'd', 'c', 'a', 'c', 'd'] (the positions of the items in list being specified by the values in the dictionary).
The function list_to_dictionary should have a key for each item appearing in the list with the value giving the location in the list. In case an item appears more than once, the value should be a list storing all of those locations.
The function dictionary_to_list should create a list consisting of the keys of the dictionary, sorted by value. In case the value is not a single integer but instead a list of integers, that key should appear in the list multiple times at the corresponding sorted locations.
My solution was as follows:
def dictionary_to_list(d):
"""inputs a dictionary a:i or a:[i,j], outputs a list of a sorted by i"""
#Converts i to [i] as value of dictionary
for a in d:
if type(d[a])!=type([0,1]):
d[a] = [d[a]]
#Reverses the dictionary from {a:[i,j]...} to {i:a, j:a,...}
reversed_d ={i:a for a in d for i in d[a]}
return [x[1] for x in sorted(reversed_d.items(), key=lambda x:x[0])]
def list_to_dictionary(x):
d = {}
for i in range(len(x)):
a = x[i]
if a in d:
d[a].append(i)
else:
d[a]=[i]
#Creates {a:[i], b:[j,k],...}
for a in d:
if len(d[a])==1:
d[a] = d[a][0]
#Converts to {a:i, b:[j,k],...}
return d
I can't change the problem to have lists of length 1 in place of the single integers as the values of the dictionaries due to the interaction with the rest of my code. It seems like there should be a simple way to handle this but I can't figure it out. A better solution here would have several applications for my python scripts.
Thanks
def dictionary_to_list(data):
result = {}
for key, value in data.items():
if isinstance(value, list):
for index in value:
result[index] = key
else:
result[value] = key
return [result[key] for key in sorted(result)]
def list_to_dictionary(data):
result = {}
for index, char in enumerate(data):
result.setdefault(char, [])
result[char].append(index)
return dict((key, value[0]) if len(value) == 1 else (key, value) for key, value in result.items())
dictData = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
listData = ['b', 'd', 'c', 'a', 'c', 'd']
print dictionary_to_list(dictData)
print list_to_dictionary(listData)
Output
['b', 'd', 'c', 'a', 'c', 'd']
{'a': 3, 'c': [2, 4], 'b': 0, 'd': [1, 5]}
In [17]: d = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
In [18]: sorted(list(itertools.chain.from_iterable([[k]*(1 if isinstance(d[k], int) else len(d[k])) for k in d])), key=lambda i:d[i] if isinstance(d[i], int) else d[i].pop(0))
Out[18]: ['b', 'd', 'c', 'a', 'c', 'd']
The call is to:
sorted(
list(
itertools.chain.from_iterable(
[[k]*(1 if isinstance(d[k], int) else len(d[k]))
for k in d
]
)
),
key=lambda i:d[i] if isinstance(d[i], int) else d[i].pop(0)
)
The idea is that the first part (i.e. list(itertools.chain.from_iterable([[k]*(1 if isinstance(d[k], int) else len(d[k])) for k in d])) creates a list of the keys in d, repeating by the number of values associated with it. So if a key has as single int (or a list containing only one int) as its value, it appears once in this list; else, it appears as many times as there are items in the list.
Next, we assume that the values are sorted (trivial to do as a pre-processing step, otherwise). So now, what we do is to sort the keys by their first value. If they have only a single int as their value, it is considered; else, the first element in the list containing all its values. This first element is also removed from the list (by the call to pop) so that subsequent occurrences of the same key won't reuse the same value
If you'd like to do this without the explicit typecheck, then you could listify all values as a preprocessing step:
In [22]: d = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
In [23]: d = {k:v if isinstance(v, list) else [v] for k,v in d.iteritems()}
In [24]: d
Out[24]: {'a': [3], 'b': [0], 'c': [2, 4], 'd': [1, 5]}
In [25]: sorted(list(itertools.chain.from_iterable([[k]*len(d[k]) for k in d])), key=lambda i:d[i].pop(0))
Out[25]: ['b', 'd', 'c', 'a', 'c', 'd']
def dictionary_to_list(d):
return [k[0] for k in sorted(list(((key,n) for key, value in d.items() if isinstance(value, list) for n in value))+\
[(key, value) for key, value in d.items() if not isinstance(value, list)], key=lambda k:k[1])]
def list_to_dictionary(l):
d = {}
for i, c in enumerate(l):
if c in d:
if isinstance(d[c], list):
d[c].append(i)
else:
d[c] = [d[c], i]
else:
d[c] = i
return d
l = dictionary_to_list({'a':3, 'b':0, 'c':[2,4], 'd':[1,5]})
print(l)
print(list_to_dictionary(l))

Categories