Duplicate values in a Python dictionary - python

I have a dictionary in the following format:
{ 'a' : [1], 'b' : [1,2,3], 'c' : [1,1,2], 'd' : [2,3,4] }
and I want to create a list of the keys which have a '1' in their values.
So my result list should look like:
['a','b','c','c']
I cannot understand how to work with duplicate values.
Any ideas how can I get such a list?

You can use list comprehensions
>>> d = { 'a' : [1], 'b' : [1,2,3], 'c' : [1,1,2], 'd' : [2,3,4] }
>>> [key for key, values in d.items() for element in values if element==1]
['c', 'c', 'b', 'a']
Here we have two nested for loops in our list comprehension. The first iterate over each key, values pairs in the dictionary and the second loop iterate over each element in the "value" list and return the key each time that element equal to 1. The result list is unordered because dict are unordered which means there are no guarantees about the order of the items.

Here is one way:
>>> x = { 'a' : [1], 'b' : [1,2,3], 'c' : [1,1,2], 'd' : [2,3,4] }
>>> list(itertools.chain.from_iterable([k]*v.count(1) for k, v in x.iteritems() if 1 in v))
['a', 'c', 'c', 'b']
If using Python 3, use items instead of iteritems.

This uses two loops, k,v in d.items() which gets each (key,value) pair from the dictionary, and n in v which loops through each value in v:
d = { 'a' : [1], 'b' : [1,2,3], 'c' : [1,1,2], 'd' : [2,3,4] }
l = []
for k,v in d.items():
for n in v:
if n == 1:
l.append(k)
l.sort()
If you want a one-liner:
l = sorted(k for k,v in d.items() for n in v if n == 1)

The sort must be made on the dictionary to get the expected result. This should work:
list = []
for i in sorted(d.keys()):
list+=[i for x in d[i] if x == 1]
print list
will output:
['a', 'b', 'c', 'c']

easy way: (Python 3)
d = { 'a' : [1], 'b' : [1,2,3], 'c' : [1,1,2], 'd' : [2,3,4] }
n = 1
result = []
for key, value in d.items():
for i in value.count(n):
res.append(key)
if you want list sorted than:
result.sort()

Related

How to efficiently remove elements from dicts that have certain value patterns?

For example, in dict1 the keys 1, 2, 3 all have the same value 'a', but the keys 3 and 5 have different values, 'b' and 'd'. What I want is:
If N keys have the same value and N >=3, then I want to remove all other elements from the dict and only keep those N key values, which means 'b' & 'd' have to be removed from the dict.
The following code works, but it seems very verbose. Is there a better way to do this?
from collections import defaultdict
dict1 = {1:'a', 2:'a', '3':'b', '4': 'a', '5':'d'}
l1 = [1, 2, 3, 4, 5]
dict2 = defaultdict(list)
for k, v in dict1.items():
dict2[v].append(k)
to_be_removed = []
is_to_be_removed = False
for k, values in dict2.items():
majority = len(values)
if majority>=3:
is_to_be_removed = True
else:
to_be_removed.extend(values)
if is_to_be_removed:
for d in to_be_removed:
del dict1[d]
print(f'New dict: {dict1}')
You can use collections.Counter to get the frequency of every value, then use a dictionary comprehension to retain only the keys that have the desired corresponding value:
from collections import Counter
dict1 = {1:'a', 2:'a', '3':'b', '4': 'a', '5':'d'}
ctr = Counter(dict1.values())
result = {key: value for key, value in dict1.items() if ctr[value] >= 3}
print(result)
This outputs:
{1: 'a', 2: 'a', '4': 'a'}

Getting duplicates from nested dictionary

I'm fairly new to python and have the following problem. I have a nested dictionary in the form of
dict = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
and would like to find all the keys that have the same values. The output should look similar to this.
1 : [a,b]
2 : [a,c]
..
Many thanks in Advance for any help!
dict = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
output = {}
for key, value in dict.items():
for v in value:
if v in output.keys():
output[v].append(key)
else:
output[v] = [ key ]
print(output)
And the output will be
{'2': ['a', 'c'], '1': ['a', 'b'], '5': ['b'], '3': ['c']}
before we go to the solution, lemme tell you something. What you've got there is not a nested dictionary but rather sets within the dictionary.
Some python terminologies to clear that up:
Array: [ 1 , 2 ]
Arrays are enclosed in square braces & separated by commas.
Dictionary: { "a":1 , "b":2 }
Dictionaries are enclosed in curly braces & separate "key":value pairs with comma. Here, "a" & "b" are keys & 1 & 2 would be their respective values.
Set: { 1 , 2 }
Sets are enclosed in curly braces & separated by commas.
dict = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
Here, {'1', '2'} is a set in a dictionary with key 'a'. Thus, what you've got is actually set in a dictionary & not a nested dictionary.
Solution
Moving on to the solution, sets are not iterable meaning you can't go through them one by one. So, you gotta turn them into lists & then iterate them.
# Initialize the dictionary to be processed
data = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
# Create dictionary to store solution
sol = {} # dictionary to store element as a key & sets containing that element as an array
# Eg., sol = { "1" : [ "a" , "b" ] }
# This shows that the value 1 is present in the sets contained in keys a & b.
# Record all elements & list every set containing those elements
for key in data. keys (): # iterate all keys in the dictionary
l = list ( data [ key ] ) # convert set to list
for elem in l: # iterate every element in the list
if elem in sol. keys (): # check if elem already exists in solution as a key
sol [ elem ]. append ( key ) # record that key contains elem
else:
sol [ elem ] = [ key ] # create a new list with elem as key & store that key contains elem
# At this time, sol would be
# {
# "1" : [ "a" , "b" ] ,
# "2" : [ "a" , "C" ] ,
# "3" : [ "c" ] ,
# "5" : [ "b" ]
# }
# Since, you want only the ones that are present in more than 1 sets, let's remove them
for key in sol : # iterate all keys in sol
if sol [ key ]. length < 2 : # Only keys in at least 2 sets will be retained
del sol [ key ] # remove the unrequired element
# Now, you have your required output in sol
print ( sol )
# Prints:
# {
# "1" : [ "a" , "b" ] ,
# "2" : [ "a" , "c" ]
# }
I hope that helps you...
You can use a defaultdict to build the output easily (and sort it if you want the keys in sorted order):
from collections import defaultdict
d = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
out = defaultdict(list)
for key, values in d.items():
for value in values:
out[value].append(key)
# for a sorted output (dicts are ordered since Python 3.7):
sorted_out = dict((k, out[k]) for k in sorted(out))
print(sorted_out)
#{'1': ['a', 'b'], '2': ['a', 'c'], '3': ['c'], '5': ['b']}
you can reverse the key-value in dict, create a value-key dict, if you only want duplicated values(find all the keys that have the same values), you can filter it:
from collections import defaultdict
def get_duplicates(dict1):
dict2 = defaultdict(list)
for k, v in dict1.items():
for c in v:
dict2[c].append(k)
# if you want to all values, just return dict2
# return dict2
return dict(filter(lambda x: len(x[1]) > 1, dict2.items()))
output:
{'1': ['a', 'b'], '2': ['a', 'c']}
This can be easily done using defaultdict from collections,
>>> d = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> for key,vals in d.items():
... for val in vals:
... dd[val].append(key)
...
>>>>>> dict(dd)
{'1': ['a', 'b'], '3': ['c'], '2': ['a', 'c'], '5': ['b']}
This can be easily achieved with two inner for loops:
dict = {'a': {'1','2'}, 'b':{'5','1'}, 'c':{'3','2'}}
out = {}
for key in dict:
for value in dict[key]:
if value not in out:
out[value]= [key]
else:
out[value]+= [key]
print out # {'1': ['a', 'b'], '3': ['c'], '2': ['a', 'c'], '5': ['b']}

How to get a list which is a value of a dictionary by a value from the list?

I have the following dictionary :
d = {'1' : [1, 2, 3, 4], '2' : [10, 20, 30, 40]}
How do I get the corresponding key I'm searching by a value from one of the lists?
Let's say I want key '1' if I'm looking for value 3 or key '2' if I'm looking for value 10.
You can reverse the dictionary into this structure to do that kind of lookup:
reverse_d = {
1: '1',
2: '1',
3: '1',
4: '1',
10: '2',
…
}
which can be built by looping over each value of each key:
reverse_d = {}
for key, values in d.items():
for value in values:
reverse_d[value] = key
or more concisely as a dict comprehension:
reverse_d = {value: key for key, values in d.items() for value in values}
Lookups are straightforward now!
k = reverse_d[30]
# k = '2'
This only offers better performance than searching through the whole original dictionary if you do multiple lookups, though.
You can use a generator expression with a filtering condition, like this
>>> def get_key(d, search_value):
... return next(key for key, values in d.items() if search_value in values)
...
>>> get_key(d, 10)
'2'
>>> get_key(d, 2)
'1'
If none of the keys contain the value being searched for, None will be returned.
>>> get_key(d, 22)
None
This is my first time to answer question. How about this method?
def get_key(d,search_value):
res = []
for v in d.items():
if search_value in v[1]:
res.append(v[0])
return res
>>> D = {'a':[2,2,3,4,5],'b':[5,6,7,8,9]}
>>> getkey.get_key(D,2)
['a']
>>> getkey.get_key(D,9)
['b']
>>> getkey.get_key(D,5)
['a', 'b']

If variable is equal to any value in a list

I want to make a IF statement inside a for loop, that I want it to be triggered if the variable is equal to any value in the list.
Sample data:
list = [variable1, variable2, variable3]
Right now I have this sample code:
for k, v in result_dict.items():
if k == 'varible1' or k == 'variable2' or k == 'variable2':
But the problem is the list will grow larger and I don't to have to create multiple OR statements for every variable.
how can I do it?
This is what the in operator is for. Do:
list = [variable1, variable2, variable3]
for k, v in result_dict.items():
if k in list:
Another way to do it is with sets:
>>> l = ['a', 'b', 'c']
>>> d = {'a': 1, 'b': 2, 'c': 'three', 'd': 4, 'e': 5, 'f': 6}
>>> keys = set(l).intersection(d.keys())
>>> keys
set(['a', 'c', 'b'])
Then you can iterate over those keys:
for k in set(l).intersection(d.keys()):
do_something(d[k])
This should be more efficient than repetitively calling in on the list. Call set() on the shortest of the list or dictionary.
You may need another FOR loop.
for k, v in result_dict.items():
for i in list:
if i==k:

Converting dictionaries to list sorted by values, with multiple values per item

In Python, I have a simple problem of converting lists and dictionaries that I have solved using explicit type check to tell the difference between integers and list of integers. I'm somewhat new to python, and I'd curious if there is a more 'pythonic' way to solve the problem,i.e. that avoids an explicit type check.
In short: Trying to sort a dictionary's keys using the values, but where each key can have multiple values, and the key needs to appear multiple times in the list. Data comes in the form {'a':1, 'b':[0,2],...}. Everything I have come up (using sorted( , key = ) ) with is tripped up by the fact the values that occur once can be specified not as an integer instead of a length of list 1.
I'd like to convert between dictionaries of the form {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]} and lists ['b', 'd', 'c', 'a', 'c', 'd'] (the positions of the items in list being specified by the values in the dictionary).
The function list_to_dictionary should have a key for each item appearing in the list with the value giving the location in the list. In case an item appears more than once, the value should be a list storing all of those locations.
The function dictionary_to_list should create a list consisting of the keys of the dictionary, sorted by value. In case the value is not a single integer but instead a list of integers, that key should appear in the list multiple times at the corresponding sorted locations.
My solution was as follows:
def dictionary_to_list(d):
"""inputs a dictionary a:i or a:[i,j], outputs a list of a sorted by i"""
#Converts i to [i] as value of dictionary
for a in d:
if type(d[a])!=type([0,1]):
d[a] = [d[a]]
#Reverses the dictionary from {a:[i,j]...} to {i:a, j:a,...}
reversed_d ={i:a for a in d for i in d[a]}
return [x[1] for x in sorted(reversed_d.items(), key=lambda x:x[0])]
def list_to_dictionary(x):
d = {}
for i in range(len(x)):
a = x[i]
if a in d:
d[a].append(i)
else:
d[a]=[i]
#Creates {a:[i], b:[j,k],...}
for a in d:
if len(d[a])==1:
d[a] = d[a][0]
#Converts to {a:i, b:[j,k],...}
return d
I can't change the problem to have lists of length 1 in place of the single integers as the values of the dictionaries due to the interaction with the rest of my code. It seems like there should be a simple way to handle this but I can't figure it out. A better solution here would have several applications for my python scripts.
Thanks
def dictionary_to_list(data):
result = {}
for key, value in data.items():
if isinstance(value, list):
for index in value:
result[index] = key
else:
result[value] = key
return [result[key] for key in sorted(result)]
def list_to_dictionary(data):
result = {}
for index, char in enumerate(data):
result.setdefault(char, [])
result[char].append(index)
return dict((key, value[0]) if len(value) == 1 else (key, value) for key, value in result.items())
dictData = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
listData = ['b', 'd', 'c', 'a', 'c', 'd']
print dictionary_to_list(dictData)
print list_to_dictionary(listData)
Output
['b', 'd', 'c', 'a', 'c', 'd']
{'a': 3, 'c': [2, 4], 'b': 0, 'd': [1, 5]}
In [17]: d = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
In [18]: sorted(list(itertools.chain.from_iterable([[k]*(1 if isinstance(d[k], int) else len(d[k])) for k in d])), key=lambda i:d[i] if isinstance(d[i], int) else d[i].pop(0))
Out[18]: ['b', 'd', 'c', 'a', 'c', 'd']
The call is to:
sorted(
list(
itertools.chain.from_iterable(
[[k]*(1 if isinstance(d[k], int) else len(d[k]))
for k in d
]
)
),
key=lambda i:d[i] if isinstance(d[i], int) else d[i].pop(0)
)
The idea is that the first part (i.e. list(itertools.chain.from_iterable([[k]*(1 if isinstance(d[k], int) else len(d[k])) for k in d])) creates a list of the keys in d, repeating by the number of values associated with it. So if a key has as single int (or a list containing only one int) as its value, it appears once in this list; else, it appears as many times as there are items in the list.
Next, we assume that the values are sorted (trivial to do as a pre-processing step, otherwise). So now, what we do is to sort the keys by their first value. If they have only a single int as their value, it is considered; else, the first element in the list containing all its values. This first element is also removed from the list (by the call to pop) so that subsequent occurrences of the same key won't reuse the same value
If you'd like to do this without the explicit typecheck, then you could listify all values as a preprocessing step:
In [22]: d = {'a':3, 'b':0, 'c':[2,4], 'd':[1,5]}
In [23]: d = {k:v if isinstance(v, list) else [v] for k,v in d.iteritems()}
In [24]: d
Out[24]: {'a': [3], 'b': [0], 'c': [2, 4], 'd': [1, 5]}
In [25]: sorted(list(itertools.chain.from_iterable([[k]*len(d[k]) for k in d])), key=lambda i:d[i].pop(0))
Out[25]: ['b', 'd', 'c', 'a', 'c', 'd']
def dictionary_to_list(d):
return [k[0] for k in sorted(list(((key,n) for key, value in d.items() if isinstance(value, list) for n in value))+\
[(key, value) for key, value in d.items() if not isinstance(value, list)], key=lambda k:k[1])]
def list_to_dictionary(l):
d = {}
for i, c in enumerate(l):
if c in d:
if isinstance(d[c], list):
d[c].append(i)
else:
d[c] = [d[c], i]
else:
d[c] = i
return d
l = dictionary_to_list({'a':3, 'b':0, 'c':[2,4], 'd':[1,5]})
print(l)
print(list_to_dictionary(l))

Categories