I've looked all over the internet asking the question how can I find all the keys in a dictionary that have the same value. But this value is not known. The closest thing that came up was this, but the values are known.
Say I had a dictionary like this and these values are totally random, not hardcoded by me.
{'AGAA': 2, 'ATAA': 5,'AJAA':2}
How can I identify all the keys with the same value? What would be the most efficient way of doing this.
['AGAA','AJAA']
The way I would do it is "invert" the dictionary. By this I mean to group the keys for each common value. So if you start with:
{'AGAA': 2, 'ATAA': 5, 'AJAA': 2}
You would want to group it such that the keys are now values and values are now keys:
{2: ['AGAA', 'AJAA'], 5: ['ATAA']}
After grouping the values, you can use max to determine the largest grouping.
Example:
from collections import defaultdict
data = {'AGAA': 2, 'ATAA': 5, 'AJAA': 2}
grouped = defaultdict(list)
for key in data:
grouped[data[key]].append(key)
max_group = max(grouped.values(), key=len)
print(max_group)
Outputs:
['AGAA', 'AJAA']
You could also find the max key and print it that way:
max_key = max(grouped, key=lambda k: len(grouped[k]))
print(grouped[max_key])
You can try this:
from collections import Counter
d = {'AGAA': 2, 'ATAA': 5,'AJAA':2}
l = Counter(d.values())
l = [x for x,y in l.items() if y > 1]
out = [x for x,y in d.items() if y in l]
# Out[21]: ['AGAA', 'AJAA']
Related
I have a dictionary of words, each of which with a certain point value. I would dictionary to search though this dictionary for a random word with a specific point value, i.e. find a random word with a point value of 3. my dictionary is structured like this:
wordList = {"to":1,"as":1,"be":1,"see":2,"bed":2,"owl":2,"era":2,"alive":3,"debt":3,"price":4,"stain":4} #shortened list obviously
Looked around online and I couldn't find a great answer, that or I did and I just didn't quite get it.
I would use random.choice() with a list comprehension:
from random import choice
choice([word for word, count in wordList.items() if count == 3])
If you don't care about performance, that will work but it will recreate a dictionary every time you access it:
random.choice([k for k,v in wordList.items() if v == 3])
otherwise it's could be better to create a reversed dictionary, to save the time in multiple runs:
from random import choice
from collections import defaultdict
rev = defaultdict(list)
for k, v wordList.items():
rev[v].append(k)
...
choice(rev[3])
I think using if statement and random.choice answers your problem in a short time
from random import choice
wordList = {"to": 1, "as": 1, "be": 1, "see": 2, "bed": 2, "owl": 2, "era": 2,
"alive": 3, "debt": 3, "price": 4, "stain": 4} # shortened list obviously
value = int(input())
lst = []
for key,val in wordList.items():
if val == value:
lst.append(key)
print(choice(lst))
one-liner:
choice([key for key, val in wordList.items() if val == value])
I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]
Sorry the topic's title is vague, I find it hard to explain.
I have a dictionary in which each value is a list of items. I wish to remove the duplicated items, so that each item will appear minimum times (preferable once) in the lists.
Consider the dictionary:
example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]}
'weapon2' and 'weapon3' have the same values, so it should result in:
result_dictionary = {"weapon1":[1],"weapon2":[3],"weapon3":[2]}
since I don't mind the order, it can also result in:
result_dictionary = {"weapon1":[1],"weapon2":[2],"weapon3":[3]}
But when "there's no choice" it should leave the value. Consider this new dictionary:
example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3],"weapon4":[3]}
now, since it cannot assign either '2' or '3' only once without leaving a key empty, a possible output would be:
result_dictionary = {"weapon1":[1],"weapon2":[3],"weapon3":[2],"weapon4":[3]}
I can relax the problem to only the first part and manage, though I prefer a solution to the two parts together
#!/usr/bin/env python3
example_dictionary = {"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]}
result = {}
used_values = []
def extract_semi_unique_value(my_list):
for val in my_list:
if val not in used_values:
used_values.append(val)
return val
return my_list[0]
for key, value in example_dictionary.items():
semi_unique_value = extract_semi_unique_value(value)
result[key] = [semi_unique_value]
print(result)
This is probably not the most efficient solution possible. Because it involves iteration over all possible combinations, then it'll run quite slow for large targets.
It makes use of itertools.product() to get all possible combinations. Then in it, tries to find the combination with the most unique numbers (by testing the length of a set).
from itertools import product
def dedup(weapons):
# get the keys and values ordered so we can join them back
# up again at the end
keys, vals = zip(*weapons.items())
# because sets remove all duplicates, whichever combo has
# the longest set is the most unique
best = max(product(*vals), key=lambda combo: len(set(combo)))
# combine the keys and whatever we found was the best combo
return {k: [v] for k, v in zip(keys, best)}
From the examples:
dedup({"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3]})
#: {'weapon1': 1, 'weapon2': 2, 'weapon3': 3}
dedup({"weapon1":[1,2,3],"weapon2":[2,3],"weapon3":[2,3],"weapon4":[3]})
#: {'weapon1': 1, 'weapon2': 2, 'weapon3': 2, 'weapon4': 3}
this could help
import itertools
res = {'weapon1': [1, 2, 3], 'weapon2': [2, 3], 'weapon3': [2, 3]}
r = [[x] for x in list(set(list(itertools.chain.from_iterable(res.values()))))]
r2 = [x for x in res.keys()]
r3 = list(itertools.product(r2,r))
r4 = dict([r3[x] for x in range(0,len(r3)) if not x%4])
I am pretty new to all of this so this might be a noobie question.. but I am looking to find length of dictionary values... but I do not know how this can be done.
So for example,
d = {'key':['hello', 'brave', 'morning', 'sunset', 'metaphysics']}
I was wondering is there a way I can find the len or number of items of the dictionary value.
Thanks
Sure. In this case, you'd just do:
length_key = len(d['key']) # length of the list stored at `'key'` ...
It's hard to say why you actually want this, but, perhaps it would be useful to create another dict that maps the keys to the length of values:
length_dict = {key: len(value) for key, value in d.items()}
length_key = length_dict['key'] # length of the list stored at `'key'` ...
Lets do some experimentation, to see how we could get/interpret the length of different dict/array values in a dict.
create our test dict, see list and dict comprehensions:
>>> my_dict = {x:[i for i in range(x)] for x in range(4)}
>>> my_dict
{0: [], 1: [0], 2: [0, 1], 3: [0, 1, 2]}
Get the length of the value of a specific key:
>>> my_dict[3]
[0, 1, 2]
>>> len(my_dict[3])
3
Get a dict of the lengths of the values of each key:
>>> key_to_value_lengths = {k:len(v) for k, v in my_dict.items()}
{0: 0, 1: 1, 2: 2, 3: 3}
>>> key_to_value_lengths[2]
2
Get the sum of the lengths of all values in the dict:
>>> [len(x) for x in my_dict.values()]
[0, 1, 2, 3]
>>> sum([len(x) for x in my_dict.values()])
6
To find all of the lengths of the values in a dictionary you can do this:
lengths = [len(v) for v in d.values()]
A common use case I have is a dictionary of numpy arrays or lists where I know they're all the same length, and I just need to know one of them (e.g. I'm plotting timeseries data and each timeseries has the same number of timesteps). I often use this:
length = len(next(iter(d.values())))
Let dictionary be :
dict={'key':['value1','value2']}
If you know the key :
print(len(dict[key]))
else :
val=[len(i) for i in dict.values()]
print(val[0])
# for printing length of 1st key value or length of values in keys if all keys have same amount of values.
d={1:'a',2:'b'}
sum=0
for i in range(0,len(d),1):
sum=sum+1
i=i+1
print i
OUTPUT=2
This seems like such an obvious thing that I feel like I'm missing out on something, but how do you find out if two different keys in the same dictionary have the exact same value? For example, if you have the dictionary test with the keys a, b, and c and the keys a and b both have the value of 10, how would you figure that out? (For the point of the question, please assume a large number of keys, say 100, and you have no knowledge of how many duplicates there are, if there are multiple sets of duplicates, or if there are duplicates at all). Thanks.
len(dictionary.values()) == len(set(dictionary.values()))
This is under the assumption that the only thing you want to know is if there are any duplicate values, not which values are duplicates, which is what I assumed from your question. Let me know if I misinterpreted the question.
Basically this is just checking if any entries were removed when the values of the dictionary were casted to an object that by definition doesn't have any duplicates.
If the above doesn't work for your purposes, this should be a better solution:
set(k for k,v in d.items() if d.values().count(v) > 1))
Basically the second version just checks to see if there is more than one entry that will be removed if you try popping it out of the list.
To detect all of these cases:
>>> import collections
>>> d = {"a": 10, "b": 15, "c": 10}
>>> value_to_key = collections.defaultdict(list)
>>> for k, v in d.iteritems():
... value_to_key[v].append(k)
...
>>> value_to_key
defaultdict(<type 'list'>, {10: ['a', 'c'], 15: ['b']})
#hivert makes the excellent point that this only works if the values are hashable. If this is not the case, there is no nice O(n) solution(sadly). This is the best I can come up with:
d = {"a": [10, 15], "b": [10, 20], "c": [10, 15]}
values = []
for k, v in d.iteritems():
must_insert = True
for val in values:
if val[0] == v:
val[1].append(k)
must_insert = False
break
if must_insert: values.append([v, [k]])
print [v for v in values if len(v[1]) > 1] #prints [[[10, 15], ['a', 'c']]]
You can tell which are the duplicate values by means of a reverse index - where the key is the duplicate value and the value is the set of keys that have that value (this will work as long as the values in the input dictionary are hashable):
from collections import defaultdict
d = {'w':20, 'x':10, 'y':20, 'z':30, 'a':10}
dd = defaultdict(set)
for k, v in d.items():
dd[v].add(k)
dd = { k : v for k, v in dd.items() if len(v) > 1 }
dd
=> {10: set(['a', 'x']), 20: set(['y', 'w'])}
From that last result it's easy to obtain the set of keys with duplicate values:
set.union(*dd.values())
=> set(['y', 'x', 'a', 'w'])
dico = {'a':0, 'b':0, 'c':1}
result = {}
for val in dico:
if dico[val] in result:
result[dico[val]].append(val)
else:
result[dico[val]] = [val]
>>> result
{0: ['a', 'b'], 1: ['c']}
Then you can filter on the result's key that has a value (list) with more than one element, e.g. a duplicate has been found
Build another dict mapping the values of the first dict to all keys that hold that value:
import collections
inverse_dict = collections.defaultdict(list)
for key in original_dict:
inverse_dict[original_dict[key]].append(key)
keys = set()
for key1 in d:
for key2 in d:
if key1 == key2: continue
if d[key1] == d[key2]:
keys |= {key1, key2}
i.e. that's Θ(n²) what you want. The reason is that a dict does not provide Θ(1) search of a key, given a value. So better rethink your data structure choices if that's not good enough.
You can use list in conjunction with dictionary to find duplicate elements!
Here is a simple code demonstrating the same:
d={"val1":4,"val2":4,"val3":5,"val4":3}
l=[]
for key in d:
l.append(d[key])
l.sort()
print(l)
for i in range(len(l)):
if l[i]==l[i+1]:
print("true, there are duplicate elements.")
print("the keys having duplicate elements are: ")
for key in d:
if d[key]==l[i]:
print(key)
break
output:
runfile('C:/Users/Andromeda/listeqtest.py', wdir='C:/Users/Andromeda')
[3, 4, 4, 5]
true, there are duplicate elements.
the keys having duplicate elements are:
val1
val2
when you sort the elements in the list, you will find that equal values always appear together!