Python : Match a dictionary value with another dictionary key - python

I have two dictionaries created this way :
tr = defaultdict(list)
tr = { 'critic' : '2_critic',
'major' : '3_major',
'all' : ['2_critic','3_major']
}
And the second one :
scnd_dict = defaultdict(list)
And contains values like this :
scnd_dict = {'severity': ['all']}
I want to have a third dict that will contain the key of scnd_dict and its corresponding value from tr.
This way, I will have :
third_dict = {'severity' : ['2_critic','3_major']}
I tried this, but it didn't work :
for (k,v) in scnd_dict.iteritems() :
if v in tr:
third_dict[k].append(tr[v])
Any help would be appreciated. Thanks.

Well...
from collections import defaultdict
tr = {'critic' : '2_critic',
'major' : '3_major',
'all' : ['2_critic','3_major']}
scnd_dict = {'severity': ['all']}
third_dict = {}
for k, v in scnd_dict.iteritems():
vals = []
if isinstance(v, list):
for i in v:
vals.append(tr.get(i))
else:
vals.append(tr.get(v))
if not vals:
continue
third_dict[k] = vals
print third_dict
Results:
>>>
{'severity': [['2_critic', '3_major']]}
Will do what you want. But I question the logic of using defaultdicts here, or of have your index part of a list...
If you use non-lists for scnd_dict then you can do the whole thing much easier. Assuming scnd_dict looks like this: scnd_dict = {'severity': 'all'}:
d = dict((k, tr.get(v)) for k, v in scnd_dict.items())
# {'severity': ['2_critic', '3_major']}

Your problem is that v is a list, not an item of a list. So, the if v in tr: will be false. Change your code so that you iterate over the items in v

third_dict = {k: [t for m in ks for t in tr[m]] for k,ks in scnd_dict.iteritems()}

The second dict's value is list, not str, so the code blow will work
for (k, v) in send_dict.iteritems():
if v[0] in tr.keys():
third_dict[k] = tr[v[0]]

The problem is that the third dictionary does not knows that the values is a list
for k in scnd_dict:
for v in scnd_dict[k]:
print v
for k2 in tr:
if v==k2:
if k not in third_dict:
third_dict[k]=tr[k2]
else:
third_dict[k]+=tr[k2]

third_dict = {k: tr[v[0]] for k, v in scnd_dict.iteritems() if v[0] in tr}
This
tr = defaultdict(list)
is a waste of time if you are just rebinding tr on the next line. Likewise for scnd_dict.
It's a better idea to make all the values of tr lists - even if they only have one item. It will mean less special cases to worry about later on.

Related

python dictionary of value has list mapping

I have dictionary of list as input
x={'a':[1,2,3,4,5],'b':[9,2,3,4,5]}
I want output like this
[{a:1,b:9},{a:2,b:2},{a:3,b:3},{a:4,b:4},{a:5,b:5}]
I spent two days for this but did not get. thank you.
Try this:
l = []
for i in range(len(list(x.values())[0])):
d = {}
for k, v in x.items():
d[k] = v[i]
l.append(d)
you can use:
[dict(zip(x, v)) for v in zip(*x.values())]

Read a string in dictionary values and replace a specific character

I have a dict in which each value is a string. In some values, this string has "-" that I would like to remove. I have been told that it is not possible to replace the values of a dict. Is that right?
mydict
'GCA_000010565.1_genomic Ribosomal_L10:': '-TRAEKEAIIQELKEKFKEARVAVLADYRGLNV-------AEATRLRRRLREAGCEFKVAKNTLTGLAARQAGLE-----GLDPYLEGPIAIAFG-VDPVAPAKVLSDF--',
I would wish something like
mydict
'GCA_000010565.1_genomic Ribosomal_L10:': 'TRAEKEAIIQELKEKFKEARVAVLADYRGLNVAEATRLRRRLREAGCEFKVAKNTLTGLAARQAGLEGLDPYLEGPIAIAFGVDPVAPAKVLSDF',
Absolutly you can, just iterate over the mappings key/value, and change the associated value by the processed one
d = {'superkey': "foo--bar", 'superkey2': "--foo--bar",
'GCA_000010565.1_genomic Ribosomal_L10:': '-TRAEKEAIIQELKEKFKEARVAVLADYRGLNV-------AEATRLRRRLREAGCEFKVAKNTLTGLAARQAGLE-----GLDPYLEGPIAIAFG-VDPVAPAKVLSDF--', }
# LOOP version
for k, v in d.items():
d[k] = v.replace("-", "")
# DICT COMPREHENSION version
d = {k: v.replace("-", "") for k, v in d.items()}
print(d) # {'superkey': 'foobar', 'superkey2': 'foobar',
'GCA_000010565.1_genomic Ribosomal_L10:': 'TRAEKEAIIQELKEKFKEARVAVLADYRGLNVAEATRLRRRLREAGCEFKVAKNTLTGLAARQAGLEGLDPYLEGPIAIAFGVDPVAPAKVLSDF'}
Yes it is possible. You can simply use
mydict['GCA_000010565.1_genomic Ribosomal_L10:'] = mydict['GCA_000010565.1_genomic Ribosomal_L10:'].replace("-","")
No, you've been told BS. The solution:
for k in mydict:
mydict[k] = mydict[k].replace('-', '')

Create dictionary from dict and list

I have a dictionary :
dicocategory = {}
dicocategory["a"] = ["crapow", "Killian", "pauk", "victor"]
dicocategory["b"] = ["graton", "fred"]
dicocategory["c"] = ["babar", "poca", "german", "Georges", "nowak"]
dicocategory["d"] = ["crado", "cradi", "hibou", "distopia", "fiboul"]
dicocategory["e"] = ["makenkosapo"]
and a list :
my_list = ['makenkosapo', 'Killian', 'Georges', 'poca', 'nowak']
I want to create a new dictionary with my dicocategory's keys as new keys and items of my list as values.
To get the keys of my new dict (removing duplicate content and adapted to my list) I made :
def tablemain(my_list ):
tableheaders = list()
for value in my_list:
tableheaders.append([k for k, v in dicocategory.items() if value in v])
convertlist = [j for i in tableheaders for j in i]
headerstablefinal = list(set(convertlist))
return headerstablefinal
giving me:
['e', 'a', 'c']
My problem is: I don't know how to put the items of my list in the corresponding keys.
EDIT :
Bellow an output of what I want
{"a" : ['Killian'], 'c' : ['Georges', 'poca', 'nowak'], 'e' : ['makenkosapo']}
The list my_list can change, so I want something that can create a new dictionary doesn't matter the list.
If my new list is :
my_list = ['crapow', 'german', 'pauk']
My output will be :
{'a':['crapow', 'pauk'], 'c':['german']}
Do you have any idea?
Thank you
You can use a couple of dictionary comprehensions. Calculate the intersection in the first, and in the second remove instances where the intersection is empty:
my_set = set(my_list)
# calculate intersection
res = {k: set(v) & my_set for k, v in dicocategory.items()}
# remove zero intersection values
res = {k: v for k, v in res.items() if v}
print(res)
{'a': {'Killian'},
'c': {'Georges', 'nowak', 'poca'},
'e': {'makenkosapo'}}
More efficiently, you can use a generator expression to avoid an intermediary dictionary:
# generate intersection
gen = ((k, set(v) & my_set) for k, v in dicocategory.items())
# remove zero intersection values
res = {k: v for k, v in gen if v}
You can get a dictionary containing only keys with values that match your list like this:
{k:v for k,v in dicocategory.items() if set(v).intersection(set(my_list))}
You won't be able to put that directly into a DataFrame though as the lists differ in length.

Combine python dictionaries that share values and keys

I am doing some entity matching based on string edit distance and my results are a dictionary with keys (query string) and values [list of similar strings] based on some scoring criteria.
for example:
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': ['benyamin'],
'benyamin': ['benjamin'],
'carl': ['karl'],
'karl': ['carl'],
}
Each value also has a corresponding dictionary item, for which it is the key (e.g. 'carl' and 'karl').
I need to combine the elements that have shared values. Choosing one value as the new key (lets say the longest string). In the above example I would hope to get:
results = {
'benjamin': ['ben', 'benj', 'benyamin', 'beny', 'benjamin', 'benyamin'],
'carl': ['carl','karl']
}
I have tried iterating through the dictionary using the keys, but I can't wrap my head around how to iterate and compare through each dictionary item and its list of values (or single value).
This is one solution using collections.defaultdict and sets.
The desired output is very similar to what you have, and can be easily manipulated to align.
from collections import defaultdict
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': 'benyamin',
'benyamin': 'benjamin',
'carl': 'karl',
'karl': 'carl',
}
d = defaultdict(set)
for i, (k, v) in enumerate(results.items()):
w = {k} | (set(v) if isinstance(v, list) else {v})
for m, n in d.items():
if not n.isdisjoint(w):
d[m].update(w)
break
else:
d[i] = w
result = {max(v, key=len): v for k, v in d.items()}
# {'benjamin': {'ben', 'benj', 'benjamin', 'beny', 'benyamin'},
# 'carl': {'carl', 'karl'}}
Credit to #IMCoins for the idea of manipulating v to w in second loop.
Explanation
There are 3 main steps:
Convert values into a consistent set format, including keys and values from original dictionary.
Cycle through this dictionary and add values to a new dictionary. If there is an intersection with some key [i.e. sets are not disjoint], then use that key. Otherwise, add to new key determined via enumeration.
Create result dictionary in a final transformation by mapping max length key to values.
EDIT : Even though performance was not the question here, I took the liberty to perform some tests between jpp's answer, and mine... here is the full script. My script performs the tests in 17.79 seconds, and his in 23.5 seconds.
import timeit
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': ['benyamin'],
'benyamin': ['benjamin'],
'carl': ['karl'],
'karl': ['carl'],
}
def imcoins(result):
new_dict = {}
# .items() for python3x
for k, v in results.iteritems():
flag = False
# Checking if key exists...
if k not in new_dict.keys():
# But then, we also need to check its values.
for item in v:
if item in new_dict.keys():
# If we update, set the flag to True, so we don't create a new value.
new_dict[item].update(v)
flag = True
if flag == False:
new_dict[k] = set(v)
# Now, to sort our newly created dict...
sorted_dict = {}
for k, v in new_dict.iteritems():
max_string = max(v)
if len(max_string) > len(k):
sorted_dict[max(v, key=len)] = set(v)
else:
sorted_dict[k] = v
return sorted_dict
def jpp(result):
from collections import defaultdict
res = {i: {k} | (set(v) if isinstance(v, list) else {v}) \
for i, (k, v) in enumerate(results.items())}
d = defaultdict(set)
for i, (k, v) in enumerate(res.items()):
for m, n in d.items():
if n & v:
d[m].update(v)
break
else:
d[i] = v
result = {max(v, key=len): v for k, v in d.items()}
return result
iterations = 1000000
time1 = timeit.timeit(stmt='imcoins(results)', setup='from __main__ import imcoins, results', number=iterations)
time2 = timeit.timeit(stmt='jpp(results)', setup='from __main__ import jpp, results', number=iterations)
print time1 # Outputs : 17.7903265883
print time2 # Outputs : 23.5605850732
If I move the import from his function to global scope, it gives...
imcoins : 13.4129249463 seconds
jpp : 21.8191823393 seconds

How to split list inside a dictionnary to create a new one?

I've been struggling on something for the day,
I have a dictionnary under the format
dict = {a:[element1, element2, element3], b:[element4, element5, element6]...}
I want a new dictionnary under the form
newdict = {a:element1, b:element4...}
Meaning only keeping the first element of the lists contained for each value.
You can use a dictionary comprehension:
{k: v[0] for k, v in d.items()}
# {'a': 'element1', 'b': 'element4'}
Hopefully this helps.
I like to check if the dictionary has a key before overwriting a keys value.
dict = {a:[element1, element2, element3], b:[element4, element5, element6]}
Python 2
newDict = {}
for k, v in dict.iteritems():
if k not in newDict:
# add the first list value to the newDict's key
newDick[k] = v[0]
Python 3
newDict = {}
for k, v in dict.items():
if k not in newDict:
# add the first list value to the newDict's key
newDick[k] = v[0]

Categories