Renaming the key in a list of dicts - python

I have a list of dicts:
[{"id" : "2016_a",
"data_1" : 106,
"data_2" : 200},
{"id" : "2015_a",
"data_1" : 110,
"data_2" : 105}
]
I wish to take the id record and use it to create a single, unique list of dicts, such that:
[{"data_1_2016_a" : 106,
"data_1_2015_a" : 110,
"data_2_2016_a" : 200,
"data_2_2015_a" : : 105}]
How do I append the string id to the key value of each other record?

It's simple to loop through and assign.
result = {}
for elem in data:
for k, v in elem.items():
if k != 'id':
result['{}_{}'.format(elem['id'], k)] = v

You can simply use dictionary comprehension:
result = {dic['id']+'_'+k:v for dic in dictionaries
for k,v in dic.items() if k != 'id'}
You can (probably) improve efficiency a bit by using a generator inside:
result = {dicid+k:v for dic,dicid in ((_dic,_dic['id']+'_') for _dic in dictionaries)
for k,v in dic.items() if k != 'id'}
Here we save on lookups (dic['id']) and we also have to append the underscore only once. This results in:
>>> {dicid+k:v for dic,dicid in ((_dic,_dic['id']+'_') for _dic in dictionaries)
... for k,v in dic.items() if k != 'id'}
{'2016_a_data_1': 106, '2015_a_data_2': 105, '2016_a_data_2': 200, '2015_a_data_1': 110}

Related

inversing a dictionary in python with duplicate values

I need to inverse a dictionary so that each old value will now be a key and the old keys will be the new values.
The trick is that there could be multiple values that are the same in the old dictionary so I need each value in the new dictionary to be a list, and if there were identical values in the old dictionary then they both will be in the list of the value of the new dictionary.
for example:
the dictionary {"python" : 1, "is" : 1, "cool" : 2}
would end up as: {1 : ["python", "is"], 2 : ["cool"]}
this is what I tried:
def inverse_dict(my_dict):
new_dict = {}
values_list = list(my_dict.values())
new_dict = new_dict.fromkeys(values_list)
for key in new_dict:
new_dict[key] = []
for old_key in my_dict:
new_dict[my_dict[old_key]] = list(new_dict[my_dict[old_key]]).append(old_key)
return new_dict
Would greatly appreciate any help with my approach (and better approaches to the problem) as I am very new to Python, thanks!
You can use dict.setdefault check if a key exists in the dictionary and if not, create new value (in this case empty list []):
d = {"python" : 1, "is" : 1, "cool" : 2}
reversed_d = {}
for k, v in d.items():
reversed_d.setdefault(v, []).append(k)
print(reversed_d)
Prints:
{1: ['python', 'is'], 2: ['cool']}
This can be more explicitly rewritten as:
d = {"python" : 1, "is" : 1, "cool" : 2}
reversed_d = {}
for k, v in d.items():
if v not in reversed_d:
reversed_d[v] = [k]
else:
reversed_d[v].append(k)
print(reversed_d)
You can use a defaultdict to avoid the pre-fill step
from collections import defaultdict
def inverse_dict(my_dict: dict):
new_dict = defaultdict(list)
for k, v in my_dict.items():
new_dict[v].append(k)
return new_dict
Though I prefer #azro's answer with the default dict, another solution is doing it with dictionary and list comprehensions.
It looks like this:
{value : [key for key in my_dict if my_dict[key] == value] for value in set(my_dict.values())}
What it does is runs over the values of the dictionary without duplicates - set(my_dict.values()).
It builds every value as a key (because it's on the left side of the ":").
And its value is a list of the keys that point to that value - [key for key in my_dict if my_dict[key] == value].

Dict Comprehension: appending to a key value if key is present, create a new key:value pair if key is not present

My code is as follows:
for user,score in data:
if user in res.keys():
res[user] += [score]
else:
res[user] = [score]
where data is a list of lists arranged as such:
data = [["a",100],["b",200],["a",50]]
and the result i want is:
res = {"a":[100,50],"b":[200]}
Would it be possible to do this with a single dictionary comprehension?
This can be simplified using dict.setdefault or collections.defaultdict
Ex:
data = [["a",100],["b",200],["a",50]]
res = {} #or collections.defaultdict(list)
for k, v in data:
res.setdefault(k, []).append(v) #if defaultdict use res[k].append(v)
print(res)
Output:
{'a': [100, 50], 'b': [200]}
you could use .update for your dictionary..
data = [["a",100],["b",200],["a",50]]
dictionary = dict()
for user,score in data:
if user in dictionary.keys():
dictionary[user] += [score]
else:
dictionary.update({user:[score]})
output:
{'a': [100, 50], 'b': [200]}

Create dictionary from dict and list

I have a dictionary :
dicocategory = {}
dicocategory["a"] = ["crapow", "Killian", "pauk", "victor"]
dicocategory["b"] = ["graton", "fred"]
dicocategory["c"] = ["babar", "poca", "german", "Georges", "nowak"]
dicocategory["d"] = ["crado", "cradi", "hibou", "distopia", "fiboul"]
dicocategory["e"] = ["makenkosapo"]
and a list :
my_list = ['makenkosapo', 'Killian', 'Georges', 'poca', 'nowak']
I want to create a new dictionary with my dicocategory's keys as new keys and items of my list as values.
To get the keys of my new dict (removing duplicate content and adapted to my list) I made :
def tablemain(my_list ):
tableheaders = list()
for value in my_list:
tableheaders.append([k for k, v in dicocategory.items() if value in v])
convertlist = [j for i in tableheaders for j in i]
headerstablefinal = list(set(convertlist))
return headerstablefinal
giving me:
['e', 'a', 'c']
My problem is: I don't know how to put the items of my list in the corresponding keys.
EDIT :
Bellow an output of what I want
{"a" : ['Killian'], 'c' : ['Georges', 'poca', 'nowak'], 'e' : ['makenkosapo']}
The list my_list can change, so I want something that can create a new dictionary doesn't matter the list.
If my new list is :
my_list = ['crapow', 'german', 'pauk']
My output will be :
{'a':['crapow', 'pauk'], 'c':['german']}
Do you have any idea?
Thank you
You can use a couple of dictionary comprehensions. Calculate the intersection in the first, and in the second remove instances where the intersection is empty:
my_set = set(my_list)
# calculate intersection
res = {k: set(v) & my_set for k, v in dicocategory.items()}
# remove zero intersection values
res = {k: v for k, v in res.items() if v}
print(res)
{'a': {'Killian'},
'c': {'Georges', 'nowak', 'poca'},
'e': {'makenkosapo'}}
More efficiently, you can use a generator expression to avoid an intermediary dictionary:
# generate intersection
gen = ((k, set(v) & my_set) for k, v in dicocategory.items())
# remove zero intersection values
res = {k: v for k, v in gen if v}
You can get a dictionary containing only keys with values that match your list like this:
{k:v for k,v in dicocategory.items() if set(v).intersection(set(my_list))}
You won't be able to put that directly into a DataFrame though as the lists differ in length.

Combine python dictionaries that share values and keys

I am doing some entity matching based on string edit distance and my results are a dictionary with keys (query string) and values [list of similar strings] based on some scoring criteria.
for example:
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': ['benyamin'],
'benyamin': ['benjamin'],
'carl': ['karl'],
'karl': ['carl'],
}
Each value also has a corresponding dictionary item, for which it is the key (e.g. 'carl' and 'karl').
I need to combine the elements that have shared values. Choosing one value as the new key (lets say the longest string). In the above example I would hope to get:
results = {
'benjamin': ['ben', 'benj', 'benyamin', 'beny', 'benjamin', 'benyamin'],
'carl': ['carl','karl']
}
I have tried iterating through the dictionary using the keys, but I can't wrap my head around how to iterate and compare through each dictionary item and its list of values (or single value).
This is one solution using collections.defaultdict and sets.
The desired output is very similar to what you have, and can be easily manipulated to align.
from collections import defaultdict
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': 'benyamin',
'benyamin': 'benjamin',
'carl': 'karl',
'karl': 'carl',
}
d = defaultdict(set)
for i, (k, v) in enumerate(results.items()):
w = {k} | (set(v) if isinstance(v, list) else {v})
for m, n in d.items():
if not n.isdisjoint(w):
d[m].update(w)
break
else:
d[i] = w
result = {max(v, key=len): v for k, v in d.items()}
# {'benjamin': {'ben', 'benj', 'benjamin', 'beny', 'benyamin'},
# 'carl': {'carl', 'karl'}}
Credit to #IMCoins for the idea of manipulating v to w in second loop.
Explanation
There are 3 main steps:
Convert values into a consistent set format, including keys and values from original dictionary.
Cycle through this dictionary and add values to a new dictionary. If there is an intersection with some key [i.e. sets are not disjoint], then use that key. Otherwise, add to new key determined via enumeration.
Create result dictionary in a final transformation by mapping max length key to values.
EDIT : Even though performance was not the question here, I took the liberty to perform some tests between jpp's answer, and mine... here is the full script. My script performs the tests in 17.79 seconds, and his in 23.5 seconds.
import timeit
results = {
'ben' : ['benj', 'benjamin', 'benyamin'],
'benj': ['ben', 'beny', 'benjamin'],
'benjamin': ['benyamin'],
'benyamin': ['benjamin'],
'carl': ['karl'],
'karl': ['carl'],
}
def imcoins(result):
new_dict = {}
# .items() for python3x
for k, v in results.iteritems():
flag = False
# Checking if key exists...
if k not in new_dict.keys():
# But then, we also need to check its values.
for item in v:
if item in new_dict.keys():
# If we update, set the flag to True, so we don't create a new value.
new_dict[item].update(v)
flag = True
if flag == False:
new_dict[k] = set(v)
# Now, to sort our newly created dict...
sorted_dict = {}
for k, v in new_dict.iteritems():
max_string = max(v)
if len(max_string) > len(k):
sorted_dict[max(v, key=len)] = set(v)
else:
sorted_dict[k] = v
return sorted_dict
def jpp(result):
from collections import defaultdict
res = {i: {k} | (set(v) if isinstance(v, list) else {v}) \
for i, (k, v) in enumerate(results.items())}
d = defaultdict(set)
for i, (k, v) in enumerate(res.items()):
for m, n in d.items():
if n & v:
d[m].update(v)
break
else:
d[i] = v
result = {max(v, key=len): v for k, v in d.items()}
return result
iterations = 1000000
time1 = timeit.timeit(stmt='imcoins(results)', setup='from __main__ import imcoins, results', number=iterations)
time2 = timeit.timeit(stmt='jpp(results)', setup='from __main__ import jpp, results', number=iterations)
print time1 # Outputs : 17.7903265883
print time2 # Outputs : 23.5605850732
If I move the import from his function to global scope, it gives...
imcoins : 13.4129249463 seconds
jpp : 21.8191823393 seconds

How to split list inside a dictionnary to create a new one?

I've been struggling on something for the day,
I have a dictionnary under the format
dict = {a:[element1, element2, element3], b:[element4, element5, element6]...}
I want a new dictionnary under the form
newdict = {a:element1, b:element4...}
Meaning only keeping the first element of the lists contained for each value.
You can use a dictionary comprehension:
{k: v[0] for k, v in d.items()}
# {'a': 'element1', 'b': 'element4'}
Hopefully this helps.
I like to check if the dictionary has a key before overwriting a keys value.
dict = {a:[element1, element2, element3], b:[element4, element5, element6]}
Python 2
newDict = {}
for k, v in dict.iteritems():
if k not in newDict:
# add the first list value to the newDict's key
newDick[k] = v[0]
Python 3
newDict = {}
for k, v in dict.items():
if k not in newDict:
# add the first list value to the newDict's key
newDick[k] = v[0]

Categories