Python creating dictionary from list and tuple - python

When I iterate over a dictionary like so:
dict2={
'Joe':('Caucasian','Male', 35, 7.5),
'Kevin':('Black','Male', 55, 9.5),
More tuples here like the one above
}
The data is bigger but it doesn't matter here.
What I am trying to accomplish is to create a new dictionary with the information from the tuples. Like so:
dict_i_want = {
"Name": Joe,
"Ethiniticy": "Caucasian",
"Gender":"Male",
"Voter_age": 35,
"Score": 7.5
}
Here is my code:
dict_i_want = {}
for k,v in dict2.items():
dict_i_want["Name"] = k
dict_i_want['Ethiniticy'] = v[0]
dict_i_want['Gender'] = v[1]
dict_i_want['Voter_age'] = v[2]
dict_i_want['Score'] = v[3]
But when I do
print(dict_i_want)
{'Name': 'Kevin', 'Ethiniticy': 'Black', 'Gender': 'Male', 'Voter_age': 55, 'Score': 9.5}
The result is just the last tuple that I have in mydict2. No all the tuples.
What I am doing wrong if I have the loop?
PS: I don't want to use any modules or import anything here. No built-in function like zip() or something like that. I want to hard code the solution

#ForceBru answered your question - your best bet is a list of dictionaries unless you want to create a dictionary of dictionaries with unique keys for each sub-dictionary. Going with the list approach you could do something like this:
Example:
from pprint import pprint
dict2 = {
'Joe': ('Caucasian', 'Male', 35, 7.5),
'Kevin': ('Black', 'Male', 55, 9.5),
}
dicts_i_want = [
{"name": name, "ethnicity": ethnicity, "gender": gender, "voter_age": voter_age, "score": score}
for name, (ethnicity, gender, voter_age, score) in dict2.items()
]
pprint(dicts_i_want)
Output:
[{'ethnicity': 'Caucasian',
'gender': 'Male',
'name': 'Joe',
'score': 7.5,
'voter_age': 35},
{'ethnicity': 'Black',
'gender': 'Male',
'name': 'Kevin',
'score': 9.5,
'voter_age': 55}]

Dict keys has to be unique. You're just overwriting your dict each cycle in your loop. It's just how dicts work.

Related

Merging two list of dicts with different keys effectively

I've got two lists:
lst1 = [{"name": "Hanna", "age":3},
{"name": "Kris", "age": 18},
{"name":"Dom", "age": 15},
{"name":"Tom", "age": 5}]
and the second one contains a few of above key name values under different key:
lst2 = [{"username": "Kris", "Town": "Big City"},
{"username":"Dom", "Town": "NYC"}]
I would like to merge them with result:
lst = [{"name": "Hanna", "age":3},
{"name": "Kris", "age": 18, "Town": "Big City"},
{"name":"Dom", "age": 15, "Town": "NYC"},
{"name":"Tom", "age":"5"}]
The easiest way is to go one by one (for each element from lst1, check whether it exists in lst2), but for big lists, this is quite ineffective (my lists have a few hundred elements each). What is the most effective way to achieve this?
To avoid iterating over another list again and again, you can build a name index first.
lst1 = [{"name": "Hanna", "age":3},
{"name": "Kris", "age": 18},
{"name":"Dom", "age": 15},
{"name":"Tom", "age": 5}]
lst2 = [{"username": "Kris", "Town": "Big City"},
{"username":"Dom", "Town": "NYC"}]
name_index = { dic['username'] : idx for idx, dic in enumerate(lst2) if dic.get('username') }
for dic in lst1:
name = dic.get('name')
if name in name_index:
dic.update(lst2[name_index[name]]) # update in-place to further save time
dic.pop('username')
print(lst1)
One way to do this a lot more efficient than by lists is to create an intermediate dictionary from lst1 with name as key, so that you're searching a dictionary not a list.
d1 = {elem['name']: {k:v for k,v in elem.items()} for elem in lst1}
for elem in lst2:
d1[elem['username']].update( {k:v for k,v in elem.items() if k != 'username'} )
lst = list(d1.values())
Output:
[{'name': 'Hanna', 'age': 3}, {'name': 'Kris', 'age': 18, 'Town': 'Big City'}, {'name': 'Dom', 'age': 15, 'Town': 'NYC'}, {'name': 'Tom', 'age': 5}]
edited to only have one intermediate dict
Use zip function to pair both lists. We need to order both lists using some criteria, in this case, you must use the username and name keys for the lists because those values will be your condition to perform the updating action, for the above reason is used the sorted function with key param. It is important to sort them out to get the match.
Finally your list lst2 has a little extra procedure, I expanded it taking into account the length of lst1, that is what I do using lst2 * abs(len(lst1) - len(lst2). Theoretically, you are iterating once over an iterable zip object, therefore I consider this could be a good solution for your requirements.
for d1, d2 in zip(sorted(lst1, key=lambda d1: d1['name']),
sorted(lst2 * abs(len(lst1) - len(lst2)), key=lambda d2: d2['username'])):
if d1['name'] == d2['username']:
d1.update(d2)
# Just we delete the username
del d1['username']
print(lst1)
Output:
[{'name': 'Hanna', 'age': 3}, {'name': 'Kris', 'age': 18, 'Town': 'Big City'}, {'name': 'Dom', 'age': 15, 'Town': 'NYC'}, {'name': 'Tom', 'age': 5}]

Python - Create list of dictionaries from multiple lists of values

I have multiple lists of data, for example: age, name, gender, etc. All of them in order, meaning that the x record of every list belongs to the same person.
What I'm trying to create is a list of dictionaries from these lists in the best pythonic way. I was able to create it using one of the lists, but not sure how to scale it from there.
What I currently have:
ages = [20, 21, 30]
names = ["Jhon", "Daniel", "Rob"]
list_of_dicts = [{"age": value} for value in ages]
It returns:
[{'age': 20}, {'age': 21}, {'age': 30}]
What I want:
[{'age': 20, 'name': 'Jhon'}, {'age': 21, 'name': 'Daniel'}, {'age': 30, 'name': 'Rob'}]
You need to zip:
ages = [20, 21, 30]
names = ["Jhon", "Daniel", "Rob"]
list_of_dicts = [{"age": value, 'name': name}
for value, name in zip(ages, names)]
You can take this one step further and use a double zip (useful if you have many more keys):
keys = ['ages', 'names']
lists = [ages, names]
list_of_dicts = [dict(zip(keys, x)) for x in zip(*lists)]
output:
[{'age': 20, 'name': 'Jhon'},
{'age': 21, 'name': 'Daniel'},
{'age': 30, 'name': 'Rob'}]
Less obvious code than #mozway's, but has imho one advantage - it relies only on a single definition of a mapping dictionary so if you need to add/remove keys you have to change only one k:v pair.
ages = [20, 21, 30]
names = ["Jhon", "Daniel", "Rob"]
d = {
"name" : names,
"age" : ages
}
list_of_dicts = [dict(zip(d,t)) for t in zip(*d.values())]
print(list_of_dicts)

How to ignore a single/multiple keys of all the dictionaries while looping over a list of dictionaries?

I am looping over a list of dictionaries and I have to drop/ignore either one or more keys of the each dictionary in the list and write it to a MongoDB. What is the efficient pythonic way of doing this ?
Example:
employees = [
{'name': "Tom", 'age': 10, 'salary': 10000, 'floor': 10},
{'name': "Mark", 'age': 5, 'salary': 12000, 'floor': 11},
{'name': "Pam", 'age': 7, 'salary': 9500, 'floor': 9}
]
Let's say I want to drop key = 'floor' or keys = ['floor', 'salary'].
Currently I am using del employees['floor'] inside the loop to delete the key and my_collection.insert_one() to simply write the dictionary into my MongoDB.
My code:
for d in employees:
del d['floor']
my_collection.insert_one(d)
The solution you proposed is the most efficient to use since you have no control on what happens inside the method insert_one.
If you have more keys, just loop over them:
ignored_keys = ['floor', 'salary']
for d in employees:
for k in ignored_keys:
del d[k]
my_collection.insert_one(d)
Let's say you want to drop keys = ['floor', 'salary']. You can try:
exclude_keys = ['salary', 'floor']
for d in employees:
my_collection.insert_one({k: d[k] for k in set(list(d.keys())) - set(exclude_keys)})

How to store information from a loop function?

Could you please help me store the 'name' and 'gender' into a new pandas.DataFrame from the following loop's outcome?
Here's my loop function:
def predict_gender_combined(name_input):
d_2=GenderDetector()
g_2=d_2.get_gender(name_input)
g_3= Genderize().get([name_input])
print(f'{g_2}\n{g_3}')
print('---------------')
return(g_2,g_3)
name_list= ['Anna', 'Maria']
for name in name_list:
_=predict_gender_combined(name)
outcome:
Person(title=None, first_name='anna', last_name=None, email=None, gender='f')
[{'name': 'Anna', 'gender': 'female', 'probability': 0.98, 'count': 383713}]
---------------
Person(title=None, first_name='maria', last_name=None, email=None, gender='f')
[{'name': 'Maria', 'gender': 'female', 'probability': 0.98, 'count': 334287}]
---------------
Goal: To create a new pandas.DataFrame, with first column "name" and second column "gender"
name gender
Anna f
Maria f
Attempt:
prediction_list = list()
name_list= ['Anna', 'Maria']
for name in name_list:
prediction=predict_gender_combined(name)
prediction_list.append(prediction)
This is what dictionary comprehensions are for.
# This := syntax is an "assignment expression" that is available in Python 3.8+
result = {"name": predicted[0]["name"], "gender": predicted[0]["gender"] for predicted := predict_gender_combined(name) in name_list}
It is, however a lot. Let's write that out so it's a little easier to read:
result = {}
for name in name_list:
predicted = predict_gender_combined(name)
result["name"] = predicted[0]["name"]
result["gender"] = predicted[0]["gender"]
I'm going to make some assumptions on what you're hoping to do here:
you're trying to get a list of dictionaries
each dictionary holds a name and holds a count for the number of times the names occurred in name_list.
I'm not sure what the probability key is used for, and I don't know what the g_3 is used for in your defined function, so I'll have to leave that up to you. But given these assumptions, here's what I would recommend:
If you really want a list of dictionaries, that's fine, but it would probably be easier to first make a dictionary of dictionaries and then convert it to a list, e.g.,
{
"Tim": {'name': 'Tim', 'gender': 'M', 'probability': 0.0, 'count': 4}, "Sam": {'name': 'Sam', 'gender': 'F', 'probability': 0.0, 'count': 5},
...
}
Then, you could use the following code:
name_list=list_of_users
name_dict={}
for name in name_list:
test_list=predict_gender_combined(name)
if name in name_dict:
name_dict[name] = {'name': name, 'gender': test_list[0], 'probability': 0.0, 'count': 1}
else:
name_dict[name]['count'] += 1
final_list=list(name_dict.values())
Hope that gets you started.

Check the values in complex dict of dicts with another dict of dicts and save it a third dictionary

The input dictionary of dictionaries are dict1 and dict2.
dict1 = {company1:[{'age':27,'weight':200,'name':'john'},{'age':23,'weight':180,'name':'peter'}],
company2:[{'age':30,'weight':190,'name':'sam'},{'age':32,'weight':210,'name':'clove'},{'age':21,'weight':170,'name':'steve'}],
company3:[{'age':36,'weight':175,'name':'shaun'},{'age':40,'weight':205,'name':'dany'},{'age':25,'weight':160,'name':'mark'}]
company4:[{'age':36,'weight':155,'name':'lina'},{'age':40,'weight':215,'name':'sammy'},{'age':25,'weight':190,'name':'matt'}]
}
dict2 = {company2:[{'age':30},{'age':45},{'age':52}],
company4:[{'age':43},{'age':67},{'age':22},{'age':34},{'age':42}]
}
I am trying to write a logic where I can check inner key ('age') of each compay key in dict2 exist in same company key dict1, even if one value of inner key 'age' matches with inner key ('age') in dict1 of same company key, then save it to a third dictionary. Please check the below example
Example:
company2:[{'age':30}]
matches with
company2:[{'age':30,'weight':190,'name':'sam'}, ...]
Also I want to save the key:values of dict1 which doesn't appered in dict2 to the dict3, As we can see in the below example company1 key does not apper in dict2.
Example:
company1:[{'age':27,'weight':200,'name':'john'},{'age':23,'weight':180,'name':'peter'}]
and
company3:[{'age':36,'weight':175,'name':'shaun'},{'age':40,'weight':205,'name':'dany'},{'age':25,'weight':160,'name':'mark'}]
Expected Output:
dict3 = {company1:[{'age':27,'weight':200,'name':'john'},{'age':23,'weight':180,'name':'peter'}],
company2:[{'age':30,'weight':190,'name':'sam'},{'age':32,'weight':210,'name':'clove'},{'age':21,'weight':170,'name':'steve'}]
company3:[{'age':36,'weight':175,'name':'shaun'},{'age':40,'weight':205,'name':'dany'},{'age':25,'weight':160,'name':'mark'}]}
pardon my explanation!
This solution might be better done using some other method more succinctly. However, it accomplishes the desired result.
from pprint import pprint
dict3 = dict()
dict1 = {'company1':[{'age':27,'weight':200,'name':'john'},{'age':23,'weight':180,'name':'peter'}],
'company2':[{'age':30,'weight':190,'name':'sam'},{'age':32,'weight':210,'name':'clove'},{'age':21,'weight':170,'name':'steve'}],
'company3':[{'age':36,'weight':175,'name':'shaun'},{'age':40,'weight':205,'name':'dany'},{'age':25,'weight':160,'name':'mark'}],
'company4':[{'age':36,'weight':155,'name':'lina'},{'age':40,'weight':215,'name':'sammy'},{'age':25,'weight':190,'name':'matt'}]
}
dict2 = {'company2':[{'age':30},{'age':45},{'age':52}],
'company4':[{'age':43},{'age':67},{'age':22},{'age':34},{'age':42}]
}
for company, array in dict1.items():
if company not in dict2:
dict3[company] = array
else:
# all the ages for this company in dict1
ages = set(map(lambda x: x['age'], array))
for dictref in dict2[company]:
if dictref['age'] in ages:
dict3[company] = array
break
pprint(dict3)
Output was
{'company1': [{'age': 27, 'name': 'john', 'weight': 200},
{'age': 23, 'name': 'peter', 'weight': 180}],
'company2': [{'age': 30, 'name': 'sam', 'weight': 190},
{'age': 32, 'name': 'clove', 'weight': 210},
{'age': 21, 'name': 'steve', 'weight': 170}],
'company3': [{'age': 36, 'name': 'shaun', 'weight': 175},
{'age': 40, 'name': 'dany', 'weight': 205},
{'age': 25, 'name': 'mark', 'weight': 160}]}

Categories