How to store information from a loop function? - python

Could you please help me store the 'name' and 'gender' into a new pandas.DataFrame from the following loop's outcome?
Here's my loop function:
def predict_gender_combined(name_input):
d_2=GenderDetector()
g_2=d_2.get_gender(name_input)
g_3= Genderize().get([name_input])
print(f'{g_2}\n{g_3}')
print('---------------')
return(g_2,g_3)
name_list= ['Anna', 'Maria']
for name in name_list:
_=predict_gender_combined(name)
outcome:
Person(title=None, first_name='anna', last_name=None, email=None, gender='f')
[{'name': 'Anna', 'gender': 'female', 'probability': 0.98, 'count': 383713}]
---------------
Person(title=None, first_name='maria', last_name=None, email=None, gender='f')
[{'name': 'Maria', 'gender': 'female', 'probability': 0.98, 'count': 334287}]
---------------
Goal: To create a new pandas.DataFrame, with first column "name" and second column "gender"
name gender
Anna f
Maria f
Attempt:
prediction_list = list()
name_list= ['Anna', 'Maria']
for name in name_list:
prediction=predict_gender_combined(name)
prediction_list.append(prediction)

This is what dictionary comprehensions are for.
# This := syntax is an "assignment expression" that is available in Python 3.8+
result = {"name": predicted[0]["name"], "gender": predicted[0]["gender"] for predicted := predict_gender_combined(name) in name_list}
It is, however a lot. Let's write that out so it's a little easier to read:
result = {}
for name in name_list:
predicted = predict_gender_combined(name)
result["name"] = predicted[0]["name"]
result["gender"] = predicted[0]["gender"]

I'm going to make some assumptions on what you're hoping to do here:
you're trying to get a list of dictionaries
each dictionary holds a name and holds a count for the number of times the names occurred in name_list.
I'm not sure what the probability key is used for, and I don't know what the g_3 is used for in your defined function, so I'll have to leave that up to you. But given these assumptions, here's what I would recommend:
If you really want a list of dictionaries, that's fine, but it would probably be easier to first make a dictionary of dictionaries and then convert it to a list, e.g.,
{
"Tim": {'name': 'Tim', 'gender': 'M', 'probability': 0.0, 'count': 4}, "Sam": {'name': 'Sam', 'gender': 'F', 'probability': 0.0, 'count': 5},
...
}
Then, you could use the following code:
name_list=list_of_users
name_dict={}
for name in name_list:
test_list=predict_gender_combined(name)
if name in name_dict:
name_dict[name] = {'name': name, 'gender': test_list[0], 'probability': 0.0, 'count': 1}
else:
name_dict[name]['count'] += 1
final_list=list(name_dict.values())
Hope that gets you started.

Related

Python: Way to build a dictionary with a variable key and append to a list as the value inside a loop

I have a list of dictionaries. I want to loop through this list of dictionary and for each specific name (an attribute inside each dictionary), I want to create a dictionary where the key is the name and the value of this key is a list which dynamically appends to the list in accordance with a specific condition.
For example, I have
d = [{'Name': 'John', 'id': 10},
{'Name': 'Mark', 'id': 21},
{'Name': 'Matthew', 'id': 30},
{'Name': 'Luke', 'id': 11},
{'Name': 'John', 'id': 20}]
I then built a list with only the names using names=[i['Name'] for i in dic1] so I have a list of names. Notice John will appear twice in this list (at the beginning and end). Then, I want to create a for-loop (for name in names), which creates a dictionary 'ID' that for its value is a list which appends this id field as it goes along.
So in the end I'm looking for this ID dictionary to have:
John: [10,20]
Mark: [21]
Matthew: [30]
Luke: [11]
Notice that John has a list length of two because his name appears twice in the list of dictionaries.
But I can't figure out a way to dynamically append these values to a list inside the for-loop. I tried:
ID={[]} #I also tried with just {}
for name in names:
ID[names].append([i['id'] for i in dic1 if i['Name'] == name])
Please let me know how one can accomplish this. Thanks.
Don't loop over the list of names and go searching for every one in the list; that's very inefficient, since you're scanning the whole list all over again for every name. Just loop over the original list once and update the ID dict as you go. Also, if you build the ID dict first, then you can get the list of names from it and avoid another list traversal:
names = ID.keys()
The easiest solution for ID itself is a dictionary with a default value of the empty list; that way ID[name].append will work for names that aren't in the dict yet, instead of blowing up with a KeyError.
from collections import defaultdict
ID = defaultdict(list)
for item in d:
ID[item['Name']].append(item['id'])
You can treat a defaultdict like a normal dict for almost every purpose, but if you need to, you can turn it into a plain dict by calling dict on it:
plain_id = dict(ID)
The Thonnu has a solution using get and list concatenation which works without defaultdict. Here's another take on a no-import solution:
ID = {}
for item in d:
name, number = item['Name'], item['id']
if name in ID:
ID[name].append(number)
else:
ID[name] = [ number ]
Using collections.defaultdict:
from collections import defaultdict
out = defaultdict(list)
for item in dic1:
out[item['Name']].append(item['id'])
print(dict(out))
Or, without any imports:
out = {}
for item in dic1:
out[item['Name']] = out.get(item['Name'], []) + [item['id']]
print(out)
Or, with a list comprehension:
out = {}
[out.update({item['Name']: out.get(item['Name'], []) + [item['id']]}) for item in dic1]
print(out)
Output:
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
dic1 = [{'Name': 'John', 'id': 10}, {'Name': 'Mark', 'id': 21}, {'Name': 'Matthew', 'id': 30}, {'Name': 'Luke', 'id': 11}, {'Name': 'John', 'id': 20}]
id_dict = {}
for dic in dic1:
key = dic['Name']
if key in id_dict:
id_dict[key].append(dic['id'])
else:
id_dict[key] = [dic['id']]
print(id_dict) # {'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
You can use defaultdict for this to initiate a dictionary with a default value. In this case the default value will be empty list.
from collections import defaultdict
d=defaultdict(list)
for item in dic1:
d[item['Name']].append(item['id'])
Output
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]} # by converting (not required) into pure dict dict(d)
You can do in a easy version
dic1=[{'Name': 'John', 'id':10}, {'Name': 'Mark', 'id':21},{'Name': 'Matthew', 'id':30}, {'Name': 'Luke', 'id':11}, {'Name': 'John', 'id':20}]
names=[i['Name'] for i in dic1]
ID = {}
for i, name in enumerate(names):
if name in ID:
ID[name].append(dic1[i]['id'])
else:
ID[name] = [dic1[i]['id']]
print(ID)

How to count the number of similar values in a dictionary?

Is there a way to count the number of similar values in a dictionary,for example I have a list consisting of different users and their names, gender, age ... if I want to count the number of males in that list how to do it?
I know it might sound silly but I am new to python and I am trying to learn more about it.
This is the list if it will help:
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
Here are two solutions for your example of counting genders:
n_males = sum(1 for user in user_list if user['gender'] == 'm')
print(n_males )
Use the counter
from collections import Counter
counts = Counter((user['gender'] for user in user_list))
print(counts)
You can create a function like:
def count_gender(lst, gender):
count=0
for dictionary in lst:
if dictionary["gender"] == gender:
count+=1
return count
Then pass into it list and desired gender:
number = count_gender(user_list, "m")
print(number)
You can use the sum function.
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
print(sum(user["gender"] == "m" for user in user_list))
I gave a list comprehension in parameter to filter the users with the attribute gender set to m. You can easily create a function to make the operation more efficient.
def count_by_attribute(my_list: List, attribute: str, value: Any) -> int:
return sum(o[attribute] == value for o in my_list)
n = len([i.gender for i in user_list if i.gender == "m"])
Here you're calculating the len of the list generated by this sintax called list comprehension, that takes all the i.gender of all the values in your dictionary, where the gender is "m"
that should do the trick
print(sum(x.get('gender') == 'm' for x in user_list))
the .get function Returns the value for key if key is in the dictionary, else default. so we just count the values m and get 1 in your example case.

Python creating dictionary from list and tuple

When I iterate over a dictionary like so:
dict2={
'Joe':('Caucasian','Male', 35, 7.5),
'Kevin':('Black','Male', 55, 9.5),
More tuples here like the one above
}
The data is bigger but it doesn't matter here.
What I am trying to accomplish is to create a new dictionary with the information from the tuples. Like so:
dict_i_want = {
"Name": Joe,
"Ethiniticy": "Caucasian",
"Gender":"Male",
"Voter_age": 35,
"Score": 7.5
}
Here is my code:
dict_i_want = {}
for k,v in dict2.items():
dict_i_want["Name"] = k
dict_i_want['Ethiniticy'] = v[0]
dict_i_want['Gender'] = v[1]
dict_i_want['Voter_age'] = v[2]
dict_i_want['Score'] = v[3]
But when I do
print(dict_i_want)
{'Name': 'Kevin', 'Ethiniticy': 'Black', 'Gender': 'Male', 'Voter_age': 55, 'Score': 9.5}
The result is just the last tuple that I have in mydict2. No all the tuples.
What I am doing wrong if I have the loop?
PS: I don't want to use any modules or import anything here. No built-in function like zip() or something like that. I want to hard code the solution
#ForceBru answered your question - your best bet is a list of dictionaries unless you want to create a dictionary of dictionaries with unique keys for each sub-dictionary. Going with the list approach you could do something like this:
Example:
from pprint import pprint
dict2 = {
'Joe': ('Caucasian', 'Male', 35, 7.5),
'Kevin': ('Black', 'Male', 55, 9.5),
}
dicts_i_want = [
{"name": name, "ethnicity": ethnicity, "gender": gender, "voter_age": voter_age, "score": score}
for name, (ethnicity, gender, voter_age, score) in dict2.items()
]
pprint(dicts_i_want)
Output:
[{'ethnicity': 'Caucasian',
'gender': 'Male',
'name': 'Joe',
'score': 7.5,
'voter_age': 35},
{'ethnicity': 'Black',
'gender': 'Male',
'name': 'Kevin',
'score': 9.5,
'voter_age': 55}]
Dict keys has to be unique. You're just overwriting your dict each cycle in your loop. It's just how dicts work.

Add list of dictionary to nested dictionary without it being in a list anymore

names = ['jan', 'piet', 'joris', 'corneel','jef']
ages = ['one', 'two', 'thee', 'four','five']
namesToDs = [{'name': name} for name in names]
ageToDs = [{'age': age} for age in ages]
concat = [[name, age] for name,age in zip(namesToDs,ageToDs)]
context = {'Team1': {'player1': concat[0] }}
print(context)
This will result in the following nested dictionary.
{'Team1': {'player1': [{'name': 'jan'}, {'age': 'one'}]}}
I want the result to be:
{'Team1': {'player1': {'name': 'jan'}, {'age': 'one'}}}
So without the [] from the list.
I've tried converting it to a dictonary.
I first had it in a tuple using the list and map function, but that didn't work out.
I'm not very familiar with Python or programming, if I'm shooting in the wrong direction, please let me know.
The reason I want it in this nested dictionary is to be able to easily access the data in flask front end.
The result you expected isn't a possible dictionary. The closest possible would be this:
{'Team1': {'player1': {'name': 'jan', 'age': 'one'}}}
which is achieved by replacing [name, age] with {**name, **age}. Full code:
names = ['jan', 'piet', 'joris', 'corneel','jef']
ages = ['one', 'two', 'thee', 'four','five']
namesToDs = [{'name': name} for name in names]
ageToDs = [{'age': age} for age in ages]
concat = [{**name, **age} for name, age in zip(namesToDs, ageToDs)]
context = {'Team1': {'player1': concat[0]}}
print(context)
The closest Python valid (standard type) instance is probably
>>> from collections import ChainMap
>>> {'Team1': {'player1': dict(ChainMap(*concat[0]))}}
{'Team1': {'player1': {'age': 'one', 'name': 'jan'}}}
... assuming you have no control on the data generating process, i.e. no control on how concat is created.

Json data manipulation using python 2.7[Redhat6.7]

I am a bit newbie to python and its data manipulation dict, list.
So I have following JSON data :
{'Namelist': {'thomas': {'gender': 'male', 'age': '23'}, 'david': {'gender': 'male'}, 'jennie': {'gender': 'female', 'age': '23'}, 'alex': {'gender': 'male'}}, 'selectors': {'naming': 'studentlist', 'code': 16}}
How can I manipulate through the data and get a result like this :
if age == 23 then return thomas and jennie as output and store it in a variable as string.
NOTE : It should iterate through the whole data and search for age, I am using the "for each" loop for this but not working.
Any help is appreciated
It looks like you already have the JSON parsed into an object, so you can just iterate through it and check the person's age.
dictionary = {
'Namelist': {
'thomas': {'gender': 'male', 'age': '23'},
'david': {'gender': 'male'},
'jennie': {'gender': 'female', 'age': '23'},
'alex': {'gender': 'male'}},
'selectors': {'naming': 'studentlist', 'code': 16}}
# For Loop Method
name_list = []
for name, person in dictionary['Namelist'].items():
if person.get('age') == '23':
name_list.append(name)
print(', '.join(name_list)) # Would print 'thomas, jennie'
# List Comprehension Method
name_list = [name for name, person in dictionary['Namelist'].items() if person.get('age') == '23']
print(', '.join(name_list))
This is a quick and dirty way that I'd do it. You can get into list comprehension as well, but I thought this was easier for you to understand as a newbie. It works in python 3 as well as I use the brackets for print().
variable = {'Namelist': {'thomas': {'gender': 'male', 'age': '23'},
'david': {'gender': 'male'}, 'jennie': {'gender': 'female', 'age':
'23'}, 'alex': {'gender': 'male'}}, 'selectors': {'naming':
'studentlist', 'code': 16}}
response = list() #Create a list to use to store the iterations.
for key, value in variable.items(): #Loop through the main dictionary
if key == 'Namelist': #Filter by the NameList
for theName, subValue in value.items(): #Loop through the dictionaries made for each name.
if 'age' in subValue and subValue['age'] == '23': #the age key wasn't in every dictionary so I check if it exists, then I check if it is set to 23.
response.append(theName + ' is 23 ') #add it to the response list.
nameString = ''.join(response) #turn the list into a string.
print (nameString) #print it

Categories