Is there a way to count the number of similar values in a dictionary,for example I have a list consisting of different users and their names, gender, age ... if I want to count the number of males in that list how to do it?
I know it might sound silly but I am new to python and I am trying to learn more about it.
This is the list if it will help:
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
Here are two solutions for your example of counting genders:
n_males = sum(1 for user in user_list if user['gender'] == 'm')
print(n_males )
Use the counter
from collections import Counter
counts = Counter((user['gender'] for user in user_list))
print(counts)
You can create a function like:
def count_gender(lst, gender):
count=0
for dictionary in lst:
if dictionary["gender"] == gender:
count+=1
return count
Then pass into it list and desired gender:
number = count_gender(user_list, "m")
print(number)
You can use the sum function.
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
print(sum(user["gender"] == "m" for user in user_list))
I gave a list comprehension in parameter to filter the users with the attribute gender set to m. You can easily create a function to make the operation more efficient.
def count_by_attribute(my_list: List, attribute: str, value: Any) -> int:
return sum(o[attribute] == value for o in my_list)
n = len([i.gender for i in user_list if i.gender == "m"])
Here you're calculating the len of the list generated by this sintax called list comprehension, that takes all the i.gender of all the values in your dictionary, where the gender is "m"
that should do the trick
print(sum(x.get('gender') == 'm' for x in user_list))
the .get function Returns the value for key if key is in the dictionary, else default. so we just count the values m and get 1 in your example case.
Related
This question already has answers here:
Get average value from list of dictionary
(4 answers)
Closed 7 months ago.
I am new to python and I am trying to count the number of males and females in a list and that worked but I do not know how to make the average of all the ages in the list.
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
def user_stat():
from collections import Counter
counts = Counter((user['gender'] for user in user_list))
print(counts)
user_stat()
def user_stat():
# Here we get rid of the None elements
user_list_filtered = list(filter(lambda x: isinstance(x['age'], int), user_list))
# Here we sum the ages and divide them by the amount of "not-None" elements
print(sum(i.get('age', 0) for i in user_list_filtered) / len(user_list_filtered))
# If you want to divide by None-elements included
print(sum(i.get('age', 0) for i in user_list_filtered) / len(user_list))
user_stat()
not entirely sure if this is what you are looking for, but maybe try with two dicts:
cont = {'m': 0, 'f': 0}
avgs = {'m': 0, 'f': 0}
for u in user_list:
gdr = u['gender']
if not gdr:
continue
cont[gdr] += 1
avgs[gdr] += u['age']
for g in avgs:
avgs[g] /= cont[g]
print(avgs) # {'m': 24.0, 'f': 34.0}
you can try something like this. As you can also have None for the age. Try using .get method for a dict.
total_age = 0
for user in user_list:
if user.get('age'):
total_age = total+user['age']
avg_age = total_age/len(user_list)
Hope it helps!!
If you have trouble understanding lambda, this can be another option.
user_list = [
{'name': 'Alizom_12',
'gender': 'f',
'age': 34,
'active_day': 170},
{'name': 'Xzt4f',
'gender': None,
'age': None,
'active_day': 1152},
{'name': 'TomZ',
'gender': 'm',
'age': 24,
'active_day': 15},
{'name': 'Zxd975',
'gender': None,
'age': 44,
'active_day': 752},
]
ages = []
def user_stat():
for age in user_list:
if isinstance(age["age"], int):
ages.append(age["age"]) # If the age is an integer, add it to a list.
average = sum(ages) / len(ages) # Creates average by dividing the sum with the length of the list
print(f"The age average is {average}")
user_stat()
you can try something like:
ages = [user['age'] for user in user_list if user['gender'] == 'f']
avg = sum(ages) / len(ages)
Could you please help me store the 'name' and 'gender' into a new pandas.DataFrame from the following loop's outcome?
Here's my loop function:
def predict_gender_combined(name_input):
d_2=GenderDetector()
g_2=d_2.get_gender(name_input)
g_3= Genderize().get([name_input])
print(f'{g_2}\n{g_3}')
print('---------------')
return(g_2,g_3)
name_list= ['Anna', 'Maria']
for name in name_list:
_=predict_gender_combined(name)
outcome:
Person(title=None, first_name='anna', last_name=None, email=None, gender='f')
[{'name': 'Anna', 'gender': 'female', 'probability': 0.98, 'count': 383713}]
---------------
Person(title=None, first_name='maria', last_name=None, email=None, gender='f')
[{'name': 'Maria', 'gender': 'female', 'probability': 0.98, 'count': 334287}]
---------------
Goal: To create a new pandas.DataFrame, with first column "name" and second column "gender"
name gender
Anna f
Maria f
Attempt:
prediction_list = list()
name_list= ['Anna', 'Maria']
for name in name_list:
prediction=predict_gender_combined(name)
prediction_list.append(prediction)
This is what dictionary comprehensions are for.
# This := syntax is an "assignment expression" that is available in Python 3.8+
result = {"name": predicted[0]["name"], "gender": predicted[0]["gender"] for predicted := predict_gender_combined(name) in name_list}
It is, however a lot. Let's write that out so it's a little easier to read:
result = {}
for name in name_list:
predicted = predict_gender_combined(name)
result["name"] = predicted[0]["name"]
result["gender"] = predicted[0]["gender"]
I'm going to make some assumptions on what you're hoping to do here:
you're trying to get a list of dictionaries
each dictionary holds a name and holds a count for the number of times the names occurred in name_list.
I'm not sure what the probability key is used for, and I don't know what the g_3 is used for in your defined function, so I'll have to leave that up to you. But given these assumptions, here's what I would recommend:
If you really want a list of dictionaries, that's fine, but it would probably be easier to first make a dictionary of dictionaries and then convert it to a list, e.g.,
{
"Tim": {'name': 'Tim', 'gender': 'M', 'probability': 0.0, 'count': 4}, "Sam": {'name': 'Sam', 'gender': 'F', 'probability': 0.0, 'count': 5},
...
}
Then, you could use the following code:
name_list=list_of_users
name_dict={}
for name in name_list:
test_list=predict_gender_combined(name)
if name in name_dict:
name_dict[name] = {'name': name, 'gender': test_list[0], 'probability': 0.0, 'count': 1}
else:
name_dict[name]['count'] += 1
final_list=list(name_dict.values())
Hope that gets you started.
name=[]
age=[]
address=[]
...
for line in pg:
for key,value in line.items():
if key == 'name':
name.append(value)
elif key == 'age':
age.append(value)
elif key == 'address':
address.append(value)
.
.
.
Is it possible to use list comprehension for above code because I need to separate lots of value in the dict? I will use the lists to write to a text file.
Source Data:
a = [{'name': 'paul', 'age': '26.', 'address': 'AU', 'gender': 'male'},
{'name': 'mei', 'age': '26.', 'address': 'NY', 'gender': 'female'},
{'name': 'smith', 'age': '16.', 'address': 'NY', 'gender': 'male'},
{'name': 'raj', 'age': '13.', 'address': 'IND', 'gender': 'male'}]
I don't think list comprehension will be a wise choice because you have multiple lists.
Instead of making multiple lists and appending to them the value if the key matches you can use defaultdict to simplify your code.
from collections import defaultdict
result = defaultdict(list)
for line in pg:
for key, value in line.items():
result[key].append(value)
You can get the name list by using result.get('name')
['paul', 'mei', 'smith', 'raj']
This probably won't work the way you want: Your'e trying to assign the three different lists, so you would need three different comprehensions. If your dict is large, this would roughly triple your execution time.
Something straightforward, such as
name = [value for for key,value in line.items() if key == "name"]
seems to be what you'd want ... three times.
You can proceed as :
pg=[{"name":"name1","age":"age1","address":"address1"},{"name":"name2","age":"age2","address":"address2"}]
name=[v for line in pg for k,v in line.items() if k=="name"]
age=[v for line in pg for k,v in line.items() if k=="age"]
address=[v for line in pg for k,v in line.items() if k=="address"]
In continuation with Vishal's answer, please dont use defaultdict. Using defaultdict is a very bad practice when you want to catch keyerrors. Please use setdefault.
results = dict()
for line in pg:
for key, value in line.items():
result.setdefault(key, []).append(value)
Output
{
'name': ['paul', 'mei', 'smith', 'raj'],
'age': [26, 26, 26, 13],
...
}
However, note that if all dicts in pg dont have the same keys, you will lose the relation/correspondence between the items in the dict
Here is a really simple solution if you want to use pandas:
import pandas as pd
df = pd.DataFrame(a)
name = df['name'].tolist()
age = df['age'].tolist()
address = df['address'].tolist()
print(name)
print(age)
print(address)
Output:
['paul', 'mei', 'smith', 'raj']
['26.', '26.', '16.', '13.']
['AU', 'NY', 'NY', 'IND']
Additionally, if your end result is a text file, you can skip the list creation and write the DataFrame (or parts thereof) directly to a CSV with something as simple as:
df.to_csv('/path/to/output.csv')
I have a list of dictionaries in which the dictionaries also contain a list.
I want to generate a set of the values of the respective nested lists so that I end up with a set of all of the unique items (in this case, hobbies).
I feel a set is perfect for this since it will automatically drop any duplicates, leaving me with a set of all of the unique hobbies.
people = [{'name': 'John', 'age': 47, 'hobbies': ['Python', 'cooking', 'reading']},
{'name': 'Mary', 'age': 16, 'hobbies': ['horses', 'cooking', 'art']},
{'name': 'Bob', 'age': 14, 'hobbies': ['Python', 'piano', 'cooking']},
{'name': 'Sally', 'age': 11, 'hobbies': ['biking', 'cooking']},
{'name': 'Mark', 'age': 54, 'hobbies': ['hiking', 'camping', 'Python', 'chess']},
{'name': 'Alisa', 'age': 52, 'hobbies': ['camping', 'reading']},
{'name': 'Megan', 'age': 21, 'hobbies': ['lizards', 'reading']},
{'name': 'Amanda', 'age': 19, 'hobbies': ['turtles']},
]
unique_hobbies = (item for item in people['hobbies'] for hobby in people['hobbies'].items())
print(unique_hobbies)
This generates an error:
TypeError: list indices must be integers or slices, not str
My comprehension is wrong, but I am not sure where. I want to iterate through each dictionary, then iterate through each nested list and update the items into the set, which will drop all duplicates, leaving me with a set of all of the unique hobbies.
You could also use a set-comprehension:
>>> unique_hobbies = {hobby for persondct in people for hobby in persondct['hobbies']}
>>> unique_hobbies
{'horses', 'lizards', 'cooking', 'art', 'biking', 'camping', 'reading', 'piano', 'hiking', 'turtles', 'Python', 'chess'}
The problem with your comprehension is that you want to access people['hobbies'] but people is a list and can only index lists with integers or slices. To make it work you need to iterate over you list and then access the 'hobbies' of each of the subdicts (like I did inside the set-comprehension above).
I got it:
unique_hobbies = set()
for d in people:
unique_hobbies.update(d['hobbies'])
print(unique_hobbies)
I am a bit newbie to python and its data manipulation dict, list.
So I have following JSON data :
{'Namelist': {'thomas': {'gender': 'male', 'age': '23'}, 'david': {'gender': 'male'}, 'jennie': {'gender': 'female', 'age': '23'}, 'alex': {'gender': 'male'}}, 'selectors': {'naming': 'studentlist', 'code': 16}}
How can I manipulate through the data and get a result like this :
if age == 23 then return thomas and jennie as output and store it in a variable as string.
NOTE : It should iterate through the whole data and search for age, I am using the "for each" loop for this but not working.
Any help is appreciated
It looks like you already have the JSON parsed into an object, so you can just iterate through it and check the person's age.
dictionary = {
'Namelist': {
'thomas': {'gender': 'male', 'age': '23'},
'david': {'gender': 'male'},
'jennie': {'gender': 'female', 'age': '23'},
'alex': {'gender': 'male'}},
'selectors': {'naming': 'studentlist', 'code': 16}}
# For Loop Method
name_list = []
for name, person in dictionary['Namelist'].items():
if person.get('age') == '23':
name_list.append(name)
print(', '.join(name_list)) # Would print 'thomas, jennie'
# List Comprehension Method
name_list = [name for name, person in dictionary['Namelist'].items() if person.get('age') == '23']
print(', '.join(name_list))
This is a quick and dirty way that I'd do it. You can get into list comprehension as well, but I thought this was easier for you to understand as a newbie. It works in python 3 as well as I use the brackets for print().
variable = {'Namelist': {'thomas': {'gender': 'male', 'age': '23'},
'david': {'gender': 'male'}, 'jennie': {'gender': 'female', 'age':
'23'}, 'alex': {'gender': 'male'}}, 'selectors': {'naming':
'studentlist', 'code': 16}}
response = list() #Create a list to use to store the iterations.
for key, value in variable.items(): #Loop through the main dictionary
if key == 'Namelist': #Filter by the NameList
for theName, subValue in value.items(): #Loop through the dictionaries made for each name.
if 'age' in subValue and subValue['age'] == '23': #the age key wasn't in every dictionary so I check if it exists, then I check if it is set to 23.
response.append(theName + ' is 23 ') #add it to the response list.
nameString = ''.join(response) #turn the list into a string.
print (nameString) #print it