ZIP function on Dictionaries - python

I am struggling with an assignment for a course I have entered.
Create a function which returns a list of countries the number of cases is equal to one:
Hint: you can use the zip() function in Python to iterate over two lists at the same time.
So the prior question was to get the number of countries which had a single case of corona.
There was 7 countries as the output - and the following worked for that.
# Add your code below
def single_case_country_count(data):
item = data['Total Cases']
count = item.count(1)
if count == 0:
print('None found')
return count
# pass
I am however struggling with the second portion returning the names of these said 7 countries.
type(latest) is showing dict
i wrote this code
assuming i will have a dictionary of only cases where it is equal to 1 and
the original list; group them through the zipped function and then finally only show the list of countries.
def single_case_countries(data):
cases = data['Total Cases'] == 1
names = data['Country']
zipped = zip(names,cases)
final = list(zipped)
return final['Country']
# pass
TypeError: 'bool' object is not iterable
The clear issue here is that I cannot filter on the dictionary using " cases = data['Total Cases'] == 1" as it returns back a boolean.
Was wondering if there is some advice (especially filtering on a dictionary for a specific value

I managed to solve this with the following code:
def single_case_countries(data):
countries = []
for country,cases in zip(data["Country"],data["Total Cases"]):
if cases == 1:
countries.append(country)
return countries

Related

Confused about how a particular code works for extracting values from list of dictionaries

So I'm working on a dataframe in Pandas that has a column comprising of list of dictionaries. To illustrate how one of the elements in column look like -
df['crew'][0] = [{'job': 'Director', 'name':'ABC'}, {'job':'Producer', 'name':'XYZ'}]
How I iterate through this to get the name of director -
list1 = []
for elements in df['crew'][0]:
for dicts in elements:
if dicts['job'] == 'Director':
list1.append(dicts['name'])
And this works. But when applied to whole dataframe, it gives me an error: 'Float' object is not iterable
However, when I apply this function I get the desired result, but I'm not able to understand how this code works -
def get_directors(dataframe):
try:
director_list = []
for elements in dataframe['crew']:
if elements['job'] == 'Director':
director_list.append(elements['name'])
dataframe['directors'] = director_list
return dataframe
except:
return None
Why is one more loop not being written after going inside dataframe['crew']? Also, when I try this code independently on a particular element, it fails at 6th line demanding me to insert integer indices and not string.
Link for data: https://drive.google.com/file/d/1dNVpcxB8M1J7dcV4E58Bg1fqwi48UiF2/view?usp=sharing

Python List comprehension indicating undefined varaibles

df_store_index_list = df_store.index.tolist()
df_store_column_list = df_store[column].tolist()
list_to_be_returned = []
for i in range(len(df_store_index_list)):
list_to_be_returned.append([df_store_index_list[i], df_store_column_list[i]])
# return list_to_be_returned
return [[df_store_index_list[i], df_store_column_list[i]] for i in range(len(df_store_index_list)) ] not working!!!!
I have a function that returns a two-dimensional list.
Problem: the list comprehension on the last line is giving me an error saying "df_store_index_list is not defined".
Solution: I created my own list (list_to_be_returned) and did a custom for loop and it's working fine. It has a value (list_to_be_returned). But I was just wondering, why is the list comprehension not working?
here is the complete code
#classmethod
def store_specific_info_string(cls, store_name, column, ascending=False):
"""
Brief
- filter for specific store
Description
- obtain sum of column based on specific `Store_Name`
Parameter
- store_name : inside the `Store_Name` column
- ascending : True or False
- column : sum of what column? (Total_Sales, Total_Profit)
Return Value(s)
- tuple of name(Item_Description) and sum of column passed based on name.
"""
# filter the store by store name
df_store = cls.dataframe[ cls.dataframe[ "Store_Name" ] == store_name]
df_store = df_store.groupby("Item_Description").sum()[[column]]
# sort them by the column(integer)
df_store.sort_values(column,ascending=ascending ,inplace=True)
df_store_index_list = df_store.index.tolist()
df_store_column_list = df_store[column].tolist()
list_to_be_returned = []
for i in range(len(df_store_index_list)):
list_to_be_returned.append([df_store_index_list[i], df_store_column_list[i]])
return list_to_be_returned
# return [[df_store_index_list[i], df_store_column_list[i]] for i in range(len(df_store_index_list)) ] not working!!!!
here is a pdb initiated
inside pdb
Based on your comment it seems that you really want the list of list format with a list comprehension. Here is another way to do your list comprehension (but that doesn't explain why yours didn't work)
column_serie = df_store[column]
[[idx, value] for idx,value in column_serie.iteritems()]

How do I use a dictionary as a rubric to assign a value to dataframe using pd.apply()

def create_rubric(number, df, col):
"""
First finds all the unique fields then segments them in quintiles.
Uses the quintiles to give ratings to the original data
"""
sorted_col = df[col].sort_values()
unique_val = sorted_col.unique()
unique_cut = pd.qcut(unique_val,number,labels=False)
unique_dict = {"Items" : unique_val, "Labels" : unique_cut}
df = pd.DataFrame(unique_dict)
rubric = {}
rubric[1] = df[df.Labels == 0]
rubric[2] = df[df.Labels == 1]
rubric[3] = df[df.Labels == 2]
rubric[4] = df[df.Labels == 3]
rubric[5] = df[df.Labels == 4]
return rubric
def frequency_star_rating(x, rubric):
"""
Uses rubric to score the rows in the dataframe
"""
for rate, key in rubric.items():
if x in key:
return rate
rubric = create_rubric(5,rfm_report,"ordersCount")
rfm_report["Frequency Rating"] = rfm_report["ordersCount"].apply(frequency_star_rating, rubric)
I’ve written two functions that should interact with each other. One creates a scoring rubric that ends up in a dictionary and the other should use that dictionary to score rows in a dataframe of about 700,000 rows. For some reason I keep getting the “Series objects are mutable and cannot be hashed” error. I really can’t figure out the best way to do this. Did I write the functions wrong?
It would be nice if you could provide a toy dataset so we could run your code quickly and see where the error happens.
The error you are getting is trying to tell you that a pd.Series object cannot be used as the key of a dictionary. The reason is that Python dictionaries are hash tables. So, they only accept hashable data types as the key. For example, strings and integers are hashable, but lists are not. So the following works fine:
fine_dict = {'John': 1, 'Lilly': 2}
While this one will throw a TypeError:
wrong_dict = {['John']: 1, ['Lilly']: 2}
The error will look like this: TypeError: unhashable type: 'list'.
So my hunch is that somewhere in your code, you're trying to use a Series object as the key of a dictionary, which you should not because it's not hashable.

Python: Running function to append values to an empty list returns no values

This is probably a very basic question but I haven't been able to figure this out.
I'm currently using the following to append values to an empty list
shoes = {'groups':['running','walking']}
df_shoes_group_names = pd.DataFrame(shoes)
shoes_group_name=[]
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
shoes_group_name
['running', 'walking']
I'm trying to accomplish the same using a for loop, however, when I execute the loop the list comes back as blank
shoes_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
list_builder(df_shoes_group_names)
shoes_group_name
[]
Reason for the function is that eventually I'll have multiple DF's with different product's so i'd like to just have if statements within the function to handle the creation of each list
so for example future examples could look like this:
df_shoes_group_names
df_boots_group_names
df_sandals_group_names
shoes_group_name=[]
boots_group_name=[]
sandals_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
elif 'boots' in dataframe_name:
for type in df_boots_group_names['groups']:
boots_group_name.append(type)
elif 'sandals' in dataframe_name:
for type in df_sandals_group_names['groups']:
sandals_group_name.append(type)
list_builder(df_shoes_group_names)
list_builder(df_boots_group_names)
list_builder(df_sandals_group_names)
Not sure if I'm approaching this the right way so any advice would be appreciated.
Best,
You should never call or search a variable name as if it were a string.
Instead, use a dictionary to store a variable number of variables.
Bad practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
def foo(x):
if shoes in df_shoes_group_names: # <-- THIS WILL NOT WORK
# do something with x
Good practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
dfs = {'shoes': df_shoes_group_names,
'boots': df_boots_group_names,
'sandals': df_sandals_group_names}
def foo(key):
if 'shoes' in key: # <-- THIS WILL WORK
# do something with dfs[key]

Dataquest: I've just learned how to define a function in python. Now I want to run it in a loop.

I am a python beginner and I learn using dataquest.
I want to use a self-defined function in a loop to check every item in a list, whether it is a color movie or not and add the results (True, False) to a list. Right now the function returns False only, also way to many times. Any hints what I did wrong?
wonder_woman = ['Wonder Woman','Patty Jenkins','Color',141,'Gal Gadot','English','USA',2017]
def is_usa(input_lst):
if input_lst[6] == "USA":
return True
else:
return False
def index_equals_str(input_lst, index, input_str):
if input_lst[index] == input_str:
return True
else:
return False
wonder_woman_in_color = index_equals_str(input_str="Color", index=2, input_lst=wonder_woman)
# End of dataquest challenge
# My own try to use the function in a loop and add the results to a list
f = open("movie_metadata.csv", "r")
data = f.read()
rows = data.split("\n")
aufbereitet = []
for row in rows:
einmalig = row.split(",")
aufbereitet.append(einmalig)
# print(aufbereitet)
finale_liste = []
for item in aufbereitet:
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
finale_liste.append(test)
print(finale_liste)
Also at pastebin: https://pastebin.com/AESjdirL
I appreciate your help!
The problem is in this line
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
The input_lst argument should be input_lst=item. Right now you are passing the whole list of lists to your function everytime.
The .csv file is not provided but I assume the reading is correct and it returns a list like the one you provided in the first line of your code; in particular, that you are trying to pack the data in a list of lists (the einmalig variable is a list obtained by the row of the csv file, then you append each einmalig you find in another list, aufbereitet).
The problem is not in the function itself but in the parameters you give as inputs: when you do
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
you should see that the third parameter is not a list corresponding to the single movie data but the whole list of movies. This means that the Python interpreter, in the function, does this iteration for every item in aufbereitet (that is, iterates for n times where n is aufbereitet's length):
if aufbereitet[2] == "Color":
return True
else:
return False
It is clear that even if the movie is in color, the comparison between a list (an element of aufbereitet) and a string returns False by default since they are different types.
To correct the issue just change the line
test = index_equals_str(input_str="Color", index=2, input_lst=aufbereitet)
with
test = index_equals_str(input_str="Color", index=2, input_lst=item)
since, when you use the for loop in that way, the variable item changes at each iteration with the elements in aufbereitet.
Notice that if you're learning that's still ok to use functions but you can use an inline version of the algorithm (that's what Python is famous for). Using
finale_liste = [item[2] == "Color" for item in aufbereitet]
you obtain the list without going to define a function and without using the for loop. That's called list comprehension.
Another thing you can do to make the code more Pythonic - if you want to use the functions anyway - is to do something like
def index_equals_str(input_lst, index, input_str):
return input_lst[index] == input_str
that has the same result with less lines.
Functional programming is sometimes more readable and adaptable for such tasks:
from functools import partial
def index_equals_str(input_lst, index=1, input_str='Null'):
return input_lst[index] == input_str
input_data = [['Name1', 'Category1', 'Color', 'Language1'],
['Name2', 'Category2', 'BW', 'Language2']]
result = list(map(partial(index_equals_str, input_str='Color', index=2), input_data))
# output
# [True, False]

Categories