dataframe to list of dictionary - python

I have the following df:
df = pd.DataFrame({"year":[2020,2020,2020,2021,2021,2021,2022,2022, 2022],"region":['europe','USA','africa','europe','USA','africa','europe','USA','africa'],'volume':[1,6,5,3,8,7,6,3,5]})
I wish to convert it to a list of dictionary such that the year would be mentioned only once in each item. Example
[{'year':2020,'europe':1,'USA':6,'africa':5,}...]
when I do:
df.set_index('year').to_dict('records')
I lost the years and the list

Another approach that uses pivot before to_dict(orient='records')
df.pivot(
index='year',
columns='region',
values='volume'
).reset_index().to_dict(orient='records')
#Output:
#[{'year': 2020, 'USA': 6, 'africa': 5, 'europe': 1},
# {'year': 2021, 'USA': 8, 'africa': 7, 'europe': 3},
# {'year': 2022, 'USA': 3, 'africa': 5, 'europe': 6}]

Try:
d = [
{"year": y, **dict(zip(x["region"], x["volume"]))}
for y, x in df.groupby("year")
]
print(d)
Prints:
[
{"year": 2020, "europe": 1, "USA": 6, "africa": 5},
{"year": 2021, "europe": 3, "USA": 8, "africa": 7},
{"year": 2022, "europe": 6, "USA": 3, "africa": 5},
]

you can use groupby on year and then zip region and volume
import pandas as pd
df = pd.DataFrame({"year":[2020,2020,2020,2021,2021,2021,2022,2022, 2022],"region":['europe','USA','africa','europe','USA','africa','europe','USA','africa'],'volume':[1,6,5,3,8,7,6,3,5]})
year_dfs = df.groupby("year")
records = []
for year, year_df in year_dfs:
year_dict = {key: value for key, value in zip(year_df["region"], year_df["volume"])}
year_dict["year"] = year
records.append(year_dict)
""" Answer
[{'europe': 1, 'USA': 6, 'africa': 5, 'year': 2020},
{'europe': 3, 'USA': 8, 'africa': 7, 'year': 2021},
{'europe': 6, 'USA': 3, 'africa': 5, 'year': 2022}]
"""

To break down each step, you could use pivot to group your df to aggregate the years, your columns become countries, and volume becomes your values
df.pivot('year','region','volume')
region USA africa europe
year
2020 6 5 1
2021 8 7 3
2022 3 5 6
To get this into dictionary format you can use the .to_dict('index')
command (in one line)
x = df.pivot('year','region','volume').to_dict('index')
{2020: {'USA': 6, 'africa': 5, 'europe': 1}, 2021: {'USA': 8, 'africa': 7, 'europe': 3}, 2022: {'USA': 3, 'africa': 5, 'europe': 6}}
finally you could use list comprehension to get it into your desired format
output = [dict(x[y], **{'year':y}) for y in x]
[{'USA': 6, 'africa': 5, 'europe': 1, 'year': 2020}, {'USA': 8, 'africa': 7, 'europe': 3, 'year': 2021}, {'USA': 3, 'africa': 5, 'europe': 6, 'year': 2022}]

Related

How to find a total year sales from a dictionary?

I have this dictionary, and when I code for it, I only have the answer for June, May, September. How would I code for the months that are not given in the dictionary? Obviously, I have zero for them.
{'account': 'Amazon', 'amount': 300, 'day': 3, 'month': 'June'}
{'account': 'Facebook', 'amount': 550, 'day': 5, 'month': 'May'}
{'account': 'Google', 'amount': -200, 'day': 21, 'month': 'June'}
{'account': 'Amazon', 'amount': -300, 'day': 12, 'month': 'June'}
{'account': 'Facebook', 'amount': 130, 'day': 7, 'month': 'September'}
{'account': 'Google', 'amount': 250, 'day': 27, 'month': 'September'}
{'account': 'Amazon', 'amount': 200, 'day': 5, 'month': 'May'}
The method I used for months mentioned in the dictionary:
year_balance=sum(d["amount"] for d in my_dict) print(f"The total year balance is {year_balance} $.")
import calendar
months = calendar.month_name[1:]
results = dict(zip(months, [0]*len(months)))
for d in data:
results[d["month"]] += d["amount"]
# then you have results dict with monthly amounts
# sum everything to get yearly total
total = sum(results.values())
This might help:
from collections import defaultdict
mydict = defaultdict(lambda: 0)
print(mydict["January"])
Also, given the comments you have written, is this what you are looking for?
your_list_of_dicts = [
{"January": 3, "March": 5},
{"January": 3, "April": 5}
]
import calendar
months = calendar.month_name[1:]
month_totals = dict()
for month in months:
month_totals[month] = 0
for d in your_list_of_dicts:
month_totals[month] += d[month] if month in d else 0
print(month_totals)
{'January': 6, 'February': 0, 'March': 5, 'April': 5, 'May': 0, 'June': 0, 'July': 0, 'August': 0, 'September': 0, 'October': 0, 'November': 0, 'December': 0}
You can read the following blog regarding the usage of dictionaries and how to perform calculations.
5 best ways to sum dictionary values in python
This is on of the examples given in the blog.
wages = {'01': 910.56, '02': 1298.68, '03': 1433.99, '04': 1050.14, '05': 877.67}
total = sum(wages.values())
print('Total Wages: ${0:,.2f}'.format(total))
Here is the result with 100,000 records.
Result with 100,000 records

previous key to current key

I am new with the concept of dictionaries and trying to learn them. What I have is a dictionary like this:
{'cars': [{'values': [1, 534],
{'values': [25,32,164]
'bikes': [{'values': [23,12,1]
{'values': [2,4]
{'values': [68,69]
{'values': [4,93]
What I try to achieve is add Ids to all inner values starting from 1
If you want the ID as part of the value group, like this:
{'cars': [{'values': [1, 534], 'sedan': 1, 'count': 2, 'ID': 1},
{'values': [25, 32, 164], 'sedan': 1, 'count': 10, 'ID': 2}],
'bikes': [{'values': [23, 12, 1], 'road': 0, 'count': 9},
...
You can do:
for i in range(len(try_dict['cars'])):
try_dict['cars'][i]['ID'] = i+1
If you want what Phydeaux suggests, you can do:
new_dict = {'cars': {}}
for i in range(len(try_dict['cars'])):
new_dict['cars'][i+1] = try_dict['cars'][i]
Which will give you:
{'cars': {1: {'values': [1, 534], 'sedan': 1, 'count': 2},
2: {'values': [25, 32, 164], 'sedan': 1, 'count': 10}}}
If you want not just cars but also bikes (and maybe trucks, trains, whatever...). Use:
new_dict = {}
for key in try_dict.keys():
new_dict[key] = {}
for i in range(len(try_dict[key])):
new_dict[key][i+1] = try_dict[key][i]
This will give you:
{'cars': {1: {'values': [1, 534], 'sedan': 1, 'count': 2},
2: {'values': [25, 32, 164], 'sedan': 1, 'count': 10}},
'bikes': {1: {'values': [23, 12, 1], 'road': 0, 'count': 9},
2: {'values': [2, 4], 'road': 1, 'count': 24},
3: {'values': [68, 69], 'sedan': 0, 'count': 28},
4: {'values': [4, 93], 'sedan': 0, 'count': 6}}}
You can do this using a simple function:
def idx(dict, key):
dict = dict
dict[key].insert(0, 0)
return dict
Full Code:
def idx(dict, key):
dict = dict
dict[key].insert(0, 0)
return dict
dict = {'cars': [{'values': [1, 534],
'sedan': 1,
'count': 2},
{'values': [25,32,164],
'sedan': 1,
'count': 10}],
'bikes': [{'values': [23,12,1],
'road': 0,
'count': 9},
{'values': [2,4],
'road': 1,
'count': 24},
{'values': [68,69],
'sedan': 0,
'count': 28},
{'values': [4,93],
'sedan': 0,
'count': 6}]}
dict = idx(dict, "cars")
print(dict["cars"][1])
Explanation:
Replace dictionary with a new edited dictionary:
dict = {key: [...,...,...]}
dict = idx(dict, key)
Function is using the .insert method to insert 0 for the value of the first index to the key provided.
Learn more about Python .insert() method at:
[
https://www.w3schools.com/python/ref_list_insert.asp

Get value if key exists in list of dictionary

I have a python list like
[{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
and want to generate it like
[
{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
...
{'month': 8, 'total': 31600},
{'month': 9, 'total': 2000},
...
{'month': 12, 'total': 0}
]
for that, I'm running iteration on the range (1,13)
new_list = []
for i in range(1, 13):
# if i exists in month, append to new_list
# else add total: 0 and append to new_list
How can I check if i exists in month and get the dictionary?
You can convert your list of dict into direct month: total mapping with
monthly_totals = {item['month']: item['total'] for item in data_list}
and use a simple list comprehension with dict.get to handle missing values:
new_list = [{'month': i, 'total': monthly_totals.get(i, 0)} for i in range(1, 13)]
Create a new list containing the default values and then update the needed values from the original list
>>> lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
>>> new_lst = [dict(month=i, total=0) for i in range(1,13)]
>>> for d in lst:
... new_lst[d['month']-1] = d
...
>>> pprint(new_lst)
[{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
{'month': 3, 'total': 0},
{'month': 4, 'total': 0},
{'month': 5, 'total': 0},
{'month': 6, 'total': 0},
{'month': 7, 'total': 0},
{'month': 8, 'total': 31600.0},
{'month': 9, 'total': 2000.0},
{'month': 11, 'total': 0},
{'month': 12, 'total': 0}]
exist_lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
new_lst = []
for i in range(1,13):
found = False
for dict_item in exist_lst:
if dict_item['month'] == i:
new_lst.append(dict_item)
found = True
if not found:
new_lst.append({'month': i, 'total': 0}) # default_dict_item
print(new_lst)

Aggregate values on lists of dicts based on key in python

I'm trying to get the aggregation of 2 different lists, where each element is a dictionary with 2 entries, month and value.
So the first list looks like this:
[{
'patient_notes': 5,
'month': datetime.date(2017, 1, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 2, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 5, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 7, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 8, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 12, 1)
}]
Second list is:
[{
'employee_notes': 4,
'month': datetime.date(2017, 2, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 3, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 4, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 8, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 9, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 10, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 12, 1)
}]
So I need to build a new list that contains the sum of both list per month, something like this:
[{
'total_messages': 14,
'month': '2017-01-01'
}, {
'total_messages': 14,
'month': '2017-02-01'
}, {
'total_messages': 14,
'month': '2017-03-01'
}, {
'total_messages': 14,
'month': '2017-04-01'
}, {
'total_messages': 14,
'month': '2017-05-01'
}, {
'total_messages': 14,
'month': '2017-06-01'
}, {
'total_messages': 14,
'month': '2017-07-01'
}, {
'total_messages': 14,
'month': '2017-08-01'
}, {
'total_messages': 14,
'month': '2017-09-01'
}, {
'total_messages': 14,
'month': '2017-10-01'
}, {
'total_messages': 14,
'month': '2017-11-01'
}, {
'total_messages': 14,
'month': '2017-12-01'
}]
I first tried with zip but this only works if first 2 list are equal size. Then I tried with [itertools.izip_longest] but this has problems if lists are equal size but different months...I cannot simply aggregate those...I need to aggregate matching months only
Counter also is great for this, but I cannot change the keys names of original lists...any ideas?
You can use defaultdict to create a counter. Go through each item in the first list and add the patient_notes value to the dictionary. Then go through the second list and add the employee_notes values.
Now you need to encode your new defaultdict back into a list in your desired format. You can use a list comprehension for that. I've sorted the list by month.
from collections import defaultdict
dd = defaultdict(int)
for d in my_list_1:
dd[d['month']] += d['patient_notes']
for d in my_list_2:
dd[d['month']] += d['employee_notes']
result = [{'total_messages': dd[k], 'month': k} for k in sorted(dd.keys())]
>>> result
[{'month': datetime.date(2017, 1, 1), 'total_messages': 5},
{'month': datetime.date(2017, 2, 1), 'total_messages': 9},
{'month': datetime.date(2017, 3, 1), 'total_messages': 4},
{'month': datetime.date(2017, 4, 1), 'total_messages': 4},
{'month': datetime.date(2017, 5, 1), 'total_messages': 5},
{'month': datetime.date(2017, 7, 1), 'total_messages': 5},
{'month': datetime.date(2017, 8, 1), 'total_messages': 9},
{'month': datetime.date(2017, 9, 1), 'total_messages': 4},
{'month': datetime.date(2017, 10, 1), 'total_messages': 4},
{'month': datetime.date(2017, 12, 1), 'total_messages': 9}]
from collections import defaultdict
d_dict = defaultdict(int)
for k,v in [ i.values() for i in l1 + l2 ]:
d_dict[k] += v
[ {'month':i.strftime("%Y-%m-%d"),'total_messages':j} for i, j in sorted(d_dict.items()) ]
Output:
[{'month': '2017-01-01', 'total_messages': 5},
{'month': '2017-02-01', 'total_messages': 9},
{'month': '2017-03-01', 'total_messages': 4},
{'month': '2017-04-01', 'total_messages': 4},
{'month': '2017-05-01', 'total_messages': 5},
{'month': '2017-07-01', 'total_messages': 5},
{'month': '2017-08-01', 'total_messages': 9},
{'month': '2017-09-01', 'total_messages': 4},
{'month': '2017-10-01', 'total_messages': 4},
{'month': '2017-12-01', 'total_messages': 9}]

Summing over an array and then multiply by a dictionary [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Fruits = ['apple', 'orange', 'banana', 'kiwi']
A = [4, 3, 10, 8]
B = {'apple': {'Bill': 4, 'Jan': 3, 'Frank': 5},
'orange': {'Bill': 0, 'Jan': 1, 'Frank': 5},
'banana': {'Bill': 8, 'Jan': 6, 'Frank': 2},
'kiwi': {'Bill': 4, 'Jan': 2, 'Frank': 7}}
I am trying to sum over all the fruits of A and multiply that by B. I am having trouble doing this A is an array of just numbers and B is a dictionary. This is where I am getting confused. I am a new Python user. The numbers in A are in the same position relative to Fruits (the first number in A is the number of apples). Would this involve using sum(A)?
Sorry folks for the lack of details. Here is some clarity. I have fruits and I have numbers of fruits that each person has based on the type. I am wanting to sum all of the values of each fruit type in B such that I get:
apple = 12
orange = 6
banana = 16
kiwi = 13
Now, I want to multiple these numbers, by A, but keeping in mind that the first number in A, is apple, then orange, and so on to get a new array:
Solution = [48,18,160,104] #solution order is apple, orange, banana, kiwi
Assuming that you want to multply the sum of the fruits for each person (in B) by the cost in A, you can do the following list comprehension:
>>> [cost * sum(B[fruit].values()) for cost, fruit in zip(A, Fruits)]
[48, 18, 160, 104]
fruit_costs = {fruit_name:fruit_cost for fruit_name,fruit_cost in zip(Fruits,A)
for fruit in Fruits:
print "Fruit:",fruit,"=",sum(B[fruit].values())*fruit_costs[fruit]
I guess?
Merge everything into one big dictionary; everything here is just properties of fruits:
>>> for i, fruit in enumerate(fruits):
>>> B[fruit]['cost'] = A[i]
>>> B
{'banana': {'Frank': 2, 'Jan': 6, 'Bill': 8, 'cost': 10}, 'apple': {'Frank': 5, 'Jan': 3, 'Bill': 4, 'cost': 4}, 'orange': {'Frank': 5, 'Jan': 1, 'Bill': 0, 'cost': 3}, 'kiwi': {'Frank': 7, 'Jan': 2, 'Bill': 4, 'cost': 8}}
Rename "B" to "fruits" (losing the old value of "fruits"):
>>> fruits = B
Calculate fruit cost for each fruit:
>>> for fruitname in fruits:
... fruit = test.B[fruitname]
... fruit['total'] = fruit['Frank'] + fruit['Bill'] + fruit['Jan']
... fruit['total cost'] = fruit['cost'] * fruit['total']
...
>>> fruits
{'banana': {'total': 16, 'Frank': 2, 'Jan': 6, 'total cost': 160, 'Bill': 8, 'cost': 10}, 'apple': {'total': 12, 'Frank': 5, 'Jan': 3, 'total cost': 48, 'Bill': 4, 'cost': 4}, 'orange': {'total': 6, 'Frank': 5, 'Jan': 1, 'total cost': 18, 'Bill': 0, 'cost': 3}, 'kiwi': {'total': 13, 'Frank': 7, 'Jan': 2, 'total cost': 104, 'Bill': 4, 'cost': 8}}
Calculate total cost:
>>> total = sum(fruits[fruit]['total cost'] for fruit in fruits)
Or if that last line is awkward since you're new to Python, you can expand it out into:
>>> total = 0
>>> for fruitname in fruits:
... fruit = fruits[fruitname]
... total += fruit['total cost']
...
Either way:
>>> total
330

Categories