I have this dictionary, and when I code for it, I only have the answer for June, May, September. How would I code for the months that are not given in the dictionary? Obviously, I have zero for them.
{'account': 'Amazon', 'amount': 300, 'day': 3, 'month': 'June'}
{'account': 'Facebook', 'amount': 550, 'day': 5, 'month': 'May'}
{'account': 'Google', 'amount': -200, 'day': 21, 'month': 'June'}
{'account': 'Amazon', 'amount': -300, 'day': 12, 'month': 'June'}
{'account': 'Facebook', 'amount': 130, 'day': 7, 'month': 'September'}
{'account': 'Google', 'amount': 250, 'day': 27, 'month': 'September'}
{'account': 'Amazon', 'amount': 200, 'day': 5, 'month': 'May'}
The method I used for months mentioned in the dictionary:
year_balance=sum(d["amount"] for d in my_dict) print(f"The total year balance is {year_balance} $.")
import calendar
months = calendar.month_name[1:]
results = dict(zip(months, [0]*len(months)))
for d in data:
results[d["month"]] += d["amount"]
# then you have results dict with monthly amounts
# sum everything to get yearly total
total = sum(results.values())
This might help:
from collections import defaultdict
mydict = defaultdict(lambda: 0)
print(mydict["January"])
Also, given the comments you have written, is this what you are looking for?
your_list_of_dicts = [
{"January": 3, "March": 5},
{"January": 3, "April": 5}
]
import calendar
months = calendar.month_name[1:]
month_totals = dict()
for month in months:
month_totals[month] = 0
for d in your_list_of_dicts:
month_totals[month] += d[month] if month in d else 0
print(month_totals)
{'January': 6, 'February': 0, 'March': 5, 'April': 5, 'May': 0, 'June': 0, 'July': 0, 'August': 0, 'September': 0, 'October': 0, 'November': 0, 'December': 0}
You can read the following blog regarding the usage of dictionaries and how to perform calculations.
5 best ways to sum dictionary values in python
This is on of the examples given in the blog.
wages = {'01': 910.56, '02': 1298.68, '03': 1433.99, '04': 1050.14, '05': 877.67}
total = sum(wages.values())
print('Total Wages: ${0:,.2f}'.format(total))
Here is the result with 100,000 records.
Result with 100,000 records
Related
I have a dictionary:
{
"account": "x*", 'amount': 300, 'day': 3, 'month': 'June',
"account": "y*", 'amount': 550, 'day': 9, 'month': 'May',
"account": 'z*', 'amount': -200, 'day': 21, 'month': 'June'
"account" : "g", "amount" : 80" "day" : 10" month" : "May"
}
How do I find the total amount for each month June and May separately?
dictionary= sum(d["amount"] for d in my_dict)
You can filter which elements to sum, by adding an if statement at the end of the one-liner for-loop:
sum(d['amount'] for d in my_dict if d['month'] == month)
Then, we can wrap this line of code inside a small function to compute the results for May and June:
my_dict = [{'account': 'x*', 'amount': 300, 'day': 3, 'month': 'June'},
{'account': 'y*', 'amount': 550, 'day': 9, 'month': 'May' },
{'account': 'z*', 'amount': -200, 'day': 21, 'month': 'June'},
{'account': 'g' , 'amount': 80, 'day': 10, 'month': 'May' }]
get_sum = lambda my_dict, month: sum(d['amount'] for d in my_dict if d['month'] == month)
sum_June = get_sum(my_dict, 'June')
sum_May = get_sum(my_dict, 'May' )
print('sum_June:', sum_June)
# sum_June: 100
print('sum_May :', sum_May)
# sum_May : 630
PS. Initially, the dictionary my_dict was over-writting data, because everything was stored in the same object. In the code above, my_dict is split into a list with multiple rows to avoid this issue. Please consider this methodology to store data in your project - it is very common.
I have the following df:
df = pd.DataFrame({"year":[2020,2020,2020,2021,2021,2021,2022,2022, 2022],"region":['europe','USA','africa','europe','USA','africa','europe','USA','africa'],'volume':[1,6,5,3,8,7,6,3,5]})
I wish to convert it to a list of dictionary such that the year would be mentioned only once in each item. Example
[{'year':2020,'europe':1,'USA':6,'africa':5,}...]
when I do:
df.set_index('year').to_dict('records')
I lost the years and the list
Another approach that uses pivot before to_dict(orient='records')
df.pivot(
index='year',
columns='region',
values='volume'
).reset_index().to_dict(orient='records')
#Output:
#[{'year': 2020, 'USA': 6, 'africa': 5, 'europe': 1},
# {'year': 2021, 'USA': 8, 'africa': 7, 'europe': 3},
# {'year': 2022, 'USA': 3, 'africa': 5, 'europe': 6}]
Try:
d = [
{"year": y, **dict(zip(x["region"], x["volume"]))}
for y, x in df.groupby("year")
]
print(d)
Prints:
[
{"year": 2020, "europe": 1, "USA": 6, "africa": 5},
{"year": 2021, "europe": 3, "USA": 8, "africa": 7},
{"year": 2022, "europe": 6, "USA": 3, "africa": 5},
]
you can use groupby on year and then zip region and volume
import pandas as pd
df = pd.DataFrame({"year":[2020,2020,2020,2021,2021,2021,2022,2022, 2022],"region":['europe','USA','africa','europe','USA','africa','europe','USA','africa'],'volume':[1,6,5,3,8,7,6,3,5]})
year_dfs = df.groupby("year")
records = []
for year, year_df in year_dfs:
year_dict = {key: value for key, value in zip(year_df["region"], year_df["volume"])}
year_dict["year"] = year
records.append(year_dict)
""" Answer
[{'europe': 1, 'USA': 6, 'africa': 5, 'year': 2020},
{'europe': 3, 'USA': 8, 'africa': 7, 'year': 2021},
{'europe': 6, 'USA': 3, 'africa': 5, 'year': 2022}]
"""
To break down each step, you could use pivot to group your df to aggregate the years, your columns become countries, and volume becomes your values
df.pivot('year','region','volume')
region USA africa europe
year
2020 6 5 1
2021 8 7 3
2022 3 5 6
To get this into dictionary format you can use the .to_dict('index')
command (in one line)
x = df.pivot('year','region','volume').to_dict('index')
{2020: {'USA': 6, 'africa': 5, 'europe': 1}, 2021: {'USA': 8, 'africa': 7, 'europe': 3}, 2022: {'USA': 3, 'africa': 5, 'europe': 6}}
finally you could use list comprehension to get it into your desired format
output = [dict(x[y], **{'year':y}) for y in x]
[{'USA': 6, 'africa': 5, 'europe': 1, 'year': 2020}, {'USA': 8, 'africa': 7, 'europe': 3, 'year': 2021}, {'USA': 3, 'africa': 5, 'europe': 6, 'year': 2022}]
I am getting my time in day and month format like this:
final =[{'day': 29, 'month': 5},{'day': 30, 'month': 5},{'day': 1, 'month': 6},{'day': 2, 'month': 6},{'day': 3, 'month': 6},{'day': 4, 'month': 6},{'day': 5, 'month': 6},{'day': 6, 'month': 6}, {'day': 7, 'month': 6}, {'day': 8, 'month': 6}, {'day': 9, 'month': 6}]
I want to check count of consecutive days in array from today to keep count of last online days . and if like my previous day exist it will add 1 in total count. for example {'day': 5, 'month': 6},{'day': 8, 'month': 6}, {'day': 9, 'month': 6}
in these three record 6 is missing so my count will be 2 .
there is now issue that like if it goes to previous month and there month end is like 30 and month 5 , how I will add this to my count ?
for now : I am doing like this
#getting today day and month and year
today_time = int(time.time())
today_time_day = datetime.datetime.fromtimestamp(today_time)
#to check if previous month day end start
month = monthrange(today_time_day.year, today_time_day.month)
print(month)
i =0
streak = 0
for x in reversed(final):
if today_time_day.day - i == x['day']:
streak += 1
else:
streak = 1
break
i += 1
print(streak)
I am trying to calculate but answer is wrong and not sure how I can use previous month streak .
So the answer is we need to keep track of last month count and reset loop count
today_time = int(time.time())
today_time_day = datetime.datetime.fromtimestamp(today_time)
final =[{'day': 29, 'month': 5},{'day': 28, 'month': 5},{'day': 1, 'month': 6},{'day': 2, 'month': 6},{'day': 3, 'month': 6},{'day': 4, 'month': 6},{'day': 5, 'month': 6},{'day': 6, 'month': 6}, {'day': 7, 'month': 6}, {'day': 8, 'month': 6}, {'day': 9, 'month': 6}]
month = monthrange(today_time_day.year, today_time_day.month)
i =0
current_day = today_time_day.day
streak = 0
for x in reversed(final):
if current_day - i == 0 and today_time_day.month -1 == x["month"] :
current_day = month[1]
i = 0
if current_day - i == x['day']:
streak += 1
else:
break
i += 1
print(streak)
I have to parse the following file in python:
20100322;232400;1.355800;1.355900;1.355800;1.355900;0
20100322;232500;1.355800;1.355900;1.355800;1.355900;0
20100322;232600;1.355800;1.355800;1.355800;1.355800;0
I need to end upwith the following variables (first line is parsed as example):
year = 2010
month = 03
day = 22
hour = 23
minute = 24
p1 = Decimal('1.355800')
p2 = Decimal('1.355900')
p3 = Decimal('1.355800')
p4 = Decimal('1.355900')
I have tried:
line = '20100322;232400;1.355800;1.355900;1.355800;1.355900;0'
year = line[:4]
month = line[4:6]
day = line[6:8]
hour = line[9:11]
minute = line[11:13]
p1 = Decimal(line[16:24])
p2 = Decimal(line[25:33])
p3 = Decimal(line[34:42])
p4 = Decimal(line[43:51])
print(year)
print(month)
print(day)
print(hour)
print(minute)
print(p1)
print(p2)
print(p3)
print(p4)
Which works fine, but I am wondering if there is an easier way to parse this (maybe using struct) to avoid having to count each position manually.
from decimal import Decimal
from datetime import datetime
line = "20100322;232400;1.355800;1.355900;1.355800;1.355900;0"
tokens = line.split(";")
dt = datetime.strptime(tokens[0] + tokens[1], "%Y%m%d%H%M%S")
decimals = [Decimal(string) for string in tokens[2:6]]
# datetime objects also have some useful attributes: dt.year, dt.month, etc.
print(dt, *decimals, sep="\n")
Output:
2010-03-22 23:24:00
1.355800
1.355900
1.355800
1.355900
You could use regex:
import re
to_parse = """
20100322;232400;1.355800;1.355900;1.355800;1.355900;0
20100322;232500;1.355800;1.355900;1.355800;1.355900;0
20100322;232600;1.355800;1.355800;1.355800;1.355800;0
"""
stx = re.compile(
r'(?P<date>(?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2}));'
r'(?P<time>(?P<hour>\d{2})(?P<minute>\d{2})(?P<second>\d{2}));'
r'(?P<p1>[\.\-\d]*);(?P<p2>[\.\-\d]*);(?P<p3>[\.\-\d]*);(?P<p4>[\.\-\d]*)'
)
f = [{k:float(v) if 'p' in k else int(v) for k,v in a.groupdict().items()} for a in stx.finditer(to_parse)]
print(f)
Output:
[{'date': 20100322,
'day': 22,
'hour': 23,
'minute': 24,
'month': 3,
'p1': 1.3558,
'p2': 1.3559,
'p3': 1.3558,
'p4': 1.3559,
'second': 0,
'time': 232400,
'year': 2010},
{'date': 20100322,
'day': 22,
'hour': 23,
'minute': 25,
'month': 3,
'p1': 1.3558,
'p2': 1.3559,
'p3': 1.3558,
'p4': 1.3559,
'second': 0,
'time': 232500,
'year': 2010},
{'date': 20100322,
'day': 22,
'hour': 23,
'minute': 26,
'month': 3,
'p1': 1.3558,
'p2': 1.3558,
'p3': 1.3558,
'p4': 1.3558,
'second': 0,
'time': 232600,
'year': 2010}]
Here i stored everything in a list, but you could actually go through the results of finditer line by line if you don't want to store everything in memory.
You can also replace fload and/or int with Decimal if needed
I have a python list like
[{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
and want to generate it like
[
{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
...
{'month': 8, 'total': 31600},
{'month': 9, 'total': 2000},
...
{'month': 12, 'total': 0}
]
for that, I'm running iteration on the range (1,13)
new_list = []
for i in range(1, 13):
# if i exists in month, append to new_list
# else add total: 0 and append to new_list
How can I check if i exists in month and get the dictionary?
You can convert your list of dict into direct month: total mapping with
monthly_totals = {item['month']: item['total'] for item in data_list}
and use a simple list comprehension with dict.get to handle missing values:
new_list = [{'month': i, 'total': monthly_totals.get(i, 0)} for i in range(1, 13)]
Create a new list containing the default values and then update the needed values from the original list
>>> lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
>>> new_lst = [dict(month=i, total=0) for i in range(1,13)]
>>> for d in lst:
... new_lst[d['month']-1] = d
...
>>> pprint(new_lst)
[{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
{'month': 3, 'total': 0},
{'month': 4, 'total': 0},
{'month': 5, 'total': 0},
{'month': 6, 'total': 0},
{'month': 7, 'total': 0},
{'month': 8, 'total': 31600.0},
{'month': 9, 'total': 2000.0},
{'month': 11, 'total': 0},
{'month': 12, 'total': 0}]
exist_lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
new_lst = []
for i in range(1,13):
found = False
for dict_item in exist_lst:
if dict_item['month'] == i:
new_lst.append(dict_item)
found = True
if not found:
new_lst.append({'month': i, 'total': 0}) # default_dict_item
print(new_lst)