Average the values of a list of dictionaries - python

I have the following list of dictionaries. Each dictionary has a "Point" and a "Value" and goes from 1 to 10, for each series of points.
My_list = [{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250},\
{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250},\
{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250}]
I would like to find the average 'Value' for every 2 'Point', without messing with the 'Value' of the next series. I have done the following.
every2 = []
counter = 2
temp = []
for point in My_list:
if counter > 0:
temp.append(point["Value"])
else:
p = point
p["Value"] = sum(temp)/len(temp)
every2.append(point)
# reset the counter after every 2 point
counter = 2
temp = []
# temp.append(point["Value"])
counter -= 1
print(every2)
The result I am getting is:
[{'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 40.0},
{'Point': 1, 'Value': 250.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 40.0}, {'Point': 1, 'Value': 250.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 40.0}]
However I am missing the first 'Point', as the 'Point' of the first series starts from 3 instead of 1 and as a consequence the 'Point' 9 has a value of 40 instead of 125.
So what I want should look like this:
[{'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}, {'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 125.0}, {'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 125.0}]

You can add a step argument to range() that will allow you to iterate over the list in steps of 2. Then, get both elements you want to use, create a new element using the values, and append that to your result list.
result_list = []
n_step = 2 # chunk size is 2
for i in range(0, len(My_list), n_step):
# Get all elements in this chunk
elems = My_list[i:i+n_step]
# Find the average of the Value key in elems
avg = sum(item['Value'] for item in elems) / len(elems)
# Point key from the first element; Value key from average
new_item = {"Point": elems[0]["Point"], "Value": avg}
result_list.append(new_item)
Which gives:
[{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}]

You can also use list comprehension
res = [{**data[n], **{'Value': sum(v['Value']/2 for v in data[n: n+2])}} for n in range(0, len(data), 2)]
print(res)
Output:
[{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}]
For python 3.9+
[data[n] | {'Value': sum(v['Value']/2 for v in data[n: n+2])} for n in range(0, len(data), 2)]

Here's another option that's using zip to "parallel loop" over My_list with an offset:
result = [
{"Point": p1["Point"], "Value": (p1["Value"] + p2["Value"]) / 2}
for p1, p2 in zip(My_list[::2], My_list[1::2])
]
Requirement is that all the series have even length.

Related

How to sort, group, and aggregate values in a list of nested dictionaries?

given the list of dictionaries below, I want to do the following things:
1: Sort the following data by key (top level)'name'
2: Sort the by the nested key "name" under key "items"
3: Group values under items by aggregation interval for example "1d"
4: Get again the min max and avg result from step number 3\
Atm, I resolve this by iter down to the values and group them with pandas, aggregate again min max and avg from result.
This way feels really tricky, and the performance is not given.
Can someone help me out?
[
{
'_id': 2,
'name': 'b',
'device': 'b',
'items': [
{
'item_id': 'item_id_2', 'name': 'item_2', 'unit': 'b/s',
'values': [
{'time': datetime.datetime(2022, 9, 5, 15, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 16, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 17, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 18, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 19, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 20, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
]
}
]
},
{
'_id': 1,
'name': 'a',
'device': 'a',
'items': [
{
'item_id': 'item_id_1', 'name': 'item_1', 'unit': 'b/s',
'values': [
{'time': datetime.datetime(2022, 9, 5, 15, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 16, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 17, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 18, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 19, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
{'time': datetime.datetime(2022, 9, 5, 20, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
]
}
]
}
]
As for the result, I would expect something like this:
[
{
'_id': 1,
'name': 'a',
'device': 'a',
'items': [
{
'item_id': 'item_id_1', 'name': 'item_1', 'unit': 'b/s',
'values': [
{'time': datetime.datetime(2022, 9, 5, 0, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
]
}
]
},
{
'_id': 1,
'name': 'b',
'device': 'b',
'items': [
{
'item_id': 'item_id_2', 'name': 'item_2', 'unit': 'b/s',
'values': [
{'time': datetime.datetime(2022, 9, 5, 0, 0), 'min': 0.0, 'max': 1.0, 'avg': 0.5},
]
}
]
}
]
With the initial list of dicts that you provided and that I choose to call data, here is one way to do it:
df = pd.DataFrame(data)
# First, sort values
df = df.assign(temp=df["items"].apply(lambda x: x[0]["name"])).pipe(
lambda df_: df_.sort_values(by="temp").drop(columns="temp").reset_index(drop=True)
)
# Get aggregated as new column 'temp'
dfs = df["items"].apply(lambda x: pd.DataFrame(x[0].pop("values", None)))
df["temp"] = pd.Series(
[
{
k: v[0]
for k, v in df.set_index("time")
.resample("D")
.mean()
.reset_index()
.to_dict(orient="list")
.items()
}
for df in dfs
]
)
df["items"] = df["items"].apply(lambda x: x[0])
# Merge intermediate dictionaries
df["items"] = df.apply(lambda x: x["items"] | {"values": [x["temp"]]}, axis=1)
df = df.drop(columns="temp")
And so:
print(df.to_json(orient="records"))
# Output
[
{
"_id": 1,
"name": "a",
"device": "a",
"items": {
"item_id": "item_id_1",
"name": "item_1",
"unit": "b\\/s",
"values": [{"time": 1662336000000, "min": 0.0, "max": 1.0, "avg": 0.5}],
},
},
{
"_id": 2,
"name": "b",
"device": "b",
"items": {
"item_id": "item_id_2",
"name": "item_2",
"unit": "b\\/s",
"values": [{"time": 1662336000000, "min": 0.0, "max": 1.0, "avg": 0.5}],
},
},
]

Python: Descending order and just 3 objects has a high value [duplicate]

This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 6 months ago.
I have an array object like that, Not sort value, I want descending order and just 3 objects has a high value:
[{'id': 1, 'value': 3},
{'id': 2, 'value': 6},
{'id': 3, 'value': 8},
{'id': 4, 'value': 8},
{'id': 5, 'value': 10},
{'id': 6, 'value': 9},
{'id': 7, 'value': 8},
{'id': 8, 'value': 4},
{'id': 9, 'value': 5}]
I want result is descending order and just 3 objects have a high value, like this
[{'id': 5, 'value': 10},
{'id': 6, 'value': 9},
{'id': 7, 'value': 8},
{'id': 3, 'value': 8},
{'id': 4, 'value': 8},]
Please help me, thanks
t = [{'id': 1, 'value': 3},
{'id': 2, 'value': 6},
{'id': 3, 'value': 8},
{'id': 4, 'value': 8},
{'id': 5, 'value': 10},
{'id': 6, 'value': 9},
{'id': 7, 'value': 8}]
newlist = sorted(t, key=lambda d: d['value'])
newlist.reverse()
print(newlist[:3])
# [{'id': 5, 'value': 10}, {'id': 6, 'value': 9}, {'id': 7, 'value': 8}]
More info about list slicing
More info about reverse()
More info

Separate list elements by theirs property in Python

I have list p1:
p1 = [
{'id': 1, 'area': 5},
{'id': 2, 'area': 6},
{'id': 3, 'area': 10},
{'id': 4, 'area': 6},
{'id': 5, 'area': 6},
{'id': 6, 'area': 6},
{'id': 7, 'area': 4},
{'id': 8, 'area': 4}
]
And I need to separate this list by area value, like this (p2):
p2 = {
4: [
{'id': 7, 'area': 4},
{'id': 8, 'area': 4}
],
5: [
{'id': 1, 'area': 5}
],
6: [
{'id': 2, 'area': 6},
{'id': 4, 'area': 6},
{'id': 5, 'area': 6},
{'id': 6, 'area': 6}
],
10: [
{'id': 3, 'area': 10}
]
}
My solution is:
areas = {x['area'] for x in p1}
p2 = {}
for area in areas:
p2[area] = [x for x in p1 if x['area'] == area]
It seems to work, but is there any better and more "pythonic" solution?
Using groupby you get
>>> import itertools
>>> f = lambda t: t['area']
>>> {i: list(b) for i, b in itertools.groupby(sorted(p1, key=f), key=f)}
Gives
{4: [{'area': 4, 'id': 7},
{'area': 4, 'id': 8}],
5: [{'area': 5, 'id': 1}],
6: [{'area': 6, 'id': 2},
{'area': 6, 'id': 4},
{'area': 6, 'id': 5},
{'area': 6, 'id': 6}],
10: [{'area': 10, 'id': 3}]}
edit: If you don't like using lambdas you can also do, as suggested by bro-grammer
>>> import operator
>>> f = operator.itemgetter('area')
You can simply use defaultdict:
from collections import defaultdict
result = defaultdict(list)
for i in p1:
result[i['area']].append(i)
Yes, use one of the grouping idioms. Using a vanilla dict:
In [15]: p1 = [
...: {'id': 1, 'area': 5},
...: {'id': 2, 'area': 6},
...: {'id': 3, 'area': 10},
...: {'id': 4, 'area': 6},
...: {'id': 5, 'area': 6},
...: {'id': 6, 'area': 6},
...: {'id': 7, 'area': 4},
...: {'id': 8, 'area': 4}
...: ]
In [16]: p2 = {}
In [17]: for d in p1:
...: p2.setdefault(d['area'], []).append(d)
...:
In [18]: p2
Out[18]:
{4: [{'area': 4, 'id': 7}, {'area': 4, 'id': 8}],
5: [{'area': 5, 'id': 1}],
6: [{'area': 6, 'id': 2},
{'area': 6, 'id': 4},
{'area': 6, 'id': 5},
{'area': 6, 'id': 6}],
10: [{'area': 10, 'id': 3}]}
Or more neatly, using a defaultdict:
In [23]: from collections import defaultdict
In [24]: p2 = defaultdict(list)
In [25]: for d in p1:
...: p2[d['area']].append(d)
...:
In [26]: p2
Out[26]:
defaultdict(list,
{4: [{'area': 4, 'id': 7}, {'area': 4, 'id': 8}],
5: [{'area': 5, 'id': 1}],
6: [{'area': 6, 'id': 2},
{'area': 6, 'id': 4},
{'area': 6, 'id': 5},
{'area': 6, 'id': 6}],
10: [{'area': 10, 'id': 3}]})

Aggregate values on lists of dicts based on key in python

I'm trying to get the aggregation of 2 different lists, where each element is a dictionary with 2 entries, month and value.
So the first list looks like this:
[{
'patient_notes': 5,
'month': datetime.date(2017, 1, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 2, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 5, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 7, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 8, 1)
}, {
'patient_notes': 5,
'month': datetime.date(2017, 12, 1)
}]
Second list is:
[{
'employee_notes': 4,
'month': datetime.date(2017, 2, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 3, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 4, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 8, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 9, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 10, 1)
}, {
'employee_notes': 4,
'month': datetime.date(2017, 12, 1)
}]
So I need to build a new list that contains the sum of both list per month, something like this:
[{
'total_messages': 14,
'month': '2017-01-01'
}, {
'total_messages': 14,
'month': '2017-02-01'
}, {
'total_messages': 14,
'month': '2017-03-01'
}, {
'total_messages': 14,
'month': '2017-04-01'
}, {
'total_messages': 14,
'month': '2017-05-01'
}, {
'total_messages': 14,
'month': '2017-06-01'
}, {
'total_messages': 14,
'month': '2017-07-01'
}, {
'total_messages': 14,
'month': '2017-08-01'
}, {
'total_messages': 14,
'month': '2017-09-01'
}, {
'total_messages': 14,
'month': '2017-10-01'
}, {
'total_messages': 14,
'month': '2017-11-01'
}, {
'total_messages': 14,
'month': '2017-12-01'
}]
I first tried with zip but this only works if first 2 list are equal size. Then I tried with [itertools.izip_longest] but this has problems if lists are equal size but different months...I cannot simply aggregate those...I need to aggregate matching months only
Counter also is great for this, but I cannot change the keys names of original lists...any ideas?
You can use defaultdict to create a counter. Go through each item in the first list and add the patient_notes value to the dictionary. Then go through the second list and add the employee_notes values.
Now you need to encode your new defaultdict back into a list in your desired format. You can use a list comprehension for that. I've sorted the list by month.
from collections import defaultdict
dd = defaultdict(int)
for d in my_list_1:
dd[d['month']] += d['patient_notes']
for d in my_list_2:
dd[d['month']] += d['employee_notes']
result = [{'total_messages': dd[k], 'month': k} for k in sorted(dd.keys())]
>>> result
[{'month': datetime.date(2017, 1, 1), 'total_messages': 5},
{'month': datetime.date(2017, 2, 1), 'total_messages': 9},
{'month': datetime.date(2017, 3, 1), 'total_messages': 4},
{'month': datetime.date(2017, 4, 1), 'total_messages': 4},
{'month': datetime.date(2017, 5, 1), 'total_messages': 5},
{'month': datetime.date(2017, 7, 1), 'total_messages': 5},
{'month': datetime.date(2017, 8, 1), 'total_messages': 9},
{'month': datetime.date(2017, 9, 1), 'total_messages': 4},
{'month': datetime.date(2017, 10, 1), 'total_messages': 4},
{'month': datetime.date(2017, 12, 1), 'total_messages': 9}]
from collections import defaultdict
d_dict = defaultdict(int)
for k,v in [ i.values() for i in l1 + l2 ]:
d_dict[k] += v
[ {'month':i.strftime("%Y-%m-%d"),'total_messages':j} for i, j in sorted(d_dict.items()) ]
Output:
[{'month': '2017-01-01', 'total_messages': 5},
{'month': '2017-02-01', 'total_messages': 9},
{'month': '2017-03-01', 'total_messages': 4},
{'month': '2017-04-01', 'total_messages': 4},
{'month': '2017-05-01', 'total_messages': 5},
{'month': '2017-07-01', 'total_messages': 5},
{'month': '2017-08-01', 'total_messages': 9},
{'month': '2017-09-01', 'total_messages': 4},
{'month': '2017-10-01', 'total_messages': 4},
{'month': '2017-12-01', 'total_messages': 9}]

Sort list of lists that each contain a dictionary

I have this list:
list_users= [[{'points': 9, 'values': 1, 'division': 1, 'user_id': 3}], [{'points': 3, 'values': 0, 'division': 1, 'user_id': 1}], [{'points': 2, 'values': 0, 'division': 1, 'user_id': 4}], [{'points': 9, 'values': 0, 'division': 1, 'user_id': 11}], [{'points': 3, 'values': 0, 'division': 1, 'user_id': 10}], [{'points': 100, 'values': 4, 'division': 1, 'user_id': 2}], [{'points': 77, 'values': 2, 'division': 1, 'user_id': 5}], [{'points': 88, 'values': 3, 'division': 1, 'user_id': 6}], [{'points': 66, 'values': 1, 'division': 1, 'user_id': 7}], [{'points': 2, 'values': 0, 'division': 1, 'user_id': 8}]]
I need to sort the list by points and values.
How can I sort it if dict is inside a list inside the main list?
I generated this list by query and than just append to list_users?
Access the dictionary containing points and values by indexing on the inner list:
list_users_sorted = sorted(list_users, key=lambda x: (x[0]['points'], x[0]['values']))
# ^ ^
Sort using a key function for sorted that builds a tuple of points and values for each dict in each list.
def kf(x):
return (x[0]["points"], x[0]["values"])
s = sorted(list_users, key=kf)
print(s)
Output:
[[{'division': 1, 'points': 2, 'user_id': 4, 'values': 0}],
[{'division': 1, 'points': 2, 'user_id': 8, 'values': 0}],
[{'division': 1, 'points': 3, 'user_id': 1, 'values': 0}],
[{'division': 1, 'points': 3, 'user_id': 10, 'values': 0}],
[{'division': 1, 'points': 9, 'user_id': 11, 'values': 0}],
[{'division': 1, 'points': 9, 'user_id': 3, 'values': 1}],
[{'division': 1, 'points': 66, 'user_id': 7, 'values': 1}],
[{'division': 1, 'points': 77, 'user_id': 5, 'values': 2}],
[{'division': 1, 'points': 88, 'user_id': 6, 'values': 3}],
[{'division': 1, 'points': 100, 'user_id': 2, 'values': 4}]]

Categories