I have the following kind of list:
myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]
I don't want to have values with the same first element, so i wanted to do this: if two or more elements have the same first value, sum all the second values of the element with the same first value and remove the duplicates, so in my example the new output would be:
myList = [[500, 18], [504, 9], [505, 30]]
How can i do this? I was thinking of using Lambda functions but i don't know how to create the function; other solutions i'm thinking about require a massive amount of for loops, so i was thinking if there is an easier way to do this. Any kind of help is appreciated!
Use a defaultdict:
import collections
# by default, non-existing keys will be initialized to zero
myDict = collections.defaultdict(int)
for key, value in myList:
myDict[key] += value
# transform back to list of lists
myResult = sorted(list(kv) for kv in myDict.items())
using the pandas library:
[[k, v] for k, v in pd.DataFrame(myList).groupby(0).sum()[1].items()]
Breaking it down:
pd.DataFrame(myList) creates a DataFrame where each row is one of the short lists in myList:
0 1
0 500 5
1 500 10
2 500 3
3 504 9
4 505 10
5 505 20
(...).groupby(0)[1].sum() groups by the first column, takes the values from the second one (to create a series instead of a dataframe) and sums them
[[k,v] for k, v in (...).items()] is simple list comprehension (treating the series as a dictionary), to output it back as a list like you wanted.
Output:
[[500, 18], [504, 9], [505, 30]]
The list comprehension can be made even shorter by casting each of the .items() to a list:
list(map(list, pd.DataFrame(myList).groupby(0)[1].sum().items()))
An easier to read implementation (less pythonesqe though :-) )
myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]
sums = dict()
for a,b in myList:
if a in sums:
sums[a] += b
else:
sums[a] = b
res = []
for key,val in sums.items():
res.append([key,val])
print (sorted(res))
You can use itertools groupby to group the sublists by the first item in the sublist, sum the last entries in the sublist, and create a new list of group keys, with the sums :
from itertools import groupby
from operator import itemgetter
#sort data
#unnecessary IMO, since data looks sorted
#it is however, required to sort data
#before running the groupby function
myList = sorted(myList, key = itemgetter(0))
Our grouper will be the first item in each sublist (500, 504, 505)
#iterate through the groups
#sum the ends of each group
#pair the sum with the grouper
#return a new list
result = [[key, sum(last for first, last in grp)]
for key, grp
in groupby(myList, itemgetter(0))]
print(result)
[[500, 18], [504, 9], [505, 30]]
myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]
temp = {}
for first, second in myList:
if first in temp:
temp[first] += second
else:
temp[first] = second
result = [[k, v] for k, v in temp.items()]
print(result)
Related
a = [[1,3,45,6,78,9],[2,6,5,88,3,4],[44,66,2,4,77,12]]
b = [4,6,3]
These are two lists in python, now each element in the first list a corresponds to the element in the list b with the same index. i.e a[0]:4,a[1]:6,a[2]:3 and so on.
Now I want to sort list b and then print the respective values corresponding to list a.
I cannot use a dictionary because it gives an error that a is not hashable. My desired output is:
x = [[44,66,2,4,77,12], [1,3,45,6,78,9], [2,6,5,88,3,4]]
You can do :
a = [[1,3,45,6,78,9],[2,6,5,88,3,4],[44,66,2,4,77,12]]
b = [4,6,3]
c = sorted([[b[i],a[i]] for i in range(len(a))])
x = [i[1] for i in c]
It can be done on a single line like:
a = [[1,3,45,6,78,9],[2,6,5,88,3,4],[44,66,2,4,77,12]]
b = [4,6,3]
c = [el for _, el in sorted(zip(b, a))]
print(c)
Output:
[[44, 66, 2, 4, 77, 12], [1, 3, 45, 6, 78, 9], [2, 6, 5, 88, 3, 4]]
(If the values in b are not unique, it will sort by first element in the corresponding the values from a)
There is no issue using a dictionary as long as you avoid using any list as a key. In this case, you could use b for keys and a for values. For example,
d = dict(zip(b,a))
print([d[k] for k in sorted(d)])
I have a list of n numbers. I want to group them in g groups. Also, I want to reverse elements in every odd group. Finally, I would combine all elements in even and odd groups into a new sublist. First I am giving what answer I am expecting and where I went wrong:
Expected answer:
num = 14
grp = 4
# A list of num natural numbers arranged in group (row) of 4 numbers
lst =
[0,1,2,3,
4,5,6,7,
8,9,10,11,
12,13]
lst =
[[0,1,2,3],
[4,5,6,7],
[8,9,10,11],
[12,13]]
# Reverse elements in odd rows
newlst =
[[0,1,2,3],
[7,6,5,4], # reversed here
[8,9,10,11],
[13,12]] # reversed here
# combine elements in all sublists by their position
# combine first element in all sublists into a new sublist
sollst =
[[0,7,8,13],[1,6,9,12],[2,5,10],[3,4,11]]
My solution:
num = 14
grp = 4
#### print
lst= list(range(0,num,1))
newlst= [lst[i:i+grp:1] for i in range(0,num,grp)]
evnlst = newlst[0::2]
oddlst = newlst[1::2]
newoddlst = [oddlst [i][::-1] for i in range(len(oddlst))]
sollst= evnlst + newoddlst
# This gives [[0, 1, 2, 3], [8, 9, 10, 11], [7, 6, 5, 4], [13, 12]]
from itertools import zip_longest
print([[x for x in t if x is not None] for t in zip_longest(fevngps)])
Present answer:
I reached the one step before the final answer and now I have to combine the lists of different lengths and I am running into an error
TypeError: 'int' object is not subscriptable
One approach:
from itertools import zip_longest
num = 14
grp = 4
lst = list(range(0, num, 1))
newlst = [lst[i:i + grp:1] for i in range(0, num, grp)]
# build new list where the sub-list are reversed if in odd indices
revlst = [lst[::-1] if i % 2 == 1 else lst for i, lst in enumerate(newlst)]
# zip using zip_longest and filter out the None values (the default fill value of zip_longest)
result = [[v for v in vs if v is not None] for vs in zip_longest(*revlst)]
print(result)
Output
[[0, 7, 8, 13], [1, 6, 9, 12], [2, 5, 10], [3, 4, 11]]
So I have a dataset that contains history of a specific tag from a start to end date. I am trying to compare rows based on the a date column, if they're similar by month, day and year, I'll add those to a temporary list by the value of the next column and then once I have those items by similar date, I'll take that list and find the min/max values subtract them, then add the result to another list and empty the temp_list to start all over again.
For the sake of time and simplicity, I am just presenting a example of 2D List. Here's my example data
dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,20],[3,40],[4,50],[4,500]]
Where the first column will act as dates and second value.
The issues I am having is :
I cant seem to compare every row based on its first column which would take the value in the second column and include it in the temp list to perform min/max operations?
Based on the above 2D List I would expect to get [18,8,30,450] but the result is [5,4,10]
dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,30],[3,40],[4,2],[4,5]]
temp_list = []
daily_total = []
for i in range(len(dataset)-1):
if dataset[i][0] == dataset[i+1][0]:
temp_list.append(dataset[i][1])
else:
max_ = max(temp_list)
min_ = min(temp_list)
total = max_ - min_
daily_total.append(total)
temp_list = []
print([x for x in daily_total])
Try:
tmp = {}
for d, v in dataset:
tmp.setdefault(d, []).append(v)
out = [max(v) - min(v) for v in tmp.values()]
print(out)
Prints:
[18, 8, 30, 450]
Here is a solution using pandas:
import pandas as pd
dataset = [
[1, 5],
[1, 6],
[1, 10],
[1, 23],
[2, 4],
[2, 8],
[2, 12],
[3, 10],
[3, 20],
[3, 40],
[4, 50],
[4, 500],
]
df = pd.DataFrame(dataset)
df.columns = ["date", "value"]
df = df.groupby("date").agg(min_value=("value", "min"), max_value=("value", "max"))
df["res"] = df["max_value"] - df["min_value"]
df["res"].to_list()
Output:
[18, 8, 30, 450]
This question already has answers here:
Optimal method to find the max of sublist items within list
(3 answers)
Closed 2 years ago.
I have a list below:
a = [ [8, 12], [13, 9], [2, 5], [1, 10], [13, 13] ]
How do I find 5 of a[2][1]. I want to find the minimum value of second element of the sub-list.
This is my code:
min = a[0][1]
for i in range(len(a)):
temp = a[i][1]
if temp < min:
min = temp
What is a good way (with fewer lines of code) to implement this code?
you could use the built-in function min with parameter key:
a = [ [8, 12], [13, 9], [2, 5], [1, 10], [13, 13] ]
min(a, key=lambda x: x[1])[1]
output:
5
You can make a generator that obtains the second element for each sublist with:
sub[1] for sub in data
So we can pass this to the min(..) function:
min(sub[1] for sub in data)
If not all sublists have at least two elements, we can add a filter conditon:
min(sub[1] for sub in data if len(sub) > 1)
You can see it in another way
filter to get only the seconds elements, here's some ways
seconds = [v[1] for v in a]
seconds = map(itemgetter(1), a)
get the min in this
min_val = min(seconds)
Shorted result
min_val = min(map(itemgetter(1), a))
I have n lists of equal length representing values of database rows. The data is pretty complicated, so i'll present simplified values in the example.
Essentially, I want to map the values of these lists (a,b,c) to a dictionary where the keys are the set of the list (id).
Example lists:
id = [1,1,1,2,2,2,3,3,3]
a = [1,2,3,4,5,6,7,8,9]
b = [10,11,12,13,14,15,16,17,18]
c = [20,21,22,23,24,25,26,27,28]
Needed dictionary output:
{id:[[a],[b],[c]],...}
{'1':[[1,2,3],[10,11,12],[20,21,22]],'2':[[4,5,6],[13,14,15],[23,24,25]],'3':[[7,8,9],[16,17,18],[26,27,28]]}
The dictionary now has a list of lists for the values in the original a,b,c subsetted by the unique values in the id list which is now the dictionary key.
I hope this is clear enough.
Try this:
id = ['1','1','1','2','2','2','3','3','3']
a = [1,2,3,4,5,6,7,8,9]
b = [10,11,12,13,14,15,16,17,18]
c = [20,21,22,23,24,25,26,27,28]
from collections import defaultdict
d = defaultdict(list)
# add as many lists as needed, here n == 3
lsts = [a, b, c]
for ki, kl in zip(id, zip(*lsts)):
d[ki] += [kl]
for k, v in d.items():
# if you don't mind using tuples, simply do this: d[k] = zip(*v)
d[k] = map(list, zip(*v))
The result is exactly as expected according to the question:
d == {'1':[[1,2,3],[10,11,12],[20,21,22]],
'2':[[4,5,6],[13,14,15],[23,24,25]],
'3':[[7,8,9],[16,17,18],[26,27,28]]}
=> True
IDs = [1,1,1,2,2,2,3,3,3]
a = [1,2,3,4,5,6,7,8,9]
b = [10,11,12,13,14,15,16,17,18]
c = [20,21,22,23,24,25,26,27,28]
import itertools
d = {}
for key, group in itertools.groupby(sorted(zip(IDs, a, b, c)), key=lambda x:x[0]):
d[key] = map(list, zip(*group)[1:]) # [1:] to get rid of the ID
print d
OUTPUT:
{1: [[1, 2, 3], [10, 11, 12], [20, 21, 22]],
2: [[4, 5, 6], [13, 14, 15], [23, 24, 25]],
3: [[7, 8, 9], [16, 17, 18], [26, 27, 28]]}