I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]
Related
I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]
This is a continuation of another question I have asked before (Dataframe add element from a column based on values contiguity from another columns), I got the solution if I use a pandas DataFrame, but not if I have 2 lists, and here is where I am stuck.
I have 2 lists:
a=[2,3,4,1]
b=[5,6,7,2,8,9,1,2,3,4]
The result I would like to add the element of b using the value of a.
The first element of a = 2 so I would like to add from b the first 2 elements (5+6)
The second element of a = 3 so I would like to add from b the next 3 elements (7+2+8)
and so on.
I tried a for loop but the sum always starts from the first element of b. There is a way to get the result I want without change b or create another list?
Is this what you're looking for?
a=[2,3,4,1]
b=[5,6,7,2,8,9,1,2,3,4]
c = []
index = 0
for item in a:
c.append(sum(b[index: index + item]))
index += item
print(c)
Output
[11, 17, 15, 4]
numpy:
import numpy as np
np.add.reduceat(b,np.cumsum(np.concatenate([[0],a[:-1]])))
# array([11, 17, 15, 4])
python:
import itertools as it
bi = iter(b)
[sum(it.islice(bi,x)) for x in a]
# [11, 17, 15, 4]
I would use numpy.cumsum to get a running sum of the starting index for the next series of sums. Then you can zip that index list against itself offset by 1 to determine the slice to sum for each iteration.
>>> from numpy import cumsum
>>> starts = cumsum([0] + a)
>>> [sum(b[i:j]) for i,j in zip(starts, starts[1:])]
[11, 17, 15, 4]
a=[2,3,4,1]
b=[5,6,7,2,8,9,1,2,3,4]
new = []
i=0
for x in range(len(a)):
el = a[x]
new.append(sum(b[i:i+el]))
i=i+el
print(new)
#[11, 17, 15, 4]
Without creating a new intermediary list, you could do something like this:
[ sum( b[ sum(a[:i]): ][ :a[i] ] ) for i in range(len(a)) ]
Although, it is somewhat computation heavy. Using a for loop which builds the list c would be a much more efficient approach, like #Balaji Ambresh answered.
I am working through some code trying to understand some Python mechanics, which I just do not get. I guess it is pretty simple and I also now, what it does, but i do not know how it works. I understand the normal use of for-loops but this here... I do not know.
Remark: I know some Python, but I am not an expert.
np.array([[[S[i,j]] for i in range(order+1)] for j in range(order+1)])
The second piece of code, I have problems with is this one:
for i in range(len(u)):
for j in range(len(v)):
tmp+=[rm[i,j][k]*someFuction(name,u[i],v[j])[k] for k in range(len(rm[i,j])) if rm[i,j][k]]
How does the innermost for-loop work? And also what does the if do here?
Thank you for your help.
EDIT: Sorry that the code is so unreadable, I just try to understand it myself. S, rm are numpy matrices, someFunction returns an array with scalar entries, andtmp is just a help variable
There are quite a few different concepts inside your code. Let's start with the most basic ones. Python lists and numpy arrays have different methodologies for indexation. Also you can build a numpy array by providing it a list:
S_list = [[1,2,3], [4,5,6], [7,8,9]]
S_array = np.array(S_list)
print(S_list)
print(S_array)
print(S_list[0][2]) # indexing element 2 from list 0
print(S_array[0,2]) # indexing element at position 0,2 of 2-dimensional array
This results in:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1 2 3]
[4 5 6]
[7 8 9]]
3
3
So for your first line of code:
np.array([[[S[i,j]] for i in range(order+1)] for j in range(order+1)])
You are building a numpy array by providing it a list. This list is being built with the concept of list comprehension. So the code inside the np.array(...) method:
[[[S[i,j]] for i in range(order+1)] for j in range(order+1)]
... is equivalent to:
order = 2
full_list = []
for j in range(order+1):
local_list = []
for i in range(order+1):
local_list.append(S_array[i, j])
full_list.append(local_list)
print(full_list)
This results in:
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
As for your second snippet its important to notice that although typically numpy arrays have very specific and constant (for all the array) cell types you can actually give the data type object to a numpy array. So creating a 2-dimensional array of lists is possible. It is also possible to create a 3-dimensional array. Both are compatible with the indexation rm[i,j][k]. You can check this in the following example:
rm = np.array(["A", 3, [1,2,3]], dtype="object")
print(rm, rm[2][0]) # Acessing element 0 of list at position 2 of the array
rm2 = np.zeros((3, 3, 3))
print(rm2[0, 1][2]) # This is also valid
The following code:
[rm[i,j][k]*someFuction(name,u[i],v[j])[k] for k in range(len(rm[i,j])) if rm[i,j][k]]
... could be written as such:
some_list = []
for k in range(len(rm[i,j])):
if rm[i, j][k]: # Expecting a boolean value (or comparable)
a_list = rm[i,j][k]*someFuction(name,u[i],v[j])
some_list.append(a_list[k])
The final detail is the tmp+=some_list. When you sum two list they'll be concatenated as can been seen in this simple example:
tmp = []
tmp += [1, 2, 3]
print(tmp)
tmp += [4, 5, 6]
print(tmp)
Which results in this:
[1, 2, 3]
[1, 2, 3, 4, 5, 6]
Also notice that multiplying a list by a number will effectively be the same as summing the list several times. So 2*[1,2] will result in [1,2,1,2].
Its a list comprehension, albeit a pretty unreadable one. That was someome doing something very 'pythonic' in spite of readablity. Just look up list comprehensions and try to rewrite it yourself as a traditional for loop. list comprehensions are very useful, not sure I would have gone that route here.
The syntax for a list comprehension is
[var for var in iterable if optional condition]
So this bottom line can be rewritten like so:
for k in range(len(rm[i,j]):
if rm[i,j][k]:
tmp+= rm[i,j][k]*someFunction(name,u[i],v[j])[k]
I have a list (a), which contains references to another list (b). I am trying to create a third list (c), which contains the summed up values of (b) referenced in the according index of (a). Below is an example to hopefully make it clearer.
I have a rather large data set and this needs to be done frequently as part of an optimization process. Is there a way besides nested for-loops to do this efficiently and automated, without having to define every entry of c?
a = [[0],[0,3],[1,2],[3],[1,2,3]]
b = [10,20,30,40]
c = [b[0], b[0]+b[3], b[1]+b[2], b[3], b[1]+b[2]+b[3]]
Thanks in advance for any help, and sorry for potential mistakes in the post. It's my first and I'm trying to learn.
Not sure if you need it to be pure python only, but if not - you can use numpy library:
>>> import numpy as np
>>> c = np.array(b)
>>> [sum(c[i]) for i in a]
[10, 50, 50, 40, 90]
You can do this in a list comprehension: It takes the indices from each inner list in a, builds a list of the b values corresponding, sums them up, and stores them in a list that is assigned to d.
a = [[0], [0, 3], [1, 2], [3], [1, 2, 3]]
b = [10, 20, 30, 40]
d = [sum(b[idx] for idx in indices) for indices in a]
print(d)
output:
[10, 50, 50, 40, 90]
I have the following problem: I have index arrays with repeating indices and would like to add values to an array like this:
grid_array[xidx[:],yidx[:],zidx[:]] += data[:]
However, as I have repeated indices this does not work as it should because numpy will create a temporary array which results in the data for the repeated indices being assigned several times instead of being added to each other (see http://docs.scipy.org/doc/numpy/user/basics.indexing.html).
A for loop like
for i in range(0,n):
grid_array[xidx[i],yidx[i],zidx[i]] += data[i]
will be way to slow. Is there a way I can still use the vectorization of numpy? Or is there another way to make this assignment faster?
Thanks for your help
How about using bincount?
import numpy as np
flat_index = np.ravel_multi_index([xidx, yidx, zidx], grid_array.shape)
datasum = np.bincount(flat_index, data, minlength=grid_array.size)
grid_array += datasum.reshape(grid_array.shape)
This is a buffering issue. The .at provides unbuffered action
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.at.html#numpy.ufunc.at
np.add.at(grid_array, (xidx,yidx,zidx),data)
For add an array to elements of a nested array you just can do grid_array[::]+=data :
>>> grid_array=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> data=np.array([3,3,3])
>>> grid_array[::]+=data
>>> grid_array
array([[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
I think I found a possible solution:
def assign(xidx,yidx,zidx,data):
grid_array[xidx,yidx,zidx] += data
return
map(assign,xidx,yidx,zidx,sn.part0['mass'])