Making for loop with index arrays faster - python

I have the following problem: I have index arrays with repeating indices and would like to add values to an array like this:
grid_array[xidx[:],yidx[:],zidx[:]] += data[:]
However, as I have repeated indices this does not work as it should because numpy will create a temporary array which results in the data for the repeated indices being assigned several times instead of being added to each other (see http://docs.scipy.org/doc/numpy/user/basics.indexing.html).
A for loop like
for i in range(0,n):
grid_array[xidx[i],yidx[i],zidx[i]] += data[i]
will be way to slow. Is there a way I can still use the vectorization of numpy? Or is there another way to make this assignment faster?
Thanks for your help

How about using bincount?
import numpy as np
flat_index = np.ravel_multi_index([xidx, yidx, zidx], grid_array.shape)
datasum = np.bincount(flat_index, data, minlength=grid_array.size)
grid_array += datasum.reshape(grid_array.shape)

This is a buffering issue. The .at provides unbuffered action
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.at.html#numpy.ufunc.at
np.add.at(grid_array, (xidx,yidx,zidx),data)

For add an array to elements of a nested array you just can do grid_array[::]+=data :
>>> grid_array=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> data=np.array([3,3,3])
>>> grid_array[::]+=data
>>> grid_array
array([[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])

I think I found a possible solution:
def assign(xidx,yidx,zidx,data):
grid_array[xidx,yidx,zidx] += data
return
map(assign,xidx,yidx,zidx,sn.part0['mass'])

Related

Python 3: Add two lists together at the same indexes [duplicate]

I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]

In NumPy, how to extract a range from a 1d array without using np.s_?

Assume an array like so:
a = np.arange(10)
I'd like to delete the numbers from index 2 to 5.
I can do it like this:
a = np.delete(a, np.s_[2:6])
Now a contains [0, 1, 6, 7, 8, 9]. However this function is not supported by Numba, and I need to compile this code using Numba.
I would need to accomplish the same using only "basic" NumPy functions (anything here is OK: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html). Unfortunately the s_ object is not supported.
How can I accomplish this? Its OK if I need to make more than one call or tmp arrays.
The temporary arrays will be created no matter how you do it. You can use some very simple indexing to get what you want:
a = np.arange(10)
a = np.delete(a, slice(2, 6))
The documentation for s_ pretty much tells you how to do this in the notes. A 1D call to s_ is mostly just shorthand for slice.
Using delete is probably the right choice here because it will allocate the output more efficiently than manually slicing the beginning and end and concentrating.
I would advise against deleting from array in numpy as it can be slow (specially for longer arrays, since it copies). Using masks is another way (not sure if numba supports it, worth a try). If you can try to do your operations on masked array:
b = a[2:6]
#[2 3 4 5]
#Try to do operations on masked array
a = np.ma.array(a, mask=False)
a.mask[2:6] = True
#[0 1 -- -- -- -- 6 7 8 9]
#if you insist on deleting masked elements
a = a.compressed()
#[0 1 6 7 8 9]
One way using numpy.arange:
from numba import njit
#njit
def nb_delete(arr, i, j):
return np.delete(arr, np.arange(i, j))
nb_delete(np.arange(10), 2,6)
Output:
array([0, 1, 6, 7, 8, 9])

need to grab entire first element (array) in a multi-dimentional list of arrays python3

Apologies if this has already been asked, but I searched quite a bit and couldn't find quite the right solution. I'm new to python, but I'll try to be as clear as possible. In short, I have a list of arrays in the following format resulting from a joining a multiprocessing pool:
array = [[[1,2,3], 5, 47, 2515],..... [[4,5,6], 3, 35, 2096]]]
and I want to get all values from the first array element to form a new array in the following form:
print(new_array)
[1,2,3,4,5,6]
In my code, I was trying to get the first value through this function:
new_array = array[0][0]
but this only returns the first value as such:
print(new_array)
[1,2,3]
I also tried np.take after converting the array into a np array:
array = np.array(array)
new_array = np.take(results,0)
print(new_array)
[1,2,3]
I have tried a number of np functions (concatenate, take, etc.) to try and iterate this over the list, but get back the following error (presumably because the size of the array changes):
ValueError: autodetected range of [[], [1445.0, 1445.0, -248.0, 638.0, -108.0, 649.0]] is not finite
Thanks for any help!
You can achieve it without numpy using reduce:
from functools import reduce
l = [[[1,2,3], 5, 47, 2515], [[4,5,6], 3, 35, 2096]]
res = reduce(lambda a, b: [*a, *b], [x[0] for x in l])
Output
[1, 2, 3, 4, 5, 6]
Maybe it is worth mentioning that [*a, *b] is a way to concatenate lists in python, for example:
[*[1, 2, 3], *[4, 5, 6]] # [1, 2, 3, 4, 5, 6]
You could also use itertools' chain() function to flatten an extraction of the first subArray in each element of the list:
from itertools import chain
result = list(chain(*[sub[0] for sub in array]))

Sum elements of same index from different lists

I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]

Deleting Elements from an array

I have a numpy array and I want to delete the first 3 elements of the array. I tried this solution:
a = np.arange(0,10)
i=0
while(i<3):
del a[0]
i=i+1
This gives me an error that "ValueError: cannot delete array elements". I do not understand why this is the case. i'd appreciate the help thanks!
Numpy arrays have a fixed size, hence you cannot simply delete an element from them. The simplest way to achieve what you want is to use slicing:
a = a[3:]
This will create a new array starting with the 4th element of the original array.
For certain scenarios, slicing is just not enough. If you want to create a subarray consisting of specific elements from the original array, you can use another array to select the indices:
>>> a = arange(10, 20)
>>> a[[1, 4, 5]]
array([11, 14, 15])
So basically, a[[1,4,5]] will return an array that consists of the elements 1,4 and 5 of the original array.
It works for me:
import numpy as np
a = np.delete(a, k)
where "a" is your numpy arrays and k is the index position you want delete.
Hope it helps.
numpy arrays don't support element deletion. Why don't you just use slicing to achieve what you want?
a = a[3:]
You can convert it into a list and then try regular delete commands like pop, del, eg.
a = np.array([1,2,3,4,5])
l = list(a)
l.pop(3)
l
>>[1, 2, 3, 5]

Categories