Two-array counting in NumPy [duplicate] - python

I'm searching for an efficient way to create a matrix of occurrences from two arrays that contains indexes, one represents the row indexes in this matrix, the other, the column ones.
eg. I have:
#matrix will be size 4x3 in this example
#array of rows idxs, with values from 0 to 3
[0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3]
#array of columns idxs, with values from 0 to 2
[0, 1, 1, 1, 2, 2, 0, 1, 2, 0, 2, 2, 2, 2]
And need to create a matrix of occurrences like:
[[1 0 0]
[0 2 0]
[0 1 2]
[2 1 5]]
I can create an array of one hot vectors in a simple form, but cant get it work when there is more than one occurrence:
n_rows = 4
n_columns = 3
#data
rows = np.array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3])
columns = np.array([0, 1, 1, 1, 2, 2, 0, 1, 2, 0, 2, 2, 2, 2])
#empty matrix
new_matrix = np.zeros([n_rows, n_columns])
#adding 1 for each [row, column] occurrence:
new_matrix[rows, columns] += 1
print(new_matrix)
Which returns:
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 1. 1.]
[ 1. 1. 1.]]
It seems like indexing and adding a value like this doesn't work when there is more than one occurrence/index, besides printing it seems to work just fine:
print(new_matrix[rows, :])
:
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 1. 0.]
[ 0. 1. 1.]
[ 0. 1. 1.]
[ 0. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]]
So maybe I'm missing something there? Or this cant be done and I need to search for another way to do it?

Use np.add.at, specifying a tuple of indices:
>>> np.add.at(new_matrix, (rows, columns), 1)
>>> new_matrix
array([[ 1., 0., 0.],
[ 0., 2., 0.],
[ 0., 1., 2.],
[ 2., 1., 5.]])
np.add.at operates on the array in-place, adding 1 as many times to the indices as specified by the (row, columns) tuple.

Approach #1
We can convert those pairs to linear indices and then use np.bincount -
def bincount_app(rows, columns, n_rows, n_columns):
# Get linear index equivalent
lidx = (columns.max()+1)*rows + columns
# Use binned count on the linear indices
return np.bincount(lidx, minlength=n_rows*n_columns).reshape(n_rows,n_columns)
Sample run -
In [242]: n_rows = 4
...: n_columns = 3
...:
...: rows = np.array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3])
...: columns = np.array([0, 1, 1, 1, 2, 2, 0, 1, 2, 0, 2, 2, 2, 2])
In [243]: bincount_app(rows, columns, n_rows, n_columns)
Out[243]:
array([[1, 0, 0],
[0, 2, 0],
[0, 1, 2],
[2, 1, 5]])
Approach #2
Alternatively, we can sort the linear indices and get the counts using slicing to have our second approach, like so -
def mask_diff_app(rows, columns, n_rows, n_columns):
lidx = (columns.max()+1)*rows + columns
lidx.sort()
mask = np.concatenate(([True],lidx[1:] != lidx[:-1],[True]))
count = np.diff(np.flatnonzero(mask))
new_matrix = np.zeros([n_rows, n_columns],dtype=int)
new_matrix.flat[lidx[mask[:-1]]] = count
return new_matrix
Approach #3
This seems like a straight-forward one with sparse matrix csr_matrix as well, as it does accumulation on its own for repeated indices. The benefit is the memory efficiency, given that it's a sparse matrix, which would be noticeable if you are filling a small number of places in the output and a sparse matrix output is okay.
The implementation would look something like this -
from scipy.sparse import csr_matrix
def sparse_matrix_app(rows, columns, n_rows, n_columns):
out_shp = (n_rows, n_columns)
data = np.ones(len(rows),dtype=int)
return csr_matrix((data, (rows, columns)), shape=out_shp)
If you need a regular/dense array, simply do -
sparse_matrix_app(rows, columns, n_rows, n_columns).toarray()
Sample output -
In [319]: sparse_matrix_app(rows, columns, n_rows, n_columns).toarray()
Out[319]:
array([[1, 0, 0],
[0, 2, 0],
[0, 1, 2],
[2, 1, 5]])
Benchmarking
Other approach(es) -
# #cᴏʟᴅsᴘᴇᴇᴅ's soln
def add_at_app(rows, columns, n_rows, n_columns):
new_matrix = np.zeros([n_rows, n_columns],dtype=int)
np.add.at(new_matrix, (rows, columns), 1)
Timings
Case #1 : Output array of shape (1000, 1000) and no. of indices = 10k
In [307]: # Setup
...: n_rows = 1000
...: n_columns = 1000
...: rows = np.random.randint(0,1000,(10000))
...: columns = np.random.randint(0,1000,(10000))
In [308]: %timeit add_at_app(rows, columns, n_rows, n_columns)
...: %timeit bincount_app(rows, columns, n_rows, n_columns)
...: %timeit mask_diff_app(rows, columns, n_rows, n_columns)
...: %timeit sparse_matrix_app(rows, columns, n_rows, n_columns)
1000 loops, best of 3: 1.05 ms per loop
1000 loops, best of 3: 424 µs per loop
1000 loops, best of 3: 1.05 ms per loop
1000 loops, best of 3: 1.41 ms per loop
Case #2 : Output array of shape (1000, 1000) and no. of indices = 100k
In [309]: # Setup
...: n_rows = 1000
...: n_columns = 1000
...: rows = np.random.randint(0,1000,(100000))
...: columns = np.random.randint(0,1000,(100000))
In [310]: %timeit add_at_app(rows, columns, n_rows, n_columns)
...: %timeit bincount_app(rows, columns, n_rows, n_columns)
...: %timeit mask_diff_app(rows, columns, n_rows, n_columns)
...: %timeit sparse_matrix_app(rows, columns, n_rows, n_columns)
100 loops, best of 3: 11.4 ms per loop
1000 loops, best of 3: 1.27 ms per loop
100 loops, best of 3: 7.44 ms per loop
10 loops, best of 3: 20.4 ms per loop
Case #3 : Sparse-ness in output
As stated earlier, for the sparse method to work better, we would need sparse-ness. Such a case would be like this -
In [314]: # Setup
...: n_rows = 5000
...: n_columns = 5000
...: rows = np.random.randint(0,5000,(1000))
...: columns = np.random.randint(0,5000,(1000))
In [315]: %timeit add_at_app(rows, columns, n_rows, n_columns)
...: %timeit bincount_app(rows, columns, n_rows, n_columns)
...: %timeit mask_diff_app(rows, columns, n_rows, n_columns)
...: %timeit sparse_matrix_app(rows, columns, n_rows, n_columns)
100 loops, best of 3: 11.7 ms per loop
100 loops, best of 3: 11.1 ms per loop
100 loops, best of 3: 11.1 ms per loop
1000 loops, best of 3: 269 µs per loop
If you need a dense array, we lose the memory efficiency and hence performance one as well -
In [317]: %timeit sparse_matrix_app(rows, columns, n_rows, n_columns).toarray()
100 loops, best of 3: 11.7 ms per loop

Related

How do I vstack or concatenate matricies of different shapes?

In a situation like the one below, how do I vstack the two matrices?
import numpy as np
a = np.array([[3,3,3],[3,3,3],[3,3,3]])
b = np.array([[2,2],[2,2],[2,2]])
a = np.vstack([a, b])
Output:
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 2
The output I would like would look like this:
a = array([[[3, 3, 3],
[3, 3, 3],
[3, 3, 3]],
[[2, 2],
[2, 2],
[2, 2]]])
My goal is to then to loop over the content of the stacked matrices, index each matrix and call a function on a specific row.
for matrix in a:
row = matrix[1]
print(row)
Output:
[3, 3, 3]
[2, 2]
Be careful with those "Numpy is faster" claims. If you already have arrays, and make full use of array methods, numpy is indeed faster. But if you start with lists, or have to use Python level iteration (as you do in Pack...), the numpy version might well be slower.
Just doing a time test on the Pack step:
In [12]: timeit Pack_Matrices_with_NaN([a,b,c],5)
221 µs ± 9.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Compare that with fetching the first row of each array with a simple list comprehension:
In [13]: [row[1] for row in [a,b,c]]
Out[13]: [array([3., 3., 3.]), array([2., 2.]), array([4., 4., 4., 4.])]
In [14]: timeit [row[1] for row in [a,b,c]]
808 ns ± 2.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
200 µs compared to less than 1 µs!
And timing your Unpack:
In [21]: [Unpack_Matrix_with_NaN(packed_matrices.reshape(3,3,5),i)[1,:] for i in range(3)]
...:
Out[21]: [array([3., 3., 3.]), array([2., 2.]), array([4., 4., 4., 4.])]
In [22]: timeit [Unpack_Matrix_with_NaN(packed_matrices.reshape(3,3,5),i)[1,:] for i in ra
...: nge(3)]
199 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I was able to solve this only using NumPy. As NumPy is significantly faster than python's list function (https://towardsdatascience.com/how-fast-numpy-really-is-e9111df44347) I wanted to share my answer as it might be useful to others.
I started with adding np.NaN to make the two arrays the same shape.
import numpy as np
a = np.array([[3,3,3],[3,3,3],[3,3,3]]).astype(float)
b = np.array([[2,2],[2,2],[2,2]]).astype(float)
# Extend each vector in array with Nan to reach same shape
b = np.insert(b, 2, np.nan, axis=1)
# Now vstack the arrays
a = np.vstack([[a], [b]])
print(a)
Output:
[[[ 3. 3. 3.]
[ 3. 3. 3.]
[ 3. 3. 3.]]
[[ 2. 2. nan]
[ 2. 2. nan]
[ 2. 2. nan]]]
Then I wrote a function to unpack each array in a, and remove the nan.
def Unpack_Matrix_with_NaN(Matrix_with_nan, matrix_of_interest):
for first_row in Matrix_with_nan[matrix_of_interest,:1]:
# find shape of matrix row without nan
first_row_without_nan = first_row[~np.isnan(first_row)]
shape = first_row_without_nan.shape[0]
matrix_without_nan = np.arange(shape)
for row in Matrix_with_nan[matrix_of_interest]:
row_without_nan = row[~np.isnan(row)]
matrix_without_nan = np.vstack([matrix_without_nan, row_without_nan])
# Remove vector specifying shape
matrix_without_nan = matrix_without_nan[1:]
return matrix_without_nan
I could then loop through the matrices, find my desired row, and print it.
Matrix_with_nan = a
for matrix in range(len(Matrix_with_nan)):
matrix_of_interest = Unpack_Matrix_with_NaN(a, matrix)
row = matrix_of_interest[1]
print(row)
Output:
[3. 3. 3.]
[2. 2.]
I also made a function to pack matrices when more than one nan needs to be added per row:
import numpy as np
a = np.array([[3,3,3],[3,3,3],[3,3,3]]).astype(float)
b = np.array([[2,2],[2,2],[2,2]]).astype(float)
c = np.array([[4,4,4,4],[4,4,4,4],[4,4,4,4]]).astype(float)
# Extend each vector in array with Nan to reach same shape
def Pack_Matrices_with_NaN(List_of_matrices, Matrix_size):
Matrix_with_nan = np.arange(Matrix_size)
for array in List_of_matrices:
start_position = len(array[0])
for x in range(start_position,Matrix_size):
array = np.insert(array, (x), np.nan, axis=1)
Matrix_with_nan = np.vstack([Matrix_with_nan, array])
Matrix_with_nan = Matrix_with_nan[1:]
return Matrix_with_nan
arrays = [a,b,c]
packed_matrices = Pack_Matrices_with_NaN(arrays, 5)
print(packed_matrices)
Output:
[[ 3. 3. 3. nan nan]
[ 3. 3. 3. nan nan]
[ 3. 3. 3. nan nan]
[ 2. 2. nan nan nan]
[ 2. 2. nan nan nan]
[ 2. 2. nan nan nan]
[ 4. 4. 4. 4. nan]
[ 4. 4. 4. 4. nan]
[ 4. 4. 4. 4. nan]]

Indexing other values in a numpy array during array vectorization (1D) without loops

I have been tasked with applying a "1-2-1" filter to a numpy array and returning an array of the filtered data. Without using for loops while loops or list comprehension.
The "1-2-1" filter maps each point of data to the average of itself twice and its neighbors. For example, if at some point the data contained ...1, 4, 3... then after applying the "1-2-1" filter the 4 would be replaced with (1 + 4 + 4 + 3) / 4 = 12 / 4 = 3.
For example the numpy array [1, 1, 4, 3, 2]
Would after the filter is applied produce a numpy array [1. 1.75 3. 3. 2. ]
Since the end points of the data do not have two neighbors the 1-2-1 filter is only applied to the internal len(data) - 2 points, leaving the end points unchanged.
Essentially I need to access the values before and after a given point during numpy array vectorization. For a array that could be of any length. Which as much as I have googled I cannot work out.
Pandas solution
s = pd.Series(l)
>>> s.rolling(3, center=True).sum().add(s).div(4).fillna(s).values
array([1. , 1.75, 3. , 3. , 2. ])
Step by step:
>>> s.rolling(3, center=True).sum().values
array([nan, 6., 8., 9., nan])
>>> s.rolling(3, center=True).sum().add(s).values
array([nan, 7., 12., 12., nan])
>>> s.rolling(3, center=True).sum().add(s).div(4).values
array([ nan, 1.75, 3. , 3. , nan])
>>> s.rolling(3, center=True).sum().add(s).div(4).fillna(s).values
array([1. , 1.75, 3. , 3. , 2. ])
Numpy solution
a = np.array(l)
>>> np.concatenate([a[:1], np.convolve(a, [1, 2, 1], mode="valid") / 4, a[-1:]])
Step by step:
>>> np.convolve(a, [1, 2, 1], mode="valid")
array([ 7, 12, 12])
>>> np.convolve(a, [1, 2, 1], mode="valid") / 4
array([1.75, 3. , 3. ])
>>> np.concatenate([a[:1], np.convolve(a, [1, 2, 1], mode="valid") / 4, a[-1:]])
array([1. , 1.75, 3. , 3. , 2. ])
Performance
%timeit s.rolling(3, center=True).sum().add(s).div(4).fillna(s).values
706 µs ± 4.62 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.concatenate([a[:1], np.convolve(a, [1, 2, 1], mode="valid") / 4, a[-1:]])
10.8 µs ± 22.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
You can try something like this:
import numpy as np
from scipy.linalg import circulant
x = np.array([1, 1, 4, 3, 2])
val = np.array([1, 2, 1])
offsets = np.array([0, 1, 2])
col0 = np.zeros(len(x))
col0[offsets] = val
C = circulant(col0).T[:-(len(val) - 1)]
print(C)
C is essentially a circulant matrix which looks like this:
array([[1., 2., 1., 0., 0.],
[0., 1., 2., 1., 0.],
[0., 0., 1., 2., 1.]])
Now you can simply compute the filtered output as follows:
y = (C # x) / 4
print(y)
# array([1.75, 3. , 3. ])

Iterate over an array and apply it to a function

In python I iterate with every column in P which is and 4,4 array with the function "q".
like:
P = np.array([[1, 0, 0, 1], [0, 1, 0, 0], [0, 0, 0.5, 0], [0, 0, 0.5, 0]])
def q(q_u):
q = np.array(
[
[np.dot(0, 2, q_u)],
[np.zeros((4, 1), dtype=int)],
[np.zeros((2, 1), dtype=int)],
],
dtype=object,
)
return q
np.apply_along_axis(q, axis=0, arr=P)
I get a (3,1,4) array applying q function to the P array. This is correct. But how is posible to save and later call the 4 (3,1) arrays to a dictonary to later apply it to another function printR which needs a (3,1) array.
printR(60, res, q)
Should add the the 4 arrays to a dictionary in order to iterate with PrintR or there is another method?
Use transpose, zip to create the dictionary.
To create 4 of (1,3), simply pass them to dict
arr = np.apply_along_axis(q, axis=0, arr=P)
d = dict(zip(range(arr.size), arr.T))
Out[259]:
{0: array([[0, array([[0],
[0],
[0],
[0]]),
array([[0],
[0]])]], dtype=object), 1: array([[0, array([[0],
[0],
[0],
[0]]),
array([[0],
[0]])]], dtype=object), 2: array([[0, array([[0],
[0],
[0],
[0]]),
array([[0],
[0]])]], dtype=object), 3: array([[0, array([[0],
[0],
[0],
[0]]),
array([[0],
[0]])]], dtype=object)}
In [260]: d[0].shape
Out[260]: (1, 3)
To create 4 of (3,1), use dict comprehension
d = {k: v.T for k, v in zip(range(arr.size), arr.T)}
Out[269]:
{0: array([[0],
[array([[0],
[0],
[0],
[0]])],
[array([[0],
[0]])]], dtype=object), 1: array([[0],
[array([[0],
[0],
[0],
[0]])],
[array([[0],
[0]])]], dtype=object), 2: array([[0],
[array([[0],
[0],
[0],
[0]])],
[array([[0],
[0]])]], dtype=object), 3: array([[0],
[array([[0],
[0],
[0],
[0]])],
[array([[0],
[0]])]], dtype=object)}
In [270]: d[0].shape
Out[270]: (3, 1)
Note: I intentionally use arr.size to let zip trim tuples solely basing on the length of arr.T
Correcting the puzzling dot to
[np.dot(0.2, q_u)],
produces the ost in your other question.
I still wonder why you insist on using apply_along_axis. It doesn't have any speed benefits. Compare these timings:
In [36]: timeit np.apply_along_axis(q, axis=0, arr=P)
141 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [37]: timeit np.stack([q(P[:,i]) for i in range(P.shape[1])], axis=2)
72.1 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [38]: timeit [q(P[:,i]) for i in range(P.shape[1])]
53 µs ± 42.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
That dot(0.2, q_u) line just does 0.2*q_u, which applied to the P can be 0.2*P or 0.2*P.T.
Let's change q to omit the size 1 dimensions, to make a more compact display:
In [49]: def q1(q_u):
...: q = np.array(
...: [
...: np.dot(0.2, q_u),
...: np.zeros((4,), dtype=int),
...: np.zeros((2,), dtype=int),
...: ],
...: dtype=object,
...: )
...: return q
...:
In [50]: np.apply_along_axis(q1, axis=0, arr=P)
Out[50]:
array([[array([0.2, 0. , 0. , 0. ]), array([0. , 0.2, 0. , 0. ]),
array([0. , 0. , 0.1, 0.1]), array([0.2, 0. , 0. , 0. ])],
[array([0, 0, 0, 0]), array([0, 0, 0, 0]), array([0, 0, 0, 0]),
array([0, 0, 0, 0])],
[array([0, 0]), array([0, 0]), array([0, 0]), array([0, 0])]],
dtype=object)
In [51]: _.shape
Out[51]: (3, 4)
We can generate the same numbers, arranged slightly differently with:
In [52]: [0.2 * P.T, np.zeros((4,4),int), np.zeros((4,2),int)]
Out[52]:
[array([[0.2, 0. , 0. , 0. ],
[0. , 0.2, 0. , 0. ],
[0. , 0. , 0.1, 0.1],
[0.2, 0. , 0. , 0. ]]),
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]),
array([[0, 0],
[0, 0],
[0, 0],
[0, 0]])]
You are making 3 2d arrays, each with one row per column of P.
The list comprehension that I timed in [38] produces 4 size (3,) arrays, that is one array per column of P. apply_along_axis obscures that, joining them on a last dimension (as my stack with axis=2 does).
In [53]: [q1(P[:,i]) for i in range(P.shape[1])]
Out[53]:
[array([array([0.2, 0. , 0. , 0. ]), array([0, 0, 0, 0]), array([0, 0])],
dtype=object),
array([array([0. , 0.2, 0. , 0. ]), array([0, 0, 0, 0]), array([0, 0])],
dtype=object),
array([array([0. , 0. , 0.1, 0.1]), array([0, 0, 0, 0]), array([0, 0])],
dtype=object),
array([array([0.2, 0. , 0. , 0. ]), array([0, 0, 0, 0]), array([0, 0])],
dtype=object)]
The list comprehension is not only fast, but it also keeps the q output 'intact', making it easier to pass on to another function.

Numpy multiply 3d matrix by 2d matrix

For example, I got matrix A of shape (3,2,2), e.g.
[
[[1,1],[1,1]],
[[2,2],[2,2]],
[[3,3],[3,3]]
]
and matrix B of shape (2,2), e.g.
[[1, 1], [0,1]]
I would like to achieve c of shape (3,2,2) like:
c = np.zeros((3,2,2))
for i in range(len(A)):
c[i] = np.dot(B, A[i,:,:])
which gives
[[[2. 2.]
[1. 1.]]
[[4. 4.]
[2. 2.]]
[[6. 6.]
[3. 3.]]]
What is the most efficient way to achieve this?
Thanks.
Use np.tensordot and then swap axes. So, use one of these -
np.tensordot(B,A,axes=((1),(1))).swapaxes(0,1)
np.tensordot(A,B,axes=((1),(1))).swapaxes(1,2)
We can reshape A to 2D after swapping axes, use 2D matrix multiplication with np.dot and reshape and swap axes to maybe gain marginal performance boost.
Timings -
# Original approach
def orgapp(A,B):
m = A.shape[0]
n = B.shape[0]
r = A.shape[2]
c = np.zeros((m,n,r))
for i in range(len(A)):
c[i] = np.dot(B, A[i,:,:])
return c
In [91]: n = 10000
...: A = np.random.rand(n,2,2)
...: B = np.random.rand(2,2)
In [92]: %timeit orgapp(A,B)
100 loops, best of 3: 12.2 ms per loop
In [93]: %timeit np.tensordot(B,A,axes=((1),(1))).swapaxes(0,1)
1000 loops, best of 3: 191 µs per loop
In [94]: %timeit np.tensordot(A,B,axes=((1),(1))).swapaxes(1,2)
1000 loops, best of 3: 208 µs per loop
# #Bitwise's solution
In [95]: %timeit np.flip(np.dot(A,B).transpose((0,2,1)),1)
1000 loops, best of 3: 697 µs per loop
Another solution:
np.flip(np.dot(A,B).transpose((0,2,1)),1)

Assign to array, adding multiple copies of index

So I have this array, right?
a=np.zeros(5)
I want to add values to it at the given indices, where indices can be duplicates.
e.g.
a[[1, 2, 2]] += [1, 2, 3]
I want this to produce array([ 0., 1., 5., 0., 0.]), but the answer I get is array([ 0., 1., 3., 0., 0.]).
I'd like this to work with multidimensional arrays and broadcastable indices and all that. Any ideas?
You need to use np.add.at to get around the buffering issue that you encounter with += (values are not accumulated at repeated indices). Specify the array, the indices, and the values to add in place at those indices:
>>> a = np.zeros(5)
>>> np.add.at(a, [1, 2, 2], [1, 2, 3])
>>> a
array([ 0., 1., 5., 0., 0.])
at is part of other ufuncs too (multiply, divide, and so on). This method will also work for multidimensional arrays.
The operation you are performing can be looked at as binning, and to be technically more specific, you are doing weighted bining with those values being the weights and the indices being the bins. For such a binning operation, you can use np.bincount.
Here's the implementation -
import numpy as np
a=np.zeros(5) # initialize output array
idx = [1, 2, 2] # indices
vals = [1, 2, 3] # values
a[:max(idx)+1] = np.bincount(idx,vals) # finally store the bincounts
Runtime tests
Here are some runtime tests for two sets of input datasizes comparing the proposed bincount based approach and the add.at based approach listed in the other answer:
Datasize #1 -
In [251]: a=np.zeros(1000)
...: idx = np.sort(np.random.randint(1,1000,(500))).tolist()
...: vals = np.random.rand(500).tolist()
...:
In [252]: %timeit np.add.at(a, idx, vals)
10000 loops, best of 3: 63.4 µs per loop
In [253]: %timeit a[:max(idx)+1] = np.bincount(idx,vals)
10000 loops, best of 3: 42.4 µs per loop
Datasize #2 -
In [254]: a=np.zeros(10000)
...: idx = np.sort(np.random.randint(1,10000,(5000))).tolist()
...: vals = np.random.rand(5000).tolist()
...:
In [255]: %timeit np.add.at(a, idx, vals)
1000 loops, best of 3: 597 µs per loop
In [256]: %timeit a[:max(idx)+1] = np.bincount(idx,vals)
1000 loops, best of 3: 404 µs per loop

Categories