Add values in numpy array successively, without looping [duplicate] - python

This question already has answers here:
Numpy sum elements in array based on its value
(2 answers)
Closed 4 years ago.
Maybe has been asked before, but I can't find it.
Sometimes I have an index I, and I want to add successively accordingly to this index to an numpy array, from another array. For example:
A = np.array([1,2,3])
B = np.array([10,20,30])
I = np.array([0,1,1])
for i in range(len(I)):
A[I[i]] += B[i]
print(A)
prints the expected (correct) value:
[11 52 3]
while
A[I] += B
print(A)
results in the expected (wrong) answer
[11 32 3].
Is there any way to do what I want in a vectorized way, without the loop?
If not, which is the fastest way to do this?

Use numpy.add.at:
>>> import numpy as np
>>> A = np.array([1,2,3])
>>> B = np.array([10,20,30])
>>> I = np.array([0,1,1])
>>>
>>> np.add.at(A, I, B)
>>> A
array([11, 52, 3])
Alternatively, np.bincount:
>>> A = np.array([1,2,3])
>>> B = np.array([10,20,30])
>>> I = np.array([0,1,1])
>>>
>>> A += np.bincount(I, B, minlength=A.size).astype(int)
>>> A
array([11, 52, 3])
Which is faster?
Depends. In this concrete example add.at seems marginally faster, presumably because we need to convert types in the bincount solution.
If OTOH A and B were float dtype then bincount would be faster.

You need to use np.add.at:
A = np.array([1,2,3])
B = np.array([10,20,30])
I = np.array([0,1,1])
np.add.at(A, I, B)
print(A)
prints
array([11, 52, 3])
This is noted in the doc:
ufunc.at(a, indices, b=None)
Performs unbuffered in place operation on operand ‘a’ for elements specified by ‘indices’. For addition ufunc, this method is equivalent to a[indices] += b, except that results are accumulated for elements that are indexed more than once. For example, a[[0,0]] += 1 will only increment the first element once because of buffering, whereas add.at(a, [0,0], 1) will increment the first element twice.

Related

Create a numpy array of arrays of similar shapes (from list) [duplicate]

I want to create an array with dtype=np.object, where each element is an array with a numerical type, e.g int or float. For example:
>>> a = np.array([1,2,3])
>>> b = np.empty(3,dtype=np.object)
>>> b[0] = a
>>> b[1] = a
>>> b[2] = a
Creates what I want:
>>> print b.dtype
object
>>> print b.shape
(3,)
>>> print b[0].dtype
int64
but I am wondering whether there isn't a way to write lines 3 to 6 in one line (especially since I might want to concatenate 100 arrays). I tried
>>> b = np.array([a,a,a],dtype=np.object)
but this actually converts all the elements to np.object:
>>> print b.dtype
object
>>> print b.shape
(3,)
>>> print b[0].dtype
object
Does anyone have any ideas how to avoid this?
It's not exactly pretty, but...
import numpy as np
a = np.array([1,2,3])
b = np.array([None, a, a, a])[1:]
print b.dtype, b[0].dtype, b[1].dtype
# object int32 int32
a = np.array([1,2,3])
b = np.empty(3, dtype='O')
b[:] = [a] * 3
should suffice.
I can't find any elegant solution, but at least a more general solution to doing everything by hand is to declare a function of the form:
def object_array(*args):
array = np.empty(len(args), dtype=np.object)
for i in range(len(args)):
array[i] = args[i]
return array
I can then do:
a = np.array([1,2,3])
b = object_array(a,a,a)
I then get:
>>> a = np.array([1,2,3])
>>> b = object_array(a,a,a)
>>> print b.dtype
object
>>> print b.shape
(3,)
>>> print b[0].dtype
int64
I think anyarray is what you need here:
b = np.asanyarray([a,a,a])
>>> b[0].dtype
dtype('int32')
not sure what happened to the other 32bits of the ints though.
Not sure if it helps but if you add another array of a different shape, it converts back to the types you want:
import numpy as np
a = np.array([1,2,3])
b = np.array([1,2,3,4])
b = np.asarray([a,b,a], dtype=np.object)
print(b.dtype)
>>> object
print(b[0].dtype)
>>> int32

Why does it print array([ ]) when in a list when using an array to iterate and pull elements from?

Looking to see if I could use comprehensions or array operators, instead of for loops.
import numpy as np
a=[[1,2],[3,4]]
b=np.array(a)
c=[[x*z for x in z] for z in b[0:1]]
print(c)
OUTPUT = [[array([1, 2]), array([2, 4])]]
I want a list or array = [2,12]
I can convert list to 1D array after.
Where it is first element * second element for each row in array.
I want it to work on a general case for any 2 dimensional array.
Look at the action - step by step:
In [170]: b.shape
Out[170]: (2, 2)
In [171]: b[0:1]
Out[171]: array([[1, 2]]) # (1,2) array
In [172]: [z for z in b[0:1]]
Out[172]: [array([1, 2])] # iteration on 1st, size 1 dimension
In [173]: [[x for x in z] for z in b[0:1]]
Out[173]: [[1, 2]]
In [174]: [[x*z for x in z] for z in b[0:1]]
Out[174]: [[array([1, 2]), array([2, 4])]]
So you are doing [1*np.array([1,2]), 2*np.array([1,2])]
With the b[0:1] slicing you aren't even touching the 2nd row of b.
But a simpler list comprehension does:
In [175]: [i*j for i,j in b] # this iterates on the rows of b
Out[175]: [2, 12]
or
In [176]: b[:,0]*b[:,1]
Out[176]: array([ 2, 12])
The easiest way is to use the prod function in numpy.
from numpy import prod
a = [[1,2],[3,4]]
b = [prod(x) for x in a]
print(b)
Output:
[2,12]
To answer the question in parts.
Why does it print array([ ])... ?
The numpy array you use is a class with a __repr__ or a __str__ method, which decides what you see, when you have it as an argument in a print statement.
In numpys array case, something along the lines of:
def __repr__(self):
return self.__class__.__name__ + "(" + repr(self.array) + ")"
2 ... when in a list using an array...
The list or dict calls the child elements __repr__ methods in its __repr__ method.
...and pull elements from?
in your inner x*z you are multiplying x, a number (1 or 2) with z a 1x2 array (b[0:1] = array([[1, 2]])). Which has an array as a result, which is the array of itself, multiplied by each of its elements as a scalar ([[1*1, 1*2]], [2*1, 2*2]])
Some other solution for your problem (but probably the already mentioned prod while for sure be much faster ;)):
import numpy as np
a=[[1,2],[3,4]]
b=np.array(a)
c=[[z1*z2] for z1, z2 in b]
print(c)
You can just multiply the first column with the second one:
c = b[:, 0] * b[:, 1]
Or you can use np.multiply:
c = np.multiply.reduce(b[:, :2], axis=1)

Why numpy returns true index while comparing integer and float value from two different list

How to avoid returning true index while comparing 10.5 and 10?
A = np.array([1,2,3,4,5,6,7,8,9,10.5])
B = np.array([1,7,10])
i = np.searchsorted(A,B)
print i # [0 6 9]
I want to get the places of the exact matches: [0 6]
You could use np.searchsorted with left and right and only keep those who don't return the same index for both:
>>> import numpy as np
>>> A = np.array([1,2,3,4,5,6,7,8,9,10.5])
>>> B = np.array([1,7,10])
>>> i = np.searchsorted(A, B, 'left')
>>> j = np.searchsorted(A, B, 'right')
>>> i[i!=j]
array([0, 6], dtype=int64)
That works because searchsorted returns the index where the element needs to be inserted if you want to keep the other array sorted. So when the value is present in the other array it returns the index before the match (left) and the index after the matches(right). So if the index differs there's an exact match and if the index is the same there's no exact match
First you can help searchsorted, since A is sorted, for better complexity:
A = np.array([1,2,3,4,5,6,7,8,9,10.5])
B = np.array([1,7,10])
i = np.searchsorted(A,B,sorter=range(len(A)))
then in1d can find the exact correspondence :
j = i[np.in1d(A[i],B,True)]
# [0,6]

Using searchsorted for 2D array

I would like to have the index like function :np.searchsorted([1,2,3,4,5], 3) returns 2 but in case 2-D dimension: np.searchsorted([[1,2],[3,4],[5,6]], [5,6]) should return 2. How can I do that?
a = np.array([[1,2],[3,4],[5,6]])
b = np.array([5,6])
np.where(np.all(a==b,axis=1))
The documentation for searchsorted indicates that it only works on 1-D arrays.
You can find the index location in a list using built-in methds:
>>> a = [[1,2],[3,4],[5,6]]
>>> a.index([5,6])
2
>>> a = [[1,2],[5,6],[3,4]]
>>> print(a.index([5,6]))
>>> a.sort()
>>> print(a.index([5,6]))
1
2

adding rows in python to table

I know in R there is a rbind function:
list = c(1,2,3)
blah = NULL
blah = rbind(blah,list)
How would I replicate this in python? I know you could possible write:
a = NULL
b= array([1,2,3])
for i in range(100):
a = np.vstack((a,b))
but I am not sure what to write in the a=NULL spot. I am essentially looping and adding rows to a table. What's the most efficient way to do this?
In numpy, things will be more efficient if you first pre-allocate space and then loop to fill that space than if you dynamically create successively larger arrays. If, for instance, the size is 500, you would:
a = np.empty((500, b.shape[0]))
Then, loop and enter values as needed:
for i in range(500):
a[i,:] = ...
Note that if you really just want to repeat b 500 times, you can just do:
In [1]: import numpy as np
In [2]: b = np.array([1,2,3])
In [3]: a = np.empty((500, b.shape[0]))
In [4]: a[:] = b
In [5]: a[0,:] == b
Out[5]: array([ True, True, True], dtype=bool)
To answer your question exactly
a = []
b= np.array([1,2,3])
for i in xrange(100):
a.append(b)
a = np.vstack( tuple(a) )
The tuple function casts an iterable (in this case a list) as a tuple object, and np.vstack takes a tuple of numpy arrays as argument.

Categories