how to broadcast certain processing in numpy arrays - python

Let's say I have an numpy array
import numpy as np
a = np.array([[1,2,3],[3,4,5]])
and I defined a function to process the data, e.g, get the product of a vector:
def vp(v):
p = 1
for i in v:
p = p * i
return p
and how can I easily broadcast the function to all the vectors in a by something like the map function for the list, e.g. vp(a) will give me [6, 60]?
what if a is a 3D or even 4D array, is there a good way to broadcast such customized function?

I think the core of your question relies on creating custom functions to apply over multidimensional datasets. For that you would use numpy.apply_along_axis().
a = array([[1, 2, 3],
[3, 4, 5]])
np.apply_along_axis(arr = a, func1d=vp, axis=1)
> array([ 6, 60])
Yes this also works with N-Dimentional datasets.
c = array([[[ 1, 2],
[ 3, 3],
[ 4, 5]],
[[16, 17],
[18, 18],
[19, 20]]])
np.apply_along_axis(vp, axis=1, arr=c)
> array([[ 12, 30],
[5472, 6120]])

Related

Numpy: Replacing part of array by another entire array based on selected index

I have two np array A and B and one index list, where:
A = np.arange(12).reshape(3,2,2)
>>> A
array([[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]],
[[ 8, 9],
[10, 11]]])
>>> B:
array([[[10, 10],
[10, 10]],
[[20, 20],
[20, 20]]])
index_list = [0,2]
I would like to replace array A by entire array B based on index the index_list gives, which is 0 and 2 (corresponding to A's index at axis=0):
# desired output:
array([[[ 10, 10],
[ 10, 10]],
[[ 4, 5],
[ 6, 7]],
[[ 20, 20],
[20, 20]]])
I have been thinking about implementing this process in a more efficient way, but only come up with two loops:
row_count = 0
for i,e in enumerate(A):
for idx in index_list:
if i == idx:
A[i,:,:] = B[row_count,:,:]
row_count += 1
which is computational expensive because I have two huge np arrays to process... it would be very appreciated if anyone have any idea of how to implement this process in vectorization way or in a more efficient way. Thanks!
Have you tried just assigning like this:
A[index_list[0]] = B[0]
A[index_list[1]] = B[1]
I guess if you had a bigger number of indexes in the index_list, and more to replace, you could then make a loop.

Is there an array method for testing multiple equality values?

I want to know where array a is equal to any of the values in array b.
For example,
a = np.random.randint(0,16, size=(3,4))
b = np.array([2,3,9])
# like this, but for any size b:
locations = np.nonzero((a==b[0]) | (a==b[1]) | (a==b[3]))
The reason is so I can change the values in a from (any of b) to another value:
a[locations] = 99
Or-ing the equality checks is not a great solution, because I would like to do this without knowing the size of b ahead of time. Is there an array solution?
[edit]
There are now 2 good answers to this question, one using broadcasting with extra dimensions, and another using np.in1d. Both work for the specific case in this question. I ended up using np.isin instead, since it seems like it is more agnostic to the shapes of both a and b.
I accepted the answer that taught me about in1d since that led me to my preferred solution.
You can use np.in1d then reshape back to a's shape so you can set the values in a to your special flag.
import numpy as np
np.random.seed(410012)
a = np.random.randint(0, 16, size=(3, 4))
#array([[ 8, 5, 5, 15],
# [ 3, 13, 8, 10],
# [ 3, 11, 0, 10]])
b = np.array([[2,3,9], [4,5,6]])
a[np.in1d(a, b).reshape(a.shape)] = 999
#array([[ 8, 999, 999, 15],
# [999, 13, 8, 10],
# [999, 11, 0, 10]])
Or-ing the equality checks is not a great solution, because I would like to do this without knowing the size of b ahead of time.
EDIT:
Vectorized equivalent to the code you have written above -
a = np.random.randint(0,16, size=(3,4))
b = np.array([2,3,9])
locations = np.nonzero((a==b[0]) | (a==b[1]) | (a==b[2]))
locations2 = np.nonzero((a[None,:,:]==b[:,None,None]).any(0))
np.allclose(locations, locations2)
True
This shows that your output is exactly the same as this output, without the need of explicitly mentioning b[0], b[1]... or using a for loop.
Explanation -
Broadcasting an operation can help you in this case. What you are trying to do is to compare each of the (3,4) matrix elements to each value in b which is (3,). This means that the resultant boolean matrix that you want is going to be three, (3,4) matrices, or (3,3,4)
Once you have done that, you want to take an ANY or OR between the three (3,4) matrices element-wise. That would reduce the (3,3,4) to a (3,4)
Finally you want to use np.nonzero to identify the locations where values are equal to TRUE
The above 3 steps can be done as follows -
Broadcasting comparison operation:
a[None,:,:]==b[:,None,None]] #(1,3,4) == (3,1,1) -> (3,3,4)
Reduction using OR logic:
(a[None,:,:]==b[:,None,None]).any(0) #(3,3,4) -> (3,4)
Get non-zero locations:
np.nonzero((a[None,:,:]==b[:,None,None]).any(0))
numpy.isin works on multi-dimensional a and b.
In [1]: import numpy as np
In [2]: a = np.random.randint(0, 16, size=(3, 4)); a
Out[2]:
array([[12, 2, 15, 11],
[12, 15, 5, 10],
[ 4, 2, 14, 7]])
In [3]: b = [2, 4, 5, 12]
In [4]: c = [[2, 4], [5, 12]]
In [5]: np.isin(a, b).astype(int)
Out[5]:
array([[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 0, 0]])
In [6]: np.isin(a, c).astype(int)
Out[6]:
array([[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 0, 0]])
In [7]: a[np.isin(a, b)] = 99; a
Out[7]:
array([[99, 99, 15, 11],
[99, 15, 99, 10],
[99, 99, 14, 7]])

How replace values from one numpy array by other array with indices

I have two arrays and one list with indices
Array source (array 01):
x = np.array([[1,2,3], [4,5,6], [7,8,9]])
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Array with new values (array 02):
y = np.array([11, 22, 33])
array([11, 22, 33])
and a list with indices
a = [1,2,1]
[1, 2, 1]
I need to replace the values of array 02 in array 01 based on indices
The return expected:
array([[1, 11, 3],
[4, 5, 22],
[7, 33, 9]])
How to do this without loop?
Thanks in advance.
Try advanced indexing:
x[np.arange(len(x)), a] = y
I found also this tutorial easier for me than official documentation.

How to create a symmetric matrix from a numpy 1D array the most efficient way

I have a 1d np.array. Its length may vary according to user input, but it will always stay single-dimesional.
Please advise if there is an efficient way to create a symmetric 2d np.array from it? By 'symmetric' I mean that its elements will be according to the rule k[i, j] = k[j, i].
I realise it is possible to do with a python for loop and lists, but that is very inefficient.
Many thanks in advance!
EXAMPLE:
For example, we have x = np.array([1, 2, 3]). The desired result should be
M = np.array([[1, 2, 3],
[2, 1, 2],
[3, 2, 1])
Interpretation #1
Seems like you are reusing elements at each row. So, with that sort of idea, an implementation using broadcasting would be -
def symmetricize(arr1D):
ID = np.arange(arr1D.size)
return arr1D[np.abs(ID - ID[:,None])]
Sample run -
In [170]: arr1D
Out[170]: array([59, 21, 70, 10, 42])
In [171]: symmetricize(arr1D)
Out[171]:
array([[59, 21, 70, 10, 42],
[21, 59, 21, 70, 10],
[70, 21, 59, 21, 70],
[10, 70, 21, 59, 21],
[42, 10, 70, 21, 59]])
Interpretation #2
Another interpretation I had when you would like to assign the elements from the input 1D array into a symmetric 2D array without re-use, such that we would fill in the upper triangular part once and then replicate those on the lower triangular region by keeping symmetry between the row and column indices. As such, it would only work for a specific size of it. So, as a pre-processing step, we need to perform that error-checking. After we are through the error-checking, we will initialize an output array and use row and column indices of a triangular array to assign values once as they are and once with swapped indices to assign values in the other triangular part, thus giving it the symmetry effect.
It seemed like Scipy's squareform should do be able to do this task, but from the docs, it doesn't look like it supports filling up the diagonal elements with the input array elements. So, let's give our solution a closely-related name.
Thus, we would have an implementation like so -
def squareform_diagfill(arr1D):
n = int(np.sqrt(arr1D.size*2))
if (n*(n+1))//2!=arr1D.size:
print "Size of 1D array not suitable for creating a symmetric 2D array!"
return None
else:
R,C = np.triu_indices(n)
out = np.zeros((n,n),dtype=arr1D.dtype)
out[R,C] = arr1D
out[C,R] = arr1D
return out
Sample run -
In [179]: arr1D = np.random.randint(0,9,(12))
In [180]: squareform_diagfill(arr1D)
Size of 1D array not suitable for creating a symmetric 2D array!
In [181]: arr1D = np.random.randint(0,9,(10))
In [182]: arr1D
Out[182]: array([0, 4, 3, 6, 4, 1, 8, 6, 0, 5])
In [183]: squareform_diagfill(arr1D)
Out[183]:
array([[0, 4, 3, 6],
[4, 4, 1, 8],
[3, 1, 6, 0],
[6, 8, 0, 5]])
What you're looking for is a special Toeplitz matrix and easy to generate with scipy
from numpy import concatenate, zeros
from scipy.linalg import toeplitz
toeplitz([1,2,3])
array([[1, 2, 3],
[2, 1, 2],
[3, 2, 1]])
another special matrix interpretation can be using Hankel matrix, which will give you minimum dimension square matrix for a given array.
from scipy.linalg import hankel
a=[1,2,3]
t=int(len(a)/2)+1
s=t-2+len(a)%2
hankel(a[:t],a[s:])
array([[1, 2],
[2, 3]])

Create array of outer products in numpy

I have an array of n vectors of length m. For example, with n = 3, m = 2:
x = array([[1, 2], [3, 4], [5,6]])
I want to take the outer product of each vector with itself, then concatenate them into an array of square matrices of shape (n, m, m). So for the x above I would get
array([[[ 1, 2],
[ 2, 4]],
[[ 9, 12],
[12, 16]],
[[25, 30],
[30, 36]]])
I can do this with a for loop like so
np.concatenate([np.outer(v, v) for v in x]).reshape(3, 2, 2)
Is there a numpy expression that does this without the Python for loop?
Bonus question: since the outer products are symmetric, I don't need to m x m multiplication operations to calculate them. Can I get this symmetry optimization from numpy?
Maybe use einsum?
>>> x = np.array([[1, 2], [3, 4], [5,6]])
>>> np.einsum('ij...,i...->ij...',x,x)
array([[[ 1, 2],
[ 2, 4]],
[[ 9, 12],
[12, 16]],
[[25, 30],
[30, 36]]])
I used the following snippet when I was trying to do the same in Theano:
def multiouter(A,B):
'''Provided NxK (Theano) matrices A and B it returns a NxKxK tensor C with C[i,:,:]=A[i,:]*B[i,:].T'''
return A.dimshuffle(0,1,'x')*B.dimshuffle(0,'x',1)
Doing a straighforward conversion to Numpy yields
def multiouter(A,B):
'''Provided NxK (Numpy) arrays A and B it returns a NxKxK tensor C with C[i,:,:]=A[i,:]*B[i,:].T'''
return A[:,:,None]*B[:,None,:]
I think I got the inspiration for it from another StackOverflow posting, so I am not sure I can take all the credit.
Note: indexing with None is equivalent to indexing with np.newaxis and instantiates a new axis with dimension 1.

Categories