Numpy linalg.norm with ufunc.reduceat functionality - python

Solution: #QuangHoang's first comment namely np.linalg.norm(arr,axis=1).
I would like to apply Numpy's linalg.norm function column wise to sub-arrays of a 3D array by using ranges (or indices?), similar in functionality to what ufunc.reduceat does.
Given the following array:
import numpy as np
In []: arr = np.array([[0,1,2,3], [2,2,3,4], [3,2,5,6],
[1,7,1,9], [1,4,8,6], [2,3,5,8],
[2,5,7,3], [2,3,4,6], [2,5,3,2]]).reshape(3,3,4)
Out []: array([[[0, 1, 2, 3],
[2, 2, 3, 4],
[3, 2, 5, 6]],
[[1, 7, 1, 9],
[1, 4, 8, 6],
[2, 3, 5, 8]],
[[2, 5, 7, 3],
[2, 3, 4, 6],
[2, 5, 3, 2]]])
I would like to apply linalg.norm column wise to the three sub-arrays separately i.e. for the first column it would be linalg.norm([0, 2, 3]), linalg.norm([1, 1, 2]) and linalg.norm([2, 2, 2]), for the second linalg.norm([1, 2, 2]), linalg.norm([7, 4, 3]) and linalg.norm([5, 3, 5]) etc. resulting in a 2D vector with shape (3,4) containing the results of the linalg.norm calls.
Doing this with a 2D array is straightforward by specifying the axis:
import numpy.linalg as npla
In []: npla.norm(np.array([[0,1,2,3], [2,2,3,4], [3,2,5,6]]), axis=0)
Out []: array([3.60555128, 3. , 6.164414 , 7.81024968])
But I don't understand how to do that for each sub-array separately. I believe that reduceat with a ufunc like add allows to set indices and ranges. Would something similar be possible here but with linalg.norm?
Edit 1:
I followed #hpaulj's advice to look at the code used for add.reduce. Getting a better understanding of the method I was able to search more precisely and I found np.apply_along_axis which is exactly what I was looking for:
In []: np.apply_along_axis(npla.norm, 1, arr)
Out []: array([[ 3.60555128, 3. , 6.164414 , 7.81024968],
[ 2.44948974, 8.60232527, 9.48683298, 13.45362405],
[ 3.46410162, 7.68114575, 8.60232527, 7. ]])
However, this method is very slow. Is there a way to use linalg.nrom in a vectorized manner instead?
Edit 2:
#QuangHoang's first comment is actually the correct answer I was looking for. I misunderstood the method which is why I misunderstood their comment. Specifying the axis in the linalg.norm call is what is required here:
np.linalg.norm(arr,axis=1)

Related

NumPy using the reshape function to reshape an array [duplicate]

This question already has an answer here:
how to reshape an N length vector to a 3x(N/3) matrix in numpy using reshape
(1 answer)
Closed 2 years ago.
I have an array: [1, 2, 3, 4, 5, 6]. I would like to use the numpy.reshape() function so that I end up with this array:
[[1, 4],
[2, 5],
[3, 6]
]
I'm not sure how to do this. I keep ending up with this, which is not what I want:
[[1, 2],
[3, 4],
[5, 6]
]
These do the same thing:
In [57]: np.reshape([1,2,3,4,5,6], (3,2), order='F')
Out[57]:
array([[1, 4],
[2, 5],
[3, 6]])
In [58]: np.reshape([1,2,3,4,5,6], (2,3)).T
Out[58]:
array([[1, 4],
[2, 5],
[3, 6]])
Normally values are 'read' across the rows in Python/numpy. This is call row-major or 'C' order. Read down is 'F', for FORTRAN, and is common in MATLAB, which has Fortran roots.
If you take the 'F' order, make a new copy and string it out, you'll get a different order:
In [59]: np.reshape([1,2,3,4,5,6], (3,2), order='F').copy().ravel()
Out[59]: array([1, 4, 2, 5, 3, 6])
You can set the order in np.reshape, in your case you can use 'F'. See docs for details
>>> arr
array([1, 2, 3, 4, 5, 6])
>>> arr.reshape(-1, 2, order = 'F')
array([[1, 4],
[2, 5],
[3, 6]])
The reason that you are getting that particular result is that arrays are normally allocates in C order. That means that reshaping by itself is not sufficient. You have to tell numpy to change the order of the axes when it steps along the array. Any number of operations will allow you to do that:
Set the axis order to F. F is for Fortran, which, like MATLAB, conventionally uses column-major order:
a.reshape(2, 3, order='F')
Swap the axes after reshaping:
np.swapaxes(a.reshape(2, 3), 0, 1)
Transpose the result:
a.reshape(2, 3).T
Roll the second axis forward:
np.rollaxis(a.reshape(2, 3), 1)
Notice that all but the first case require you to reshape to the transpose.
You can even manually arrange the data
np.stack((a[:3], a[3:]), axis=1)
Note that this will make many unnecessary copies. If you want the data copied, just do
a.reshape(2, 3, order='F').copy()

Minimum value in 3D NumPy array along specified axis

Say you have a 3D array as follows:
a = np.random.uniform(0,10,(3,4,4))
a
Out[167]:
array([[[6.11382489, 5.33572952, 2.6994938 , 5.32924568],
[0.02494179, 9.5813176 , 3.78090323, 7.73698908],
[0.4559432 , 3.14531716, 4.18929635, 9.44256735],
[7.05641989, 0.51355523, 6.61806454, 1.3124488 ]],
[[9.79806021, 6.9343234 , 3.96018673, 8.97424501],
[3.25146771, 5.06744849, 6.05870707, 2.27286515],
[4.66656429, 6.92791142, 7.1623226 , 5.34108811],
[6.09831564, 9.52367529, 8.27257007, 8.01510805]],
[[5.62545596, 9.01048599, 6.76713644, 7.71836144],
[5.59842752, 0.34003062, 8.07114444, 8.5382837 ],
[0.20420194, 6.39088367, 4.97895935, 4.26247875],
[1.2701483 , 8.35244104, 2.69965027, 8.39305974]]])
Is there a way to get the minimum values in the slices along axis=0 as one array efficiently?
So in this case I would specify axis=0 (i.e. the axis with dimension length=3) and return the minimum values: (0.02494179, 2.27286515, 0.20420194).
I feel like this is a simple problem but I can't seem to get it to work, so any help on the matter would be greatly appreciated!
If I got it right, you just have to apply "min" twice
for instance:
>>> np.random.seed(1) #reproduce the same results
>>> a = np.random.randint(0,10,(3,2,4)) #using int is easier to understand
Out[4]:
array([[[5, 8, 9, 5],
[0, 0, 1, 7]],
[[6, 9, 2, 4],
[5, 2, 4, 2]],
[[4, 7, 7, 9],
[1, 7, 0, 6]]])
>>> a.min(axis=0).min(axis=0)
Out[5]: array([0, 0, 0, 2])
It is the first time I post an answer, I hope I did okay.

Numpy.where used with list of values

I have a 2d and 1d array. I am looking to find the two rows that contain at least once the values from the 1d array as follows:
import numpy as np
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
results = []
for elem in B:
results.append(np.where(A==elem)[0])
This works and results in the following array:
[array([0, 5], dtype=int64),
array([0, 3], dtype=int64),
array([2, 4], dtype=int64),
array([0, 2], dtype=int64)]
But this is probably not the best way of proceeding. Following the answers given in this question (Search Numpy array with multiple values) I tried the following solutions:
out1 = np.where(np.in1d(A, B))
num_arr = np.sort(B)
idx = np.searchsorted(B, A)
idx[idx==len(num_arr)] = 0
out2 = A[A == num_arr[idx]]
But these give me incorrect values:
In [36]: out1
Out[36]: (array([ 0, 1, 2, 6, 8, 9, 13, 17], dtype=int64),)
In [37]: out2
Out[37]: array([0, 3, 1, 2, 3, 1, 2, 0])
Thanks for your help
If you need to know whether each row of A contains ANY element of array B without interest in which particular element of B it is, the following script can be used:
input:
np.isin(A,B).sum(axis=1)>0
output:
array([ True, False, True, True, True, True])
Since you're dealing with a 2D array* you can use broadcasting to compare B with raveled version of A. This will give you the respective indices in a raveled shape. Then you can reverse the result and get the corresponding indices in original array using np.unravel_index.
In [50]: d = np.where(B[:, None] == A.ravel())[1]
In [51]: np.unravel_index(d, A.shape)
Out[51]: (array([0, 5, 0, 3, 2, 4, 0, 2]), array([0, 2, 2, 0, 0, 1, 1, 2]))
^
# expected result
* From documentation: For 3-dimensional arrays this is certainly efficient in terms of lines of code, and, for small data sets, it can also be computationally efficient. For large data sets, however, the creation of the large 3-d array may result in sluggish performance.
Also, Broadcasting is a powerful tool for writing short and usually intuitive code that does its computations very efficiently in C. However, there are cases when broadcasting uses unnecessarily large amounts of memory for a particular algorithm. In these cases, it is better to write the algorithm's outer loop in Python. This may also produce more readable code, as algorithms that use broadcasting tend to become more difficult to interpret as the number of dimensions in the broadcast increases.
Is something like this what you are looking for?
import numpy as np
from itertools import combinations
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
for i in combinations(A, 2):
if np.all(np.isin(B, np.hstack(i))):
print(i[0], ' ', i[1])
which prints the following:
[0 3 1] [2 7 3]
[0 3 1] [6 2 7]
note: this solution does NOT require the rows be consecutive. Please let me know if that is required.

numpy einsum: nested dot products

I have two n-by-k-by-3 arrays a and b, e.g.,
import numpy as np
a = np.array([
[
[1, 2, 3],
[3, 4, 5]
],
[
[4, 2, 4],
[1, 4, 5]
]
])
b = np.array([
[
[3, 1, 5],
[0, 2, 3]
],
[
[2, 4, 5],
[1, 2, 4]
]
])
and it like to compute the dot-product of all pairs of "triplets", i.e.,
np.sum(a*b, axis=2)
A better way to do that is perhaps einsum, but I can't seem to get the indices straight.
Any hints here?
You are loosing the third axis on those two 3D input arrays with that sum-reduction, while keeping the first two axes aligned. Thus, with np.einsum, we would have the first two strings identical alongwith the third string being identical too, but would be skipped in the output string notation signalling we are reducing along that axis for both the inputs. Thus, the solution would be -
np.einsum('ijk,ijk->ij',a,b)

Apply same permutation for every row in a 2D numpy array

To permute a 1D array A I know that you can run the following code:
import numpy as np
A = np.random.permutation(A)
I have a 2D array and want to apply exactly the same permutation for every row of the array. Is there any way you can specify the numpy to do that for you?
Generate random permutations for the number of columns in A and index into the columns of A, like so -
A[:,np.random.permutation(A.shape[1])]
Sample run -
In [100]: A
Out[100]:
array([[3, 5, 7, 4, 7],
[2, 5, 2, 0, 3],
[1, 4, 3, 8, 8]])
In [101]: A[:,np.random.permutation(A.shape[1])]
Out[101]:
array([[7, 5, 7, 4, 3],
[3, 5, 2, 0, 2],
[8, 4, 3, 8, 1]])
Actually you do not need to do this, from the documentation:
If x is a multi-dimensional array, it is only shuffled along its first
index.
So, taking Divakar's array:
a = np.array([
[3, 5, 7, 4, 7],
[2, 5, 2, 0, 3],
[1, 4, 3, 8, 8]
])
you can just do: np.random.permutation(a) and get something like:
array([[2, 5, 2, 0, 3],
[3, 5, 7, 4, 7],
[1, 4, 3, 8, 8]])
P.S. if you need to perform column permutations - just do np.random.permutation(a.T).T. Similar things apply to multi-dim arrays.
It depends what you mean on every row.
If you want to permute all values (regardless of row and column), reshape your array to 1d, permute, reshape back to 2d.
If you want to permutate each row but not shuffle the elements among the different columns you need to loop trough the one axis and call permutation.
for i in range(len(A)):
A[i] = np.random.permutation(A[i])
It can probably done shorter somehow but that is how it can be done.

Categories