How to get a value from every column in a Numpy matrix - python

I'd like to get the index of a value for every column in a matrix M. For example:
M = matrix([[0, 1, 0],
[4, 2, 4],
[3, 4, 1],
[1, 3, 2],
[2, 0, 3]])
In pseudocode, I'd like to do something like this:
for col in M:
idx = numpy.where(M[col]==0) # Only for columns!
and have idx be 0, 4, 0 for each column.
I have tried to use where, but I don't understand the return value, which is a tuple of matrices.

The tuple of matrices is a collection of items suited for indexing. The output will have the shape of the indexing matrices (or arrays), and each item in the output will be selected from the original array using the first array as the index of the first dimension, the second as the index of the second dimension, and so on. In other words, this:
>>> numpy.where(M == 0)
(matrix([[0, 0, 4]]), matrix([[0, 2, 1]]))
>>> row, col = numpy.where(M == 0)
>>> M[row, col]
matrix([[0, 0, 0]])
>>> M[numpy.where(M == 0)] = 1000
>>> M
matrix([[1000, 1, 1000],
[ 4, 2, 4],
[ 3, 4, 1],
[ 1, 3, 2],
[ 2, 1000, 3]])
The sequence may be what's confusing you. It proceeds in flattened order -- so M[0,2] appears second, not third. If you need to reorder them, you could do this:
>>> row[0,col.argsort()]
matrix([[0, 4, 0]])
You also might be better off using arrays instead of matrices. That way you can manipulate the shape of the arrays, which is often useful! Also note ajcr's transpose-based trick, which is probably preferable to using argsort.
Finally, there is also a nonzero method that does the same thing as where in this case. Using the transpose trick now:
>>> (M == 0).T.nonzero()
(matrix([[0, 1, 2]]), matrix([[0, 4, 0]]))

As an alternative to np.where, you could perhaps use np.argwhere to return an array of indexes where the array meets the condition:
>>> np.argwhere(M == 0)
array([[[0, 0]],
[[0, 2]],
[[4, 1]]])
This tells you each the indexes in the format [row, column] where the condition was met.
If you'd prefer the format of this output array to be grouped by column rather than row, (that is, [column, row]), just use the method on the transpose of the array:
>>> np.argwhere(M.T == 0).squeeze()
array([[0, 0],
[1, 4],
[2, 0]])
I also used np.squeeze here to get rid of axis 1, so that we are left with a 2D array. The sequence you want is the second column, i.e. np.argwhere(M.T == 0).squeeze()[:, 1].

The result of where(M == 0) would look something like this
(matrix([[0, 0, 4]]), matrix([[0, 2, 1]])) First matrix tells you the rows where 0s are and second matrix tells you the columns where 0s are.
Out[4]:
matrix([[0, 1, 0],
[4, 2, 4],
[3, 4, 1],
[1, 3, 2],
[2, 0, 3]])
In [5]: np.where(M == 0)
Out[5]: (matrix([[0, 0, 4]]), matrix([[0, 2, 1]]))
In [6]: M[0,0]
Out[6]: 0
In [7]: M[0,2] #0th row 2nd column
Out[7]: 0
In [8]: M[4,1] #4th row 1st column
Out[8]: 0

This isn't anything new on what's been already suggested, but a one-line solution is:
>>> np.where(np.array(M.T)==0)[-1]
array([0, 4, 0])
(I agree that NumPy matrix objects are more trouble than they're worth).

>>> M = np.array([[0, 1, 0],
... [4, 2, 4],
... [3, 4, 1],
... [1, 3, 2],
... [2, 0, 3]])
>>> [np.where(M[:,i]==0)[0][0] for i in range(M.shape[1])]
[0, 4, 0]

Related

Remove all zero rows and columns in one go in Python

I want to remove all zero rows and columns in one line from the array A1. I present the current and expected outputs.
import numpy as np
A1=np.array([[0, 0, 0],
[0, 1, 2],
[0, 3, 4]])
A1 = A1[~np.all(A1 == 0, axis=0)]
print([A1])
The current output is
[array([[0, 1, 2],
[0, 3, 4]])]
The expected output is
[array([[1, 2],
[3, 4]])]
Not really sure your example works, but given the description in the title - for a matrix matrix, you can use
mask = matrix != 0
new_matrix = matrix[np.ix_(mask.any(1), mask.any(0))]
you can check out this post about np.ix_

How would I achieve this "row in A * all rows in B by col in A" multiplication in NumPy without a loop?

Say I have two matrices, A and B:
A = np.array([[1, 3, 2],
[2, 2, 3],
[3, 1, 1]])
B = np.array([[0, 1, 0],
[1, 1, 0],
[1, 1, 1]])
I want to take one column in A and multiply it by each column in B element-wise, then proceed to the next column in A. So, using just one column as an example, I will use A[:,0] (values 1,2,3), and multiply it by each column in B to get this:
array([[0, 1, 0],
[2, 2, 0],
[3, 3, 3]])
I've implemented this using np.einsum like so:
np.einsum('i,ij->ij',A[:,0],B)
I then want to generate a 3D matrix with the depth dimension corresponding to the multiplication by each column in A, which I implemented using a for loop:
np.stack([np.einsum('i,ij->ij',A[:,i],B) for i in range(0,A.shape[1])])
This returns my desired array:
array([[[0, 1, 0],
[2, 2, 0],
[3, 3, 3]],
[[0, 3, 0],
[2, 2, 0],
[1, 1, 1]],
[[0, 2, 0],
[3, 3, 0],
[1, 1, 1]]])
How would I go about doing this without the loop? Can this be done purely with np.einsum? Is there another function in NumPy that will do this more simply?
Here's a simple way:
A.T[:,:,None]*B
adding the last None in indexing creates a new axis which is then used for broadcasting the elementwise multiplication.
How about this code?
A.T.reshape(3, 3, 1) * B
Reshaping ndarray can make doing many things...
Keeping with your usage of einsum:
np.einsum('ij,ik->jik', A, B)

Replace numpy subarray when element matches a condition

I have an n x m x 3 numpy array. This represents a middle-step towards an RGB representation of a complex-function plotter. When the function being plotted takes infinite values or has singularities, parts of the RGB data become NaNs.
I'm looking for an efficient way to replace a row containing a NaN with a row of my choice, perhaps [0, 0, 0] or [1, 1, 1]. In terms of the RGB values, this has the effect of replacing poorly-behaving pixels with white or black pixels. By efficient, I mean some way that takes advantage of numpy's vectorization and speed.
Please note that I am not looking to merely replace the NaN values with 0 (which I know how to do with numpy.where); if a row contains a NaN, I want to replace the whole row. I suspect this can be done nicely in numpy, but I'm not sure how.
Concrete Question
Suppose we are given a 2 x 2 x 3 array arr. If a row contains a 5, I want to replace the row with [0, 0, 0]. Trivial code that does this slowly is as follows.
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 3, 5], [2, 4, 6]]])
# so arr is
# array([[[1, 2, 3],
# [4, 5, 6]],
#
# [[1, 3, 5],
# [2, 4, 6]]])
# Trivial and slow version to replace rows containing 5 with [0,0,0]
for i in range(len(arr)):
for j in range(len(arr[i])):
if 5 in arr[i][j]:
arr[i][j] = np.array([0, 0, 0])
# Now arr is
#
# array([[[1, 2, 3],
# [0, 0, 0]],
#
# [[0, 0, 0],
# [2, 4, 6]]])
How can we accomplish this taking advantage of numpy?
A simpler way would be -
arr[np.isin(arr,5).any(-1)] = 0
If it's just a single value that you are looking for, then we could simplify to -
arr[(arr==5).any(-1)] = 0
If you are looking to match against NaN, we need to do the comparison differently and use np.isnan instead -
arr[np.isnan(arr).any(-1)] = 0
If you are looking to assign array values, instead of just 0, the solutions stay the same. Hence it would be -
arr[(arr==5).any(-1)] = new_array
Using np.broadcast_to
arr[np.broadcast_to((arr == 5).any(-1)[..., None], arr.shape)] = 0
array([[[1, 2, 3],
[0, 0, 0]],
[[0, 0, 0],
[2, 4, 6]]])
Just as FYI, based on your description, if you want to find np.nans instead of integers like 5, you shouldn't use ==, but rather np.isnan
arr[np.broadcast_to((np.isnan(arr)).any(-1)[..., None], arr.shape)] = 0
you can do it using in1d function like below
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 3, 5], [2, 4, 6]]])
arr[np.in1d(arr,5).reshape(arr.shape).any(axis=2)] = [0,0,0]
arr

I have a numpy array, and an array of indexs, how can I access to these positions at the same time

for example, I have the numpy arrays like this
a =
array([[1, 2, 3],
[4, 3, 2]])
and index like this to select the max values
max_idx =
array([[0, 2],
[1, 0]])
how can I access there positions at the same time, to modify them.
like "a[max_idx] = 0" getting the following
array([[1, 2, 0],
[0, 3, 2]])
Simply use subscripted-indexing -
a[max_idx[:,0],max_idx[:,1]] = 0
If you are working with higher dimensional arrays and don't want to type out slices of max_idx for each axis, you can use linear-indexing to assign zeros, like so -
a.ravel()[np.ravel_multi_index(max_idx.T,a.shape)] = 0
Sample run -
In [28]: a
Out[28]:
array([[1, 2, 3],
[4, 3, 2]])
In [29]: max_idx
Out[29]:
array([[0, 2],
[1, 0]])
In [30]: a[max_idx[:,0],max_idx[:,1]] = 0
In [31]: a
Out[31]:
array([[1, 2, 0],
[0, 3, 2]])
Numpy support advanced slicing like this:
a[b[:, 0], b[:, 1]] = 0
Code above would fit your requirement.
If b is more than 2-D. A better way should be like this:
a[np.split(b, 2, axis=1)]
The np.split will split ndarray into columns.

Using numpy.where() to return the indexes of a full array where the tested condition is on a sliced one

I have the following 3 x 3 x 3 numpy array called a (the comments will make sense after you read the rest of the question):
array([[[8, 1, 0], # irrelevant 1 (is at position 1 rather than 0)
[1, 7, 5], # the 1 on this line is what I am after!
[1, 4, 9]], # irrelevant 1 (out of the "cross")
[[4, 0, 1], # irrelevant 1 (is at position 2 rather than 0)
[1, 0, 1], # I'm only after the first 1 on this line!
[6, 2, 1]], # irrelevant 1 (is at position 2 rather than 0)
[[0, 2, 2],
[0, 6, 7],
[3, 4, 9]]])
furthermore I have this list of indexes that refers to the "central cross" of said matrix, called idx
[array([0, 1, 1, 1, 2]), array([1, 0, 1, 2, 1])]
EDIT: I call it "cross" as it marks the central column and row in the following:
>>> a[..., 0]
array([[8, 1, 1],
[4, 1, 6],
[0, 0, 3]])
What I would like to obtain is the indexes of all those arrays located at idx whose first value is 1, but I'm struggling in understanding how to use numpy.where() in the right way. Since...
>>> a[..., 0][idx]
array([1, 4, 1, 6, 0])
...I tried...
>>> np.where(a[..., 0][idx] == 1)
(array([0, 2]),)
...but as you can see it returns the index of the sliced array, not of a, while I would like to get:
[array([0, 1]), array([1, 1])] #as a[0, 1, 0] and a [1, 1, 0] are equal to 1.
Thank you in advance for your help!
PS: In the comments I have been suggested to try to give a broader scenario of applicability. Although it is not what I am using for, I suppose this could be used to process images as many 2D libraries do, with a source layer, a destination layer and a mask (see for example cairo). In this case the mask would be the idx array, and one might imagine working with the R channel of RGB colors (a[..., 0]).
You can translate the indices back using idx:
>>> w = np.where(a[..., 0][idx] == 1)[0]
>>> array(idx).T[w]
array([[0, 1],
[1, 1]])

Categories