Getting first n non-zero elements of a numpy matrix - python

I have a matrix M like this:
>>> M
array([[1, 0, 3, 4],
[0, 3, 4, 5],
[5, 4, 0, 7]])
What I want to do is to extract is the first N (let's say N = 2) non-zero elements of each row in M and put them in a new matrix M2 of the same shape, while setting the matching values in M to zero. So the output should be:
>>> M
array([[0, 0, 0, 4],
[0, 0, 0, 5],
[0, 0, 0, 7]])
>>> M2
array([[1, 0, 3, 0],
[0, 3, 4, 0],
[5, 4, 0, 0]])

One approach with cumsum -
M2 = M*((M!=0).cumsum(1)<=2)
M_new = M - M2
Sample run -
In [42]: M
Out[42]:
array([[1, 0, 3, 4],
[0, 3, 4, 5],
[5, 4, 0, 7]])
In [43]: M2 = M*((M!=0).cumsum(1)<=2)
...: M_new = M - M2
...:
In [44]: M_new
Out[44]:
array([[0, 0, 0, 4],
[0, 0, 0, 5],
[0, 0, 0, 7]])
In [45]: M2
Out[45]:
array([[1, 0, 3, 0],
[0, 3, 4, 0],
[5, 4, 0, 0]])

Related

Numpy add array of shape NxD to NxK into the first D in array NxK

I have two arrays
a = np.array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4]])
and
b = np.array([[1, 1],
[2, 2],
[3, 3]])
I want to one array where I am adding the values of b to the first two columns in a like this:
c = np.array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
if it helps you can think of the first two columns in a as the x,y coordinates and b as dx, dy.
My current method is as follows:
c = np.concatenate([a[:, 0:2] + b, a[:, 2:]],1)
but I am looking for a better method
Thank you
You can use np.pad to add zeros to b to make its shape the same as a's, then add them:
>>> a + np.pad(b, ((0, 0), (0, 3)))
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
In general (for 2-D):
>>> a = np.array([[0, 0, 2, 3, 4],
... [0, 1, 2, 3, 4],
... [0, 2, 2, 3, 4]])
>>> b = np.array([[1, 1],
... [2, 2],
... [3, 3],
... [4, 4],
... [5, 5]])
>>> a_shape, b_shape = a.shape, b.shape
>>> max_w = max(a_shape[0], b_shape[0])
>>> max_h = max(a_shape[1], b_shape[1])
>>> padded_a = np.pad(a,
((0, np.abs(a_shape[0] - max_w)),
(0, np.abs(a_shape[1] - max_h))))
>>> padded_a
array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> padded_b = np.pad(b,
((0, np.abs(b_shape[0] - max_w)),
(0, np.abs(b_shape[1] - max_h))))
>>> padded_b
array([[1, 1, 0, 0, 0],
[2, 2, 0, 0, 0],
[3, 3, 0, 0, 0],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
>>> padded_a + padded_b
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
In general (2-D, using a zeros array and adding to it):
>>> c = np.zeros((max_h, max_w), dtype=a.dtype)
>>> c[:a_shape[0], :a_shape[1]] += a
>>> c[:b_shape[0], :b_shape[1]] += b
>>> c
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])

How can I store store index pairs using True values from a boolean-like square symmetric numpy array?

I have a Numpy Array that with integer values 1 or 0 (can be cast as booleans if necessary). The array is square and symmetric (see note below) and I want a list of the indices where a 1 appears:
Note that array[i][j] == array[j][i] and array[i][i] == 0 by design. Also I cannot have any duplicates.
import numpy as np
array = np.array([
[0, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 1, 0],
[1, 1, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 1, 1, 0],
[1, 0, 0, 1, 0, 0, 1],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]
])
I would like a result that is like this (order of each sub-list is not important, nor is the order of each element within the sub-list):
[
[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 6],
[3, 4],
[3, 5],
[4, 6]
]
Another point to make is that I would prefer not to loop over all indices twice using the condition j<i because the size of my array can be large but I am aware that this is a possibility - I have written an example of this using two for loops:
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
print(pd.DataFrame(result).sort_values(1).values)
# using dataframes and arrays for formatting but looking for
# 'result' which is a list
# Returns (same as above but columns are the opposite way round):
[[2 0]
[4 0]
[6 0]
[2 1]
[3 1]
[5 1]
[6 2]
[4 3]
[5 3]
[6 4]]
idx = np.argwhere(array)
idx = idx[idx[:,0]<idx[:,1]]
Another way:
idx = np.argwhere(np.triu(array))
output:
[[0 2]
[0 4]
[0 6]
[1 2]
[1 3]
[1 5]
[2 6]
[3 4]
[3 5]
[4 6]]
Comparison:
##bousof solution
def method1(array):
return np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
#Also mentioned by #hpaulj
def method2(array):
return np.argwhere(np.triu(array))
def method3(array):
idx = np.argwhere(array)
return idx[idx[:,0]<idx[:,1]]
#The original method in question by OP(d-man)
def method4(array):
result = []
for i in range(array.shape[0]):
for j in range(i):
if array[i][j]:
result.append([i, j])
return result
#suggestd by #bousof in comments
def method5(array):
return np.vstack(np.where(np.triu(array))).transpose()
inputs = [np.random.randint(0,2,(n,n)) for n in [10,100,1000,10000]]
Seems like method1, method2 and method5 are slightly faster for large arrays while method3 is faster for smaller cases:
In [249]: arr = np.array([
...: [0, 0, 1, 0, 1, 0, 1],
...: [0, 0, 1, 1, 0, 1, 0],
...: [1, 1, 0, 0, 0, 0, 1],
...: [0, 1, 0, 0, 1, 1, 0],
...: [1, 0, 0, 1, 0, 0, 1],
...: [0, 1, 0, 1, 0, 0, 0],
...: [1, 0, 1, 0, 1, 0, 0]
...: ])
The most common way of getting indices on non-zeros (True) is with np.nonzero (aka np.where):
In [250]: idx = np.nonzero(arr)
In [251]: idx
Out[251]:
(array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6]),
array([2, 4, 6, 2, 3, 5, 0, 1, 6, 1, 4, 5, 0, 3, 6, 1, 3, 0, 2, 4]))
This is a tuple - 2 arrays for a 2d array. It can be used directly to index the array (or anything like it): arr[idx] will give all 1s.
Apply np.transpose to that and get an array of 'pairs':
In [252]: np.argwhere(arr)
Out[252]:
array([[0, 2],
[0, 4],
[0, 6],
[1, 2],
[1, 3],
[1, 5],
[2, 0],
[2, 1],
[2, 6],
[3, 1],
[3, 4],
[3, 5],
[4, 0],
[4, 3],
[4, 6],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
Using such an array to index arr is harder - requiring a loop and conversion to tuple.
To weed out the symmetric duplicates we could make a tri-lower array:
In [253]: np.tril(arr)
Out[253]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0]])
In [254]: np.argwhere(np.tril(arr))
Out[254]:
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
You can use numpy.where:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0))).transpose()
array([[2, 0],
[2, 1],
[3, 1],
[4, 0],
[4, 3],
[5, 1],
[5, 3],
[6, 0],
[6, 2],
[6, 4]])
np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]<=0 is true only on the lower part of the matrix. If the order is important, you can get the same order as in the question using:
>>> np.vstack(np.where(np.logical_and(array, np.diff(np.ogrid[:array.shape[0],:array.shape[0]])[0]>=0))).transpose()[:,::-1]
array([[2, 0],
[4, 0],
[6, 0],
[2, 1],
[3, 1],
[5, 1],
[6, 2],
[4, 3],
[5, 3],
[6, 4]])

Most efficient way to get sorted indices based on two numpy arrays

How can i get the sorted indices of a numpy array (distance), only considering certain indices from another numpy array (val).
For example, consider the two numpy arrays val and distance below:
val = np.array([[10, 0, 0, 0, 0],
[0, 0, 10, 0, 10],
[0, 10, 10, 0, 0],
[0, 0, 0, 10, 0],
[0, 0, 0, 0, 0]])
distance = np.array([[4, 3, 2, 3, 4],
[3, 2, 1, 2, 3],
[2, 1, 0, 1, 2],
[3, 2, 1, 2, 3],
[4, 3, 2, 3, 4]])
the distances where val == 10 are 4, 1, 3, 1, 0, 2. I would like to get these sorted to be 0, 1, 1, 2, 3, 4 and return the respective indices from distance array.
Returning something like:
(array([2, 1, 2, 3, 1, 0], dtype=int64), array([2, 2, 1, 3, 4, 0], dtype=int64))
or:
(array([2, 2, 1, 3, 1, 0], dtype=int64), array([2, 1, 2, 3, 4, 0], dtype=int64))
since the second and third element both have distance '1', so i guess the indices can be interchangable.
Tried using combinations of np.where, np.argsort, np.argpartition, np.unravel_index but cant seem to get it working right
Here's one way with masking -
In [20]: mask = val==10
In [21]: np.argwhere(mask)[distance[mask].argsort()]
Out[21]:
array([[2, 2],
[1, 2],
[2, 1],
[3, 3],
[1, 4],
[0, 0]])

numpy matrix. move all 0's to the end of each row

Given a matrix in python numpy which has for some of its rows, leading zeros. I need to shift all zeros to the end of the line.
E.g.
0 2 3 4
0 0 1 5
2 3 1 1
should be transformed to
2 3 4 0
1 5 0 0
2 3 1 1
Is there any nice way to do this in python numpy?
To fix for leading zeros rows -
def fix_leading_zeros(a):
mask = a!=0
flipped_mask = mask[:,::-1]
a[flipped_mask] = a[mask]
a[~flipped_mask] = 0
return a
To push all zeros rows to the back -
def push_all_zeros_back(a):
# Based on http://stackoverflow.com/a/42859463/3293881
valid_mask = a!=0
flipped_mask = valid_mask.sum(1,keepdims=1) > np.arange(a.shape[1]-1,-1,-1)
flipped_mask = flipped_mask[:,::-1]
a[flipped_mask] = a[valid_mask]
a[~flipped_mask] = 0
return a
Sample runs -
In [220]: a
Out[220]:
array([[0, 2, 3, 4],
[0, 0, 1, 5],
[2, 3, 1, 1]])
In [221]: fix_leading_zero_rows(a)
Out[221]:
array([[2, 3, 4, 0],
[1, 5, 0, 0],
[2, 3, 1, 1]])
In [266]: a
Out[266]:
array([[0, 2, 3, 4, 0],
[0, 0, 1, 5, 6],
[2, 3, 0, 1, 0]])
In [267]: push_all_zeros_back(a)
Out[267]:
array([[2, 3, 4, 0, 0],
[1, 5, 6, 0, 0],
[2, 3, 1, 0, 0]])
leading zeros, simple loop
ar = np.array([[0, 2, 3, 4],
[0, 0, 1, 5],
[2, 3, 1, 1]])
for i in range(ar.shape[0]):
for j in range(ar.shape[1]): # prevent infinite loop if row all zero
if ar[i,0] == 0:
ar[i]=np.roll(ar[i], -1)
ar
Out[31]:
array([[2, 3, 4, 0],
[1, 5, 0, 0],
[2, 3, 1, 1]])

How do I add a guard ring to a matrix in NumPy?

Using NumPy, a matrix A has n rows and m columns, and I want add a guard ring to matrix A. That guard ring is all zero.
What should I do? Use Reshape? But the element is not enough to make a n+1 m+1 matrix.
Or etc.?
Thanks in advance
I mean an extra ring of cells that always contain 0 surround matrix A.Basically there is a Matrix B has n+2rows m+2columns where the first row and columns and the last row and columns are all zero,and the rest of it are same as matrix A.
Following up on your comment:
>>> import numpy
>>> a = numpy.array(range(9)).reshape((3,3))
>>> b = numpy.zeros(tuple(s+2 for s in a.shape), a.dtype)
>>> b[tuple(slice(1,-1) for s in a.shape)] = a
>>> b
array([[0, 0, 0, 0, 0],
[0, 0, 1, 2, 0],
[0, 3, 4, 5, 0],
[0, 6, 7, 8, 0],
[0, 0, 0, 0, 0]])
This is a less general but easier to understand version of Alex's answer:
>>> a = numpy.array(range(9)).reshape((3,3))
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> b = numpy.zeros(a.shape + numpy.array(2), a.dtype)
>>> b
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> b[1:-1,1:-1] = a
>>> b
array([[0, 0, 0, 0, 0],
[0, 0, 1, 2, 0],
[0, 3, 4, 5, 0],
[0, 6, 7, 8, 0],
[0, 0, 0, 0, 0]])
This question is ancient now, but I just want to alert people finding it that numpy has a function pad that very easily accomplishes this now.
import numpy as np
a = np.array(range(9)).reshape((3, 3))
a
Out[15]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
a = np.pad(a, pad_width=((1,1),(1,1)), mode='constant', constant_values=0)
a
Out[16]:
array([[0, 0, 0, 0, 0],
[0, 0, 1, 2, 0],
[0, 3, 4, 5, 0],
[0, 6, 7, 8, 0],
[0, 0, 0, 0, 0]])

Categories