Delete diagonals of zero elements - python

I'm trying to reshape an array from its original shape, to make the elements of each row descend along a diagonal:
np.random.seed(0)
my_array = np.random.randint(1, 50, size=(5, 3))
array([[45, 48, 1],
[ 4, 4, 40],
[10, 20, 22],
[37, 24, 7],
[25, 25, 13]])
I would like the result to look like this:
my_array_2 = np.array([[45, 0, 0],
[ 4, 48, 0],
[10, 4, 1],
[37, 20, 40],
[25, 24, 22],
[ 0, 25, 7],
[ 0, 0, 13]])
This is the closest solution I've been able to get:
my_diag = []
for i in range(len(my_array)):
my_diag_ = np.diag(my_array[i], k=0)
my_diag.append(my_diag_)
my_array1 = np.vstack(my_diag)
array([[45, 0, 0],
[ 0, 48, 0],
[ 0, 0, 1],
[ 4, 0, 0],
[ 0, 4, 0],
[ 0, 0, 40],
[10, 0, 0],
[ 0, 20, 0],
[ 0, 0, 22],
[37, 0, 0],
[ 0, 24, 0],
[ 0, 0, 7],
[25, 0, 0],
[ 0, 25, 0],
[ 0, 0, 13]])
From here I think it might be possible to remove all zero diagonals, but I'm not sure how to do that.

One way using numpy.pad:
n = my_array.shape[1] - 1
np.dstack([np.pad(a, (i, n-i), "constant")
for i, a in enumerate(my_array.T)])
Output:
array([[[45, 0, 0],
[ 4, 48, 0],
[10, 4, 1],
[37, 20, 40],
[25, 24, 22],
[ 0, 25, 7],
[ 0, 0, 13]]])

In [134]: arr = np.array([[45, 48, 1],
...: [ 4, 4, 40],
...: [10, 20, 22],
...: [37, 24, 7],
...: [25, 25, 13]])
In [135]: res= np.zeros((arr.shape[0]+arr.shape[1]-1, arr.shape[1]), arr.dtype)
Taking a hint from how np.diag indexes a diagonal, iterate on the rows of arr:
In [136]: for i in range(arr.shape[0]):
...: n = i*arr.shape[1]
...: m = arr.shape[1]
...: res.flat[n:n+m**2:m+1] = arr[i,:]
...:
In [137]: res
Out[137]:
array([[45, 0, 0],
[ 4, 48, 0],
[10, 4, 1],
[37, 20, 40],
[25, 24, 22],
[ 0, 25, 7],
[ 0, 0, 13]])

There's probably a shift capability in numpy, but I'm not familiar w/it, so here's a solution using pandas. You concat np.zeros to the original array with the number of rows being equal to ncols - 1. Then iterate over each col and shift it down by the number equal to the column number.
import numpy as np
import pandas as pd
np.random.seed(0)
my_array = np.random.randint(1,50, size=(5,3))
df = pd.DataFrame(np.concatenate((my_array,np.zeros((my_array.shape[1]-1,
my_array.shape[1])))))
for col in df.columns:
df[col] = df[col].shift(int(col))
df.fillna(0).values
Output
array([[45., 0., 0.],
[ 4., 48., 0.],
[10., 4., 1.],
[37., 20., 40.],
[25., 24., 22.],
[ 0., 25., 7.],
[ 0., 0., 13.]])

You can create a fancy index for the output using simple broadcasting and padding. First pad the end of your data:
a = np.concatenate((a, np.zeros((a.shape[1] - 1, a.shape[1]), a.dtype)), axis=0)
Now make an index that gets the elements using their negative index. This will make it trivial to roll around the end:
cols = np.arange(a.shape[1])
rows = np.arange(a.shape[0]).reshape(-1, 1) - cols
Now just simply index:
result = a[rows, cols]
For large arrays, this may not be as efficient as running a small loop. At the same time, this avoids actual looping, and allows you to write a one-liner (but please don't):
result = np.concatenate((a, np.zeros((a.shape[1] - 1, a.shape[1]), a.dtype)), axis=0)[np.arange(a.shape[0] + a.shape[1] - 1).reshape(-1, 1) - np.arange(a.shape[1]), np.arange(a.shape[1])]

Related

How to using numpy.argsort on a 2D array to sort another 2D array

I use numpy.argsort all the time for 1D data, but it seems to behaving differently in 2D.
For example, let's say I want to argsort this array along axis 1 so the items in each row are in ascending order
>>> import numpy as np
>>> arr = np.eye(4)
>>> arr
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
>>> idx = np.argsort(arr, axis=1)
>>> idx
array([[1, 2, 3, 0],
[0, 2, 3, 1],
[0, 1, 3, 2],
[0, 1, 2, 3]])
All fine so far.
Each row in the above gives the order to how the columns should be rearranged in the second array.
Let's say we want to sort the array below with the above idx.
>>> arr2 = np.arange(16).reshape((4, 4))
>>> arr2
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> sorted = arr2[idx]
>>> sorted
array([[[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
....
[[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]]])
>>> sorted.shape
(10, 4, 4)
The shape now has an added dimensions.
I was expecting to get.
array([[ 1, 2, 3, 0],
[ 4, 6, 7, 5],
[ 8, 9, 11, 10],
[12, 13, 14, 15]])
I can do this iterating over the rows, which is bad!
>>> rows = []
>>> for i, row in enumerate(arr2):
... rows.append(row[idx[i]])
>>> np.arrays(rows)
array([[ 1, 2, 3, 0],
[ 4, 6, 7, 5],
[ 8, 9, 11, 10],
[12, 13, 14, 15]])
np.take_along_axis has an example using argsort:
>>> a = np.array([[10, 30, 20], [60, 40, 50]])
We can sort either by using sort directly, or argsort and this function
>>> np.sort(a, axis=1)
array([[10, 20, 30],
[40, 50, 60]])
>>> ai = np.argsort(a, axis=1); ai
array([[0, 2, 1],
[1, 2, 0]])
>>> np.take_along_axis(a, ai, axis=1)
array([[10, 20, 30],
[40, 50, 60]])
This streamlines a process of applying ai to the array itself. We can do that directly, but it requires a bit more thought about what the index actually represents.
In this example, ai are index values along axis 1 (values like 0,1,or 2). This (2,3) has to broadcast with a (2,1) array for axis 0:
In [247]: a[np.arange(2)[:,None], ai]
Out[247]:
array([[10, 20, 30],
[40, 50, 60]])

Replace values from (m,n,3) array with conditions from (m,n,1) array

Let's say I have the following array:
a = np.random.randint(5, size=(2000, 2000, 1))
a = np.repeat(a, 3, axis=2) # Using this method to have a (m,n,3) array with the same values
and the next arrays:
val_old = np.array([[0, 0, 0], [3, 3, 3]])
val_new = np.array([[12, 125, 13], [78, 78, 0]])
What I want to do is to replace the values from the array a with the values specified in the array val_new. So, all [0,0,0] arrays would become [12,125,13] and all [3,3,3] would become [78, 78, 0].
I can't find an efficient way to do this... I tried to adapt this solution but it's only for 1-d arrays...
Does anyone know a fast way/method to replace these values ?
Assuming you have a "map" for each integer, you can use a (2000, 2000) index on a (5,) array to broadcast to a (2000,2000, 5) array. example:
val_new = np.array([[12, 125, 13], [0,0,0], [1,3,3], [78, 78, 0]]) #0, 1, 2, 3
a = np.random.randint(4,size=(4,5))
val_new[a] # (4,5,3) shaped array
>>array([[[ 0, 0, 0],
[ 78, 78, 0],
[ 78, 78, 0],
[ 12, 125, 13],
[ 0, 0, 0]],
....
[[ 12, 125, 13],
[ 12, 125, 13],
[ 0, 0, 0],
[ 12, 125, 13],
[ 0, 0, 0]]])

Set 3D numpy array value to 0 if last axis index is smaller than value in another 2D array

I have a 3D array a with shape (m, n, p) and a 2D array idx with shape (m, n). I want all elements in a where the last axis index is smaller than the corresponding element in idx to be set to 0.
The following code works. My question is : is there a more efficient approach?
a = np.array([[[1, 2, 3],
[4, 5, 6]],
[[7, 8, 9],
[10, 11, 12]],
[[21, 22, 23],
[25, 26, 27]]])
idx = np.array([[2, 1],
[0, 1],
[1, 1]])
for (i, j), val in np.ndenumerate(idx):
a[i, j, :val] = 0
The result is
array([[[ 0, 0, 3],
[ 0, 5, 6]],
[[ 7, 8, 9],
[ 0, 11, 12]],
[[ 0, 22, 23],
[ 0, 26, 27]]])
Use broadcasting to create the 3D mask and then assign zeros with boolean-indexing -
mask = idx[...,None] > np.arange(a.shape[2])
a[mask] = 0
Alternatively, we can also use NumPy builtin for outer-greater comparison to get that mask -
mask = np.greater.outer(idx, np.arange(a.shape[2]))
Run on given sample -
In [34]: mask = idx[...,None] > np.arange(a.shape[2])
In [35]: a[mask] = 0
In [36]: a
Out[36]:
array([[[ 0, 0, 3],
[ 0, 5, 6]],
[[ 7, 8, 9],
[ 0, 11, 12]],
[[ 0, 22, 23],
[ 0, 26, 27]]])

How to write numpy where condition based on indices and not values?

I have a 2d numpy array and I need to extract all elements array[i][j] if the conditions
x1range < i < x2range and y1range < j < y2range are satisfied.
How do I write such conditions? Do I need to use mgrid/ogrid?
Edit: Should have written my additional requirement. I was looking for a where condition, and not splicing, because I want to change the values of all the elements to (0,0,0) which satisfy the above condition. I assumed if I have a where condition, I could do that.
Edit2: Also, is it possible to get the 'not' of the above condition?
As in,
if i > x1range and i < x2range and j > y1range and j < y2range: # the above condition
do nothing # keep original value
else:
val = (0,0,0)
Problem #1: Getting indices within the range
You could use np.meshgrid to get those indices -
In [145]: x1range,x2range = 2,5
...: y1range,y2range = 1,4
...:
In [146]: np.meshgrid(np.arange(x1range,x2range),np.arange(y1range,y2range))
Out[146]:
[array([[2, 3, 4],
[2, 3, 4],
[2, 3, 4]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
Problem #2 : Extracting or setting input array elements within those ranges
You could use np.ix_ to directly index into the input array arr -
In [148]: arr
Out[148]:
array([[97, 69, 0, 60, 28, 97],
[98, 85, 24, 75, 97, 23],
[70, 25, 77, 86, 93, 66],
[ 0, 85, 51, 17, 40, 92],
[66, 28, 28, 22, 79, 52]])
In [149]: arr[np.ix_(np.arange(x1range,x2range),np.arange(y1range,y2range))]
Out[149]:
array([[25, 77, 86],
[85, 51, 17],
[28, 28, 22]])
With this indexing, one can also set all those elements directly.
Problem #3 : Extracting or setting input array elements NOT within those ranges
To set/ extract the not satisfied elements to 0s and keeping rest as the same, you can use NumPy broadcasting alongwith boolean-indexing like so -
In [150]: Imask = np.in1d(np.arange(arr.shape[0]),np.arange(x1range,x2range))
...: Jmask = np.in1d(np.arange(arr.shape[1]),np.arange(y1range,y2range))
...: arr[~(Imask[:,None] & Jmask)] = 0
...:
In [151]: arr
Out[151]:
array([[ 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0],
[ 0, 25, 77, 86, 0, 0],
[ 0, 85, 51, 17, 0, 0],
[ 0, 28, 28, 22, 0, 0]])
just a guess.
x=array[x1range:x2range,y1range:y2range]
What about slicing?
array[x1range:x2range,y1range:y2range]
Example:
numpy.array([[1,2,3],[4,5,6],[7,8,9]])[0:2,0:2]
array([[1, 2],
[4, 5]])

How to get a padded slice of a multidimensional array?

I am stuck on a little issue in the project I am currently working on.
Getting straight to the point, let's assume I have a 2-dimensional numpy.array - I will call it arr.
I need to slice arr, but this slice must contain some padding depending on the selected interval.
Example:
arr = numpy.array([
[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[ 11, 12, 13, 14, 15],
[ 16, 17, 18, 19, 20],
[ 21, 22, 23, 24, 25]
])
Actually, numpy's response for arr[3:7, 3:7] is:
array([[19, 20],
[24, 25]])
But I need it to be padded as if arr were bigger than it really is.
Here is what I need as response for arr[3:7, 3:7]:
array([[19, 20, 0, 0],
[24, 25, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0]])
This padding should also occur in case of negative indices. If the requested slice is bigger than the whole image, padding must occur in all sides, if needed.
Another example, negative indices. This is the expected result for arr[-2:2, -1:3]:
array([[ 0, 0, 0, 0],
[ 0, 0, 1, 2],
[ 0, 0, 6, 7],
[ 0, 0, 11, 12]])
Is there any native numpy function for this? If not, any idea of how can I implement this?
About the first part of your question you can use a simple indexing, and you can create a zero_like of your array with numpy.zeros_like then assign the special part :
>>> new=numpy.zeros_like(arr)
>>> part=arr[3:7, 3:7]
>>> i,j=part.shape
>>> new[:i,:j]=part
>>> new
array([[19, 20, 0, 0, 0],
[24, 25, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0]])
But for the second case you can not use a negative indexing for for numpy arrays like this.Negative indices are interpreted as counting from the end of the array so if you are counting from -2 actually in a 5x5 array there are not any row between -2 and 2 so the result would be an empty array :
>>> arr[-2:2]
array([], shape=(0, 5), dtype=int64)
You can do something like:
print np.lib.pad(arr[3:7,3:7], ((0, 2), (0, 2)), 'constant', constant_values=(0,0 ))
[[19 20 0 0]
[24 25 0 0]
[ 0 0 0 0]
[ 0 0 0 0]]
For the negative indexing:
print np.lib.pad(arr[ max(0,-1):3 , 0:2 ], ((1, 0), (2, 0)), 'constant', constant_values=(0,0 ))
[[ 0 0 0 0]
[ 0 0 1 2]
[ 0 0 6 7]
[ 0 0 11 12]]
Check here for reference
import numpy as np
def convert(inarr, x1, x2, y1, y2):
xd = x2 - x1
yd = y2 - y1
outarr = np.zeros(xd * yd).reshape(xd, yd)
x1fr = max(0, x1)
x2fr = min(x2, inarr.shape[0])
y1fr = max(0, y1)
y2fr = min(y2, inarr.shape[1])
x1to = max(0, xd - x2)
x2to = x1to + x2fr - x1fr
y1to = max(0, yd - y2)
y2to = y1to + y2fr - y1fr
outarr[x1to:x2to, y1to:y2to] = inarr[x1fr:x2fr, y1fr:y2fr]
return outarr
arr = np.array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
print(convert(arr, -2, 2, -1, 3))
Well this works but returns
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 1. 2. 3.]
[ 0. 6. 7. 8.]]
for your -ve index example. You can play around to get it to do what you expect

Categories