Python Numpy array indexing and masking - python

I need help indexing a 3 dimensional array (an RGB/BGR image) using a 2 dimensional array of indices. All values are either 0, 1, or 2 for different color channels. The result should be a 2D array of color values. If anyone can tell me the syntax for this in python that would be great!
For a context of what I am trying to do (also read TLDR below):
I am essentially trying to convert the following code from normal for loop syntax which is very very slow to a more efficient python/numpy syntax:
colorIndices = np.zeros((height,width)); # an array which has the index of the outstanding color
colorIndices -= 1; # all -1's
for x in range(0,width):
for y in range(0,height):
pix = img[y,x]; # get the pixel, a 1D array of length 3
colorID = np.argmax(pix); #get which index has max value (candidate for outstanding color)
if(pix[colorID]>np.average(pix)+np.std(pix)): # if that value is more than one std dev away from the overall pixel's value, the pixel has an outstanding color
colorIndices[y,x] = colorID;
I then would like to access the outstanding color channels in each pixel using something like:
img[:,:,:]=0;
img[colorIndices] = 255;
TLDR: I want to set a pixel to pure blue, green, or red if it is a shade of that color. The way I define if a pixel is shade red is if the R value of the pixel is more than one std above the average of the overall distribution of {R, G, B}.
The broken code I have so far:
colorIDs = np.argmax(img, axis=2);
averages = np.average(img, axis=2);
stds = np.std(img, axis=2);
cutoffs = averages + stds;
print(img[colorIDs]);

I think you want to apply the 2d indexing mask from argmax to the 2nd axis:
In [38]: img=np.random.randint(0,10,(16,16,3))
In [39]: ids = np.argmax(img, axis=2)
In [40]: ids
Out[40]:
array([[0, 1, 2, 1, 2, 0, 0, 0, 2, 2, 1, 0, 1, 2, 1, 0],
[0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1],
[1, 1, 0, 0, 0, 0, 1, 2, 0, 2, 2, 1, 2, 1, 1, 0],
[2, 0, 1, 2, 0, 0, 1, 1, 0, 2, 2, 1, 1, 1, 1, 2],
[2, 2, 1, 1, 0, 1, 0, 2, 1, 0, 0, 2, 2, 0, 1, 2],
[1, 0, 2, 1, 0, 2, 0, 1, 0, 1, 1, 2, 1, 1, 0, 2],
[1, 0, 0, 0, 1, 2, 1, 0, 1, 2, 1, 1, 1, 2, 0, 0],
[1, 2, 2, 2, 0, 0, 1, 1, 0, 1, 0, 2, 2, 1, 1, 0],
[0, 2, 2, 1, 0, 0, 1, 0, 2, 1, 1, 0, 2, 1, 1, 0],
[1, 0, 2, 1, 2, 0, 1, 1, 0, 2, 2, 2, 1, 1, 0, 1],
[1, 1, 1, 1, 1, 2, 1, 1, 0, 2, 1, 0, 0, 1, 0, 0],
[1, 2, 1, 0, 2, 2, 2, 1, 0, 1, 2, 1, 2, 0, 2, 1],
[2, 0, 2, 1, 2, 0, 1, 1, 2, 2, 2, 2, 1, 0, 2, 1],
[0, 1, 0, 0, 2, 0, 1, 0, 0, 0, 0, 2, 0, 2, 0, 1],
[0, 1, 2, 1, 1, 0, 1, 2, 0, 1, 0, 0, 2, 1, 0, 2],
[0, 0, 2, 2, 2, 2, 2, 1, 0, 0, 0, 2, 0, 0, 1, 1]])
In [41]: I,J = np.ix_(np.arange(16), np.arange(16))
In [42]: img[I,J,ids]
Out[42]:
array([[5, 9, 9, 8, 8, 8, 5, 7, 1, 9, 9, 5, 5, 9, 6, 8],
[6, 7, 5, 8, 5, 6, 9, 6, 7, 7, 7, 8, 3, 7, 9, 5],
[7, 6, 8, 7, 6, 9, 6, 8, 9, 5, 8, 8, 9, 7, 9, 6],
[8, 9, 3, 4, 7, 5, 8, 4, 4, 9, 1, 4, 9, 9, 9, 7],
[9, 8, 9, 7, 9, 8, 7, 5, 8, 9, 9, 6, 9, 5, 8, 8],
[7, 9, 8, 8, 9, 3, 6, 9, 8, 6, 8, 7, 7, 7, 7, 7],
[8, 8, 5, 8, 9, 8, 8, 2, 8, 7, 8, 9, 5, 5, 6, 7],
[9, 6, 6, 9, 5, 3, 6, 4, 7, 6, 8, 8, 6, 3, 9, 9],
[7, 8, 9, 7, 5, 7, 5, 9, 6, 4, 7, 7, 8, 5, 7, 8],
[9, 7, 6, 4, 8, 9, 3, 8, 9, 2, 6, 9, 6, 7, 9, 7],
[9, 8, 6, 6, 5, 9, 3, 9, 2, 4, 9, 5, 9, 9, 6, 9],
[8, 7, 8, 3, 8, 8, 9, 7, 9, 5, 9, 8, 6, 9, 7, 8],
[8, 2, 7, 7, 4, 5, 9, 8, 8, 8, 6, 5, 3, 9, 9, 6],
[6, 8, 8, 5, 8, 8, 8, 9, 3, 7, 7, 8, 5, 4, 2, 9],
[3, 7, 9, 9, 8, 5, 9, 8, 9, 7, 3, 3, 9, 5, 5, 9],
[8, 4, 3, 6, 4, 9, 9, 9, 9, 9, 9, 7, 9, 7, 5, 8]])
Recent numpy versions have a function that does this for us
np.take_along_axis(img, ids[:,:,None], 2)[:,:,0]
and to set values np.put_along_axis.

You can convert a numerical index along a given axis into values, you can use np.take_along_axis or fancy indexing. When using a fancy index, you need to index along all axes with arrays whose shapes broadcast to the size of the final result. np.ogrid helps with this. For an MxNx3 array img (M, N, _ = img.shape), if you had ix = np.argmax(img, axis=2), the index would look like this:
r, c = np.ogrid[:M, :N]
maxes = img[r, c, ix]
Using take_along_axis saves you a step and some temp arrays:
maxes = np.take_along_axis(img, ix, 2)
Now create your mask:
significant = np.abs(maxes - img.mean(axis=2) > img.std(axis=2))
At this point you have a 2D boolean mask and an integer index in the third dimension. The simplest thing is probably to turn everything into a linear index:
r, c = np.where(significant)
Now you can construct the output:
color_index = np.zeros_like(img)
color_index[r, c, ix[significant]] = 255
While tempting, np.put_along_axis can not be used here in a straightforward manner. The issue is that masking ix with significant would invalidated its shape similarity. You could, however, create an intermediate 2D array containing 255 at the locations marked by significant, and use that with put_along_axis:
values = np.zeros(significant.shape, dtype=img.dtype)
values[significant] = 255
color_index = np.zeros_like(img)
np.put_along_axis(color_index, ix, values, 2)
All combined:
ix = np.argmax(img, axis=2)
significant = np.abs(np.take_along_axis(img, ix, 2) - img.mean(axis=2)) > img.std(axis=2)
color_index = np.zeros_like(img)
color_index[(*np.where(significant), ix[significant])] = 255

Related

How to copy the next row from the previous row in matrix by using python?

I would like to ask you how to copy the rows to the next rows in matrix by using python. I tried however the result is not as expected. Here is the coding and the expected result. Thank you in advance.
import numpy as np
Total_T = 6
Total_P = 8
M0 = np.array([[0,0,0,0,0,1,1,1]])
M = np.random.randint(10,size = (Total_T, Total_P))
for k in range(0,Total_T):
for l in range (0,Total_P):
if k == Total_T-1:
M[Total_T-1][l] = M0[0][l]
#if k == 0:
#M[0][l] = M0[0][l]
else:
M
M_prev = np.empty((Total_T, Total_P), dtype = object)
if M is not None:
for k in range(0,Total_T):
for l in range (0,Total_P):
if k == 0:
M_prev[0][l] = M0[0][l]
else:
M_prev = M[:].copy()
The result of M:
array([[4, 8, 1, 6, 2, 4, 9, 6],
[3, 9, 1, 6, 9, 2, 7, 1],
[9, 4, 5, 9, 3, 5, 1, 2],
[8, 3, 5, 2, 8, 5, 4, 8],
[6, 3, 4, 3, 7, 9, 8, 4],
[0, 0, 0, 0, 0, 1, 1, 1]])
The expected result of M_prev:
array([[0, 0, 0, 0, 0, 1, 1, 1],
[4, 8, 1, 6, 2, 4, 9, 6],
[3, 9, 1, 6, 9, 2, 7, 1],
[9, 4, 5, 9, 3, 5, 1, 2],
[8, 3, 5, 2, 8, 5, 4, 8],
[6, 3, 4, 3, 7, 9, 8, 4]])
However the result of M_prev from the coding:
array([[4, 8, 1, 6, 2, 4, 9, 6],
[3, 9, 1, 6, 9, 2, 7, 1],
[9, 4, 5, 9, 3, 5, 1, 2],
[8, 3, 5, 2, 8, 5, 4, 8],
[6, 3, 4, 3, 7, 9, 8, 4],
[0, 0, 0, 0, 0, 1, 1, 1]])
It looks like you're overcomplicating things a lot…
import numpy as np
Total_T = 6
Total_P = 8
M0 = np.array([[0,0,0,0,0,1,1,1]])
# for reproducibility
np.random.seed(0)
# generate random 2D array
M = np.random.randint(10, size=(Total_T, Total_P))
# replace last row with M0
M[-1] = M0
# roll the array
M_prev = np.roll(M, 1, axis=0)
output:
# M
array([[5, 0, 3, 3, 7, 9, 3, 5],
[2, 4, 7, 6, 8, 8, 1, 6],
[7, 7, 8, 1, 5, 9, 8, 9],
[4, 3, 0, 3, 5, 0, 2, 3],
[8, 1, 3, 3, 3, 7, 0, 1],
[0, 0, 0, 0, 0, 1, 1, 1]])
# M_prev
array([[0, 0, 0, 0, 0, 1, 1, 1],
[5, 0, 3, 3, 7, 9, 3, 5],
[2, 4, 7, 6, 8, 8, 1, 6],
[7, 7, 8, 1, 5, 9, 8, 9],
[4, 3, 0, 3, 5, 0, 2, 3],
[8, 1, 3, 3, 3, 7, 0, 1]])

Numpy: Sample group of indices with different values

Given some numpy array a
array([2,2,3,3,2,0,0,0,2,2,3,2,0,1,1,0])
what is the best way to get all groups of n indices with each of them having a different value in a?
Obviously there is no group larger than the number of unique elements in a, here 4.
So for example, one group of size 4 is
array([0,2,5,13])
Consider that a might be quite long, let's say up to 250k.
If the result gets too large, it might also be desirable not to compute all such groups, but only the first k requested.
For inputs as integers, we can have a solution based on this post -
In [41]: sidx = a.argsort() # use kind='mergesort' for first occurences
In [42]: c = np.bincount(a)
In [43]: np.sort(sidx[np.r_[0,(c[c!=0])[:-1].cumsum()]])
Out[43]: array([ 0, 2, 5, 13])
Another closely related to previous method for generic inputs -
In [44]: b = a[sidx]
In [45]: np.sort(sidx[np.r_[True,b[:-1]!=b[1:]]])
Out[45]: array([ 0, 2, 5, 13])
Another with numba for memory-efficiency and hence performance too, to select first indices along those unique groups and also with the additional k arg -
from numba import njit
#njit
def _numba1(a, notfound, out, k):
iterID = 0
for i,e in enumerate(a):
if notfound[e]:
notfound[e] = False
out[iterID] = i
iterID += 1
if iterID>=k:
break
return out
def unique_elems(a, k, maxnum=None):
# feed in max of the input array as maxnum value if known
if maxnum is None:
L = a.max()+1
else:
L = maxnum+1
notfound = np.ones(L, dtype=bool)
out = np.ones(k, dtype=a.dtype)
return _numba1(a, notfound, out, k)
Sample run -
In [16]: np.random.seed(0)
...: a = np.random.randint(0,10,200)
In [17]: a
Out[17]:
array([5, 0, 3, 3, 7, 9, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1, 5, 9,
8, 9, 4, 3, 0, 3, 5, 0, 2, 3, 8, 1, 3, 3, 3, 7, 0, 1, 9, 9, 0, 4,
7, 3, 2, 7, 2, 0, 0, 4, 5, 5, 6, 8, 4, 1, 4, 9, 8, 1, 1, 7, 9, 9,
3, 6, 7, 2, 0, 3, 5, 9, 4, 4, 6, 4, 4, 3, 4, 4, 8, 4, 3, 7, 5, 5,
0, 1, 5, 9, 3, 0, 5, 0, 1, 2, 4, 2, 0, 3, 2, 0, 7, 5, 9, 0, 2, 7,
2, 9, 2, 3, 3, 2, 3, 4, 1, 2, 9, 1, 4, 6, 8, 2, 3, 0, 0, 6, 0, 6,
3, 3, 8, 8, 8, 2, 3, 2, 0, 8, 8, 3, 8, 2, 8, 4, 3, 0, 4, 3, 6, 9,
8, 0, 8, 5, 9, 0, 9, 6, 5, 3, 1, 8, 0, 4, 9, 6, 5, 7, 8, 8, 9, 2,
8, 6, 6, 9, 1, 6, 8, 8, 3, 2, 3, 6, 3, 6, 5, 7, 0, 8, 4, 6, 5, 8,
2, 3])
In [19]: unique_elems(a, k=6)
Out[19]: array([0, 1, 2, 4, 5, 8])
Use Numpy.unique for this job. There are several other options, one can for instance return the number of times each unique item appears in a.
import numpy as np
# Sample data
a = np.array([2,2,3,3,2,0,0,0,2,2,3,2,0,1,1,0])
# The unique values are in 'u'
# The indices of the first occurence of the unique values are in 'indices'
u, indices = np.unique(a, return_index=True)

Most efficient way to roll each column in a numpy array by x steps

I have a 2D array that holds data as shown in the picture below.
Each column needs to be rolled up the same number of steps as the index of the column. How could I do this most ideally? I am thinking about a for loop where I iterate over the columns, rolling them up by i and then applying the column of the array to this rolled result.
Is there a more efficient way?
You can use reshape like so:
>>> a = np.r_[np.tril(np.random.randint(1, 10, (10, 10))), np.random.randint(1, 10, (5, 10))]
>>> a
array([[7, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[9, 4, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 3, 6, 0, 0, 0, 0, 0, 0, 0],
[7, 2, 8, 2, 0, 0, 0, 0, 0, 0],
[6, 1, 3, 6, 5, 0, 0, 0, 0, 0],
[7, 4, 1, 2, 9, 8, 0, 0, 0, 0],
[2, 9, 7, 3, 7, 7, 4, 0, 0, 0],
[2, 4, 2, 9, 1, 6, 8, 8, 0, 0],
[8, 5, 4, 5, 2, 6, 5, 5, 8, 0],
[9, 3, 4, 6, 1, 4, 8, 3, 9, 1],
[3, 7, 7, 7, 3, 7, 9, 8, 2, 1],
[8, 1, 2, 6, 3, 1, 4, 8, 7, 8],
[5, 8, 4, 6, 1, 1, 9, 7, 5, 6],
[7, 3, 9, 9, 1, 1, 3, 4, 8, 1],
[4, 7, 7, 5, 7, 9, 4, 8, 2, 7]])
>>>
>>> m, n = a.shape
>>> buffer = np.zeros((m+1, n), dtype=a.dtype, order='F')
>>> buffer.ravel(order='F')[:m*n].reshape(m, n, order='F')[...] = a
>>> result = buffer[:-1]
>>> result
array([[7, 4, 6, 2, 5, 8, 4, 8, 8, 1],
[9, 3, 8, 6, 9, 7, 8, 5, 9, 1],
[5, 2, 3, 2, 7, 6, 5, 3, 2, 8],
[7, 1, 1, 3, 1, 6, 8, 8, 7, 6],
[6, 4, 7, 9, 2, 4, 9, 8, 5, 1],
[7, 9, 2, 5, 1, 7, 4, 7, 8, 7],
[2, 4, 4, 6, 3, 1, 9, 4, 2, 0],
[2, 5, 4, 7, 3, 1, 3, 8, 0, 0],
[8, 3, 7, 6, 1, 1, 4, 0, 0, 0],
[9, 7, 2, 6, 1, 9, 0, 0, 0, 0],
[3, 1, 4, 9, 7, 0, 0, 0, 0, 0],
[8, 8, 9, 5, 0, 0, 0, 0, 0, 0],
[5, 3, 7, 0, 0, 0, 0, 0, 0, 0],
[7, 7, 0, 0, 0, 0, 0, 0, 0, 0],
[4, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Append arrays of different dimensions to get a single array

l have three vectors (numpy arrays), vector_1, vector_2, vector_3
as follow :
Dimension(vector1)=(200,2048)
Dimension(vector2)=(200,8192)
Dimension(vector3)=(200,32768)
l would like to append these vectors to get vector_4 :
Dimension(vector4)= (200,2048+8192+32768)= (200, 43008)
Add respectively vector1 then vector2 then vector3
l tries the following :
vector4=numpy.concatenate((vector1,vector2,vector3),axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
and
vector4=numpy.append(vector4,[vector1,vector2,vectors3],axis=0)
TypeError: append() missing 1 required positional argument: 'values'
I believe you are looking for numpy.hstack.
>>> import numpy as np
>>> a = np.arange(4).reshape(2,2)
>>> b = np.arange(6).reshape(2,3)
>>> c = np.arange(8).reshape(2,4)
>>> a
array([[0, 1],
[2, 3]])
>>> b
array([[0, 1, 2],
[3, 4, 5]])
>>> c
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>> np.hstack((a,b,c))
array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
[2, 3, 3, 4, 5, 4, 5, 6, 7]])
The error message is pretty much telling you exactly what is the problem:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
But you are doing the opposite, the concatenation axis dimensions match exactly but the others don't. Consider:
In [3]: arr1 = np.random.randint(0,10,(20, 5))
In [4]: arr2 = np.random.randint(0,10,(20, 3))
In [5]: arr3 = np.random.randint(0,10,(20, 11))
Note the dimensions. Just give it the correct axis. So use the second rather than the first:
In [8]: arr1.shape, arr2.shape, arr3.shape
Out[8]: ((20, 5), (20, 3), (20, 11))
In [9]: np.concatenate((arr1, arr2, arr3), axis=1)
Out[9]:
array([[3, 1, 4, 7, 3, 6, 1, 1, 6, 7, 4, 6, 8, 6, 2, 8, 2, 5, 0],
[4, 2, 2, 1, 7, 8, 0, 7, 2, 2, 3, 9, 8, 0, 7, 3, 5, 9, 6],
[2, 8, 9, 8, 5, 3, 5, 8, 5, 2, 4, 1, 2, 0, 3, 2, 9, 1, 0],
[6, 7, 3, 5, 6, 8, 3, 8, 4, 8, 1, 5, 4, 4, 6, 4, 0, 3, 4],
[3, 5, 8, 8, 7, 7, 4, 8, 7, 3, 8, 7, 0, 2, 8, 9, 1, 9, 0],
[5, 4, 8, 3, 7, 8, 3, 2, 7, 8, 2, 4, 8, 0, 6, 9, 2, 0, 3],
[0, 0, 1, 8, 6, 4, 4, 4, 2, 8, 4, 1, 4, 1, 3, 1, 5, 5, 1],
[1, 6, 3, 3, 9, 2, 3, 4, 9, 2, 6, 1, 4, 1, 5, 6, 0, 1, 9],
[4, 5, 4, 7, 1, 4, 0, 8, 8, 1, 6, 0, 4, 6, 3, 1, 2, 5, 2],
[6, 4, 3, 2, 9, 4, 1, 7, 7, 0, 0, 5, 9, 3, 7, 4, 5, 6, 1],
[7, 7, 0, 4, 1, 9, 9, 1, 0, 1, 8, 3, 6, 0, 5, 1, 4, 0, 7],
[7, 9, 0, 4, 0, 5, 5, 9, 8, 9, 9, 7, 8, 8, 2, 6, 2, 3, 1],
[4, 1, 6, 5, 4, 5, 6, 7, 9, 2, 5, 8, 6, 6, 6, 8, 2, 3, 1],
[7, 7, 8, 5, 0, 8, 5, 6, 4, 4, 3, 5, 9, 8, 7, 9, 8, 8, 1],
[3, 9, 3, 6, 3, 2, 2, 4, 0, 1, 0, 4, 3, 0, 1, 3, 4, 1, 3],
[5, 1, 9, 7, 1, 8, 3, 9, 4, 7, 6, 7, 4, 7, 0, 1, 2, 8, 7],
[6, 3, 8, 0, 6, 2, 1, 8, 1, 0, 0, 3, 7, 2, 1, 5, 7, 0, 7],
[5, 4, 7, 5, 5, 8, 3, 2, 6, 1, 0, 4, 6, 9, 7, 3, 9, 2, 5],
[1, 4, 8, 5, 7, 2, 0, 2, 6, 2, 6, 5, 5, 4, 6, 1, 8, 8, 1],
[4, 4, 5, 6, 2, 6, 0, 5, 1, 8, 4, 5, 8, 9, 2, 1, 0, 4, 2]])
In [10]: np.concatenate((arr1, arr2, arr3), axis=1).shape
Out[10]: (20, 19)

Python: what's an elegant way to select specific rows of an 2-d array into a new array?

For example, given a python numpy.ndarray a = array([[1, 2], [3, 4], [5, 6]]), I want to select the 0th and 2nd row of array a into a new array b, such that b becomes array([[1,2],[5,6]].
I need to solution to work on more general problems, where the original 2d array can have more rows and I should be able to select the rows based on some disjoint ranges. In general, I was looking for something like a[i:j] + a[k:p] that works for 1-d list, but it seems 2d-arrays won't add up this way.
update
It seems that I can use vstack((a[i:j], a[k:p])) to get this working, but is there any elegant way to do this?
You can use list indexing:
a[ [0,2], ]
More generally, to select rows i:j and k:p (I'm assuming in the python sense, meaning rows i to j but not including j):
a[ range(i,j) + range(k,p) , ]
Note that the range(i,j) + range(k,p) creates a flat list of [ i, i+1, ..., j-1, k, k+1, ..., p-1 ], which is then used to index the rows of a.
numpy is kind of clever when it comes to indexing. You can give it a list of indexes and it will return the sliced part.
In : a = numpy.array([[i]*10 for i in range(10)])
In : a
Out:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]])
In : a[[0,5,9]]
Out:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]])
In : a[range(0,2)+range(5,8)]
Out:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7]])

Categories