I have stacked 5 probability map in a numpy array (a with the shape 256x256x5), that I have stacked them and then I get the argmax of all of them that final output is show by different 5 colors, however, the values correspond to a pixel within an area are not same (values are changing between [0,1]).
max_= np.argmax(a, axis=2)
plt.imshow(max_)
plt.show()
I do not know how to separate each object by value, because pixels inside a region do not have same values. Does someone know how to label this five objects (colored parts and including background)?
If I understand the question, you want the maximum probabilities themselves, not the indices of the maximum probabilities. (Small point: if you array really is shape 5 × 256 × 256, then I think you did np.argmanx(a, axis=0) to get that result.)
This will give you the maximum probabilities themselves:
max_prob = np.amax(a, axis=0)
If you want each 'object' on its own, you could then do this for each of the regions:
prob_1 = np.zeros((256, 256))
prob_1[max_ == 1] = max_prob[max_ == 1]
prob_1[prob_1 == 0] = np.nan
Related
I'm following this tutorial online from kaggle and I can't get my head round why .T is changing the shape of the matrix. Here is the part I am stuck at:
#saleprice correlation matrix
k = 10 #number of variables for heatmap
cols = corrmat.nlargest(k, 'SalePrice')['SalePrice'].index
cm = np.corrcoef(df_train[cols].values.T)
sns.set(font_scale=1.25)
hm = sns.heatmap(cm, cbar=True, annot=True, square=True, fmt='.2f', annot_kws={'size': 10}, yticklabels=cols.values, xticklabels=cols.values)
plt.show()
I'm basically trouble shooting the code and tried this:
cm = np.corrcoef(df_train[cols].values)
cm.shape
returns a matrix with shape 1460x1460. But when I input:
cm = np.corrcoef(df_train[cols].values.T)
cm.shape
it returns a matrix with shape 10x10. Does anyone know why it does this? I can't figure out.
The correlation gives you a normalized representation of the covariance matrix between all the "columns" of the dataframe. For instance, in the case of having only two variables, you'd end up with a matrix of the shape:
Rx = [[ 1, r_xy],
[r_yx, 1]]
This is quite an expensive computation, since it involves taking the dot product of each column with the rest, resulting in a correlation coefficient for each combination.
So in matrix notation, since you want to end up with a 10x10 matrix, you want to have the shapes correctly aligned. In this case you want (10,1460)x(1460,10) so you get a 10,10 matrix. Hence you need to transpose the 2D-array so that it has shape (10,1460) when you feed it to np.corrcoef.
Though you might find it a little easier by playing around with it yourself and seeing how the actual Pearson correlation is computed:
X = np.random.randint(0,10,(500,2))
print(np.corrcoef(X.T))
array([[1. , 0.04400245],
[0.04400245, 1. ]])
Which is doing the same as:
mean_X = X.mean(axis=0)
std_X = X.std(axis=0)
n, _ = X.shape
print((X.T-mean_X[:,None]).dot(X-mean_X)/(n*std_X**2))
array([[1. , 0.04416552],
[0.04383998, 1. ]])
Note that as mentioned, this is giving as result a normalized dot product of X with itself, so for each (1,1460)x(1460,1) product your getting a single number. So X here, just as in your example, has to be transposed so the dimensions are correctly aligned.
From numpy documentation of corrcoef:
x : array_like
A 1-D or 2-D array containing multiple variables and observations.
Each row of x represents a variable, and
each column a single observation of all those variables. Also see rowvar below.
Note that each row represents a variable, in the first case you have 1460 rows and 10 columns and in the second one you have 10 rows with 1460 columns.
So when you transpose your NumPy array your basically changing from 1460 variables with 10 values for each one to 10 variables with 1460 values for each one.
If you are dealing with pandas you could just use the built-in .corr() method that computes the correlation between columns.
I have found a working code for this, but don't understand everything about these lines:
counter = np.unique(img.reshape(-1, img.shape[2]), axis=0)
print(counter.shape[0])
Especially these values:
-1, img.shape[2], axis=0
What does that -1 do, why is the shape 2, and why is axis 0?
And after that, why do we print shape[0]?
If you don't understand a complex sentence, always break them up and print shapes.
print(img.shape)
img2 = img.reshape(-1, img.shape[2]) # reshape the original image into -1, 3; -1 is placeholder, so lets say you have a
# numpy array with shape (6,2), if you reshape it to (-1, 3), we know the second dim = 3
# first dim = (6*2)/3 = 4, so -1 is replaced with 4
print(img2.shape)
counter = np.unique(img2, axis=0) # find unique elemenst
'''
numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)[source]
Find the unique elements of an array.
Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:
the indices of the input array that give the unique values
the indices of the unique array that reconstruct the input array
the number of times each unique value comes up in the input array
'''
print(counter)
print(counter.shape) # as, we have separate axis, so the channels are shown in dim 2
print(counter.shape[0])
But, this one is probably not correct as it doesn't consider unique RGB across channel.
So, the following is a better one, you flatten the array to get a list then using set find the unique elements and finally print the len of the set.
A handy shortcut is ->
print(len(set(img.flatten())))
Try this:
a = np.array([
[[1,2,3],[1,2,3],[1,2,3],[1,2,3]],
[[1,2,3],[1,2,3],[1,2,3],[1,2,3]]
])
a # to print the contents
a.shape # returns 2, 4, 3
Now, if you do reshape, it will change the shape, meaning it will re-arrange the items in the array.
# try:
a.reshape(2, 4, 3)
a.reshape(4, 2, 3)
# or even
a.reshape(12, 2, 1)
a.reshape(1, 1, 4, 2, 3)
a.reshape(1, 1, 4, 2, 1, 1, 3)
# or:
a.reshape(24, 1)
a.reshape(1, 24)
If you replace one of the numbers with -1, it will get calculated automatically. So:
a.reshape(-1, 3)
# is the same as
a.reshape(8, 3)
and that'll give you the a "vector" of RGB values in a way.
So now you have got the reshaped array and you just need to count unique values.
np.unique(a.reshape(8, 3), axis=0)
will return an array of unique values over axis 0 and you will just count them.
It is calculating the number of unique RGB pixel values in the image. In other word it is calculation the number of different colors in the images
img.reshape(-1, img.shape[2]) : A three channel image flattened per channel. A 3 channel image is of shape width x height x 3 where each channels (3 here) corresponds to RGB or BGR depending on how you read the image. We are reshaping it into 2 dimensions, RGB values per channel. So second dimension will be number of channels. so if you know the width and height of image, it is equal to img.reshape(w*h, img.shape[2]) which is same as img.rehape(img.shape[0]*img.shape[1], img.shape[2]). Intutively think of it like you are taking a 3 channel image and laying out the colors of pixels one after the other. In numpy you can always leave out one dimension as -1 which is automatically calculated based on the shape of the object and the other dimensions.
Now that we have layed out pixes one after the other we can calculate the number of unique colors, but since color is represented by 3 (RGB) values we want to calculated unique RGB values which is done by using np.unique over the second dimension which is channel. This returns all the unique RGB values, which will be of size n x 3 where n are the unique pixel values. Finally since we want to find the count, shape will return (n,3) we select shape[0] which will return n.
Code
# image of size 200 X 200 X 3 => 200 pixels width 200 pixels height => total 200*200 pixels
img = np.random.randint(0,256, (200,200,3))
print (img.shape)
# Flatten the image to 200*200 pixels
img = img.reshape(-1, img.shape[2])
print (img.shape)
# Count unique colors
counter = np.unique(img, axis=0)
# n unique colors (3 values per pixel)
print (counter.shape)
Output
(200, 200, 3)
(40000, 3)
(39942, 3)
I have a batch of images (4d tensor/array with dimensions "batchsize x channels x height x width" and I would like to draw horizontal bars of zeros of size s on each image, but across different rows for each image. I can do this trivially with a for loop, but I haven't been able to figure out a vectorized implementation.
Ideally I would generate a 1-D tensor r of "batchsize" random starting points, and do something like
t[:,:,r:r+s,:] = 0. If I try this I get TypeError: only integer scalar arrays can be converted to a scalar index
If I do a toy example and just try to pull out two different sections of a batch with only two images, doing something like t[:,:,torch.tensor(([1,2],[2,3])),:] I get back a 5D tensor because it is pulling both of those sections from both images in the batch. How do I grab those different sections but only one for each image? In this case if the input were 2xCxHxW I would want 2xCx2xW where the first item corresponds to rows 1 and 2 of the first image, and the second item corresponds to rows 2 and 3 of the second image. Thank you.
You can use this function which will create a mask where you can perform operations across the y or x axis by their index. You can do this by arranging the x values of the index to be set to their y index.
bsg = sgs.data
device = sgs.device
bs, _, x, y = bsg.shape
max_y = y-size-1
rs = torch.randint(0, max_y, (bs,1), device=device)
m = torch.arange(y,device=device).repeat(bs, x)
gpumask = ((m < rs) | (m > (rs+size))).view(bs, 1, x, -1)
gpumask*bsg
I have a custom layer to multiply two tensors A & B of size (x,1) & (1,y), where I want to produce an output C of size (x,y).
To take into account batching i.e. matrices size are actually (?,x,1) & (?,1,y), I am calling:
C = K.batch_dot(A,B, axes = [2,1])
This seems to producing the desired output, but I don't really understand what the axes variable represents here. My intuition is that these are the axes over which we want to perform the matrix multiplication, but I don't understand why it is in the order [2,1] rather than [1,2] (which produced an error).
Can anyone assist me in my understanding?
As per the official documentation here
The lengths of axes[0] and axes[1] should be the same
In your case A has dimensions (?, x, 1) and B has dimensions (?, 1, y).
So its quite clear that from axis = [2, 1], second dimension of A i.e. 1 equals first dimensions of B i.e. 1 (axis dims starts from 0) and produces the desired results.
I want to remove the locations of bad pixels in my coordinate- and disparity-arrays. Therefore I wrote some code but it feels a bit circuitous and a little too long for the task. The main idea behind the code is that I want all array entries removed that contain a disparity value of -17. The same should happen for my pixel coordinate arrays of my 2000x2000 image.
Here is my code using a mask and flattened arrays. ( in the end I want 3 arrays containing x, y and the disparity value sorted in the same order not containing the entries and coordinates of the bad pixels)
Thanks for any hints that improve this code!
#define 2000x2000 index arrays
xyIdx = np.mgrid[0:disp.shape[0],0:disp.shape[1]]
xIdx = xyIdx[1]
yIdx = xyIdx[0]
#flatten indice-arrays and the disparity-array
xIdxFlat = xIdx.flatten()
yIdxFlat = yIdx.flatten()
dispFlat = disp.flatten()
#create mask with all values = 1 'true'
mask = np.ones(dispFlat.shape, dtype='bool')
#create false entrys in the mask wherever the minimum disparity or better
#said a bad pixel is located
for x in range(0,len(dispFlat)):
if dispFlat[x] == -17:
mask[x] = False
#mask the arrays and remove the entries that belong to the bad pixels
xCoords = np.zeros((xIdxFlat[mask].size), dtype='float64')
xCoords[:] = xIdxFlat[mask]
yCoords = np.zeros((yIdxFlat[mask].size), dtype='float64')
yCoords[:] = yIdxFlat[mask]
dispPoints = np.zeros((dispFlat[mask].size), dtype='float64')
dispPoints[:] = dispFlat[mask]
Create a mask of valid ones !=-17. Use this mask to get the valid row, col indices, which would be the X-Y coordinates. Finally index into the input array with the mask or the row, col indices for the filtered data array. Thus, you won't need to do all of that flattening business.
Hence, the implementation would be -
mask = disp != -17
yCoords, xCoords = np.where(mask) # or np.nonzero(mask)
dispPoints = disp[yCoords, xCoords] # or disp[mask]