I have found a working code for this, but don't understand everything about these lines:
counter = np.unique(img.reshape(-1, img.shape[2]), axis=0)
print(counter.shape[0])
Especially these values:
-1, img.shape[2], axis=0
What does that -1 do, why is the shape 2, and why is axis 0?
And after that, why do we print shape[0]?
If you don't understand a complex sentence, always break them up and print shapes.
print(img.shape)
img2 = img.reshape(-1, img.shape[2]) # reshape the original image into -1, 3; -1 is placeholder, so lets say you have a
# numpy array with shape (6,2), if you reshape it to (-1, 3), we know the second dim = 3
# first dim = (6*2)/3 = 4, so -1 is replaced with 4
print(img2.shape)
counter = np.unique(img2, axis=0) # find unique elemenst
'''
numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)[source]
Find the unique elements of an array.
Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:
the indices of the input array that give the unique values
the indices of the unique array that reconstruct the input array
the number of times each unique value comes up in the input array
'''
print(counter)
print(counter.shape) # as, we have separate axis, so the channels are shown in dim 2
print(counter.shape[0])
But, this one is probably not correct as it doesn't consider unique RGB across channel.
So, the following is a better one, you flatten the array to get a list then using set find the unique elements and finally print the len of the set.
A handy shortcut is ->
print(len(set(img.flatten())))
Try this:
a = np.array([
[[1,2,3],[1,2,3],[1,2,3],[1,2,3]],
[[1,2,3],[1,2,3],[1,2,3],[1,2,3]]
])
a # to print the contents
a.shape # returns 2, 4, 3
Now, if you do reshape, it will change the shape, meaning it will re-arrange the items in the array.
# try:
a.reshape(2, 4, 3)
a.reshape(4, 2, 3)
# or even
a.reshape(12, 2, 1)
a.reshape(1, 1, 4, 2, 3)
a.reshape(1, 1, 4, 2, 1, 1, 3)
# or:
a.reshape(24, 1)
a.reshape(1, 24)
If you replace one of the numbers with -1, it will get calculated automatically. So:
a.reshape(-1, 3)
# is the same as
a.reshape(8, 3)
and that'll give you the a "vector" of RGB values in a way.
So now you have got the reshaped array and you just need to count unique values.
np.unique(a.reshape(8, 3), axis=0)
will return an array of unique values over axis 0 and you will just count them.
It is calculating the number of unique RGB pixel values in the image. In other word it is calculation the number of different colors in the images
img.reshape(-1, img.shape[2]) : A three channel image flattened per channel. A 3 channel image is of shape width x height x 3 where each channels (3 here) corresponds to RGB or BGR depending on how you read the image. We are reshaping it into 2 dimensions, RGB values per channel. So second dimension will be number of channels. so if you know the width and height of image, it is equal to img.reshape(w*h, img.shape[2]) which is same as img.rehape(img.shape[0]*img.shape[1], img.shape[2]). Intutively think of it like you are taking a 3 channel image and laying out the colors of pixels one after the other. In numpy you can always leave out one dimension as -1 which is automatically calculated based on the shape of the object and the other dimensions.
Now that we have layed out pixes one after the other we can calculate the number of unique colors, but since color is represented by 3 (RGB) values we want to calculated unique RGB values which is done by using np.unique over the second dimension which is channel. This returns all the unique RGB values, which will be of size n x 3 where n are the unique pixel values. Finally since we want to find the count, shape will return (n,3) we select shape[0] which will return n.
Code
# image of size 200 X 200 X 3 => 200 pixels width 200 pixels height => total 200*200 pixels
img = np.random.randint(0,256, (200,200,3))
print (img.shape)
# Flatten the image to 200*200 pixels
img = img.reshape(-1, img.shape[2])
print (img.shape)
# Count unique colors
counter = np.unique(img, axis=0)
# n unique colors (3 values per pixel)
print (counter.shape)
Output
(200, 200, 3)
(40000, 3)
(39942, 3)
Related
I am analyzing some image represented datasets using keras. I am stuck that I have two different dimensions of images. Please see the snapshot. Features has 14637 images having dimension (10,10,3) and features2 has dimension (10,10,100)
Is there any way that I can merge/concatenate these two data together.?
If features and features2 contain the features of the same batch of images, that is features[i] is the same image of features2[i] for each i, then it would make sense to group the features in a single array using the numpy function concatenate():
newArray = np.concatenate((features, features2), axis=3)
Where 3 is the axis along which the arrays will be concatenated. In this case, you'll end up with a new array having dimension (14637, 10, 10, 103).
However, if they refer to completely different batches of images and you would like to merge them on the first axis such that the 14637 images of features2 are placed after the first 14637 image, then, there no way you can end up with an array, since numpy array are structured as matrix, non as a list of objects.
For instance, if you try to execute:
> a = np.array([[0, 1, 2]]) // shape = (1, 3)
> b = np.array([[0, 1]]) // shape = (1, 2)
> c = np.concatenate((a, b), axis=0)
Then, you'll get:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
since you are concatenating along axis = 0 but axis 1's dimensions differ.
If dealing with numpy arrays, you should be able to use concatenate method and specify the axis, along which the data should be merged. Basically: np.concatenate((array_a, array_b), axis=2)
I think it would be better if you use class.
class your_class:
array_1 = []
array_2 = []
final_array = []
for x in range(len(your_previous_one_array)):
temp_class = your_class
temp_class.array_1 = your_previous_one_array
temp_class.array_2 = your_previous_two_array
final_array.append(temp_class)
I'm trying to perform dot product "pixel of RGb image" by "pixel of grayscale image"
And I need to assign result of dot product into 2 axis (last axis) of ori_img
to create "pixel by pixel dot producted" image from np_img_R and np_img_S
Code is as follow
# I have (826, 600, 3) 3D tensor, which will be used as final result RGB color image
ori_img=np.zeros((826, 600, 3))
# And I also have (826, 600, 3) 3D tensor np_img_R as RGB color image,
# and (826, 600) 2D matrix np_img_S as grayscale image
# I'd like to perform dot product of 2 axis of np_img_R (3 length list like [x x x])
# and one scalar value of np_img_S indexed by row and column indices
# And I'd like to assign result of dot product into 3 axis of ori_img like [92 22 12]
# So, I used double for loop
for i in range(0,825):
for j in range(0,599):
# np_img_R[i,j,:]: select all from last axis
# np_img_S[i,j]: index by row and column
# ori_img[i,j,:]: assign result of dot product into last axis
# to create new image
ori_img[i,j,:]=np.dot(np_img_R[i,j,:],np_img_S[i,j])
# And I checked whether "result of dot product" and "assigned value in ori_img 3D tensor" are same
print(np.dot(np_img_R[456,232,:],np_img_S[456,232]))
# [130 250 255]
print(ori_img.astype(int)[456,232,:])
# [130 250 255]
# Above results show same result
print(np.dot(np_img_R[825,599,:],np_img_S[825,599]))
[ 50 29 113]
print(ori_img.astype(int)[825,599,:])
[0 0 0]
# But above results show different result, actually ori_img is not assigned by result of dot product
print(np_img_R.shape)
# (826, 600, 3)
print(ori_img.shape)
# (826, 600, 3)
# Shape of both show same
I'd like to assign result of dot product into 3 axis of ori_img
How can I fix it?
np.dot() of a vector and a scalar is just elementwise multiplication. And because NumPy offers broadcasting, there isn't even a need for a double loop. Instead you can just do
ori_img = img_R * np_img_S[..., None]
I have stacked 5 probability map in a numpy array (a with the shape 256x256x5), that I have stacked them and then I get the argmax of all of them that final output is show by different 5 colors, however, the values correspond to a pixel within an area are not same (values are changing between [0,1]).
max_= np.argmax(a, axis=2)
plt.imshow(max_)
plt.show()
I do not know how to separate each object by value, because pixels inside a region do not have same values. Does someone know how to label this five objects (colored parts and including background)?
If I understand the question, you want the maximum probabilities themselves, not the indices of the maximum probabilities. (Small point: if you array really is shape 5 × 256 × 256, then I think you did np.argmanx(a, axis=0) to get that result.)
This will give you the maximum probabilities themselves:
max_prob = np.amax(a, axis=0)
If you want each 'object' on its own, you could then do this for each of the regions:
prob_1 = np.zeros((256, 256))
prob_1[max_ == 1] = max_prob[max_ == 1]
prob_1[prob_1 == 0] = np.nan
I have a numpy array of size 5000x32x32x3. The number 5000 is the number of images and each image is 32x32 in width and height and has 3 color channels.
Now I would like to create a numpy array of shape 5000x3x32x32 in a way that the data is preserved.
What I mean by preserving data is :
There should be 5000 data points in the resulting array
The 2nd dimension (3) of the array correctly determines the color channel i.e all the elements whose 2nd dimension is 0 belong to red channel, whose 2nd dimension is 1 belong to green channel,whose 2nd dimension is 2 belong to blue channel.
Simply reshaping the by np.reshape(data,(5000,3,32,32)) would not work as it would not preserve the channels but just reshape the data into the desired shape.
I think you are looking for a permutation of the axes, numpy.transpose can get this job done:
data = np.transpose( data, (0, 3, 1, 2))
I have a lot of 750x750 images. I want to take the geometric mean of non-overlapping 5x5 patches from each image, and then for each image, average those geometric means to create one feature per image. I wrote the code below, and it seems to work just fine. But, I know it's not very efficient. Running it on 300 or so images takes around 60 seconds. I have about 3000 images. So, while it works for my purpose, it's not efficient. How can I improve this code?
#each sublist of gmeans will contain a list of 22500 geometric means
#corresponding to the non-overlapping 5x5 patches for a given image.
gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]]
#the loop here populates gmeans.
for folder in range(len(subfolders)):
just_thefilename, colorsourceimages, graycroppedfiles = get_all_images(folder)
for items in graycroppedfiles:
myarray = misc.imread(items)
area_of_big_matrix=750*750
area_of_small_matrix= 5*5
how_many = area_of_big_matrix / area_of_small_matrix
n = 0
p = 0
mylist=[]
while len(mylist) < how_many:
mylist.append(gmean(myarray[n:n+5,p:p+5],None))
n=n+5
if n == 750:
p = p+5
n = 0
gmeans[folder].append(my list)
#each sublist of mean_of_gmeans will contain just one feature per image, the mean of the geometric means of the 5x5 patches.
mean_of_gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]]
for folder in range(len(subfolders)):
for items in range(len(gmeans[0])):
mean_of_gmeans[folder].append((np.mean(gmeans[folder][items],dtype=np.float64)))
I can understand the suggestion to move this to the code review site,
but this problem provides a nice example of the power of using vectorized
numpy and scipy functions, so I'll give an answer.
The function below, cleverly called func, computes the desired value.
The key is to reshape the image into a four-dimensional array. Then
it can be interpreted as a two-dimensional array of two-dimensional
arrays, where the inner arrays are the 5x5 blocks.
scipy.stats.gmean can compute the geometric mean over more than one
dimension, so that is used to reduce the four-dimensional array to the
desired two-dimensional array of geometric means. The return value is the
(arithmetic) mean of those geometric means.
import numpy as np
from scipy.stats import gmean
def func(img, blocksize=5):
# img must be a 2-d array whose dimensions are divisible by blocksize.
if (img.shape[0] % blocksize) != 0 or (img.shape[1] % blocksize) != 0:
raise ValueError("blocksize does not divide the shape of img.")
# Reshape 'img' into a 4-d array 'blocks', so blocks[i, :, j, :] is
# the subarray with shape (blocksize, blocksize).
blocks_nrows = img.shape[0] // blocksize
blocks_ncols = img.shape[1] // blocksize
blocks = img.reshape(blocks_nrows, blocksize, blocks_ncols, blocksize)
# Compute the geometric mean over axes 1 and 3 of 'blocks'. This results
# in the array of geometric means with size (blocks_nrows, blocks_ncols).
gmeans = gmean(blocks, axis=(1, 3), dtype=np.float64)
# The return value is the average of 'gmeans'.
avg = gmeans.mean()
return avg
For example, here the function is applied to an array with shape (750, 750).
In [358]: np.random.seed(123)
In [359]: img = np.random.randint(1, 256, size=(750, 750)).astype(np.uint8)
In [360]: func(img)
Out[360]: 97.035648309350179
It isn't easy to verify that that is the correct result, so here is a much smaller example:
In [365]: np.random.seed(123)
In [366]: img = np.random.randint(1, 4, size=(3, 6))
In [367]: img
Out[367]:
array([[3, 2, 3, 3, 1, 3],
[3, 2, 3, 2, 3, 2],
[1, 2, 3, 2, 1, 3]])
In [368]: func(img, blocksize=3)
Out[368]: 2.1863131342986666
Here is the direct calculation:
In [369]: 0.5*(gmean(img[:,:3], axis=None) + gmean(img[:, 3:], axis=None))
Out[369]: 2.1863131342986666