I have a matrix consisting of True and False values. I want to print this as an image where all the True values are white and the False values are black. The matrix is called indices. I have tried the following:
indices = indices.astype(int) #To convert the true to 1 and false to 0
indices*=255 #To change all the 1's to 255
cv2.imshow('Indices',indices)
cv2.waitKey()
This is printing a fully black image. When I try, print (indices==255).sum(), it returns a values of 669 which means that there are 669 elements/pixels in the indices matrix which should be white. But I can only see a pure black image. How can I fix this?
As far as I know, opencv represents an image as a matrix of floats ranging from 0 to 1, or an integer with values between the minimum and the maximum of that type.
An int has no bounds (except the boundaries of what can be represented with all available memory). If you however use np.uint8, that means you are working with (unsigned) bytes, where the minimum is 0 and the maximum 255.
So there are several options. The two most popular would be:
cast to np.uint8 and then multiply with 255:
indices = indices.astype(np.uint8) #convert to an unsigned byte
indices*=255
cv2.imshow('Indices',indices)
cv2.waitKey()
Use a float representation:
indices = indices.astype(float)
cv2.imshow('Indices',indices)
cv2.waitKey()
Note that you could also choose to use np.uint16 for instance to use unsigned 16-bit integers. In that case you will have to multiply with 65'535. The advantage of this approach is that you can use an arbitrary color depth (although most image formats use 24-bit colors (8 bits per channel), there is no reason not to use 48-bit colors. If you for instance are doing image processing for a glossy magazine, then it can be beneficial to work with more color depth.
Furthermore even if the end result is a 24-bit colorpalette, one can sometimes better use a higher color depth for the different steps in image processing.
Related
I was working with the laplacian function to detect edges in OpenCV, when I ran into some confusion regarding the underlying principles behind the code.
The documentation features us reading an image with the following code and passing it through the laplacian function.
img = cv2.imread("messi5.jpg", cv2.IMREAD_GRAYSCALE)
lap = cv2.Laplacian(img, cv2.CV_32F, ksize=1)
Now, I am able to understand the code written above pretty well. As I believe, we read in an image, and calculate the Laplacian at each pixel. This value can be bigger or smaller than the original 8-bit unsigned int pixel, so we store it in an array of 32-bit floats.
My confusion begins with the next few lines of code. In the documentation, the image is converted back to an 8-bit usigned integer using the convertScaleAbs() function, and then displayed as seen below.
lap = cv2.convertScaleAbs(lap)
cv2.imshow(lap)
However, my instructor showed me the following method of converting back to uint8:
lap = np.uint8(np.absolute(lap))
cv2.imshow(lap)
Surprisingly both solutions display identical images. However, I am unable to understand why this occurs. From what I've seen, np.uint8 simply truncates values (floats, etc.) down to unsigned 8-bit integers. So for example, 1025 becomes 1 as all the other bits beyond the 8-th bit are discarded.
Yet this would literally mean that any value of our laplacian for each pixel would become heavily reduced and muddled. If our Laplacian for a pixel was 1024 (signaling a non-zero second derivative in both x and y dimensions), we would instead have the value 0 on hand (singaling a second derivative of zero and a possible local max/min, or in other words an edge). Thus by my logic, my instructor's solution should fail miserably, but surprisingly everything works fine. Why is this?
On the other hand, I do not have any idea about how convertScaleAbs() works. I'm going to assume it works similarly as my instructor's solution, but I'm not sure. Can someone please explain what's going on?
OpenCV BGR images or Grayscale have pixel values from 0 to 255 when in CV_8U 8 Bit which corresponds to np.uint8, more details here.
So when you use the Laplacian function with ddepth (Desired depth of the destination image.) set to cv2.CV_32F you get this:
lap = cv2.Laplacian(img, cv2.CV_32F, ksize=1)
print(np.amax(lap)) #=> 317.0
print(np.amin(lap)) #=> -315.0
So, you need to convert back to np.uint8, for example:
lap_uint8 = lap.copy()
lap_uint8[lap > 255] = 255
lap_uint8[lap < 0] = 0
lap_uint8 = lap_uint8.astype(np.uint8)
print(np.amax(lap_uint8)) #=> 255
print(np.amin(lap_uint8)) #=> 0
Or with any other more straightforward way which does the same.
But you can use also set -1 as argument for ddepth, see documentation, to get:
lap = cv2.Laplacian(img, -1, ksize=1)
print(np.amax(lap)) #=> 0
(print(np.amin(lap))) #=> 255
In this way you get a wrong result:
lap_abs = np.absolute(lap)
print(np.amax(lap_abs)) #=> 317.0
print(np.amin(lap_abs)) #=> 0.0
There is something that I probably misunderstand about data types in images. Lets say we have an uint8 image. Since uint8 is between 0 and 255, 0 is the darkest and 255 is the brightest intensity value.
Same logic would make -32768 the darkest and 32767 the brightest intensity value for an int16 image. However, I have an int16 image(it is originally a dicom) where the darkest pixel is -1024 and the brightest is 4095. I say int16 because pixels are saved in an int16 type numpy array.
In addition, when I concatenate two int16 numpy arrays where one of them is a = np.ones((100,100), dtype=np.int16) * 32767 and the other is b = np.ones((100,100), dtype=np.int16) * 32766 , It results in a binary image where 32767s is white and 32766s are black.
Can someone help me about what I am getting wrong?
Short answer
Nothing is wrong, this is how DICOM works.
Long answer
In DICOM standard, pixel value is not directly related to its color (gray level). These values should correspond to physical properties of the acquired item (e.g. in Computed Tomography pixel values are measured in Hounsfield Units. *(unless they are linearly rescaled, see below)).
Gray level of the pixel image is displayed dynamically based on arbitrary chosen minimal and maximal values, which are set by the user. Every pixel value less or equal minimum is black, every pixel greater or equal maximum is white, the others are linearly interpolated gray levels.
So it is perfectly fine that in binary image black minimum is equal to 32766 and white maximum is equal to 32767.
If you use DICOM viewer, you will have possibility to change dynamically these minimal and maximal values, so you will change total contrast and brightness of the image. It is necessary for radiologists, to diagnose e.g. lungs and bones in different ranges. And if you export DICOM to other file format, you should choose, what is the color mapping. Normally it is full range (the lowest value gets black, the brightest gets white).
There are two other values, which are often used instead of minimum and maximum: "window width" (ww) and "window level" (wl). ww = max-min, wl=(max+min)/2.
You should look at these questions and answers:
Window width and center calculation of DICOM image
*you should also consider tags "rescale intercept" (0028,1052), and "rescale slope" (0028,1053), that lineary rescale value of pixel array to the final value, but normally it is implemented in dicom toolkit.
FinalPixelValue = (RawPixelValue * RescaleSlope) + RescaleIntercept
I am doing some image processing, and I need to check if a binary image is identical to another.
Processing speed isn't an issue, and the simple thing I thought to do was count the white pixels remaining after adding the inverse of image A to image B (these images are very nearly identical, but not quite--some sort of distance metric is the goal).
Note: take the logarithm to linearize the distance
However, in order to create the composite image, I need to include a "mask" that is the same size as the two images.
I am having trouble finding an example of creating the mask online and using it for the Image.composite function.
Here is my code:
compA = ImageOps.invert(imgA)
imgAB = Image.composite(compA,imgB,??? mask)
Right now, I have created a mask of all zeros--however, the composite image does not appear correctly (both A and B are exactly the same images; a mask of all zeros--or all ones for that matter--does not work).
mask = Image.fromarray(np.zeros(imgA.size,dtype=int),mode='L')
imgAB = Image.composite(compA,imgB,mask)
How do I just add these two binary images on top of eachother?
Clearly you're using numpy, so why not just work with numpy arrays and explicitly do whatever arithmetic you want to do in that domain—such as subtracting one image from the other:
arrayA = numpy.asarray( imgA, dtype=int )
arrayB = numpy.asarray( imgB, dtype=int )
arrayDelta = arrayA - arrayB
print( (arrayDelta !=0 ).sum() ) # print the number of non-identical pixels (why count them by hand?)
# NB: this number may be inflated by a factor of 3 if there are 3 identical channels R, G, B
imgDelta = Image.fromarray((numpy.sign(arrayDelta)*127+127).astype('uint8')) # display this image if you want to visualize where the differences are
You could do this even more simply, e.g.
print((numpy.asarray(imgA) != numpy.asarray(imgB)).sum())
but I thought casting to a signed integer type first and then subtracting would allow you to visualize more information (A white and B black -> white pixel in delta; A black and B white -> black pixel in delta)
I am using OpenCV 2 to do some images manipulations in YCbCr color space. For the moment I can detect some noise due to the conversion RGB -> YCbCr and then YCbCr -> RGB, but as said in the documentation:
If you use cvtColor with 8-bit images, the conversion will have some information lost. For many applications, this will not be noticeable but it is recommended to use 32-bit images in applications that need the full range of colors or that convert an image before an operation and then convert back.
So I would like to convert my image in 16 or 32 bits, but I didn't found how to do it with NumPy. Some ideas?
img = cv2.imread(imgNameIn)
# Here I want to convert img in 32 bits
cv2.cvtColor(img, cv2.COLOR_BGR2YCR_CB, img)
# Some image processing ...
cv2.cvtColor(img, cv2.COLOR_YCR_CB2BGR, img)
cv2.imwrite(imgNameOut, img, [cv2.cv.CV_IMWRITE_PNG_COMPRESSION, 0])
Thanks to #moarningsun, problem resolved:
i = cv2.imread(imgNameIn, cv2.CV_LOAD_IMAGE_COLOR) # Need to be sure to have a 8-bit input
img = np.array(i, dtype=np.uint16) # This line only change the type, not values
img *= 256 # Now we get the good values in 16 bit format
The accepted answer is not accurate. A 16-bit image has 65536 intensity levels (2^16) hence, values ranging from 0 to 65535.
If one wants to obtain a 16-bit image from an image represented as an array of float ranging from 0 to 1, one has to multiply every coefficient of this array by 65535.
Also, it is good practice to cast the type of your end result as the very last step of the operations you perform.
This is mainly for two reasons:
- If you perform divisions or multiplications by float, the result will return a float and you will need to change the type again.
- In general (in the mathematical sense of the term), a transformation from float to integer can introduce errors. Casting the type at the very end of the operations prevents error propagation.
What I'm doing is reducing colors in an image by quantization, but instead of using floats I need to translate into RGB (eg. array(255, 255, 255)). I've found similar questions but not a straightforward/direct solution.
Returning clustered produces the array of floats. How do you convert the float to the RGB equivalent?
# Pixel Matrix
pixel = reshape(img,(img.shape[0]*img.shape[1],3))
# Clustering
centroids,_ = kmeans(pixel,8) # six colors will be found
# Quantization
qnt,_ = vq(pixel,centroids)
# Shape Quantization Result
centers_idx = reshape(qnt,(img.shape[0],img.shape[1]))
clustered = centroids[centers_idx]
If you want to convert any array of floats to array of bytes (8-bit unsigned integers, from 0 to 255), you have some options. The one I prefer for more general conversions is this:
bytearray = (floatarray*255).astype('uint8')
This should work if you have any array of positive floats whose pixel values for each channel vary between 0.0 and 1.0. If you have arbitrary positive values, you could do floatarray /= floatarray.max() first, to normalize the values.
Hope this helps!