I am using OpenCV 2 to do some images manipulations in YCbCr color space. For the moment I can detect some noise due to the conversion RGB -> YCbCr and then YCbCr -> RGB, but as said in the documentation:
If you use cvtColor with 8-bit images, the conversion will have some information lost. For many applications, this will not be noticeable but it is recommended to use 32-bit images in applications that need the full range of colors or that convert an image before an operation and then convert back.
So I would like to convert my image in 16 or 32 bits, but I didn't found how to do it with NumPy. Some ideas?
img = cv2.imread(imgNameIn)
# Here I want to convert img in 32 bits
cv2.cvtColor(img, cv2.COLOR_BGR2YCR_CB, img)
# Some image processing ...
cv2.cvtColor(img, cv2.COLOR_YCR_CB2BGR, img)
cv2.imwrite(imgNameOut, img, [cv2.cv.CV_IMWRITE_PNG_COMPRESSION, 0])
Thanks to #moarningsun, problem resolved:
i = cv2.imread(imgNameIn, cv2.CV_LOAD_IMAGE_COLOR) # Need to be sure to have a 8-bit input
img = np.array(i, dtype=np.uint16) # This line only change the type, not values
img *= 256 # Now we get the good values in 16 bit format
The accepted answer is not accurate. A 16-bit image has 65536 intensity levels (2^16) hence, values ranging from 0 to 65535.
If one wants to obtain a 16-bit image from an image represented as an array of float ranging from 0 to 1, one has to multiply every coefficient of this array by 65535.
Also, it is good practice to cast the type of your end result as the very last step of the operations you perform.
This is mainly for two reasons:
- If you perform divisions or multiplications by float, the result will return a float and you will need to change the type again.
- In general (in the mathematical sense of the term), a transformation from float to integer can introduce errors. Casting the type at the very end of the operations prevents error propagation.
Related
I am doing a denoising work and I'm not very familiar with Python. I applied BM3D to get the denoised picture and I also have the original one.
Now I want to get the noise by doing this:
tmp = img - img_denoised
But it turns out to be a very strange black and white figure like this:
So how can I get a proper noise picture? What I wish to get is image like this:
Edit:
Got an image from the Internet and done the same processing.
after processing:
Edit again:
Providing a simple example:
import cv2
img = cv2.imread("path of the original image")
img_denoised = cv2.imread("path of the denoised image")
tmp = img - img_denoised
cv2.imwrite("test_noise.jpg",tmp)
In your example code, both img and img_denoised are uint8 NumPy arrays. When operating on these arrays, the output is of the same type. These operations are modulo 256. When the result of an operation exceeds 255, it wraps around back to 0, and when the result is negative, it wraps around back to 255. For example:
np.array([5], np.uint8) - np.array([10], np.uint8)
return array([251], dtype=uint8). Instead of -5, which cannot be represented in a uint8 value, we get 256 - 5 = 251.
The subtraction img - img_denoised results in some values just above zero, which look black, and some values just below zero, which will be stored as values near 255 and look white.
We can solve this in different ways. One is to force the operation to happen with floating-point values:
tmp = img.astype(float) - img_denoised.astype(float)
We now have an array of floats, about half of them negative. But a JPEG file can only store uint8 values, and casting our float values to uint8 will get us back where we started. So we need to shift the origin (the zero value) to a middle-gray (typically 128):
tmp = img.astype(float) - img_denoised.astype(float)
tmp += 128
cv2.imwrite("test_noise.jpg", tmp.astype(np.uint8))
This is very fiddly, but it works. I prefer using a library that takes care of data types for me, so I don't have to think about them when I don't want to. DIPlib is such a library (disclaimer: I'm an author):
import diplib as dip
img = dip.ImageRead("7yJS3.png")
img_denoised = dip.ImageRead("xjQIy.png")
tmp = img - img_denoised
tmp += 128
dip.ImageWrite(tmp, "test_noise.jpg")
In DIPlib, arithmetic operations automatically promote the images to a floating-point type, unless we explicitly prevent it. Saving as JPEG silently casts the image to uint8 (this is where errors will happen if the pixel values are outside the range of the uint8 type).
With limited information it is hard to pin-point the problem. Please provide input images and more code.
Looks like the result image is a binary bitmap, only white or black, no gray. Your tmp image's pixel format is probably incorrect, which might be due to your img and img_denoised are not having the same pixel format, or both are wrong. Try display your input images to see if they look normal.
Your img and img_denoised should be the same pixel format, maybe 8-bit gray scale, or 24-bit RGB, and after img-img_denoised, the result should still have the same pixel format.
It could also due to it's unsigned, try to make it signed, or + 128 to all pixels and see what happened.
What is your image data type/range? 0-1 or 0-255?
if your image is 0-1 float32, the noise image will have a data range of [-1, 1], around half of the pixels is below 0, and when displayed by cv2.imshow() as "black".
Try
noise = origin - clean
noise = (noise + 1) * 0.5
I was working with the laplacian function to detect edges in OpenCV, when I ran into some confusion regarding the underlying principles behind the code.
The documentation features us reading an image with the following code and passing it through the laplacian function.
img = cv2.imread("messi5.jpg", cv2.IMREAD_GRAYSCALE)
lap = cv2.Laplacian(img, cv2.CV_32F, ksize=1)
Now, I am able to understand the code written above pretty well. As I believe, we read in an image, and calculate the Laplacian at each pixel. This value can be bigger or smaller than the original 8-bit unsigned int pixel, so we store it in an array of 32-bit floats.
My confusion begins with the next few lines of code. In the documentation, the image is converted back to an 8-bit usigned integer using the convertScaleAbs() function, and then displayed as seen below.
lap = cv2.convertScaleAbs(lap)
cv2.imshow(lap)
However, my instructor showed me the following method of converting back to uint8:
lap = np.uint8(np.absolute(lap))
cv2.imshow(lap)
Surprisingly both solutions display identical images. However, I am unable to understand why this occurs. From what I've seen, np.uint8 simply truncates values (floats, etc.) down to unsigned 8-bit integers. So for example, 1025 becomes 1 as all the other bits beyond the 8-th bit are discarded.
Yet this would literally mean that any value of our laplacian for each pixel would become heavily reduced and muddled. If our Laplacian for a pixel was 1024 (signaling a non-zero second derivative in both x and y dimensions), we would instead have the value 0 on hand (singaling a second derivative of zero and a possible local max/min, or in other words an edge). Thus by my logic, my instructor's solution should fail miserably, but surprisingly everything works fine. Why is this?
On the other hand, I do not have any idea about how convertScaleAbs() works. I'm going to assume it works similarly as my instructor's solution, but I'm not sure. Can someone please explain what's going on?
OpenCV BGR images or Grayscale have pixel values from 0 to 255 when in CV_8U 8 Bit which corresponds to np.uint8, more details here.
So when you use the Laplacian function with ddepth (Desired depth of the destination image.) set to cv2.CV_32F you get this:
lap = cv2.Laplacian(img, cv2.CV_32F, ksize=1)
print(np.amax(lap)) #=> 317.0
print(np.amin(lap)) #=> -315.0
So, you need to convert back to np.uint8, for example:
lap_uint8 = lap.copy()
lap_uint8[lap > 255] = 255
lap_uint8[lap < 0] = 0
lap_uint8 = lap_uint8.astype(np.uint8)
print(np.amax(lap_uint8)) #=> 255
print(np.amin(lap_uint8)) #=> 0
Or with any other more straightforward way which does the same.
But you can use also set -1 as argument for ddepth, see documentation, to get:
lap = cv2.Laplacian(img, -1, ksize=1)
print(np.amax(lap)) #=> 0
(print(np.amin(lap))) #=> 255
In this way you get a wrong result:
lap_abs = np.absolute(lap)
print(np.amax(lap_abs)) #=> 317.0
print(np.amin(lap_abs)) #=> 0.0
I'm dealing with some satellite images, consisting of 16-bit .tiff images. The color is encoded as 16-bit per channel. I would like to know how I can convert these images to normal 8-bit RGB for further CNN processing.
I have tried OpenCV (cv2.read('file',-1)) and PIL (read('file')), but these two packages cannot recognize and read 16-bit tiff images.
Generally, when you want to read or write images in Python — of any bit-depth and format — it is best to use ImageIO. As the name suggests, its singular goal is to input/output images. Only caveat: It may ignore the image's meta data. That is: It may not deal correctly with images defining a color space other than the standard sRGB, or it might fail to preserve the image's intended orientation.
You would read in the image, say example.tif, like so:
import imageio
image = imageio.imread('example.tif')
As for the conversion, that's just basic math. The data structure in which you'll receive the pixel data is a NumPy array. Introspect image.shape and image.dtype. You should expect your images to have a shape of (y, x, 3), where y is the number of pixels in the vertical, x in the horizontal direction, and 3 represents the three color channels: red, green, blue. Its dtype (data type) should be uint16, meaning unsigned 16-bit integers.
Side note: As there are three color channels, each sampled with a 16-bit resolution, the color depth of the image is more commonly described as "48 bits" (per pixel).
16-bit integer numbers range between 0 and 65535 (= 216−1). They need to be coerced to the 8-bit range: 0 to 255 (= 28−1). So divide by 256 (= 28):
image = image / 256
This will yield an array of floating-point pixel values. Its data type must be explicitly cast to 8-bit integer in order to drop any fractions.
image = image.astype('uint8')
Equivalently, and more efficiently, you may also bit-shift the 16-bit values 8 bits to the right:
image = (image >> 8).astype('uint8')
This makes the conversion faster (by a factor of 2 or so on modern hardware) as it skips the floating-point operations.
Then, either use the final image array for further processing, or save it to a new file:
imageio.imwrite('example.png', image)
If all you want is to convert, your .tiff file's color space to RGB. Then Try:-
from PIL import Image
img = Image.open(r"Path_to_tiff_image")
img = img.convert("RGB")
img.save(r"path_of_destination_image")
The above code, first opens a .tiff image, then changes its color mode to RGB. And then saves it to the destination location.
Hey I used tifffile to handle the file and a calculation that I've found in a different thread here for rescaling the 16-bit image to 8-bit.
import numpy as np
import tifffile as tif
import cv2
image = tif.imread('/home/trance/test.tiff')
# Rescale 16-bit to 8-bit
img_rescaled = 255 * (image - image.min()) / (image.max() - image.min())
# Colourising image and saving it with opencv
img_col = cv2.applyColorMap(img_rescaled.astype(np.uint8), cv2.COLORMAP_INFERNO)
cv2.imwrite('/home/trance/test.png', img_col)
I have a matrix consisting of True and False values. I want to print this as an image where all the True values are white and the False values are black. The matrix is called indices. I have tried the following:
indices = indices.astype(int) #To convert the true to 1 and false to 0
indices*=255 #To change all the 1's to 255
cv2.imshow('Indices',indices)
cv2.waitKey()
This is printing a fully black image. When I try, print (indices==255).sum(), it returns a values of 669 which means that there are 669 elements/pixels in the indices matrix which should be white. But I can only see a pure black image. How can I fix this?
As far as I know, opencv represents an image as a matrix of floats ranging from 0 to 1, or an integer with values between the minimum and the maximum of that type.
An int has no bounds (except the boundaries of what can be represented with all available memory). If you however use np.uint8, that means you are working with (unsigned) bytes, where the minimum is 0 and the maximum 255.
So there are several options. The two most popular would be:
cast to np.uint8 and then multiply with 255:
indices = indices.astype(np.uint8) #convert to an unsigned byte
indices*=255
cv2.imshow('Indices',indices)
cv2.waitKey()
Use a float representation:
indices = indices.astype(float)
cv2.imshow('Indices',indices)
cv2.waitKey()
Note that you could also choose to use np.uint16 for instance to use unsigned 16-bit integers. In that case you will have to multiply with 65'535. The advantage of this approach is that you can use an arbitrary color depth (although most image formats use 24-bit colors (8 bits per channel), there is no reason not to use 48-bit colors. If you for instance are doing image processing for a glossy magazine, then it can be beneficial to work with more color depth.
Furthermore even if the end result is a 24-bit colorpalette, one can sometimes better use a higher color depth for the different steps in image processing.
I'm trying to equalize a 1 one channel image like so:
img = cv2.equalizeHist(img)
But since it's a float64 img, I get the following error:
error: (-215) _src.type() == CV_8UC1 in function equalizeHist
How do I go about this?
so basically histogram equalize is work with gray scaled images.
so if you want to do histogram equalize at colorful image you can use this code.
R, G, B = cv2.split(img)
output1_R = cv2.equalizeHist(R)
output1_G = cv2.equalizeHist(G)
output1_B = cv2.equalizeHist(B)
equ = cv2.merge((output1_R, output1_G, output1_B))
You can also use .astype(numpy.uint8).
The function equalizeHist is histogram equalization of images and only implemented for CV_8UC1 type, which is a single channel 8 bit unsigned integral type.
To convert your image to this type you can use the function convertTo with the target type (must be the same number of channels).
Make sure that the source image has the right value range, typically floating point images are interpreted as 0 = black and 1 = white and the gray range is in between, while integer images are interpreted as 0 = black and maximum value = white (which would be 255 for unsigned 8 bit type). So you'll often have to multiply your source image by 255 to fit the range. Function convertTo has a parameter to scale your values during conversion, which could give you a speed improvement compared to manual scaling.
When initialising your image variable, don't forget the flag, that solved it for me.
img = cv2.imread("my_image.png", 0)
I used 0 as the flag because I was working with a greyscale image.
After reading the image and converting it to gray. Use
img = np.float32(img)