What is a "twice HSV transformation"? - python

I learned this method from a SPIE Proceeding article, they used the twice HSV transformation for shadow detection. In their paper, the method was stated as following:
Firstly, the color model of the image is transformed from RGB to HSV,
and the three components of the HSV model are normalized to 0 to 255,
then the image is transformed from RGB to HSV once again. Thirdly, the
image is turned into a gray image from a color image, only the gray
value of the red component is used. Fourthly, the OTSU thresholding
method is used to produce a threshold by which the image is converted
to a binary image. Since the gray value of the shadow area is usually
smaller than those areas which are not covered by shadow, the
objective is pixels whose gray value is below the threshold, and
background is pixels whose gray value is beyond the threshold.
Do the second and third steps make sense?

The second and third statements absolutely don't make any sense whatsoever. Even the pipeline is rather suspicious. However, after re-reading that statement probably a dozen times, here is what I came up with. Apologies for any errors in understanding.
Let's start with the second point:
Firstly, the color model of the image is transformed from RGB to HSV, and the three components of the HSV model are normalized to 0 to 255, then the image is transformed from RGB to HSV once again
You're well aware that transforming an image from RGB to HSV results in another three channel output. Depending on which platform you're using, you'll either get 0-360 or 0-1 for the first channel or Hue, 0-100 or 0-255 for the second channel or Saturation, and 0-100 or 0-255 for the third channel or Value. Each channel may be unequal in magnitude when comparing with the other channels, and so these channels are normalized to the 0-255 range independently. Specifically, this means that the Hue, Saturation and Value components all get normalized so that they all span from 0-255.
Once we do this, we now have a HSV image where each channel ranges from 0-255. My guess is they call this new image a RGB image because the channels all span from 0-255, just like any 8-bit RGB image would. This also makes sense because when you're transforming an image from RGB to HSV, the dynamic range of the channels all span from 0-255, so my guess is that they normalize all of the channels in the first HSV result to make it suitable for the next step.
Once they normalize the channels after doing HSV conversion as per above, they do another HSV conversion on this new result. The reasons why they would do this a second time are beyond me and don't make any sense, but that's what I gathered from the above description, and that's what they probably mean by "twice HSV transformation" - To transform the original RGB image to HSV once, normalize that result so all channels span from 0-255, then re-apply the HSV conversion again to this intermediate result.
Let's go to the third point:
Thirdly, the image is turned into a gray image from a color image, only the gray value of the red component is used.
The output after you transform the HSV image a second time, the final result is simply taking the first channel which is inherently a grayscale image and is the "red" channel. Coincidentally, this also corresponds to the Hue after you do a HSV conversion. I'm not quite sure what properties the Hue channel holds after converting the image using HSV twice, but maybe it worked for this particular method.
I decided to give this a whirl and see if this really works. Here's an example image of a shadow I found online:
Source: http://petapixel.com/
The basic pipeline is to take an image, convert it into HSV, renormalize the image so that the values are 0-255 again, do another HSV conversion, then do an adaptive threshold via Otsu. We threshold below the optimal value to segment out the shadows.
I'm going to use OpenCV Python, as I don't have the C++ libraries set up on my computer here. In OpenCV, when converting an image to HSV, if the image is unsigned 8-bit RGB, the Saturation and Value components are automatically scaled to [0-255], but the Hue component is scaled to [0-179] in order to fit the Hue (which is originally [0-360)) into the data type. As such, I scaled each value by (255/179) so that the Hue gets normalized to [0-255]. Here's the code I wrote:
import numpy as np # Import relevant libraries
import cv2
# Read in image
img = cv2.imread('shadow.jpg')
# Convert to HSV
hsv1 = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Renormalize Hue channel to 0-255
hsv1[:,:,0] = ((255.0/179.0)*hsv1[:,:,0]).astype('uint8')
# Convert to HSV again
# Remember, channels are now RGB
hsv2 = cv2.cvtColor(hsv1, cv2.COLOR_RGB2HSV)
# Extract out the "red" channel
red = hsv2[:,:,0]
# Perform Otsu thresholding and INVERT the image
# Anything smaller than threshold is white, anything greater is black
_,out = cv2.threshold(red, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Show the image - shadow mask
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
This is the output I get:
Hmm.... well there are obviously some noisy pixels, but I guess it does work.... kinda!

Related

Create a grayscale image from a color coded source image in Python

I want to convert a color-coded image to grayscale. The colors in that image correspond to given values (e.g. black is 0 and red is 25) like on this color scale.
Is there a way to convert this image to grayscale and keep that color gradient, so that 0 on the scale remains black and 25 is shown as white?
I tried to convert it with matplotlib and also cv2 but ended up with grayscale images, that did not respect my given color gradient. I would appreciate an answer very much. Thank you!
Depending on the tools you use you can
read/convert the color-coded image as RGB,
convert RGB to grayscale
or
convert the colors in the gradient to grayscale,
read/convert the color-coded image using that grayscale palette.
The second approach is more efficient.
Update:
Upon reading your last comment (which should be in the original question), the options become
read/convert the color-coded image as RGB,
convert RGB to grayscale,
rescale by multiplying the pixel values by 10
or
convert the colors in the gradient to grayscale, rescaled to 0-255,
read/convert the color-coded image using that grayscale palette.

How cv2.imshow() function works in BGR model?

In RGB color model, cv2.imshow() works by bgr order,when use cv2.imshow() to load a image,and show it in matplotlib.plot.show() function,it goes wrong cause plot.show() use rgb order,it can be solved by reverse the channels of b,g,r in image,but when I make RGB color model transformation into other color model,like HSV,YUV,YCbCr,when use plot.show() it still be different to cv2.show().So when RGB->HSV,YUV,YCbCr color model,does cv2.show() still works like v,s,h maps to b,g,r?And plt.show() use the order h,s,v maps to r,g,b?No matter what color model use cv2.show() and plot.show(),it just use b,g,r and r,g,b orders maps to changed channels?Like v,s,h and h,s,v in HSV,v,u,y and y,u,v in YUV?
img_HSV = cv2.cvtColor(img_BGR, cv2.COLOR_BGR2HSV)
plt.show(img_HSV )
cv2.show(img_HSV )
plt.show(img_HSV [:,:,::-1])
the result in line 1 code is different with line2,but line 3 is same as line2.
So plt.show works by h,s,v maps to r,g,b and cv2.show works by v,s,h maps to b,g,r?
plt.show(img_HSV ) assumes the input is RGB
cv2.show(img_HSV ) assumes the input is BGR
So if you give input in a different format they take the channel as they assume anyway. For example, cv2.imshow(img_HSV ) method will take the Hue value as Blue so there will be a different image representation.
So just convert them back to RGB or BGR for imshow() functions after your processes done with the other color spaces.
I assume you mean cv2.imshow (and plt.imshow) as in your title, right? Most of the time (and in all your examples) you pass an image with three channels to the functions plt.imshow and cv2.imshow.
Those are numpy arrays and the data type (uint8, float32, ect) is the only information you implicitly pass along. So those functions simply don't know which color encoding you are currently using. They simply interpret it as RGB (matplotlib) respectively BGR (OpenCV). So the second channel will always be interpreted as green and the other ones as red and blue respectively. That's all they do.
PS: If you pass a single-channel array you'll get a gray image.
PPS: Consider the range of the values in your numpy array. For example if you are passing float the values should be in the interval [0-1], whereas with uint8 it's [0-255] for matplotlib.

What is the example reference threshold for each component of a LAB image?

I'm doing a project of face skin detection. I need to replace each pixel in a face image with a black pixel if the image intensity is less than some fixed constant T, or a white pixel if the image intensity is greater than that constant.
I know in opencv, cv2.threshold takes two arguments, First argument is the source image, which should be a grayscale image. Second argument is the threshold value which is used to classify the pixel values.
Can anybody tell me how to threshold color images by designating a separate threshold for each of the LAB components of the image and then combine them with an AND operation?
Example threshold ranges would be great!
here is an sample code i wrote:
import cv2
color_image = cv2.imread("lena.png")
lab_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2LAB)
L,A,B=cv2.split(lab_image)
th, th_image = cv2.threshold(L,100,255,cv2.THRESH_BINARY)
#cv2.imshow("original",color_image)
#cv2.imshow("l space",L)
cv2.imshow("th imaged",th_image)
# wait until escape is pressed
while True:
keyboard = cv2.waitKey()
if keyboard == 27:
break
cv2.destroyAllWindows()
here is the official doc:
cv.Threshold(src, dst, threshold, maxValue, thresholdType) → None
Parameters:
src – input array (single-channel, 8-bit or 32-bit floating point).
dst – output array of the same size and type as src.
thresh – threshold value.
maxval – maximum value to use with the THRESH_BINARY and THRESH_BINARY_INV thresholding types.
type – thresholding type (see the details below).
the code will generate you these three images:
Original:
L-Space of the LAB Convertion:
and finally a simple threshold example:
a rly good tutorial can be found here for tresholding: OpenCV

Converting image to grayscale

I want to convert any image to grayscale, but I don't understand the difference between these implementations.
image = cv2.imread('lenna.jpg')
gray = cv2.cvtColor(image, cv2.IMREAD_GRAYSCALE)
gray1 = rgb2gray(image)
gray2 = cv2.imread('lenna.jpg', cv2.IMREAD_GRAYSCALE)
image1 = Image.open('lenna.jpg', 'r')
gray3 = image1.convert('L')
When I plot them, I get them in blue scale, green scale, green scale and gray respectively. When I should use each one?
You've encountered a spot where Python's type system isn't protecting you in the way that C++ would.
cv2.IMREAD_GRAYSCALE and cv2.COLOR_BGR2GRAY are values from different enumerations. The former, whose numerical value is 0, applies to cv2.imread(). The latter, whose numerical value is 6, applies to cv2.cvtColor(). C++ would have told you that cv2.IMREAD_GRAYSCALE can't be passed to cv2.cvtColor(). Python quietly accepts the corresponding int value.
Thus, you think you're asking cv2 to convert a color image to gray, but by passing cv2.IMREAD_GRAYSCALE, cv2.cvtColor() sees the value 0, and thinks you're passing cv2.COLOR_BGR2BGRA. Instead of a grayscale image, you get the original image with an alpha channel added.
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
is what you need instead.
The other issue you're seeing, assuming you're using a Jupyter notebook, is that cv2 layers color planes in BGR order instead of RGB. To display them properly, first do
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
and then display the result.
The images that are not gray are the still 3d arrays, that is to say they still somehow retain color information, the reason you are seeing blue and green is because in those 3d arrays the red and green channels in the first case and the blue & red channels in the second have been reduced to 0 leaving only the blue and green that you see.
In order to read the image as grayscale you would use
img_gray=cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
This will yield a 2d array with values between 0 and 255 corresponding to how bright the pixel should be instead of how bright each of the 3 color channels of the pixel should be.

How can I detect red particles on an image using python opencv?

my job is to detect and get the size of red particles from image. I tried simple blob detections, but works bad with colour filter and extracting values of red using the HSV but I got poor results because the image has small resolution (I work on Rasperry Pi using a webcam).
Here is a sample picture:
Using the HSV colour space is perfectly fine. If you show the hue and saturation components of the image, you'll see that the red particles have a relatively large hue with a small saturation.
BTW, your image is rather large in resolution. I'm going to downsample for the purposes of fitting the images into the post as well as minimizing processing time. First let's load in your image, resize it down to 25% resolution, then extract out the HSV components:
import cv2
import numpy as np
im = cv2.imread('sample.png')
im_resize = cv2.resize(im, None, None, 0.25, 0.25)
out = cv2.cvtColor(im_resize, cv2.COLOR_BGR2HSV)
stacked = np.hstack([out[...,0], out[...,1]])
cv2.imshow("Hue & Saturation", stacked)
cv2.waitKey(0)
cv2.destroyAllWindows()
I'm also stacking the hue and saturation channels together into a single image so we can see what it looks like and displaying this to the screen.
We get this image:
The combination of a relatively large hue component with a low saturation component is unique in comparison to the rest of the image. Let's do some simple thresholding to extract out those components where we look for areas that have a hue component that is greater than one threshold and a saturation component that is smaller than another threshold:
hue_thresh = 100
saturation_thresh = 32
thresh = np.logical_and(out[...,0] > hue_thresh, out[...,1] < saturation_thresh)
cv2.imshow("Thresholded", 255*(thresh.astype(np.uint8)))
cv2.waitKey(0)
cv2.destroyAllWindows()
I set some tuned thresholds, then use numpy.logical_and to combine both conditions together. Because the image is now of type bool and to display images, they should be an unsigned or floating-point type, we convert the image to uint8 then multiply by 255.
We now get this image:
As you can see, we extract out the portions that are a reddish hue that is not common with the background. The thresholds will also need to be played around with, but it's fine for this particular example.

Categories