my job is to detect and get the size of red particles from image. I tried simple blob detections, but works bad with colour filter and extracting values of red using the HSV but I got poor results because the image has small resolution (I work on Rasperry Pi using a webcam).
Here is a sample picture:
Using the HSV colour space is perfectly fine. If you show the hue and saturation components of the image, you'll see that the red particles have a relatively large hue with a small saturation.
BTW, your image is rather large in resolution. I'm going to downsample for the purposes of fitting the images into the post as well as minimizing processing time. First let's load in your image, resize it down to 25% resolution, then extract out the HSV components:
import cv2
import numpy as np
im = cv2.imread('sample.png')
im_resize = cv2.resize(im, None, None, 0.25, 0.25)
out = cv2.cvtColor(im_resize, cv2.COLOR_BGR2HSV)
stacked = np.hstack([out[...,0], out[...,1]])
cv2.imshow("Hue & Saturation", stacked)
I'm also stacking the hue and saturation channels together into a single image so we can see what it looks like and displaying this to the screen.
We get this image:
The combination of a relatively large hue component with a low saturation component is unique in comparison to the rest of the image. Let's do some simple thresholding to extract out those components where we look for areas that have a hue component that is greater than one threshold and a saturation component that is smaller than another threshold:
hue_thresh = 100
saturation_thresh = 32
thresh = np.logical_and(out[...,0] > hue_thresh, out[...,1] < saturation_thresh)
cv2.imshow("Thresholded", 255*(thresh.astype(np.uint8)))
I set some tuned thresholds, then use numpy.logical_and to combine both conditions together. Because the image is now of type bool and to display images, they should be an unsigned or floating-point type, we convert the image to uint8 then multiply by 255.
We now get this image:
As you can see, we extract out the portions that are a reddish hue that is not common with the background. The thresholds will also need to be played around with, but it's fine for this particular example.
So, I extracted the radiometric raw data of thermograms (exiftools) and needed to do some processing to enhance the visualization in order to annotate these images to get mask for segmentation later. However, I need to keep the radiometric values unchanged (they are 16bits grayscale thermal images). The extracted raw png is too gray and I barely can see the image, so I thought on doing some basic processing (min-max normalization) to enhance the visualization. For this image, for example, the max and min values range from 19663 to 16792, but it varies. When I normalize using mix/max (code below) the image looks great for annotation, but it stretches the values and I don't want it.
Im using this loop to process these images:
for filename in glob.iglob("*.png"):
if "raw" in filename:
img = cv2.imread(filename, -1)
#max = np.max(img)
#min = np.min(img)
img_16bits = cv2.normalize(img, None, 0, 65535, cv2.NORM_MINMAX, dtype = cv2.CV_16U)
basename = os.path.splitext(os.path.basename(filename))[0]
Interesting enough, when I plot the image using plt.imshow with grayscale cmap, the image looks great and the values are unchanged, same when I drag it in ImageJ (it automatically corrects the contrast). I tried several things to change this code to get where I want, without luck. Any help would be appreciated. Thanks.
Images (raw image / processed with stretched values):
I am trying to threshold images with challenging noise.
The numbers on the side are the dimensions. I have tried various standard methods:
ret,thresh1 = cv2.threshold(img,95,255,0)
thresh2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,7,0.5)
thresh3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,3,1.5)
# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(3,3),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
I want to segment the "lighter" portion inside the darker grey zone (or vice versa). I have played with various kernel sizes, and constant values but nothing is giving me a good segmentation. Any ideas what else i can try or how to improve the results? Some sample results i get using the code is
I was doing some research about how can i crop the dress in this image (see image1) using python and some other libraries, so i need to do this for different images with many models on the photo, they will have different sizes and shapes so i need to do something generic that could take the image, analize it and remove all but the dress,
I have a code that takes this image and do some mask around the model's shape and put the alpha channel so i get this (image2):
As you can see this is the result of my code, but is not what i need, i really need to remove all the colors around the model, if possible all the colors around the dress, and need to be generic.. i.e. should work with different models that have different shapes and sizes
this is the code i have written on python using PIL and numpy libraries, i was using python 3.4
import numpy
from numpy import array
from PIL import Image
#import cv2
# read image as RGB and add alpha (transparency)
im ="one.jpg").convert("RGBA")
# convert to numpy (for convenience)
imArray = numpy.asarray(im)
# create mask (zeros + circle with ones)
center = (100,100)
radius = 100
mask = numpy.zeros((imArray.shape[0],imArray.shape[1]))
for i in range(imArray.shape[0]):
for j in range(imArray.shape[1]):
#if (i-center[0])**2 + (j-center[0])**2 < radius**2:
# mask[i,j] = 1
if ((j > 110 and j<240 and i>65 ) or (j > 440 and j<580 and i>83 )):
mask[i, j] = 1
lower = numpy.array([0,0,0])
upper = numpy.array([15, 15, 15])
shapeMask = cv2.inRange(imArray, lower, upper)
# assemble new image (uint8: 0-255)
newImArray = numpy.empty(imArray.shape,dtype='uint8')
# colors (three first columns, RGB)
newImArray[:,:,:3] = imArray[:,:,:3]
# transparency (4th column)
newImArray[:,:,3] = mask*255
# back to Image from numpy
newIm = Image.fromarray(newImArray, "RGBA")"one2.png")
The result should be a PNG image with all transparent except the model, or the dress if possible
As you can see im only making a static mask that always will be in the same place, and it is rectangular, not adjusted to the model, let me know if you need more explanation of what i need
Thanks a lot!
This is a very hard problem, especially when you do not know what the background is going to be and when the background has shadows.
The netting of the dress is also going to be lost in part or whole as might the areas between the body and the arms.
Here is an attempt using ImageMagick. But OpenCV has similar commands.
First, blur the image slightly and then extract the Hue channel from HCL colorspace.
Second I change all white colors within a tolerance of 30% to black.
Third I perform Otsu thresholding using one of my scripts.
Fourth I do a small amount of morphology close.
Fifth I use connected components processing to remove all regions smaller than 150 pixels in area. In OpenCV, that would be blob detection (SimpleBlobDetection) and invert (negate) the result as a mask.
Last, I put the mask into the alpha channel of the input to make the background transparent (which will show up white here).
convert image.jpg -blur 0x1 -colorspace HCL -channel r -separate hue.png
convert hue.png -fuzz 30% -fill black -opaque white filled.png
otsuthresh -g save filled.png thresh.png
convert thresh.png -morphology open disk:1 morph.png
convert morph.png -type bilevel \
-define connected-components:mean-color=true \
-define connected-components:area-threshold=150 \
-connected-components 4 \
-negate \
convert image.jpg mask.png -alpha off -compose copy_opacity -composite result.png
Here are the image for the steps:
Hue Image:
Filled Image after converting white to black:
Otsu Thresholded Image:
As you can see, the result is not very good at keeping to the outline of the woman and the dress, especially in the hair and the netting of the dress.
You might investigate OpenCV GrabCut Foreground Extaction at
If you can assume the background is fairly simple, (uniform in color, or only nearly horizontal lines) you could do edge detection, and the remove all pixels that's outside the first occuring edge.
Any edge detection filter should be sufficient, But I would probably go for a simple high pass filter, that enhances vertical edges only.
You'r merely trying to figure out where the models silhouette is!
Then remove all the pixels from the frame, going inwards, till the first edge is encountered. (cleans up background outside model).
To remove holes between arms and dress etc.. Median the color value of the removed pixels, to get the background color for this row, then remove pixels with a color value close to the found mean on the remainder of the row.
removals should be done via building a mask image, and then subtract it from the image, as the mask can be used for an opacity / alpha channel afterwards.
if dress or model is too close in colour to the background, holes will appear in the model/dress.
patterns in background disturbs algorithm and leaves rows untouched.
noise in the background can cause the removal or colour value to be set from pixels close to the frame only.
some of those problems can be minimized by opening and closing the deletion mask.
others by a spacial median filter prior to edge detection.
First step is to calculate the background color(s). Get a block of 50*50 find the variance, shift 10-20 pixels to right and get another block, calculate its variance as well and many more. Store the variances in an array. (and their means as well).
The ones with lowest variance are background colors, you will see bunch of those. After finding the background color, choose 5*5 blocks and if the variance is very small and its mean is equal to one of the backgrounds (i.e similar characteristic), then make it white or do whatever you want.
That is just my intuition, I'm not professional about image processing.
You can give this a try in order to extract dress from image of a model.
The link is github repo of image-conditional image generation model called PixelDTGAN. This model will perform a challenging task of generating a piece of clothing from an input image of a dressed person
This model transfers an input domain to a target domain in semantic level, and generates the target image in pixel level.
To generate realistic target images, the real/fake-discriminator is used as in Generative Adversarial Nets, a domain-discriminator is used to make the generated image relevant to the input image.
I can only ever find examples in C/C++ and they never seem to map well to the OpenCV API. I'm loading video frames (both from files and from a webcam) and want to reduce them to 16 color, but mapped to a 24-bit RGB color-space (this is what my output requires - a giant LED display).
I read the data like this:
ret, frame =
image = cv2.cvtColor(frame, cv2.COLOR_RGB2BGRA)
I did find the below python example, but cannot figure out how to map that to the type of output data I need:
import numpy as np
import cv2
img = cv2.imread('home.jpg')
Z = img.reshape((-1,3))
# convert to np.float32
Z = np.float32(Z)
# define criteria, number of clusters(K) and apply kmeans()
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 8
# Now convert back into uint8, and make original image
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((img.shape))
That obviously works for the OpenCV image viewer but trying to do the same errors on my output code since I need an RGB or RGBA format. My output works like this:
for y in range(self.height):
for x in range(self.width):
Each color is represented as an (r,g,b) tuple.
Any thoughts on how to make this work?
I think the following could be faster than kmeans, specially with a k = 16.
Convert the color image to gray
Contrast stretch this gray image to so that resulting image gray levels are between 0 and 255 (use normalize with NORM_MINMAX)
Calculate the histogram of this stretched gray image using 16 as the number of bins (calcHist)
Now you can modify these 16 values of the histogram. For example you can sort and assign ranks (say 0 to 15), or assign 16 uniformly distributed values between 0 and 255 (I think these could give you a consistent output for a video)
Backproject this histogram onto the stretched gray image (calcBackProject)
Apply a color-map to this backprojected image (you might want to scale the backprojected image befor applying a colormap using applyColorMap)
Tip for kmeans:
If you are using kmeans for video, you can use the cluster centers from the previous frame as the initial positions in kmeans for the current frame. That way, it'll take less time to converge, so kmeans in the subsequent frames will most probably run faster.
You can speed up your processing by applying the k-means on a downscaled version of your image. This will give you the cluster centroids. You can then quantify each pixel of the original image by picking the closest centroid.
I learned this method from a SPIE Proceeding article, they used the twice HSV transformation for shadow detection. In their paper, the method was stated as following:
Firstly, the color model of the image is transformed from RGB to HSV,
and the three components of the HSV model are normalized to 0 to 255,
then the image is transformed from RGB to HSV once again. Thirdly, the
image is turned into a gray image from a color image, only the gray
value of the red component is used. Fourthly, the OTSU thresholding
method is used to produce a threshold by which the image is converted
to a binary image. Since the gray value of the shadow area is usually
smaller than those areas which are not covered by shadow, the
objective is pixels whose gray value is below the threshold, and
background is pixels whose gray value is beyond the threshold.
Do the second and third steps make sense?
The second and third statements absolutely don't make any sense whatsoever. Even the pipeline is rather suspicious. However, after re-reading that statement probably a dozen times, here is what I came up with. Apologies for any errors in understanding.
Let's start with the second point:
Firstly, the color model of the image is transformed from RGB to HSV, and the three components of the HSV model are normalized to 0 to 255, then the image is transformed from RGB to HSV once again
You're well aware that transforming an image from RGB to HSV results in another three channel output. Depending on which platform you're using, you'll either get 0-360 or 0-1 for the first channel or Hue, 0-100 or 0-255 for the second channel or Saturation, and 0-100 or 0-255 for the third channel or Value. Each channel may be unequal in magnitude when comparing with the other channels, and so these channels are normalized to the 0-255 range independently. Specifically, this means that the Hue, Saturation and Value components all get normalized so that they all span from 0-255.
Once we do this, we now have a HSV image where each channel ranges from 0-255. My guess is they call this new image a RGB image because the channels all span from 0-255, just like any 8-bit RGB image would. This also makes sense because when you're transforming an image from RGB to HSV, the dynamic range of the channels all span from 0-255, so my guess is that they normalize all of the channels in the first HSV result to make it suitable for the next step.
Once they normalize the channels after doing HSV conversion as per above, they do another HSV conversion on this new result. The reasons why they would do this a second time are beyond me and don't make any sense, but that's what I gathered from the above description, and that's what they probably mean by "twice HSV transformation" - To transform the original RGB image to HSV once, normalize that result so all channels span from 0-255, then re-apply the HSV conversion again to this intermediate result.
Let's go to the third point:
Thirdly, the image is turned into a gray image from a color image, only the gray value of the red component is used.
The output after you transform the HSV image a second time, the final result is simply taking the first channel which is inherently a grayscale image and is the "red" channel. Coincidentally, this also corresponds to the Hue after you do a HSV conversion. I'm not quite sure what properties the Hue channel holds after converting the image using HSV twice, but maybe it worked for this particular method.
I decided to give this a whirl and see if this really works. Here's an example image of a shadow I found online:
The basic pipeline is to take an image, convert it into HSV, renormalize the image so that the values are 0-255 again, do another HSV conversion, then do an adaptive threshold via Otsu. We threshold below the optimal value to segment out the shadows.
I'm going to use OpenCV Python, as I don't have the C++ libraries set up on my computer here. In OpenCV, when converting an image to HSV, if the image is unsigned 8-bit RGB, the Saturation and Value components are automatically scaled to [0-255], but the Hue component is scaled to [0-179] in order to fit the Hue (which is originally [0-360)) into the data type. As such, I scaled each value by (255/179) so that the Hue gets normalized to [0-255]. Here's the code I wrote:
import numpy as np # Import relevant libraries
import cv2
# Read in image
img = cv2.imread('shadow.jpg')
# Convert to HSV
hsv1 = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Renormalize Hue channel to 0-255
hsv1[:,:,0] = ((255.0/179.0)*hsv1[:,:,0]).astype('uint8')
# Convert to HSV again
# Remember, channels are now RGB
hsv2 = cv2.cvtColor(hsv1, cv2.COLOR_RGB2HSV)
# Extract out the "red" channel
red = hsv2[:,:,0]
# Perform Otsu thresholding and INVERT the image
# Anything smaller than threshold is white, anything greater is black
_,out = cv2.threshold(red, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Show the image - shadow mask
cv2.imshow('Output', out)
This is the output I get:
Hmm.... well there are obviously some noisy pixels, but I guess it does work.... kinda!