Im trying to remove the background from a video and get a binary images( or 8-bit) where value of the object that moves is 1 and static background is 0.
something like this:
at first I tried it with getting the difference absDiff() from running average accumulateWeighted() and the current frame but the result was not what I expected( only the edges was 1 and inside of the moving object was 0).
so I went for createBackgroundSubtractorMOG2 and createBackgroundSubtractorMOG but this is not good either( same problem ).
is there a way to get the whole moving object?
The Mixture of Gaussians method is not going to solve all your problems. Common problem is sensitivity to light conditions, e.g. attaching shadow to extracted foreground object . If the image scenario (background) is roughly the same you can refine your results with some image processing.
If the background is similar as in attached image, try to build color histogram in HSI space, create image of extracted foreground object (not mask, actual colored image) and remove pixels that color is similar to the floor (technique known from skin detection methods). In that way you could remove some shadows attached to the person/objects.
Also, if real-time processing is not crucial in your application, you could use more sophisticated background/foreground detection like SubSENSE.
Related
I always wanted to have a device that, from a live camera feed, could detect an object, create a 3D model of it, and then identify it. It would work a lot like the Scanner tool from Subnautica. Imagine my surprise when I found OpenCV, a free-to-use computer vision tool for Python!
My first step is to get the computer to recognize that there is an object at the center of the camera feed. To do this, I found a Canny() function that could detect edges and display them as white lines in a black image, which should make a complete outline of the object in the center. I also used the floodFill() function to fill in the black zone between the white lines with gray, which would show that the computer recognizes that there is an object there. My attempt is in the following image.
The red dot is the center of the live video.
The issue is that the edge lines can have holes in them due to a blur between two colors, which can range from individual pixels to entire missing lines. As a result, the gray gets out and doesn't highlight me as the only object, and instead highlights the entire wall as well. Is there a way to fill those missing pixels in or is there a better way of doing this?
Welcome to SO and the exiting world of machine vision !
What you are describing is a very classical problem in the field, and not a trivial one at all. It depends heavily on the shape and appearance of what you define as the object of interest and the overall structure, homogeneity and color of the background. Remember, the computer has no concept of what an "object" is, the only thing it 'knows' is a matrix of numbers.
In your example, you might start out with selecting the background area by color (or hue, look up HSV). Everything else is your object. This is what classical greenscreening techniques do, and it only works with (a) a homogenous background, which does not share a color with your object and (b) a single or multiple not overlapping objects.
The problem with your edge based approach is that you won't get a closed edge safely, and deciding where the inside and outside of the object is might get tricky.
Advanced ways to do this would get you into Neural Network territory, but maybe try to get the basics down first.
Here are two links to tutorials on converting color spaces and extracting contours:
https://docs.opencv.org/4.x/df/d9d/tutorial_py_colorspaces.html
https://docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html
If you got that figured out, look into stereo vision or 3D imaging in general, and that subnautica scanner might just become reality some day ;)
Good luck !
I am trying to create a line tracking program with a drone with a forward facing camera. I understood this could be a bit difficult since the camera was not facing downward and would pick up on the environment. I need it to face forward for a face recognition algorithm. So I chose to make the line pink. I found on this site some parameters for color filtering. I thought they would be over compensating with the color range, but the tape doesn't show up in a full sheet, but rather in a ton of boxes inside the tape.
def pinkThreshold(image):
copy = image.copy()
copy = cv2.cvtColor(copy,cv2.COLOR_RGB2HSV)
lower_pink = np.array([125,30,100])
upper_pink = np.array([225,255,255])
pinkImage = cv2.inRange(copy, lower_pink, upper_pink)
edges = cv2.Canny(pinkImage,240,255)
return edges
The image I get is this:
I think it might have to do with the camera returning red squares, but i'm not completely sure what I should do about this and if this is even the issue. The red pattern areas seem to be like what I have seen, but i'm not completely sure. If that is true, what would be a good color filter with pink and red? Also, would this be solved by a large floodlight over the line to be tracked?
The camera is attached to a DJI Tello drone. I can't change the equipment.
I think it might have to do with the camera returning red squares, but i'm not completely sure what I should do about this and if this is even the issue.
Let's adjust contrast and use color substitution to see the actual problem:
As you can see, color noise is huge. If you try to do color-based segmentation or try to apply any other "color sensitive logic" by targeting any specific color you are going to see that noise being picked up:
You can always improve your lighting conditions and extend the range you defined, but there is another approach: you can use multiple colors to find the actual shape you need, you can use thresholding to boost some specific areas and so on:
Long story short:
improve lightning conditions
AND/OR do some specific preprocessing with wider color range to properly address noise and overall quality of the picture
When humans see markers suggesting the form of a shape, they immediately perceive the shape itself, as in https://en.wikipedia.org/wiki/Illusory_contours. I'm trying to accomplish something similar in OpenCV in order to detect the shape of a hand in a depth image with very heavy noise. In this question, assume that skin color based detection is not working (actually it is the best I've achieved so far but it is not robust under changing light conditions, shadows or skin colors. Also various paper shapes (flat and colorful) are on the table, confusing color-based approaches. This is why I'm attempting to use the depth cam instead).
Here's a sample image of the live footage that is already pre-processed for better contrast and with background gradient removed:
I want to isolate the exact shape of the hand from the rest of the picture. For a human eye this is a trivial thing to do. So here are a few attempts I did:
Here's the result with canny edge detection applied. The problem here is that the black shape inside the hand is larger than the actual hand, causing the detected hand to overshoot in size. Also, the lines are not connected and I fail at detecting contours.
Update: Combining Canny and a morphological closing (4x4 px ellipse) makes contour detection possible with the following result. It is still waaay too noisy.
Update 2: The result can be slightly enhanced by drawing that contour to an empty mask, save that in a buffer and re-detect yet another contour on a merge of three buffered images. The line that combines the buffered images is is hand_img = np.array(np.minimum(255, np.multiply.reduce(self.buf)), np.uint8) which is then morphed once again (closing) and finally contour detected. The results are slightly less horrible than in the picture above but laggy instead.
Alternatively I tried to use an existing CNN (https://github.com/victordibia/handtracking) for detecting the approximate position of the hand's center (this step works) and then flood from there. In order to detect contours the result is put into an OTSU filter and then the largest contour is taken, resulting in the following picture (ignore black rectangles in the left). The problem is that some of the noise is flooded as well and the results are mediocre:
Finally, I tried background removers such as MOG2 or GMG. They are confused by the enormous amount of fast-moving noise. Also they cut off the fingertips (which are crucial for this project). Finally, they don't see enough details in the hand (8 bit plus further color reduction via equalizeHist yield a very poor grayscale resolution) to reliably detect small movements.
It's ridiculous how simple it is for a human to see the exact precise shape of the hand in the first picture and how incredibly hard it is for the computer to draw a shape.
What would be your recommended method to achieve an exact hand segmentation?
After two days of desperate testing, the solution was to VERY carefully apply thresholding to an well-preprocessed image.
Here are the steps:
Remove as much noise as you possibly can. In my case, denoising was done using Intel's pyrealsense2 (I'm using an Intel RealSense depth camera and the algorithms were written for that camera family, thus they work very well). I used rs.temporal_filter() and directly after rs.hole_filling_filter() on every frame.
Capture the very first frame. Besides capturing the exact distance to the table (for later thresholding), this step also saves a still picture that is blurred by a 100x100 px kernel. Since the camera is never mounted perfectly but slightly tilted, there's an ugly grayscale gradient going over the picture and making operations impossible. This still picture is then subtracted from every single later frame, eliminating the gradient. BTW: this gradient removal step is already incorporated in the screenshots shown in the question above
Now the picture is almost noise-free. Do not use equalizeHist. This does not simply increase the general contrast regularly but instead empathizes the remaining noise way too much. This was my main error I did in almost all experiments. Instead, apply a threshold (binary with fixed border) directly. The border is extremely thin, setting it at 104 instead of 205 makes a huge difference.
Invert colors (unless you have taken BINARY_INV in the previous step), apply contours, take the largest one and write it to a mask
VoilĂ !
How to go from the image on the left to the image on the right programmatically using Python (and maybe some tools, like OpenCV)?
I made this one by hand using an online tool for clipping. I am completely noob in image processing (especially in practice). I was thinking to apply some edge or contour detection to create a mask, which I will apply later on the original image to paint everything else (except the region of interest) black. But I failed miserably.
The goal is to preprocess a dataset of very similar images, in order to train a CNN binary classifier. I tried to train it by just cropping the image close to the region of interest, but the noise is so high that the CNN learned absolutely nothing.
Can someone help me do this preprocessing?
I used OpenCV's implementation of watershed algorithm to solve your problem. You can find out how to use it if you read this great tutorial, so I will not explain this into a lot of detail.
I selected four points (markers). One is located on the region that you want to extract, one is outside and the other two are within lower/upper part of the interior that does not interest you. I then created an empty integer array (the so-called marker image) and filled it with zeros. Then I assigned unique values to pixels at marker positions.
The image below shows the marker positions and marker values, drawn on the original image:
I could also select more markers within the same area (for example several markers that belong to the area you want to extract) but in that case they should all have the same values (in this case 255).
Then I used watershed. The first input is the image that you provided and the second input is the marker image (zero everywhere except at marker positions). The algorithm stores the result in the marker image; the region that interests you is marked with the value of the region marker (in this case 255):
I set all pixels that did not have the 255 value to zero. I dilated the obtained image three times with 3x3 kernel. Then I used the dilated image as a mask for the original image (i set all pixels outside the mask to zero) and this is the result i got:
You will probably need some kind of method that will find markers automatically. The difficulty of this task depends heavily on the set of the input images. In some cases, the method can be really straightforward and simple (as in the tutorial linked above) but sometimes this can be a tough nut to crack. But I can't recommend anything because I don't know how your images look like in general (you only provided one). :)
I am working with a stack of noisy images, trying to isolate a blob in an image section. Below you can see the starting image, loaded and plotted with python, and the same image after some editing with Gimp.
What I want to do is to isolate and recognise the blob inside the light blue circle, editing the image and then using something like ndimage.label. Do you have any suggestion on how to edit the image? Thanks.
The background looks quite even so you should be able to isolate the main object using thresholding, allowing you to use array masking to identify regions within the main object. I would have a go with some tools from scikit image to see where that gets you http://scikit-image.org/docs/dev/auto_examples/
I would try gaussian/median filtering followed by thresholding/filling gaps. Or you could try random walker segmentation, or pherhaps texture classification might be more useful. When you have a list of smaller objects within the main object you can then filter these with respect to shape, size, roundness etc
http://scikit-image.org/docs/dev/auto_examples/plot_label.html#example-plot-label-py