I'm trying to detect camera tampering (lens being blocked, resulting in a black frame). The approach I have taken so far is to apply background subtraction and then finding contours post thresholding the foreground mask. Next, I find the area of each contour and if the contour area is higher than a threshold value (say, larger than 3/4th of the camera frame area), then the camera is said to be tampered.
Although upon trying this approach, there is a false tamper alert, even when the camera captures full view.
Not sure, how to go about this detection.
Any help shall be highly appreciated
One solution would be to compute the median of the image. Than you can use a simple threshold on the median to detect a blocked camera. Of course the median is not flexible in detecting 3/4 of the camera being blocked. But you also also loop over the pixels yourself and count the ones that are below a certain threshold and then compute the percentage of the field being blocked.
Here is a link where yo can see how to compute the median link
A possible cause for this error could be mild jitters in the frame that occur due to mild shaking of the camera
If your background subtraction algorithm isn't tolerant enough to low-value colour changes, then a tamper alert will be triggered even if you shake the camera a bit.
I would suggest using MOG2 for background subtraction
Take a snapshot from your camera and save it.
And run this python code. If you haven't PIL lib you have to install it.
from PIL import Image
im = Image.open('yourCamSnapshot.jpg')
pixels = im.getdata()
blackThresh=30 #30 is RGB value
nblack=0
for pixel in pixels:
if(pixel[0]<=blackThresh and pixel[1]<=blackThresh and pixel[2]<=blackThresh):
nblack=nblack+1
n=len(pixels)
if (nblack / float(n)) > 0.8: #0.8 is 80% black
print("raise exception")
Related
I have a FOV camera that has approximately 195*130 degree. So this 'lens' will put in a circle holder and the lens should not see the holder. Here's the image of not I want.
I draw 4 rectangle in Paint. There are 4 black spots which is the holder. The full red one is for censorship not there actually
If the camera image streams like that, that's a no. I need to detect that black spots and if it is like this it should gives me a error message or simply 'false'. I searched google and couldn't found this. I'm a noob of this subject but if you explain me how to do this I can connect the dots.
Thank you for your helps.
And I get the stream via USB-Capture Card. It acts like webcam.
#UPDATE1: I cropped the four corners of image then get the threshold. Made a basic if else logic and get what I want. Thank you anyway.
Try detection by generating custom HAAR filters.
Or make it simply by applying a threshold (nearly black) and look if some tiny squares in the corneres are completely black.
I'm currently working on my first assignment in image processing (using OpenCV in Python). My assignment is to calculate a precise score (to tenths of a point) of one to several shooting holes in an image uploaded by a user. One of the requirements is to transform the uploaded shooting target image to be from "birds-eye view" for further processing. For that I have decided that I need to find center coordinates of numbers (7 & 8) to select them as my 4 quadrilateral.
Unfortunately, there are several limitations that need to be taken into account.
Limitations:
resolution of the processed shooting target image can vary
the image can be taken in different lighting conditions
the image processed by this part of my algorithm will always be taken under an angle (extreme angles will be automatically rejected)
the image can be slightly rotated (+/- 10 degrees)
the shooting target can be just a part of the image
the image can be only of the center black part of the target, meaning the user doesn't have to take a photo of the whole shooting target (but there always has to be the center black part on it)
this algorithm can take a maximum of 2000ms runtime
What I have tried so far:
Template matching
here I quickly realized that it was unusable since the numbers could be slightly rotated and a different scale
Feature matching
I have tried all of the different feature matching types (SIFT, SURF, ORB...)
unfortunately, the numbers do not have that specific set of features so they matched a quite lot of false positives, but I could possibly filter them by adding shape matching, etc..
the biggest blocker was runtime, the runtime of only a single number feature matching took around 5000ms (even after optimizations) (on MacBook PRO 2017)
Optical character recognition
I mostly tried using pytesseract library
even after thresholding the image to inverted binary (so the text of numbers 7 and 8 is black and the background white) it failed to recognize them
I also tried several ways of preprocessing the image and I played a lot with the tesseract config parameter but it didn't seem to help whatsoever
Contour detection
I have easily detected all of the wanted numbers (7 & 8) as single contours but failed to filter out all of the false positives (since the image can be in different resolutions and also there are two types of targets with different sizes of the numbers I couldn't simply threshold the contour by its width, height or area)
After I would detect the numbers as contours I wanted to extract them as some ROI and then I would use OCR on them (but since there were so many false positives this would take a lot of time)
I also tried filtering them by using cv2.matchShapes function on both contours and cropped template / ROI but it seemed really unreliable
Example processed images:
high resolution version here
high resolution version here
high resolution version here
high resolution version here
high resolution version here
high resolution version here
As of right now, I'm lost on how to progress about this. I have tried everything I could think of. I would be immensely happy if any of you image recognition experts gave me any kind of advice or even better a usable code example to help me solve my problem.
Thank you all in advance.
Find the black disk by adaptive binarization and contour (possibly blur to erase the inner features);
Fit an ellipse to the outline, as accurate as possible;
Find at least one edge of the square (Hough lines);
Classify the edge as one of NWSE (according to angle);
Use the ellipse and the line information to reconstruct the perspective transformation (it is an homography);
Apply the inverse homography to straighten the image and obtain the exact target center and axis;
Again by adaptive binarization, find the bullet holes (center/radius);
Rate the holes after their distance to the center, relative to the back disk radius.
If the marking scheme is variable, detect the circles (Hough circles, using the known center, or detect peaks in an oblique profile starting from the center).
If necessary, you could OCR the digits, but it seems that the score is implicitly starting at one in the outer ring.
I am working with frames from a video. The video is overlaid with several semi-transparent boxes and my goal is to find the coordinates of these boxes. These boxes are the only fixed points in the video - the camera is moving, color intensity changes, there is no fixed reference. The problem is that the boxes are semi-transparent, so they also change with the video, albeit not as much. It seems that neither background substraction nor tracking have the right tools for this problem.
Nevertheless, I've tried the background substractors that come with cv2 as well as some homebrewn methods using differences between frames and thresholding. Unfortunately, these don't work due to the box transparency.
For reference, here is what the mean difference between the first 50 frames looks like:
And here is what cv2 background subtractor KNN returns:
I've experimented with thresholds, number of frames taken into account, various contouring algorithms, blurring/sharpening/etc. I've also tried techniques from document layout analysis.
I wonder if maybe there is something I'm missing due to not knowing the right keyword. I don't expect anyone here to give me the perfect solution, but any pointers as to where to look/what approach to try, are appreciated. I'm not bound to cv2 either, anything that works in python will do.
If you take a sample of random frames as elements of an array, and calculate the FFT, all the semi-transparent boxes will have a very high signal, and the rest of the pixels would behave as noise, so noise remotion will filter away the semi-transparent boxes. You can add the result of your other methods as additional frames for the fft
You are trying to find something that does not changes on the entire video, so do not use consecutive frames, or if you are forced to use consecutive frames, shuffle them randomly.
To gain speed, you may only take only one color channel from each frame, and pick the color channel randomly. That way the colors becomes noise, and cancel each other.
If the FFT is too expensive, just averaging random frames should filter the noise.
Ok here is first step, you can make Canny from that image, from canny you can make countours:
import cv2
import random as rng
image = cv2.imread("c:\stackoverflow\interface.png")
edges = cv2.Canny(image, 100, 240)
contoursext, hierarchy = cv2.findContours(
edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#cv2.RETR_EXTERNAL would work better if the image would not be framed.
for i in range(len(contoursext)):
color = (rng.randint(0,256), rng.randint(0,256), rng.randint(0,256))
cv2.drawContours(image, contoursext, i, color, 1, cv2.LINE_8, hierarchy, 0)
# Show in a window
cv2.imshow("Canny", edges)
cv2.imshow("Contour", image)
cv2.waitKey(0)
Then you can test if the contour or combination of 2 contours is rectangles for example...wich would probably detect most of the rectangle overlays...
Or Also you can try to detect canny lines if they are similar to rectangles.
When humans see markers suggesting the form of a shape, they immediately perceive the shape itself, as in https://en.wikipedia.org/wiki/Illusory_contours. I'm trying to accomplish something similar in OpenCV in order to detect the shape of a hand in a depth image with very heavy noise. In this question, assume that skin color based detection is not working (actually it is the best I've achieved so far but it is not robust under changing light conditions, shadows or skin colors. Also various paper shapes (flat and colorful) are on the table, confusing color-based approaches. This is why I'm attempting to use the depth cam instead).
Here's a sample image of the live footage that is already pre-processed for better contrast and with background gradient removed:
I want to isolate the exact shape of the hand from the rest of the picture. For a human eye this is a trivial thing to do. So here are a few attempts I did:
Here's the result with canny edge detection applied. The problem here is that the black shape inside the hand is larger than the actual hand, causing the detected hand to overshoot in size. Also, the lines are not connected and I fail at detecting contours.
Update: Combining Canny and a morphological closing (4x4 px ellipse) makes contour detection possible with the following result. It is still waaay too noisy.
Update 2: The result can be slightly enhanced by drawing that contour to an empty mask, save that in a buffer and re-detect yet another contour on a merge of three buffered images. The line that combines the buffered images is is hand_img = np.array(np.minimum(255, np.multiply.reduce(self.buf)), np.uint8) which is then morphed once again (closing) and finally contour detected. The results are slightly less horrible than in the picture above but laggy instead.
Alternatively I tried to use an existing CNN (https://github.com/victordibia/handtracking) for detecting the approximate position of the hand's center (this step works) and then flood from there. In order to detect contours the result is put into an OTSU filter and then the largest contour is taken, resulting in the following picture (ignore black rectangles in the left). The problem is that some of the noise is flooded as well and the results are mediocre:
Finally, I tried background removers such as MOG2 or GMG. They are confused by the enormous amount of fast-moving noise. Also they cut off the fingertips (which are crucial for this project). Finally, they don't see enough details in the hand (8 bit plus further color reduction via equalizeHist yield a very poor grayscale resolution) to reliably detect small movements.
It's ridiculous how simple it is for a human to see the exact precise shape of the hand in the first picture and how incredibly hard it is for the computer to draw a shape.
What would be your recommended method to achieve an exact hand segmentation?
After two days of desperate testing, the solution was to VERY carefully apply thresholding to an well-preprocessed image.
Here are the steps:
Remove as much noise as you possibly can. In my case, denoising was done using Intel's pyrealsense2 (I'm using an Intel RealSense depth camera and the algorithms were written for that camera family, thus they work very well). I used rs.temporal_filter() and directly after rs.hole_filling_filter() on every frame.
Capture the very first frame. Besides capturing the exact distance to the table (for later thresholding), this step also saves a still picture that is blurred by a 100x100 px kernel. Since the camera is never mounted perfectly but slightly tilted, there's an ugly grayscale gradient going over the picture and making operations impossible. This still picture is then subtracted from every single later frame, eliminating the gradient. BTW: this gradient removal step is already incorporated in the screenshots shown in the question above
Now the picture is almost noise-free. Do not use equalizeHist. This does not simply increase the general contrast regularly but instead empathizes the remaining noise way too much. This was my main error I did in almost all experiments. Instead, apply a threshold (binary with fixed border) directly. The border is extremely thin, setting it at 104 instead of 205 makes a huge difference.
Invert colors (unless you have taken BINARY_INV in the previous step), apply contours, take the largest one and write it to a mask
VoilĂ !
So, I am developing a hobby project where a camera is mounted on a bot, powered by Raspberry Pi. The bot moves around a room and does some processing based on the camera's response. I apologize, if this is not the right place to ask.
Problem:-
The camera attached to the bot will perform Background subtraction continuously.The bot will be moving simultaneously. In case the background algorithm detects an object in front of the bot, it'll stop the bot and do further processing with respect to the object. Here the working assumption is that the ground is of only one color and uniform to a great extent.
The algorithm works great under very controlled lighting conditions. The problem arises when there is slight lighting changes or when the ground has small patches/potholes/uneveness in it. The above scenarios generate false flags and as a result my bot stops. I want to know if there is any way to prevent these false flags with the help of any modifications in the following code ?
import picamera,cv2,time
camera = PiCamera()
camera.resolution = (512,512)
camera.awb_mode="fluorescent"
camera.iso = 800
camera.contrast=25
camera.brightness=64
camera.sharpness=100
rawCapture = PiRGBArray(camera, size=(512, 512))
first_time=0 # This flag is to capture the first frame as background image.
frame_buffer=0 # This flag is to change the background image every 30 frames.
counter=0
camera.start_preview()
sleep(1)
def imageSubtract(img):
luv=cv2.cvtColor(img,cv2.COLOR_BGR2LUV)
l,u,v=cv2.split(luv)
return v
for frame in camera.capture_continuous(rawCapture, format="bgr", use_video_port=True):
# Here first 10 frames are being rejected .
if first_time==0:
rawCapture.truncate(0)
if frame_buffer<10:
print("Frame rejected -",str(frame_buffer))
frame_buffer+=1
continue
os.system("clear")
refImg=frame.array
refThresh=imageSubtract(refImg)
first_time=1
frame_buffer=0
frame_buffer+=1
cv2.imshow("Background",refImg)
image = frame.array
cv2.imshow("Foreground", image)
key = cv2.waitKey(1)
rawCapture.truncate(0)
newThresh=imageSubtract(image)
diff=cv2.absdiff(refThresh,newThresh) #Here the background image is sub from foreground
kernel = np.ones((5,5),np.uint8)
diff = cv2.morphologyEx(diff, cv2.MORPH_OPEN, kernel)
diff=cv2.dilate(diff,kernel,iterations = 2)
_, thresholded = cv2.threshold(diff, 0 , 255, cv2.THRESH_BINARY +cv2.THRESH_OTSU)
_, contours, _= cv2.findContours(thresholded,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
try:
c=max(contours,key=cv2.contourArea)
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(thresholded,(x,y),(x+w,y+h),(125,125,125),2)
if cv2.contourArea(c)>300 and len(contours)<=3:
if counter==0:
print("Going to sleep for 0.1 second") # allowing the device to move ahead for 0.1 sec before processing the object
time.sleep(0.1)
counter=1
continue
else:
print("Object found !")
cv2.imshow("Threshold",thresholded)
if frame_buffer%30==0:
frame_buffer=0
refImg=image
refThresh=imageSubtract(refImg)
os.system("clear")
print("Refrence Image changed")
except Exception as e:
print(e)
pass
NOTE :- The above algorithm used continuous capture mode of the PiCamera. Also, first 10 frames are rejected because I have noticed that the PiCamera takes some time to adjust the colors once it starts up. Another thing is that the background image is being changed every 30 frames because I wanted the background image to remain as close as possible to the foreground image. Since the room is quite big, there is going to be some local changes in the light/color of the ground between one corner of the room and the other. Hence, felt the need to update the background image after every 30 frames. The object needs to have an area greater than 300 for it to be detected. I have also given a delay of 0.1 sec (time.sleep(0.1)) after the object has been detected, because I wanted the object to enter the frame completely and be right in the middle of the frame before the device stops.
Some solutions that I had in mind was :-
I thought of attaching few IR sensors at the base of the device. In case, an object is detected ( Real/False Flags), it'll check the output from IR sensor just to check if any object is being picked up by it as well. In case of shadows and patches, the IR response is going to be NULL, so the bot continues to move forward.
I thought of calculating the height of the detected object. If the height was above a certain threshold , then presence of object could have been confirmed. Otherwise it is a false flag. But, the camera is going to be facing down, which means the image is taken from top. So I don't think there is any way to ascertain the height of the object from it's top-down image.
Please suggest any alternative solutions. I want to make the above algorithm as perfect as possible because the entire working of the device depends upon the accuracy of the background subtraction algorithm.
Thank you for your help in advance !
EDIT 1-
Background Image - Back
Foreground Image - Front
Background subtraction works well when your camera is not moving and objects are moving in front of the camera. Your case is just the opposite and it will easily detect movement everywhere.
If your background is uniform, any edge detection filter may work better than your background subtraction.
Another alternative is using thresholding (see https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html). I would suggest an adaptive threshold (Gaussian or mean) that will work better if the edges are darker than the center. Try different sizes of the neighbourhood and different values for the constant C.
Then you can erode/dilate as you already do and filter by size or position as you suggest. Adding more sensors to your robot is a good idea as well.