Open CV to Capture Unique Objects from a Video - python

I was doing a frame slicing from the OpenCV Library in Python, and I am successfully able to create frames from the video being tested on.
I am doing it on a CCTV Camera installed at a parking entry gateway where the video plays 24x7, and at times the car is standing still for good number of minutes, leading to having consecutive frames of the same vehicle.
My question is how can I create a frame only when a new vehicle enters the parking lot?

Stackoverflow is for code related queries. I suggest you try some code and share your results and your problems before posting anything here. That being said, you can start with object detection tutorials like this and then do tracking with sort. Many pre trained models are available that include the cars class so you won't even need to train a new model.

Do you need to detect license plates, etc? Or just notice if something happens? For the latter, you could use a very simple approach. Take an average of say the frames of the last 30 seconds and subtract that from a current frame. If the mean absolute average of the delta image is above a threshold, that could be the change you are looking for.

You could do some simpler motion detection with opencv, it's nicely explained in https://www.pyimagesearch.com/2015/05/25/basic-motion-detection-and-tracking-with-python-and-opencv/
So if you have a picture of the background as reference, you can compare each new image to the background and only save the image if it's different enough from the background (hopefully only when a car entered). Then make this the new background and reset for a new car when the new images again start looking like the original background.
Hopefully I stated my idea clear enough and that link provides enough information to implement it. If not just ask for a clarification!

First you have to have a specific xml to detect only cars.You can get it from the here.I have developed a code just to uniquely identify and count the cars that are visible to the cctv you are using,sometimes it totally depends on the frame rate and detection too,so you can control the frame rate and also the total count variable.
import cv2
cascade_src = 'cars.xml'
cap = cv2.VideoCapture('rtsp_of_ur_cctv')
car_cascade = cv2.CascadeClassifier(cascade_src)
prev_count=0
total_count=0
while True:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cars = car_cascade.detectMultiScale(gray, 1.1, 1)
if len(cars)>prev_count:
diffrence=len(cars)-prev_count
total_count=total_count+diffrence
#here yo can save unique new entry and possibly avoid the recursive ones
print(total_count)
for (x,y,w,h) in cars:
cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
prev_count=len(cars)
cv2.imshow('video', img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()

Related

Recording the real time face expression detection

I have coded the face expression detection using Jupyter notebook, detecting seven expressions of the face (Anger, Sad, Disgust, Happy, ...) and tried the real-time detection using the camera of my laptop. Now I want to record those expressions detected by the model in the real-time detection and create a figure of the detected expressions over time. First of all, is it possible to do so? If not, what other options do I have? For example, can I record the video taken by the camera and later detect the expressions from the video and make a figure from all the expressions detected over time? Thank you all for helping me!
You could do something like this:
from tensorflow import keras
import cv2
all_labels = ["Anger", "Sad", "Disgust", "Happy"]
# load the trained model, or train a model
model = keras.models.load_model('path/to/location')
# Open the camera
cap = cv2.VideoCapture(0)
# Or similarly open a saved video
cap = cv2.VideoCapture('path/to/video')
# Check if camera was opened correctly
if not (cap.isOpened()):
print("Could not open video device")
# Fetch one frame at a time from your camera in real-time or from the video
i=0
while(True):
# frame is a numpy array, that you can predict on
ret, frame = cap.read()
# Obtain the prediction (you may have to reshape frame according to your model)
prediction = model(frame, training=False)
# obtain a label from prediction, depending on your label list
# saving the frame in a different folder depending on label predicted
if label in all_labels:
cv2.imwrite('{}/frame_{}.jpg'.format(label, i), frame)
i = i+1
#Waits for a user input to quit the application
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
cap.release()
cv2.destroyAllWindows()
I made an answer to a similar but not identical problem. Maybe you can draw inspiration from that. Also this is a great tutorial for capturing live videos made by OpenCV.

Trying to read Thermal Data from Hikvision Camera in Python

I'm looking for solution to access thermal data of camera. well i used OpenCV and only could got original image. but there is no more data for process like temperature. I tried available library about HikVision cameras and surfed the net for this. but i could not be succeed. also I tried FLIR library but no success.
second solution that I have is converting RGB to temperature but i don't know what to do for this kind of process. Also I know the range of device temperature which is between-20 to 150 degree
looking for something like this:
# cam model: hikvision DS-2TD2615-10
import cv2
import hikvision api library for example
thermal = cv2.VideoCapture()
thermal.open("rtsp://""user:pass#ip:port/Streaming/channels/202/")
ret, frame = thermal.read()
while True:
ret, frame = thermal.read()
temp_data = api.read_temperature(frame) # -> array or excel file
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
thermal.release()
cv2.destroyAllWindows()
and my video input is something similar to this pic and for example i want to find out the nose is how much hot just by click on it:
General answer for thermal images from any camera - you can't just convert grayscale level (or color if you already applied pallete to your image) to temperature values. You need to know some coefficients which relates to IR matrix. Some software may embed that data to image file metadata but there is no standard for that. Also, if you resaved your image file without knowing it, you'll probably lose that metadata.
Also, like plain visible-light camera, IR-camera can adapt it's range to current image. So, if you're shooting a human in a room, minimum temperature in your picture will be like 22°С (cold wall or floor), maximum will be like 37°C (hottest part of human body). In that case you'll get 256 gray levels covering range of 15 degrees, so black color is 22°С, white is 37°С (keep in mind proportion is not linear!). You move your camera to a cup of hot tea with like 60°С and your relation of gray level to temperature changes. So, you need to get coefficients for every frame.
It is possible to "fix" temperature range on some cameras but that depends on specific models.
More than that - most cheap thermal cameras don't deal with temperature values at all.
P.S. Oh, I just noticed exact model of your camera. So answer is even stronger "YOU CAN'T". Don't expect capabilities of science or medical thermal camera from that chinese poorly documented surveilance hardware.

How to differentiate between two progressive images in opencv

I have a video file of evening time ( 6pm-9pm). And I want to detect movement of people on the road.
While trying to find the difference between a handful of images from "10 minute" time frame videos (10 equally time spaced images within any 10 minutes video frame clip) I'm facing these challenges:
All the images are coming as different (coming as Alert) because there is some plant moving due to wind all the time.
All the 10 images are coming different also because the sun is setting down and hence due to "natural light variation" the 10
images from 10 minute frames after coming different even though
there is no public/human movement.
How do I restrict my algorithm to focus only on movements ion certain area of the video rather than all of it ? (Couldn't find
anything on google or dont know if there's any algo in opencv for this)
This one is rather difficult to deal with. I recommend you try to blur the frames a little bit to reduce the noises from moving plants. Also, if the range of the movement is not so large, try changing the difference threshold and area threshold (if your algorithm contains contour detection as the following step). Hope this can help a little bit.
For detecting "movement" of people, a (10 frame/10 min) fps is a little too low. People in the frames can be totally different. This means you cannot detect the movement of a single person, but to find the differences between two frames. In the case where you are using low fps videos, I recommend you try Background Subtraction, to find people in the frames instead of people movements between the frames. For Background Subtraction, to solve
All the 10 images are coming different also because the sun is setting down and hence due to "natural light variation" the 10 images from 10 minute frames after coming different even though there is no public/human movement.
you can try using the average image of all frames as the background_img in
difference = current_img - background_img
If the time span is longer, you can use the average of images more recent to current_img as background_img. And keep updating background_img when running the video.
If your ROI is a rectangle in the frame, use
my_ROI = cv::Rect(x, y, width, height)
cv::Mat ROI_img= frame(my_ROI)
If not, try using a mask.
I think what you are looking for is a Pedestrian Detection. You can do this easily in Python with OpenCV package.
# Initialize a HOG descriptor
hog = cv2.HOGDescriptor()
# Set it for Pedestrian Detection
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
# Then use the detector
hog.detectMultiScale()
Exemple : Pedestrian Detection OpenCV

How do I prevent false flags in Background subtraction using OpenCV?

So, I am developing a hobby project where a camera is mounted on a bot, powered by Raspberry Pi. The bot moves around a room and does some processing based on the camera's response. I apologize, if this is not the right place to ask.
Problem:-
The camera attached to the bot will perform Background subtraction continuously.The bot will be moving simultaneously. In case the background algorithm detects an object in front of the bot, it'll stop the bot and do further processing with respect to the object. Here the working assumption is that the ground is of only one color and uniform to a great extent.
The algorithm works great under very controlled lighting conditions. The problem arises when there is slight lighting changes or when the ground has small patches/potholes/uneveness in it. The above scenarios generate false flags and as a result my bot stops. I want to know if there is any way to prevent these false flags with the help of any modifications in the following code ?
import picamera,cv2,time
camera = PiCamera()
camera.resolution = (512,512)
camera.awb_mode="fluorescent"
camera.iso = 800
camera.contrast=25
camera.brightness=64
camera.sharpness=100
rawCapture = PiRGBArray(camera, size=(512, 512))
first_time=0 # This flag is to capture the first frame as background image.
frame_buffer=0 # This flag is to change the background image every 30 frames.
counter=0
camera.start_preview()
sleep(1)
def imageSubtract(img):
luv=cv2.cvtColor(img,cv2.COLOR_BGR2LUV)
l,u,v=cv2.split(luv)
return v
for frame in camera.capture_continuous(rawCapture, format="bgr", use_video_port=True):
# Here first 10 frames are being rejected .
if first_time==0:
rawCapture.truncate(0)
if frame_buffer<10:
print("Frame rejected -",str(frame_buffer))
frame_buffer+=1
continue
os.system("clear")
refImg=frame.array
refThresh=imageSubtract(refImg)
first_time=1
frame_buffer=0
frame_buffer+=1
cv2.imshow("Background",refImg)
image = frame.array
cv2.imshow("Foreground", image)
key = cv2.waitKey(1)
rawCapture.truncate(0)
newThresh=imageSubtract(image)
diff=cv2.absdiff(refThresh,newThresh) #Here the background image is sub from foreground
kernel = np.ones((5,5),np.uint8)
diff = cv2.morphologyEx(diff, cv2.MORPH_OPEN, kernel)
diff=cv2.dilate(diff,kernel,iterations = 2)
_, thresholded = cv2.threshold(diff, 0 , 255, cv2.THRESH_BINARY +cv2.THRESH_OTSU)
_, contours, _= cv2.findContours(thresholded,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
try:
c=max(contours,key=cv2.contourArea)
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(thresholded,(x,y),(x+w,y+h),(125,125,125),2)
if cv2.contourArea(c)>300 and len(contours)<=3:
if counter==0:
print("Going to sleep for 0.1 second") # allowing the device to move ahead for 0.1 sec before processing the object
time.sleep(0.1)
counter=1
continue
else:
print("Object found !")
cv2.imshow("Threshold",thresholded)
if frame_buffer%30==0:
frame_buffer=0
refImg=image
refThresh=imageSubtract(refImg)
os.system("clear")
print("Refrence Image changed")
except Exception as e:
print(e)
pass
NOTE :- The above algorithm used continuous capture mode of the PiCamera. Also, first 10 frames are rejected because I have noticed that the PiCamera takes some time to adjust the colors once it starts up. Another thing is that the background image is being changed every 30 frames because I wanted the background image to remain as close as possible to the foreground image. Since the room is quite big, there is going to be some local changes in the light/color of the ground between one corner of the room and the other. Hence, felt the need to update the background image after every 30 frames. The object needs to have an area greater than 300 for it to be detected. I have also given a delay of 0.1 sec (time.sleep(0.1)) after the object has been detected, because I wanted the object to enter the frame completely and be right in the middle of the frame before the device stops.
Some solutions that I had in mind was :-
I thought of attaching few IR sensors at the base of the device. In case, an object is detected ( Real/False Flags), it'll check the output from IR sensor just to check if any object is being picked up by it as well. In case of shadows and patches, the IR response is going to be NULL, so the bot continues to move forward.
I thought of calculating the height of the detected object. If the height was above a certain threshold , then presence of object could have been confirmed. Otherwise it is a false flag. But, the camera is going to be facing down, which means the image is taken from top. So I don't think there is any way to ascertain the height of the object from it's top-down image.
Please suggest any alternative solutions. I want to make the above algorithm as perfect as possible because the entire working of the device depends upon the accuracy of the background subtraction algorithm.
Thank you for your help in advance !
EDIT 1-
Background Image - Back
Foreground Image - Front
Background subtraction works well when your camera is not moving and objects are moving in front of the camera. Your case is just the opposite and it will easily detect movement everywhere.
If your background is uniform, any edge detection filter may work better than your background subtraction.
Another alternative is using thresholding (see https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html). I would suggest an adaptive threshold (Gaussian or mean) that will work better if the edges are darker than the center. Try different sizes of the neighbourhood and different values for the constant C.
Then you can erode/dilate as you already do and filter by size or position as you suggest. Adding more sensors to your robot is a good idea as well.

Suggestion In Improving the code of webcam

I have written a basic code which captures an image from webcam using OpenCV & Python 2.7.
The code is as follows:
import numpy
import cv2
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
cv2.imshow('image',frame)
cap.release()
cv2.waitKey(0)
cv2.destroyAllWindows()
This code gives the correct output but my camera takes a few seconds to focus so I get a black or dim image as output instead of a bright proper focused image..
How can I solve this problem in a more mature way?
You need an "auto capture" algorithm. Auto capturing algorithms are various depending on what your case is. For example if you need take a shoot for a document that you want to OCR it later, you have to check how much this text is OCRable in order to take the image. However, in the general case there is something called Reference-less Image Quality Assurance that will help you to rate how much this image is good. Then, if it is good enough, take a shoot. However, implementing it is not an easy task.
If you need something fast and easy, just compute the sharpness of the image and depend on it to take the photo or not. See this :http://answers.opencv.org/question/5395/how-to-calculate-blurriness-and-sharpness-of-a-given-image/
Another option could be using a face detector if you are taking photos for humans. OpenCV has a cascade classifier with pre-trained model for human face. Just try to detect it and when it is detected, take the shoot.
You may also combine the last two types together in a hybrid mode. In other words, Detect the face then make sure it is sharp enough then take the photo
You could wait till the video capturing has been initialized by modifying the code as:
import cv2
cv2.namedWindow("output")
cap = cv2.VideoCapture(0)
if cap.isOpened(): # Getting the first frame
ret, frame = cap.read()
else:
ret = False
while ret:
cv2.imshow("output", frame)
ret, frame = cap.read()
key = cv2.waitKey(20)
if key == 27: # exit on Escape key
break
cv2.destroyWindow("output")

Categories