I have coded the face expression detection using Jupyter notebook, detecting seven expressions of the face (Anger, Sad, Disgust, Happy, ...) and tried the real-time detection using the camera of my laptop. Now I want to record those expressions detected by the model in the real-time detection and create a figure of the detected expressions over time. First of all, is it possible to do so? If not, what other options do I have? For example, can I record the video taken by the camera and later detect the expressions from the video and make a figure from all the expressions detected over time? Thank you all for helping me!
You could do something like this:
from tensorflow import keras
import cv2
all_labels = ["Anger", "Sad", "Disgust", "Happy"]
# load the trained model, or train a model
model = keras.models.load_model('path/to/location')
# Open the camera
cap = cv2.VideoCapture(0)
# Or similarly open a saved video
cap = cv2.VideoCapture('path/to/video')
# Check if camera was opened correctly
if not (cap.isOpened()):
print("Could not open video device")
# Fetch one frame at a time from your camera in real-time or from the video
i=0
while(True):
# frame is a numpy array, that you can predict on
ret, frame = cap.read()
# Obtain the prediction (you may have to reshape frame according to your model)
prediction = model(frame, training=False)
# obtain a label from prediction, depending on your label list
# saving the frame in a different folder depending on label predicted
if label in all_labels:
cv2.imwrite('{}/frame_{}.jpg'.format(label, i), frame)
i = i+1
#Waits for a user input to quit the application
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
cap.release()
cv2.destroyAllWindows()
I made an answer to a similar but not identical problem. Maybe you can draw inspiration from that. Also this is a great tutorial for capturing live videos made by OpenCV.
Related
I am working on detecting a QR code attached to someone from a video of that person walking away from me. I have tried a few different implementations using OpenCV and pyzbar but they are not very accurate and will detect the QR code in about 50% of frames.
I have found that cropping the image helps with detection but since it is stuck on a person, the cropping would have to be much more complex/robust than I would like to use.
I only really need to decode/read the QR code in a few frames of the video. Detecting its location is more important than reading and detecting if it exists at all is most important.
I can't fully understand how the OpenCV implementation for QRCodeDetector().detect() works but I am trying to find a method (does not have to be OpenCV or pyzbar) where I could find a QR code within a confidence interval. I am not sure what threshold the implementations I have tried so far use, but sometimes between frames, it does not look like the QR code changes position/orientation but it cannot 'detect' it in subsequent frames, so I am looking for some way to relax those requirements.
This is an example of a frame from the video. This shows the relative size of the QR code to the entire frame that needs to be searched.
This is my current setup:
video = cv2.VideoCapture(input_video)
ret, frame = video.read()
while ret:
ret, frame = video.read()
qrCodeDetector = cv2.QRCodeDetector()
points = qrCodeDetector.detect(image)[1]
if points is not None:
points = points[0]
# get center of all points
center = tuple(np.mean(np.array(points),axis=0).astype(int))
# draw circle around QR code
cv2.circle(frame,center,50,color=(255,0,0),thickness=2)
cv2.imshow('frame', frame)
cv2.waitKey(1)
video.release()
cv2.destroyAllWindows()
For these cases, you should probably rely on a Machine/Deep Learning based solution. As far as I have experimented, pyzbar works pretty well for the decoding, but have a lower detection rate. For cases like the one you propose, I tend to use QReader. It actually implements a Detection + Decode pipeline that is supported by a YoloV7 QR detection model. And uses pyzbar, sweetened with some image preprocessing fallbacks, on the decoding side.
from qreader import QReader
import cv2
# Create a QReader instance
qreader = QReader()
video = cv2.VideoCapture(input_video)
ret, frame = video.read()
while ret:
# Read the image and cast it to RGB
ret, frame = cv2.cvtColor(video.read(), cv2.COLOR_BGR2RGB)
# Use the detect_and_decode function to get the decoded QR data
found_qrs = qreader.detect_and_decode(image=image, return_bboxes=True)
# Process the detections as you like
for (x1, y1, x2, y2), decoded_text in found_qrs:
# Draw on your image
pass
video.release()
I'm working on a project that will eventually have to process webcam images in real-time. I have some suitable test videos that I use to test my program. However, I don't know how to simulate real-time processing with a video file. I can read in each frame and process it, but this is not realistic since the algorithm is too heavy to run on every frame. I would like to 'stream' the video separately and pull in a frame each time the algorithm starts to test with a realistic fps, but I don't know how to do this.
Here is the basic layout of the code you can use :
import cv2
cap = cv2.VideoCapture('path to video file')
while cap.isOpened():
ret,frame = cap.read()
### YOUR CODE HERE ###
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows() # destroy all opened windows
I was doing a frame slicing from the OpenCV Library in Python, and I am successfully able to create frames from the video being tested on.
I am doing it on a CCTV Camera installed at a parking entry gateway where the video plays 24x7, and at times the car is standing still for good number of minutes, leading to having consecutive frames of the same vehicle.
My question is how can I create a frame only when a new vehicle enters the parking lot?
Stackoverflow is for code related queries. I suggest you try some code and share your results and your problems before posting anything here. That being said, you can start with object detection tutorials like this and then do tracking with sort. Many pre trained models are available that include the cars class so you won't even need to train a new model.
Do you need to detect license plates, etc? Or just notice if something happens? For the latter, you could use a very simple approach. Take an average of say the frames of the last 30 seconds and subtract that from a current frame. If the mean absolute average of the delta image is above a threshold, that could be the change you are looking for.
You could do some simpler motion detection with opencv, it's nicely explained in https://www.pyimagesearch.com/2015/05/25/basic-motion-detection-and-tracking-with-python-and-opencv/
So if you have a picture of the background as reference, you can compare each new image to the background and only save the image if it's different enough from the background (hopefully only when a car entered). Then make this the new background and reset for a new car when the new images again start looking like the original background.
Hopefully I stated my idea clear enough and that link provides enough information to implement it. If not just ask for a clarification!
First you have to have a specific xml to detect only cars.You can get it from the here.I have developed a code just to uniquely identify and count the cars that are visible to the cctv you are using,sometimes it totally depends on the frame rate and detection too,so you can control the frame rate and also the total count variable.
import cv2
cascade_src = 'cars.xml'
cap = cv2.VideoCapture('rtsp_of_ur_cctv')
car_cascade = cv2.CascadeClassifier(cascade_src)
prev_count=0
total_count=0
while True:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cars = car_cascade.detectMultiScale(gray, 1.1, 1)
if len(cars)>prev_count:
diffrence=len(cars)-prev_count
total_count=total_count+diffrence
#here yo can save unique new entry and possibly avoid the recursive ones
print(total_count)
for (x,y,w,h) in cars:
cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
prev_count=len(cars)
cv2.imshow('video', img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I am using openCv for making video processing. What I do is reading a video frame by frame, then applying some processing on each frame and then displaying the new modified frame. My code looks like that :
video_capture = cv2.VideoCapture('video.mp4')
while True:
# Capture frame-by-frame
ret, frame = video_capture.read()
# Applying some processing to frame
.
.
.
# Displaying the new frame with processing
img=cv2.imshow('title', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
This way I can display instantaneously the processed video. The problem is the display is lagging much due to the presence of the 'waitkey'. Is there another way for display images in real time to form a video, but with another module than cv2?
Thank you
One option is Tkinter, you can find some information here. Along with Tkinter, it uses python-gstreamer and python-gobject. It is much more complicated to set up, however it allows for more customization options.
I have written a basic code which captures an image from webcam using OpenCV & Python 2.7.
The code is as follows:
import numpy
import cv2
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
cv2.imshow('image',frame)
cap.release()
cv2.waitKey(0)
cv2.destroyAllWindows()
This code gives the correct output but my camera takes a few seconds to focus so I get a black or dim image as output instead of a bright proper focused image..
How can I solve this problem in a more mature way?
You need an "auto capture" algorithm. Auto capturing algorithms are various depending on what your case is. For example if you need take a shoot for a document that you want to OCR it later, you have to check how much this text is OCRable in order to take the image. However, in the general case there is something called Reference-less Image Quality Assurance that will help you to rate how much this image is good. Then, if it is good enough, take a shoot. However, implementing it is not an easy task.
If you need something fast and easy, just compute the sharpness of the image and depend on it to take the photo or not. See this :http://answers.opencv.org/question/5395/how-to-calculate-blurriness-and-sharpness-of-a-given-image/
Another option could be using a face detector if you are taking photos for humans. OpenCV has a cascade classifier with pre-trained model for human face. Just try to detect it and when it is detected, take the shoot.
You may also combine the last two types together in a hybrid mode. In other words, Detect the face then make sure it is sharp enough then take the photo
You could wait till the video capturing has been initialized by modifying the code as:
import cv2
cv2.namedWindow("output")
cap = cv2.VideoCapture(0)
if cap.isOpened(): # Getting the first frame
ret, frame = cap.read()
else:
ret = False
while ret:
cv2.imshow("output", frame)
ret, frame = cap.read()
key = cv2.waitKey(20)
if key == 27: # exit on Escape key
break
cv2.destroyWindow("output")