I'm pretty new to opencv and I've been trying to build an automatic stat tracker for a game I'm playing. I want to collect videos to test the object recognition and stat tracking code, and thought I might as well use the capture code I've already written to make the test videos. So I've been using the pywin32 api to make images for opencv to process, and thought I'd just directly use those images to save to video.
The problem I'm having is that the videos being made are invalid and useless.
So now that I've gone through the background I can move on to the actual questions.
Is there a specific format the opencv images need to be in before they can be saved? or is there a way to possibly feed the images through the VideoCapture class in a non clunky way (like recapturing the cv.imshow window)? I don't want to directly use the VideoCapture class, because it requires the specified window to be always on top. I might also just be doing it wrong.
Code included below.
class VideoScreenCapture:
frame_time_limit = 0
time_elapsed = 0
frame_width = None
frame_height = None
frame_size = None
fps = None
window_capture = None
filming = False
output_video = None
scaling_factor = 2/3
def __init__(self, window_capture=None, fps=5):
"""
window_capture: a custom window capture object that is associated with the window to be captured
fps: the frames per second of the resulting video
"""
self.fps = fps
# calculate the amount of time per frame (inverse of fps)
self.frame_time_limit = 1/self.fps
self.window_capture = window_capture
# get the size (in pixels) of the window getting captured
self.frame_width, self.frame_height, self.frame_size = self.window_capture.get_frame_dimentions()
# initialize video writer
self.output_video = cv.VideoWriter(
f'assets{sep}Videos{sep}test.avi',
cv.VideoWriter_fourcc(*'XVID'),
self.fps,
self.frame_size)
def update(self, dt):
"""dt: delta time -> the time since the last update"""
# only updates if the camera is on
if self.filming:
# update the elapsed time
self.time_elapsed += dt
# if the elapsed time is more than the time segment calculated for the given fps
if self.time_elapsed >= self.frame_time_limit:
self.time_elapsed -= self.frame_time_limit
# window capture takes screenshot of the capture window and resizes it to the wanted side
frame = self.window_capture.get_screenshot()
frame = cv.resize(
frame,
dsize=(0, 0),
fx=self.scaling_factor,
fy=self.scaling_factor)
# the image is then saved to video
self.output_video.write(frame)
# show the image in a separate window
cv.imshow("Computer Vision", frame)
"""methods for stopping and starting the video capture"""
def start(self):
self.filming = True
def stop(self):
self.filming = False
"""releases video writer: use when done"""
def release_vw(self):
self.output_video.release()
This while loop that runs the code
vsc.start()
loop_time = time()
while cv.waitKey(1) != ord('q'):
vsc.update(time() - loop_time)
loop_time = time()
I was able to get the current image from a Thorlabs uc480 camera using instrumental. My issue is when I try to adjust the parameters for grab_image. I can change cx and left to any value and get an image. But cy and top only works if cy=600 and top=300. The purpose is to create a GUI so that the user can select values for these parameters to zoom in/out an image.
Here is my code
import instrumental
from instrumental.drivers.cameras import uc480
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
paramsets = instrumental.list_instruments()
cammer = instrumental.instrument(paramsets[0])
plt.figure()
framer= cammer.grab_image(timeout='1s',copy=True,n_frames=1,exposure_time='5ms',cx=640,
left=10,cy=600,top=300)
plt.pcolormesh(framer)
The above code does not give an image if I choose cy=600 and top=10. Are there any particular value set to be used for these parameters? How can I get an image of the full sensor size?
Thorlabs has a Python programming interface available as a download on their website. It is very well documented, and can be installed locally via pip.
Link:
https://www.thorlabs.com/software_pages/ViewSoftwarePage.cfm?Code=ThorCam
Here is an example of a simple capture algorithm that might help get you started:
from thorlabs_tsi_sdk.tl_camera import TLCameraSDK
from thorlabs_tsi_sdk.tl_mono_to_color_processor import MonoToColorProcessorSDK
from thorlabs_tsi_sdk.tl_camera_enums import SENSOR_TYPE
# open the TLCameraSDK dll
with TLCameraSDK() as sdk:
cameras = sdk.discover_available_cameras()
if len(cameras) == 0:
print("Error: no cameras detected!")
with sdk.open_camera(cameras[0]) as camera:
#camera.disarm() # ensure any previous session is closed
# setup the camera for continuous acquisition
camera.frames_per_trigger_zero_for_unlimited = 0
camera.image_poll_timeout_ms = 2000 # 2 second timeout
camera.arm(2)
# need to save the image width and height for color processing
image_width = camera.image_width_pixels
image_height = camera.image_height_pixels
# initialize a mono to color processor if this is a color camera
is_color_camera = (camera.camera_sensor_type == SENSOR_TYPE.BAYER)
mono_to_color_sdk = None
mono_to_color_processor = None
if is_color_camera:
mono_to_color_sdk = MonoToColorProcessorSDK()
mono_to_color_processor = mono_to_color_sdk.create_mono_to_color_processor(
camera.camera_sensor_type,
camera.color_filter_array_phase,
camera.get_color_correction_matrix(),
camera.get_default_white_balance_matrix(),
camera.bit_depth
)
# begin acquisition
camera.issue_software_trigger()
# get the next frame
frame = camera.get_pending_frame_or_null()
# initialize frame attempts and max limit
frame_attempts = 0
max_attempts = 10
# if frame is null, try to get a frame until
# successful or until max_attempts is reached
if frame is None:
while frame is None:
frame = camera.get_pending_frame_or_null()
frame_attempts += 1
if frame_attempts == max_attempts:
raise TimeoutError("Timeout was reached while polling for a frame, program will now exit")
image_data = frame.image_buffer
if is_color_camera:
# transform the raw image data into RGB color data
color_data = mono_to_color_processor.transform_to_24(image_data, image_width, image_height)
save_data = np.reshape(color_data,(image_height, image_width,3))
camera.disarm()
You can also process the image after capture with the PIL library.
I have a flask application which reads frame from camera and streams it to the website.
Camera.py
from threading import Thread
from copy import deepcopy
import queue
import cv2
class Camera(Thread):
def __init__(self, cam, normalQue, detectedQue):
Thread.__init__(self)
self.__cam = cam
self.__normalQue = normalQue
self.__detectedQue = detectedQue
self.__shouldStop = False
def __del__(self):
self.__cam.release()
print('Camera released')
def run(self):
while True:
rval, frame = self.__cam.read()
if rval:
frame = cv2.resize(frame, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_AREA)
_, jpeg = cv2.imencode('.jpg', frame)
self.__normalQue.put(jpeg.tobytes())
self.__detectedQue.put(deepcopy(jpeg.tobytes()))
if self.__shouldStop:
break
def stopCamera(self):
self.__shouldStop = True
From what you can see I am just reading the frame, resizing it and storing in two different ques. Nothing too complex.
I also have two two classes responsible for mjpeg stream:
NormalVideoStream.py
from threading import Thread
import traceback
import cv2
class NormalVideoStream(Thread):
def __init__(self, framesQue):
Thread.__init__(self)
self.__frames = framesQue
self.__img = None
def run(self):
while True:
if self.__frames.empty():
continue
self.__img = self.__frames.get()
def gen(self):
while True:
try:
if self.__img is None:
print('Normal stream frame is none')
continue
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + self.__img + b'\r\n')
except:
traceback.print_exc()
print('Normal video stream genenation exception')
and
DetectionVideoStream.py
from threading import Thread
import cv2
import traceback
class DetectionVideoStream(Thread):
def __init__(self, framesQue):
Thread.__init__(self)
self.__frames = framesQue
self.__img = None
self.__faceCascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
def run(self):
while True:
if self.__frames.empty():
continue
self.__img = self.__detectFace()
def gen(self):
while True:
try:
if self.__img is None:
print('Detected stream frame is none')
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + self.__img + b'\r\n')
except:
traceback.print_exc()
print('Detection video stream genenation exception')
def __detectFace(self):
retImg = None
try:
img = self.__frames.get()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = self.__faceCascade.detectMultiScale(gray, 1.1, 4)
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
(_, encodedImage) = cv2.imencode('.jpg', img)
retImg = encodedImage.tobytes()
except:
traceback.print_exc()
print('Face detection exception')
return retImg
From what you can see in both streams I am reading camera frames from ques in infinite loop. Both classes also have gen() method which generates frame to site itself. Only difference is that in detection stream I am also doing face recognition.
Now in my main file:
main.py
from flask import Blueprint, render_template, Response, abort, redirect, url_for
from flask_login import login_required, current_user
from queue import Queue
from . import db
from .Camera import Camera
from .NormalVideoStream import NormalVideoStream
from .DetectionVideoStream import DetectionVideoStream
from .models import User
import cv2
main = Blueprint('main', __name__)
# Queues for both streams
framesNormalQue = Queue(maxsize=0)
framesDetectionQue = Queue(maxsize=0)
print('Queues created')
# RPi camera instance
camera = Camera(cv2.VideoCapture(0), framesNormalQue, framesDetectionQue)
camera.start()
print('Camera thread started')
# Streams
normalStream = NormalVideoStream(framesNormalQue)
detectionStream = DetectionVideoStream(framesDetectionQue)
print('Streams created')
normalStream.start()
print('Normal stream thread started')
detectionStream.start()
print('Detection stream thread started')
#main.route('/')
def index():
return render_template('index.html')
#main.route('/profile', methods=["POST", "GET"])
def profile():
if not current_user.is_authenticated:
abort(403)
return render_template('profile.html', name=current_user.name, id=current_user.id, detectionState=current_user.detectionState)
#main.route('/video_stream/<int:stream_id>')
def video_stream(stream_id):
if not current_user.is_authenticated:
abort(403)
print(f'Current user detection: {current_user.detectionState}')
global detectionStream
global normalStream
stream = None
if current_user.detectionState:
stream = detectionStream
print('Stream set to detection one')
else:
stream = normalStream
print('Stream set to normal one')
return Response(stream.gen(), mimetype='multipart/x-mixed-replace; boundary=frame')
#main.route('/detection')
def detection():
if not current_user.is_authenticated:
abort(403)
if current_user.detectionState:
current_user.detectionState = False
else:
current_user.detectionState = True
user = User.query.filter_by(id=current_user.id)
user.detectionState = current_user.detectionState
db.session.commit()
return redirect(url_for('main.profile', id=current_user.id, user_name=current_user.name))
#main.errorhandler(404)
def page_not_found(e):
return render_template('404.html'), 404
#main.errorhandler(403)
def page_forbidden(e):
return render_template('403.html'), 403
I am creating camera, ques and streams object globally. Also when user logs in on website, he will be able to see live video stream. There is also a button which changes the stream which is currently presented.
Whole project is working well with one exception: when I change stream to detection one it will have huge lag (around 10/15 seconds) which makes whole thing unfunctional. Tried to search a bug/optimalization in my own but can't find anything. On purpose I am running everything on separate threads to unoverload app but it looks like this is not enought. Lag on a level of 1 - 2 seconds will be acceptable, but not 10+. So guys, maybe you can see some bug here? Or know how to optimalize it?
Also need to mention that whole app is running on RPi 4B 4GB and I am accessing website on my desktop. Default server is changed to the Nginx and Gunicorn. From what I can see Pi's CPU usage is 100% when app is working. When testing on default server behaviout is the same. Guess that 1,5 GHz CPU have enough power to run it more smoothly.
One option is using VideoStream
The reason VideoCapture is so slow because the VideoCapture pipeline spends the most time on the reading and decoding the next frame. While the next frame is being read, decode, and returned the OpenCV application is completely blocked.
VideoStream solves the problem by using a queue structure, concurrently read, decode, and return the current frame.
VideoStream supports both PiCamera and webcam.
All you need to is:
Install imutils:
For virtual environment: pip install imutils
For anaconda environment: conda install -c conda-forge imutils
Initialize VideoStream on main.py
import time
from imutils.video import VideoStream
vs = VideoStream(usePiCamera=True).start() # For PiCamera
# vs = VideoStream(usePiCamera=False).start() # For Webcam
camera = Camera(vs, framesNormalQue, framesDetectionQue)
In your Camera.py
In run(self) method:
* ```python
def run(self):
while True:
frame = self.__cam.read()
frame = cv2.resize(frame, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_AREA)
_, jpeg = cv2.imencode('.jpg', frame)
self.__normalQue.put(jpeg.tobytes())
self.__detectedQue.put(deepcopy(jpeg.tobytes()))
if self.__shouldStop:
break
```
One of the issue which even I had was regrading the encoding and decoding. Like the encoder of Opencv is too slow so try to use encoder from simplejpeg. Use pip3 install simplejpeg and with respect to using cv2.imencode() use simplejpeg.encode_jpeg()
I'm not really surprised about your problem, in general "detection" using a lot of your computation time, because
performing a cascaded classification algorithm is a demanding computational task.
I found a source which compares cascaded classification algos for there performance link
An easy solution, would be to reduce the frame rate, when
processing your detection.
An easy implementation to reduce performance demand could be something like a skip counter e.g.
frameSkipFactor = 3 # use every third frame
frameCounter = 0
if (frameCounter%frameSkipFactor==0):
#process
else:
print("ignore frame", frameCounter)
frameCounter+=1
Nevertheless you will have a lag, because the detection calculation
will produce always a time offset.
I you planning to construct a "real time" classification camera system, please look for another class of classification algos,
which are more designed for this use-case.
I followed an discussion here: real time class algos
Another solution could be using a bigger "hardware hammer" than the rpi e.g. an GPU implementation of the algo via Cuda etc.
What I have found that main reason of getting slow frames is that you have higher resolution and frame rates.
To tackle this problem you can change the resolution to something 640 width by 480 height with fps 30 or less upto 5 fps (if you only need face detection) and can implement resizing of OpenCV (cv2.resize() function) by 0.25 resizing factor of fx and fy. Do this if you don't want higher resolution streams. Works smoothly with opencv. I wanted to try out that VideoStream code (from imutils) as it is also used by Adrian Rosebrock of PyImageSearch. I will use in later projects.
For reference, I am posting code snippet in the following. Special Thanks to Adrian Rosebrock and ageitey face_recognition as their code helped me making it.
class Camera(object):
SHRINK_RATIO = 0.25
FPS = 5
FRAME_RATE = 5
WIDTH = 640
HEIGHT = 480
def __init__(self):
""" initializing camera with settings """
self.cap = cv2.VideoCapture(0)
self.cap.set(3,Camera.WIDTH)
self.cap.set(4,Camera.HEIGHT)
self.cap.set(5,Camera.FPS)
self.cap.set(7,Camera.FRAME_RATE)
def get_frame(self):
""" get frames from the camera """
success, frame = self.cap.read()
if success == True:
# Resizing to 0.25 scale
rescale_frame = cv2.resize(frame, None, fx= Camera.SHRINK_RATIO, fy=Camera.SHRINK_RATIO)
cascPath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
gray = cv2.cvtColor(rescale_frame, cv2.COLOR_BGR2GRAY)
# detecting faces
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30,30)
)
# Draw a rectangle around the faces
if len(faces) != 0:
for (x, y, w, h) in faces:
x *=4
y *=4
w *=4
h *=4
# Draw rectangle across faces in current frame
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
# return frame outside if-statement
return frame
Also keep in mind that JPEG is the accelerated codec, use:
cv2.imencode(".jpg",frame)[1].tobytes()
I am trying to make an application with a flask that
Captures a video stream from a webcam
Applies an object detection algorithm on each frame
Concurrently displays the frames with bounding boxes and data provided by the above function in the form of a video
The problem is that due to a function operation, the result renders with a bit of lag. To overcome this I tried to run the function over another thread.
Can you please help in running the function over each frame of the video and displaying the resultant frames in the form of a video?
My main function looks somewhat like :
def detect_barcode():
global vs, outputFrame, lock
total = 0
# loop over frames from the video stream
while True:
frame = vs.read()
frame = imutils.resize(frame, width=400)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
timestamp = datetime.datetime.now()
cv2.putText(frame, timestamp.strftime(
"%A %d %B %Y %I:%M:%S%p"), (10, frame.shape[0] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)
if total > frameCount:
# detect barcodes in the image
barcodes = md.detect(gray)
# check to see if barcode was found
if barcode in barcodes:
x,y,w,h = cv2.boundRect(barcode)
cv2.rectangle(frame,x,y,(x+w),(y+h),(255,0,0),2)
total += 1
# lock
with lock:
outputFrame = frame.copy()
if __name__ == '__main__':
# start a thread that will perform motion detection
t = threading.Thread(target=detect_barcode)
t.daemon = True
t.start()
app.run(debug=True)
vs.stop()
I literally just did something like this. My solution was asyncio's concurrent.futures module.
import concurrent.futures as futures
Basically, you create your executor and 'submit' your blocking method as a parameter. It returns something called Future, which you can check the result of (if any) every frame.
executor = futures.ThreadPoolExecutor()
would initialize the executor
future_obj = executor.submit(YOUR_FUNC, *args)
would start execution. Sometime later on you could check its status
if future_obj.done():
print(future_obj.result())
else:
print("task running")
# come back next frame and check again
I am facing issue while accessing USB camera using beagle-bone black wireless.
Firstly the error is "select timeout" exception which was resolved by this post
Now I am facing the black screen in output.
Here is the testing code I am using.
from cv2 import *
# initialize the camera
cam = VideoCapture(0) # 0 -> index of camera
print "Cam capture"
cam.set(3,320)
cam.set(4,240)
print "Cam set"
s, img = cam.read()
print "Cam read"
if s: # frame captured without any errors
namedWindow("cam-test",CV_WINDOW_AUTOSIZE)
imshow("cam-test",img)
while True:
key = waitKey(30)
if key == ord('q') :
destroyWindow("cam-test")
I have already check that video0 in /dev directory.
The issue is that you need to call 'cam.read()andimshow()` inside the while loop
What you're doing is that you're reading just the first frame, then showing it, and you whileloop isn't doing anything. When the camera boots, the first frame is just a blank screen , which is what you see.
The code should be more like:
while True:
s, img = cam.read()
imshow("cam-test",img)
key = waitKey(30)
if key == ord('q') :
destroyWindow("cam-test")