I need to extract specific frames of an online video to work on an algorithm but I don't want to download the whole video because that would make it highly inefficient.
For starters, I tried working with youtube videos. I can download whole of the video using youtube-dl in this way:
ydl_opts = {'outtmpl': r'OUTPUT_DIRECTORY_HERE',}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download([url])
And then I can capture individual frames.
I need to avoid downloading the whole video. After some research, I have found that ffmpeg might help me do this. I found no way to download just the frames so if this is not possible, the second option is that I can download specific portions of the video. One such example in linux is here but I couldn't find any solution for python.
What is a good way to download just the frames, or portions of videos (in python) without downloading the entire thing?
Just to add on to the current answer, performance can further be enhanced using multiprocessing. For example, if you wanted to split up the video into frames and process them independently in num_cpu processes:
import os
from functools import partial
from multiprocessing.pool import Pool
import cv2
import youtube_dl
def process_video_parallel(url, skip_frames, process_number):
cap = cv2.VideoCapture(url)
num_processes = os.cpu_count()
frames_per_process = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) // num_processes
cap.set(cv2.CAP_PROP_POS_FRAMES, frames_per_process * process_number)
x = 0
count = 0
while x < 10 and count < frames_per_process:
ret, frame = cap.read()
if not ret:
break
filename =r"PATH\shot"+str(x)+".png"
x += 1
cv2.imwrite(filename.format(count), frame)
count += skip_frames # Skip 300 frames i.e. 10 seconds for 30 fps
cap.set(1, count)
cap.release()
video_url = "..." # The Youtube URL
ydl_opts = {}
ydl = youtube_dl.YoutubeDL(ydl_opts)
info_dict = ydl.extract_info(video_url, download=False)
formats = info_dict.get('formats', None)
print("Obtaining frames")
for f in formats:
if f.get('format_note', None) == '144p':
url = f.get('url', None)
cpu_count = os.cpu_count()
with Pool(cpu_count) as pool:
pool.map(partial(process_video_parallel, url, 300), range(cpu_count))
For the purposes of this application, since images are just being saved from the video, this may not result in a huge improvement (maybe a few seconds), but if additional algorithms needed to be applied on the frames, it could be beneficial.
I tried what #AyeshaKhan shared in the comments.
After importing cv2,numpy,youtube-dl:
url=saved_url #The Youtube URL
ydl_opts={}
ydl=youtube_dl.YoutubeDL(ydl_opts)
info_dict=ydl.extract_info(video_url, download=False)
formats = info_dict.get('formats',None)
print("Obtaining frames")
for f in formats:
if f.get('format_note',None) == '144p':
url = f.get('url',None)
cap = cv2.VideoCapture(url)
x=0
count=0
while x<10:
ret, frame = cap.read()
if not ret:
break
filename =r"PATH\shot"+str(x)+".png"
x+=1
cv2.imwrite(filename.format(count), frame)
count+=300 #Skip 300 frames i.e. 10 seconds for 30 fps
cap.set(1,count)
if cv2.waitKey(30)&0xFF == ord('q'):
break
cap.release()
The answer in the comments was downloading all of the frames so the count I added in .format() ensured that I skipped the frames as per my requirement.
Additionally, x here limits the number to 10.
Although, I am still not sure whether this method is actually capturing the specified frames, or if it is capturing all the frames and just saving the specified frames to my local storage. I needed the former thing.
But this is still fast enough and works for me!
Alternative to #danielcahall,
This method uses Ray for parallelization instead of Pool
Note: For the first time, it might take more time for initializing the ray components, After the first run, this will be fine
from timeit import default_timer as timer
import os, ray, shutil
import cv2
import youtube_dl
try:
os.makedirs('test_fol')
except:
shutil.rmtree('test_fol')
os.makedirs('test_fol')
ray.init()
#ray.remote
def process_video_parallel(url, total_frames, process_number):
cap = cv2.VideoCapture(url)
num_processes = os.cpu_count()
frames_per_process = int(total_frames) // num_processes
cap.set(cv2.CAP_PROP_POS_FRAMES, frames_per_process * process_number)
count = frames_per_process * process_number
while count < frames_per_process * (process_number+1):
ret, frame = cap.read()
if not ret:
break
filename = f"test_fol/{count}.jpg"
cv2.imwrite(filename, frame)
count += 1
cap.release()
t1 = timer()
video_url = "..." # The Youtube URL
ydl_opts = {}
ydl = youtube_dl.YoutubeDL(ydl_opts)
info_dict = ydl.extract_info(video_url, download=False)
duration = info_dict['duration']
formats = info_dict.get('formats', None)
for f in formats:
if f.get('format_note', None) == '144p':
url = f.get('url', None)
cpu_count = os.cpu_count()
data = ray.get([process_video_parallel.remote(url, int(duration*31), x) for x in range(cpu_count)])
break
print("Total Time", timer()-t1)
Related
I am using multiprocessing to get frames of a video using Opencv in python.
My class looks like this :-
import cv2
from multiprocessing import Process, Queue
class StreamVideos:
def __init__(self):
self.image_data = Queue()
def start_proces(self):
p = Process(target=self.echo)
p.start()
def echo(self):
cap = cv2.VideoCapture('videoplayback.mp4')
while cap.isOpened():
ret,frame = cap.read()
self.image_data.put(frame)
# print("frame")
I start the process "echo" using :-
p = Process(target=self.echo)
p.start()
the echo function looks like this :-
def echo(self):
cap = cv2.VideoCapture('videoplayback.mp4')
while cap.isOpened():
ret,frame = cap.read()
self.image_data.put(frame)
in which i am using queue where i put these frames
self.image_data.put(frame)
and then in another process I start reviving these frames
self.obj = StreamVideos()
def start_process(self):
self.obj.start_proces()
p = Process(target=self.stream_videos)
p.start()
def stream_videos(self):
while True:
self.img = self.obj.image_data.get()
print(self.img)
but as soon as I start putting frames to queue, the ram gets filled very quickly and the system gets stuck. The video I am using is just 25 fps and 39mb in size, so it does not make any sense.
One thing I noticed is that the "echo" process is putting a lot of frames in the queue before the "stream_videos" process retrives it.
What could be the root of this problem?
Thanks in advance.
Expectations: -
Able to retrieve the frames continuosly.
Tried :-
Not putting frames in queue, in which case the ram is not filled.
The following is a general purpose single producer/multiple consumer implementation. The producer (class StreamVideos) creates a shared memory array whose size is the number of bytes in the video frame. One or more consumers (you specify the number of consumers to StreamVideos) can then call StreamVideos.get_next_frame() to retrieve the next frame. This method converts the shared array back into a numpy array for subsequent processing. The producer will only read the next frame into the shared array after all consumers have called get_next_frame:
#!/usr/bin/env python3
import multiprocessing
import numpy as np
import ctypes
import cv2
class StreamVideos:
def __init__(self, path, n_consumers):
"""
path is the path to the video:
n_consumers is the number of tasks to which we will be sreaming this.
"""
self._path = path
self._event = multiprocessing.Event()
self._barrier = multiprocessing.Barrier(n_consumers + 1, self._reset_event)
# Discover how large a framesize is by getting the first frame
cap = cv2.VideoCapture(self._path)
ret, frame = cap.read()
if ret:
self._shape = frame.shape
frame_size = self._shape[0] * self._shape[1] * self._shape[2]
self._arr = multiprocessing.RawArray(ctypes.c_ubyte, frame_size)
else:
self._arr = None
cap.release()
def _reset_event(self):
self._event.clear()
def start_streaming(self):
cap = cv2.VideoCapture(self._path)
while True:
self._barrier.wait()
ret, frame = cap.read()
if not ret:
# No more readable frames:
break
# Store frame into shared array:
temp = np.frombuffer(self._arr, dtype=frame.dtype)
temp[:] = frame.flatten(order='C')
self._event.set()
cap.release()
self._arr = None
self._event.set()
def get_next_frame(self):
# Tell producer that this consumer is through with the previous frame:
self._barrier.wait()
# Wait for next frame to be read by the producer:
self._event.wait()
if self._arr is None:
return None
# Return shared array as a numpy array:
return np.ctypeslib.as_array(self._arr).reshape(self._shape)
def consumer(producer, id):
frame_name = f'Frame - {id}'
while True:
frame = producer.get_next_frame()
if frame is None:
break
cv2.imshow(frame_name, frame)
cv2.waitKey(1)
cv2.destroyAllWindows()
def main():
producer = StreamVideos('videoplayback.mp4', 2)
consumer1 = multiprocessing.Process(target=consumer, args=(producer, 1))
consumer1.start()
consumer2 = multiprocessing.Process(target=consumer, args=(producer, 2))
consumer2.start()
"""
# Run as a child process:
producer_process = multiprocessing.Process(target=producer.start_streaming)
producer_process.start()
producer_process.join()
"""
# Run in main process:
producer.start_streaming()
consumer1.join()
consumer2.join()
if __name__ == '__main__':
main()
I have the code to get images from the video stream of a laptop camera. I want to reduce the photo saving interval to one photo per minute. The original code looks like this
# Importing all necessary libraries
import cv2
import os
# Read the video from specified path
cam = cv2.VideoCapture(0)
try:
# creating a folder named data
if not os.path.exists('data'):
os.makedirs('data')
# if not created then raise error
except OSError:
print ('Error: Creating directory of data')
# frame
currentframe = 0
while(True):
# reading from frame
ret,frame = cam.read()
if ret:
# if video is still left continue creating images
name = './data/frame' + str(currentframe) + '.jpg'
print ('Creating...' + name)
# writing the extracted images
cv2.imwrite(name, frame)
# increasing counter so that it will
# show how many frames are created
currentframe += 1
else:
break
# Release all space and windows once done
cam.release()
cv2.destroyAllWindows()
For this task I try to use the parameter CAP_PROP_POS_MSEC
[...]
# Read the video from specified path
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_POS_MSEC,20000)
[...]
while(True):
[...]
# writing the extracted images
cv2.imwrite(name, frame)
cv2.waitKey()
[...]
But, the saving speed remains the same and I see the following error
videoio error v4l2 property pos_msec is not supported
I use Ubuntu 18.04, Python 3.7, and OpenCV 4.1.
Where do I have a mistake, and whether I chose the right way to minimize the load on my computer's resources?
UPD
Using the recommendation of J.D. this code is working
import cv2
import os
import time
prev_time = time.time()
delay = 1 # in seconds
# Read the video from specified path
cam = cv2.VideoCapture(0)
currentframe = 0
while (True):
# reading from frame
ret, frame = cam.read()
if ret:
if time.time() - prev_time > delay:
# if video is still left continue creating images
name = './data/frame' + str(currentframe) + '.jpg'
print('Creating...' + name)
# writing the extracted images
cv2.imwrite(name, frame)
currentframe += 1
prev_time = time.time()
else:
break
EDIT: this answer is not a good solution - due to the frame buffer, as described in the comments. Because of the relevant information in the comments I will leave the answer.
If you don't plan to expand the code to do other things, you can just use the waitkey:
cv2.waitKey(60000) will freeze code execution 60 secs.
If you want to expand the code, you have to create a time based loop:
import time
prev_time = time.time()
count = 0
delay = 1 # in seconds
while True:
if time.time()-prev_time > delay:
count += 1
print(count)
prev_time = time.time()
you should read separately and set the values again for each image, like this:
while(True):
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_POS_MSEC,20000)
# reading from frame
ret,frame = cam.read()
if ret:
# if video is still left continue creating images
name = './data/frame' + str(currentframe) + '.jpg'
print ('Creating...' + name)
# writing the extracted images
cv2.imwrite(name, frame)
cv2.waitKey()
# increasing counter so that it will
# show how many frames are created
currentframe += 1
else:
break
cam.release()
and check your openCV version for python about the error.
Here are my goals.
Capture video continuously until 'q; is pressed
Every ten seconds save the video in created directory file
Continue step two until 'q' is pressed
I am executing the following code. But when creating files it's creating 6kb files and saying cannot play. I am fairly new to opencv and python. Not sure what I am missing. Running this code on pycharm with Python 3.6. Also the
cv2.imshow('frame',frame)
stops after ten seconds but recording is happening in background and files are created.
import numpy as np
import cv2
import time
import os
import random
import sys
fps=24
width=864
height=640
video_codec=cv2.VideoWriter_fourcc('D','I','V','X')
name = random.randint(0,1000)
print (name)
if (os.path.isdir(str(name)) is False):
name = random.randint(0,1000)
name=str(name)
name = os.path.join(os.getcwd(), str(name))
print('ALl logs saved in dir:', name)
os.mkdir(name)
cap = cv2.VideoCapture(0)
ret=cap.set(3, 864)
ret=cap.set(4, 480)
cur_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
start=time.time()
video_file_count = 1
video_file = os.path.join(name, str(video_file_count) + ".avi")
print('Capture video saved location : {}'.format(video_file))
while(cap.isOpened()):
start_time = time.time()
ret, frame = cap.read()
if ret==True:
cv2.imshow('frame',frame)
if (time.time() - start > 10):
start = time.time()
video_file_count += 1
video_file = os.path.join(name, str(video_file_count) + ".avi")
video_writer = cv2.VideoWriter(video_file,video_codec, fps,(int(cap.get(3)),int(cap.get(4))))
time.sleep(10)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap.release()
cv2.destroyAllWindows()
I want files with the recorded videos. Files are generated but size 6kb and nothing is being recorded.
You're almost there! Given that I understood what your goal is, and with minimal change to your code, here is what worked for me.
This writes a new video file every ten seconds while recording each frame into the current video.
import numpy as np
import cv2
import time
import os
import random
import sys
fps = 24
width = 864
height = 640
video_codec = cv2.VideoWriter_fourcc("D", "I", "V", "X")
name = random.randint(0, 1000)
print(name)
if os.path.isdir(str(name)) is False:
name = random.randint(0, 1000)
name = str(name)
name = os.path.join(os.getcwd(), str(name))
print("ALl logs saved in dir:", name)
os.mkdir(name)
cap = cv2.VideoCapture(0)
ret = cap.set(3, 864)
ret = cap.set(4, 480)
cur_dir = os.path.dirname(os.path.abspath(sys.argv[0]))
start = time.time()
video_file_count = 1
video_file = os.path.join(name, str(video_file_count) + ".avi")
print("Capture video saved location : {}".format(video_file))
# Create a video write before entering the loop
video_writer = cv2.VideoWriter(
video_file, video_codec, fps, (int(cap.get(3)), int(cap.get(4)))
)
while cap.isOpened():
start_time = time.time()
ret, frame = cap.read()
if ret == True:
cv2.imshow("frame", frame)
if time.time() - start > 10:
start = time.time()
video_file_count += 1
video_file = os.path.join(name, str(video_file_count) + ".avi")
video_writer = cv2.VideoWriter(
video_file, video_codec, fps, (int(cap.get(3)), int(cap.get(4)))
)
# No sleeping! We don't want to sleep, we want to write
# time.sleep(10)
# Write the frame to the current video writer
video_writer.write(frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
cap.release()
cv2.destroyAllWindows()
A sign that the videos are being received at 6 kb, an error with the codec. You need to download opencv_ffmpeg.dll and place it in the Python3.2.1 folder and renamed to opencv_ffmpeg321.dll
This solved the problem for me, and before that, 5.6 kb videos were created, regardless of what I do. But the problem is deeper than it seems, it can still be connected with a mismatch in the resolution of the stream and the recording.
For OpenCV version X.Y.Z
opencv_ffmpeg.dll ==> opencv_ffmpegXYZ.dll
For 64-bit version of OpenCV X.Y.Z
opencv_ffmpeg.dll ==> opencv_ffmpegXYZ_64.dll
I'm using OpenCv and Dlib to execute facial recognition w/ landmarks, live from the webcam stream. The language is Python. It works fine on my macbook laptop, but I need it to run from a desktop computer 24/7. The computer is a PC Intel® Core™2 Quad CPU Q6600 # 2.40GHz 32bit running Debian Jessie. The drop in performance is drastic : there is a 10 seconds delay due to processing !
I therefore looked into multi-threading to gain performance :
I first tried the sample code by OpenCv, and the result is great! All four cores hit 100%, and the performance is much better.
I then replaced the frame processing code with my code, and it doesn't improve performance at all ! Only one core hits the 100%, the other ones stay very low. I even think it's worse with multi-threading on.
I got the facial landmark code from the dlib sample code. I know it can probably be optimized, but I want to understand why am I not able to use my (old) computer's full power with multi-threading ?
I'll drop my code below, thanks a lot for reading :)
from __future__ import print_function
import numpy as np
import cv2
import dlib
from multiprocessing.pool import ThreadPool
from collections import deque
from common import clock, draw_str, StatValue
import video
class DummyTask:
def __init__(self, data):
self.data = data
def ready(self):
return True
def get(self):
return self.data
if __name__ == '__main__':
import sys
print(__doc__)
try:
fn = sys.argv[1]
except:
fn = 0
cap = video.create_capture(fn)
#Face detector
detector = dlib.get_frontal_face_detector()
#Landmarks shape predictor
predictor = dlib.shape_predictor("landmarks/shape_predictor_68_face_landmarks.dat")
# This is where the facial detection takes place
def process_frame(frame, t0, detector, predictor):
# some intensive computation...
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
clahe_image = clahe.apply(gray)
detections = detector(clahe_image, 1)
for k,d in enumerate(detections):
shape = predictor(clahe_image, d)
for i in range(1,68): #There are 68 landmark points on each face
cv2.circle(frame, (shape.part(i).x, shape.part(i).y), 1, (0,0,255), thickness=2)
return frame, t0
threadn = cv2.getNumberOfCPUs()
pool = ThreadPool(processes = threadn)
pending = deque()
threaded_mode = True
latency = StatValue()
frame_interval = StatValue()
last_frame_time = clock()
while True:
while len(pending) > 0 and pending[0].ready():
res, t0 = pending.popleft().get()
latency.update(clock() - t0)
draw_str(res, (20, 20), "threaded : " + str(threaded_mode))
draw_str(res, (20, 40), "latency : %.1f ms" % (latency.value*1000))
draw_str(res, (20, 60), "frame interval : %.1f ms" % (frame_interval.value*1000))
cv2.imshow('threaded video', res)
if len(pending) < threadn:
ret, frame = cap.read()
t = clock()
frame_interval.update(t - last_frame_time)
last_frame_time = t
if threaded_mode:
task = pool.apply_async(process_frame, (frame.copy(), t, detector, predictor))
else:
task = DummyTask(process_frame(frame, t, detector, predictor))
pending.append(task)
ch = cv2.waitKey(1)
if ch == ord(' '):
threaded_mode = not threaded_mode
if ch == 27:
break
cv2.destroyAllWindows()
Performance issue was due to a bad compilation of dlib. Do not use pip install dlib which runs very very slowly for some reason compared to the proper compilation. I went from almost 10 seconds lag to about 2 seconds this way. So finally I didn't need multi-threading/processing, but I'm working on it to enhance the speed even more. Thanks for the help :)
i tried a simplified approach like P.Ro mentioned in his answer with processes writing to an output queue but somehow the queue got locked most of the time because all the processes wrote to it at the same time. (just my guess) i probably did something wrong.
in the end i ended up using pipes.
the code is nasty. but if i was me a few hours ago. i would still be glad to find an example that actually runs without effort.
from multiprocessing import Process, Queue, Manager,Pipe
import multiprocessing
import face_recognition as fik
import cv2
import time
video_input = 0
obama_image = fik.load_image_file("obama.png")
obama_face_encoding = fik.face_encodings(obama_image)[0]
quality = 0.7
def f(id,fi,fl):
import face_recognition as fok
while True:
small_frame = fi.get()
print("running thread"+str(id))
face_locations = fok.face_locations(small_frame)
if(len(face_locations)>0):
print(face_locations)
for (top7, right7, bottom7, left7) in face_locations:
small_frame_c = small_frame[top7:bottom7, left7:right7]
fl.send(small_frame_c)
fps_var =0
if __name__ == '__main__':
multiprocessing.set_start_method('spawn')
# global megaman
with Manager() as manager:
video_capture = cv2.VideoCapture(video_input)
fi = Queue(maxsize=14)
threads = 8
proc = []
parent_p = []
thread_p = []
# procids = range(0,threads)
for t in range(0,threads):
p_t,c_t = Pipe()
parent_p.append(p_t)
thread_p.append(c_t)
print(t)
proc.append(Process(target=f, args=(t,fi,thread_p[t])))
proc[t].start()
useframe = False
frame_id = 0
while True:
# Grab a single frame of video
ret, frame = video_capture.read()
effheight, effwidth = frame.shape[:2]
if effwidth < 20:
break
# Resize frame of video to 1/4 size for faster face recognition processing
xxx = 930
yyy = 10/16 #0.4234375
small_frame = cv2.resize(frame, (xxx, int(xxx*yyy)))
if frame_id%2 == 0:
if not fi.full():
fi.put(small_frame)
print(frame_id)
cv2.imshow('Video', small_frame)
print("FPS: ", int(1.0 / (time.time() - fps_var)))
fps_var = time.time()
#GET ALL DETECTIONS
for t in range(0,threads):
if parent_p[t].poll():
small_frame_c = parent_p[t].recv()
cv2.imshow('recc', small_frame_c)
height34, width34 = small_frame_c.shape[:2]
# print fsizeee
if(width34<20):
print("face 2 small")
print(width34)
break
face_encodings_cam = fik.face_encodings(small_frame_c,[(0, width34, height34, 0)])
match = fik.compare_faces([obama_face_encoding], face_encodings_cam[0])
name = "Unknown"
if match[0]:
name = "Barack"
print(name)
break
frame_id += 1
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
break
Do not have much experience with using ThreadPool, but I always just use Process like shown below. You should be able to easily edit this code to fit your needs. I wrote this with your implementation in mind.
This code will get the number of cores and start however many worker processes that will all be implementing the desired function in parallel. They all share a Queue of frames for input and all put to the same output Queue for the main to get and show. Each Queue has a maximum size, in this case 5. This ensures that despite the CPU time it takes to process, it will always be relatively live time.
import numpy as np
import cv2
from multiprocessing import Process, Queue
import time
#from common import clock, draw_str, StatValue
#import video
class Canny_Process(Process):
def __init__(self,frame_queue,output_queue):
Process.__init__(self)
self.frame_queue = frame_queue
self.output_queue = output_queue
self.stop = False
#Initialize your face detectors here
def get_frame(self):
if not self.frame_queue.empty():
return True, self.frame_queue.get()
else:
return False, None
def stopProcess(self):
self.stop = True
def canny_frame(self,frame):
# some intensive computation...
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 100)
#To simulate CPU Time
#############################
for i in range(1000000):
x = 546*546
res = x/(i+1)
#############################
'REPLACE WITH FACE DETECT CODE HERE'
if self.output_queue.full():
self.output_queue.get_nowait()
self.output_queue.put(edges)
def run(self):
while not self.stop:
ret, frame = self.get_frame()
if ret:
self.canny_frame(frame)
if __name__ == '__main__':
frame_sum = 0
init_time = time.time()
def put_frame(frame):
if Input_Queue.full():
Input_Queue.get_nowait()
Input_Queue.put(frame)
def cap_read(cv2_cap):
ret, frame = cv2_cap.read()
if ret:
put_frame(frame)
cap = cv2.VideoCapture(0)
threadn = cv2.getNumberOfCPUs()
threaded_mode = True
process_list = []
Input_Queue = Queue(maxsize = 5)
Output_Queue = Queue(maxsize = 5)
for x in range((threadn -1)):
canny_process = Canny_Process(frame_queue = Input_Queue,output_queue = Output_Queue)
canny_process.daemon = True
canny_process.start()
process_list.append(canny_process)
ch = cv2.waitKey(1)
cv2.namedWindow('Threaded Video', cv2.WINDOW_NORMAL)
while True:
cap_read(cap)
if not Output_Queue.empty():
result = Output_Queue.get()
cv2.imshow('Threaded Video', result)
ch = cv2.waitKey(5)
if ch == ord(' '):
threaded_mode = not threaded_mode
if ch == 27:
break
cv2.destroyAllWindows()
This should do the trick just change my canny function to do your face detection. I wrote this from your code and compared the two. This is significantly faster. I am using multiprocessing.Process here. In python processes are truly parallel and threads are not quite because of the GIL. I am using 2 queues to send data back and forth between the main and the processes. Queues are both Thread and Process safe.
you may use this, multithreaded:
from imutils.video import VideoStream
# Initialize multithreading the video stream.
videostream = "rtsp://192.168.x.y/user=admin=xxxxxxx_channel=vvvv=1.sdp?params"
vs = VideoStream(src=videostream, resolution=frameSize,
framerate=32).start()
frame = vs.read()
I have gotten both OpenCV and PyAudio working however I am not sure how I would sync them together. I am unable to get a framerate from OpenCV and measuring the call time for a frame changes from moment to moment. However with PyAudio it's basis is grabbing a certain sample rate. How would I sync them to be at the same rate. I assume there is some standard or some way codecs do it. (I've tried google all I got was information on lip syncing :/).
OpenCV Frame rate
from __future__ import division
import time
import math
import cv2, cv
vc = cv2.VideoCapture(0)
# get the frame
while True:
before_read = time.time()
rval, frame = vc.read()
after_read = time.time()
if frame is not None:
print len(frame)
print math.ceil((1.0 / (after_read - before_read)))
cv2.imshow("preview", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
print "None..."
cv2.waitKey(1)
# display the frame
while True:
cv2.imshow("preview", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
Grabbing and saving audio
from sys import byteorder
from array import array
from struct import pack
import pyaudio
import wave
THRESHOLD = 500
CHUNK_SIZE = 1024
FORMAT = pyaudio.paInt16
RATE = 44100
def is_silent(snd_data):
"Returns 'True' if below the 'silent' threshold"
print "\n\n\n\n\n\n\n\n"
print max(snd_data)
print "\n\n\n\n\n\n\n\n"
return max(snd_data) < THRESHOLD
def normalize(snd_data):
"Average the volume out"
MAXIMUM = 16384
times = float(MAXIMUM)/max(abs(i) for i in snd_data)
r = array('h')
for i in snd_data:
r.append(int(i*times))
return r
def trim(snd_data):
"Trim the blank spots at the start and end"
def _trim(snd_data):
snd_started = False
r = array('h')
for i in snd_data:
if not snd_started and abs(i)>THRESHOLD:
snd_started = True
r.append(i)
elif snd_started:
r.append(i)
return r
# Trim to the left
snd_data = _trim(snd_data)
# Trim to the right
snd_data.reverse()
snd_data = _trim(snd_data)
snd_data.reverse()
return snd_data
def add_silence(snd_data, seconds):
"Add silence to the start and end of 'snd_data' of length 'seconds' (float)"
r = array('h', [0 for i in xrange(int(seconds*RATE))])
r.extend(snd_data)
r.extend([0 for i in xrange(int(seconds*RATE))])
return r
def record():
"""
Record a word or words from the microphone and
return the data as an array of signed shorts.
Normalizes the audio, trims silence from the
start and end, and pads with 0.5 seconds of
blank sound to make sure VLC et al can play
it without getting chopped off.
"""
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT, channels=1, rate=RATE,
input=True, output=True,
frames_per_buffer=CHUNK_SIZE)
num_silent = 0
snd_started = False
r = array('h')
while 1:
# little endian, signed short
snd_data = array('h', stream.read(1024))
if byteorder == 'big':
snd_data.byteswap()
print "\n\n\n\n\n\n"
print len(snd_data)
print snd_data
r.extend(snd_data)
silent = is_silent(snd_data)
if silent and snd_started:
num_silent += 1
elif not silent and not snd_started:
snd_started = True
if snd_started and num_silent > 1:
break
sample_width = p.get_sample_size(FORMAT)
stream.stop_stream()
stream.close()
p.terminate()
r = normalize(r)
r = trim(r)
r = add_silence(r, 0.5)
return sample_width, r
def record_to_file(path):
"Records from the microphone and outputs the resulting data to 'path'"
sample_width, data = record()
data = pack('<' + ('h'*len(data)), *data)
wf = wave.open(path, 'wb')
wf.setnchannels(1)
wf.setsampwidth(sample_width)
wf.setframerate(RATE)
wf.writeframes(data)
wf.close()
if __name__ == '__main__':
print("please speak a word into the microphone")
record_to_file('demo.wav')
print("done - result written to demo.wav")
I think you'd be better off using either GSreamer or ffmpeg, or if you're on Windows, DirectShow. These libs can handle both audio and video, and should have some kind of a Multiplexer to allow you to mix video and audio properly.
But if you really want to do this using Opencv, you should be able to use VideoCapture to get the frame rate, have you tried using this?
fps = cv.GetCaptureProperty(vc, CV_CAP_PROP_FPS)
Another way would be to estimate fps as number of frames divided by duration:
nFrames = cv.GetCaptureProperty(vc, CV_CAP_PROP_FRAME_COUNT)
cv.SetCaptureProperty(vc, CV_CAP_PROP_POS_AVI_RATIO, 1)
duration = cv.GetCaptureProperty(vc, CV_CAP_PROP_POS_MSEC)
fps = 1000 * nFrames / duration;
I'm not sure I understand what you were trying to do here:
before_read = time.time()
rval, frame = vc.read()
after_read = time.time()
It seems to me that doing after_read - before_read only measures how long it took for OpenCV to load the next frame, it doesn't measure the fps. OpenCV is not trying to do playback, it's only loading frames and it'll try to do so the fastest it can and I think there's no way to configure that. I think that putting a waitKey(1/fps) after displaying each frame will achieve what you're looking for.
You could have 2 counters 1 for audio and one for video.
The video counter will become +(1/fps) when showing an image and audio +sec where sec the seconds of audio you are writing to the stream each time. Then on audio part of the code you can do something like
While audiosec-videosec>=0.05: # Audio is ahead
time.sleep(0.05)
And on video part
While videosec-audiosec>=0.2:# video is ahead
time.sleep(0.2)
You can play with the numbers
This is how i achieve some sort of synchronization on my own video player project using pyaudio recently ffmpeg instead of cv2.
personally i used threading for this.
import concurrent.futures
import pyaudio
import cv2
class Aud_Vid():
def __init__(self, arg):
self.video = cv2.VideoCapture(0)
self.CHUNK = 1470
self.FORMAT = pyaudio.paInt16
self.CHANNELS = 2
self.RATE = 44100
self.audio = pyaudio.PyAudio()
self.instream = self.audio.open(format=self.FORMAT,channels=self.CHANNELS,rate=self.RATE,input=True,frames_per_buffer=self.CHUNK)
self.outstream = self.audio.open(format=self.FORMAT,channels=self.CHANNELS,rate=self.RATE,output=True,frames_per_buffer=self.CHUNK)
def sync(self):
with concurrent.futures.ThreadPoolExecutor() as executor:
tv = executor.submit(self.video.read)
ta = executor.submit(self.instream.read,1470)
vid = tv.result()
aud = ta.result()
return(vid[1].tobytes(),aud)