I have problem understanding video processing in opencv.
I have dynamically generated images that comprises most recent 4 images (face images extracted from a frame of a live video) concatenated together vertically.
I want to concatenate this image to a live video frame so that on one side of an OpenCV window, live video from video capture is running/displaying and on the other side of the window there is the above mentioned image displayed.
I am thinking about whether I should use cv2.addweighted to overlay this image on the frame of live video capture or whether I should concatenate the frame and image and then pass to cv2.imshow to display the image and live video in single window.
Can this problem be considered as concatenating two videos of different frame rates one is a video with some fps and the other is a static image with 0 fps. Kindly suggest any approach that could help or share any resource similar to this problem!
# extract faces from a live video stream
# and save them to a file
face_cascadez=cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
def extract_face(image):
faces = face_cascade.detectMultiScale(image, 1.3, 5)
for (x, y, w, h) in faces:
face = image[y:y+h, x:x+w]
break
return face
vid = cv2.VideoCapture(1)
while (True):
ret, frame = vid.read()
if cv2.waitKey(1) & 0xFF == ord('q'):
break
faced = extract_face(frame)
cv2.imwrite('face.jpg', faced)
# horozontal concat the saved face and the frame
hconcat = np.concatenate((frame, face), axis=1)
# display the vieo
cv2.imshow('frame', hconcat)
vid.release()
cv2.destroyAllWindows()
This a sample I am trying to implement.
Related
I use the following function to obtain video frames. I either pass noise_type=None to obtain original frames or pass salt and pepper to and overlay frames with salt and pepper noise (randomly replacing some RGB pixels with (0, 0, 0) or (255, 255, 255) This is passed alongside some probability that a pixel will be replaced with a black or white pixel (e.g. prob=0.1 to replace 10% of pixels with either a black or white pixel).
Please note, I am using Python 3.7.9 and OpenCV 4.4.0. Also, as the videos are to be ultimately written alongside audio data using moviepy, they are in RGB space; so running this code and viewing the video will be in the wrong colourspace, but you should still see that the video hangs during playback.
def get_video_frames(filename, noise_type=None, prob=None):
all_frames = []
video_capture = cv2.VideoCapture()
if not video_capture.open(filename):
print('Error: Cannot open video file {}'.format(filename))
return
fps = video_capture.get(cv2.CAP_PROP_FPS)
print("fps: {}".format(fps))
while True:
has_frames, frame = video_capture.read()
if not has_frames:
video_capture.release()
break
if noise_type is None:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
elif noise_type == 'salt and pepper':
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
row,col,ch = frame.shape
s_vs_p = 0.5
salty_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
salty_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[salty_x_coords, salty_y_coords] = 255, 255, 255
peppery_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
peppery_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[peppery_x_coords, peppery_y_coords] = 0, 0, 0
all_frames.append(frame)
return all_frames, fps
The issue comes with playback, it seems. I generate clean frames and display them using opencv:
frames_clean, fps = get_video_frames('C:/some_video_file.mp4')
for f in frames_clean:
cv2.imshow('clean', f)
cv2.waitKey(33)
cv2.destroyAllWindows()
Then I generate noisy frames and display them using opencv:
frames_noisy, fps = get_video_frames('C:/some_video_file.mp4', noise_type='salt and pepper', prob=0.1)
for f in frames_noisy:
cv2.imshow('noisy', f)
cv2.waitKey(33)
cv2.destroyAllWindows()
The noisy video hangs/pauses/stutters on some frames. It's really unusual as both frames_clean and frames_noisy are lists of uint8 frames of the same shape. The only difference is that the noisy frames have some different pixel values. This behaviour is also present if I create a videoclip using moviepy with these frame lists, write them to disk, and play them with VLC/ Windows Media Player. After 2 days of scouring the internet, I can't find any explanation. I would like the noisy videos I generate to play as expected with a seemingly stable display rate as per the clean video without noise. Thanks for any help!
I want to get each frame from a video as an image. background to this is following. I have written a Neural Network which is able to recognize Hand Signs. Now I want to start a video stream, where each image/frame of the stream is put through the Neural Network. To fit it into my neural Network, I want to render each frame and reduce the image to 28*28 pixels. In the end it should look similar to this: https://www.youtube.com/watch?v=JfSao30fMxY
I have searched through the web and found out that I can use cv2.VideoCapture to get the stream. But how can I pick each image of the Frame, render it and print the result back on the screen. My Code looks like this until now:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
# Todo: each Frame/Image from the video should be saved as a variable and open imageToLabel()
# Todo: before the image is handed to the method, it needs to be translated into a 28*28 np Array
# Todo: the returned Label should be printed onto the video (otherwise it can be )
i = 0
while (True):
# Capture frame-by-frame
# Load model once and pass it as an parameter
ret, frame = cap.read()
i += 1
image = cv2.imwrite('database/{index}.png'.format(index=i), frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2BGRAY)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
def imageToLabel(imgArr, checkpointLoad):
new_model = tf.keras.models.load_model(checkpointLoad)
imgArrNew = imgArr.reshape(1, 28, 28, 1) / 255
prediction = new_model.predict(imgArrNew)
label = np.argmax(prediction)
return label
frame is the RGB Image you get from the stream.
gray is the grayscale converted image.
I suppose your network takes grayscaled images because of its shape. Therefor you need to first resize the image to (28,28) and then pass it to your imageToLabel function
resizedImg = cv2.resize(gray,(28,28))
label = imageToLabel(resizedImg,yourModel)
now that you know the prediction you can draw it on the frame using e.g. cv2.putText() and then draw the frame it returns instead of frame
edit:
If you want to use parts of the image for your network you can slice the image like this:
slicedImg = gray[50:150,50:150]
resizedImg = cv2.resize(slicedImg,(28,28))
label = imageToLabel(resizedImg,yourModel)
If you're not that familiar with indexing in python you might want to take a look at this
Also if you want it to look like in the linked video you can draw a rectangle from e.g. (50,50) to (150,150) that is green (0,255,0)
cv2.rectangle(frame,(50,50),(150,150),(0,255,0))
I have conferance call video with different people's tiles arranged on a grid.
Example:
gallery view zoom
Can I crop every video tile to a separate file using python or nodejs?
Yes, you can achieve that using OpenCV library
Read the video in OpenCV using VideoCapture API. Note down framerate while reading.
Parse through each frame and crop the frame:
Write the frame in a video using OpenCV VideoWriter
Here is the example code using (640,480) to be the new dimensions:
cap = cv2.VideoCapture(<video_file_name>)
fps = cap.get(cv2.CAP_PROP_FPS)
out = cv2.VideoWriter('<output video file name>, -1, fps, (640,480))
while(cap.isOpened()):
ret, frame = cap.read()
crop_frame = frame[y:y+h, x:x+w]
# write the crooped frame
out.write(crop_frame)
# Release reader wand writer after parsing all frames
cap.release()
out.release()
Here's the code (tested). It works by initialising a number of video outputs, then for each frame of the input video: cropping the region of interest (roi) and assigning each to the relevent output video. You might need to make tweaks depending on input video dimensions, number of times, offsets etc.
import numpy as np
import cv2
import time
cap = cv2.VideoCapture('in.mp4')
ret, frame = cap.read()
(h, w, d) = np.shape(frame)
horiz_divisions = 5 # Number of tiles stacked horizontally
vert_divisions = 5 # Number of tiles stacked vertically
divisions = horiz_divisions*vert_divisions # Total number of tiles
seg_h = int(h/vert_divisions) # Tile height
seg_w = int(w/horiz_divisions) # Tile width
# Initialise the output videos
outvideos = [0] * divisions
for i in range(divisions):
outvideos[i] = cv2.VideoWriter('out{}.avi'.format(str(i)),cv2.VideoWriter_fourcc('M','J','P','G'), 10, (seg_w,seg_h))
# main code
while(cap.isOpened()):
ret, frame = cap.read()
if ret == True:
vid = 0 # video counter
for i in range(vert_divisions):
for j in range(horiz_divisions):
# Get the coordinates (top left corner) of the current tile
row = i * seg_h
col = j * seg_w
roi = frame[row:row+seg_h,col:col+seg_w,0:3] # Copy the region of interest
outvideos[vid].write(roi)
vid += 1
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
# Release all the objects
cap.release()
for i in range(divisions):
outvideos[i].release()
# Release everything if job is finished
cv2.destroyAllWindows()
Hope this helps!
I have recently bought a stereo camera through Amazon and I want to use it for depth mapping. The problem is that the output that I get from the camera is in the form of a single video with the output of both the cameras.
What I want is two seprate outputs from the single usb port if it is possible.I can use cropping but I dont want to use that because i am trying to reduce the processing time and I want the outputs sepratley.
The obove image was generated from the following code
import numpy as np
import cv2
cam = cv2. VideoCapture(1)
cam.set(cv2.CAP_PROP_FPS, 120)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while(1):
s,orignal = cam.read()
cv2.imshow('original',orignal)
if cv2.waitKey(1) & 0xFF == ord('w'):
break
cam.release()
cv2.destroyAllWindows()
I have also tried other techniques such as:
import numpy as np
import cv2
left = cv2.VideoCapture(1)
right = cv2.VideoCapture(2)
left.set(cv2.CAP_PROP_FRAME_WIDTH, 720)
left.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
right.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
right.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
left.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
right.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
# Grab both frames first, then retrieve to minimize latency between cameras
while(True):
_, leftFrame = left.retrieve()
leftWidth, leftHeight = leftFrame.shape[:2]
_, rightFrame = right.retrieve()
rightWidth, rightHeight = rightFrame.shape[:2]
# TODO: Calibrate the cameras and correct the images
cv2.imshow('left', leftFrame)
cv2.imshow('right', rightFrame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
left.release()
right.release()
cv2.destroyAllWindows()
but they are not recognising the 3rd camera any help would be nice.
My openCV version is 3.4
P.S If anyone can present a soloution in c++ it would also work for me
Ok so after analysing the problem I figured that the best way would be to crop the images in half as it saves processing time. If you have two different image sources then your pipeline time is doubled for getting these images. After testing the stereo camera using cropping and without cropping I saw no noticeable change in the FPS. Here is a simple code for cropping the video and displaying it in two different windows.
import numpy as np
import cv2
cam = cv2. VideoCapture(1)
cam.set(cv2.CAP_PROP_FPS, 120)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
s,orignal = cam.read()
height, width, channels = orignal.shape
print(width)
print(height)
while(1):
s,orignal = cam.read()
left=orignal[0:height,0:int(width/2)]
right=orignal[0:height,int(width/2):(width)]
cv2.imshow('left',left)
cv2.imshow('Right',right)
if cv2.waitKey(1) & 0xFF == ord('w'):
break
cam.release()
cv2.destroyAllWindows()
[
As a task for school our group has to create an application that knows when a goal is scored. This means that a ball shaped object passes a line.
First we are attempting to input a video, get OpenCV to track the ball, and then to output it as a video.
I have put a bunch of other code snippets together that I have found on StackOverflow, but it doesn't work.
I am creating a new post because all the other related threads are either C++ or use colour detection instead of the shape detection that we use. I also can't find a clear answer on outputting the video file when it was turned into a series of images.
Following is the code that I have so far:
import cv2
import numpy as np
cap = cv2.VideoCapture('bal.mp4')
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output1.avi',fourcc, 20.0, (640,480))
while(1):
# Take each frame
ret, frame = cap.read()
if ret == True:
if ret == 0:
break
frame = cv2.medianBlur(frame,5)
cimg = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(cimg,3,1,20,param1=50,param2=30,minRadi$
if circles == None:
print "NoneType"
break
circles = np.uint16(np.around(circles.astype(np.double),3))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
cv2.imwrite('test.jpg',cimg)
out.write(cimg)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
out.release()
cv2.destroyAllWindows()
We get working images, but the video is unplayable with VLC or any other media player.
This is an image from the program:
This issue is turning it into a playable video now.
Thanks in advance.
Not sure if you got this working in the end, but changing it to mp4 worked for me on my mac
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
out = cv2.VideoWriter('static/video/outpy.mp4', 0x7634706d, 20, (frame_width, frame_height))