Why do some pixel values cause stuttering during playback? - python

I use the following function to obtain video frames. I either pass noise_type=None to obtain original frames or pass salt and pepper to and overlay frames with salt and pepper noise (randomly replacing some RGB pixels with (0, 0, 0) or (255, 255, 255) This is passed alongside some probability that a pixel will be replaced with a black or white pixel (e.g. prob=0.1 to replace 10% of pixels with either a black or white pixel).
Please note, I am using Python 3.7.9 and OpenCV 4.4.0. Also, as the videos are to be ultimately written alongside audio data using moviepy, they are in RGB space; so running this code and viewing the video will be in the wrong colourspace, but you should still see that the video hangs during playback.
def get_video_frames(filename, noise_type=None, prob=None):
all_frames = []
video_capture = cv2.VideoCapture()
if not video_capture.open(filename):
print('Error: Cannot open video file {}'.format(filename))
fps = video_capture.get(cv2.CAP_PROP_FPS)
print("fps: {}".format(fps))
while True:
has_frames, frame = video_capture.read()
if not has_frames:
if noise_type is None:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
elif noise_type == 'salt and pepper':
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
row,col,ch = frame.shape
s_vs_p = 0.5
salty_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
salty_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[salty_x_coords, salty_y_coords] = 255, 255, 255
peppery_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
peppery_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[peppery_x_coords, peppery_y_coords] = 0, 0, 0
return all_frames, fps
The issue comes with playback, it seems. I generate clean frames and display them using opencv:
frames_clean, fps = get_video_frames('C:/some_video_file.mp4')
for f in frames_clean:
cv2.imshow('clean', f)
Then I generate noisy frames and display them using opencv:
frames_noisy, fps = get_video_frames('C:/some_video_file.mp4', noise_type='salt and pepper', prob=0.1)
for f in frames_noisy:
cv2.imshow('noisy', f)
The noisy video hangs/pauses/stutters on some frames. It's really unusual as both frames_clean and frames_noisy are lists of uint8 frames of the same shape. The only difference is that the noisy frames have some different pixel values. This behaviour is also present if I create a videoclip using moviepy with these frame lists, write them to disk, and play them with VLC/ Windows Media Player. After 2 days of scouring the internet, I can't find any explanation. I would like the noisy videos I generate to play as expected with a seemingly stable display rate as per the clean video without noise. Thanks for any help!


concatenating image and video frame opencv

I have problem understanding video processing in opencv.
I have dynamically generated images that comprises most recent 4 images (face images extracted from a frame of a live video) concatenated together vertically.
I want to concatenate this image to a live video frame so that on one side of an OpenCV window, live video from video capture is running/displaying and on the other side of the window there is the above mentioned image displayed.
I am thinking about whether I should use cv2.addweighted to overlay this image on the frame of live video capture or whether I should concatenate the frame and image and then pass to cv2.imshow to display the image and live video in single window.
Can this problem be considered as concatenating two videos of different frame rates one is a video with some fps and the other is a static image with 0 fps. Kindly suggest any approach that could help or share any resource similar to this problem!
# extract faces from a live video stream
# and save them to a file
def extract_face(image):
faces = face_cascade.detectMultiScale(image, 1.3, 5)
for (x, y, w, h) in faces:
face = image[y:y+h, x:x+w]
return face
vid = cv2.VideoCapture(1)
while (True):
ret, frame = vid.read()
if cv2.waitKey(1) & 0xFF == ord('q'):
faced = extract_face(frame)
cv2.imwrite('face.jpg', faced)
# horozontal concat the saved face and the frame
hconcat = np.concatenate((frame, face), axis=1)
# display the vieo
cv2.imshow('frame', hconcat)
This a sample I am trying to implement.

OpenCv Project Guidance for tracking humans

I am currently making a program with OpenCv that detects 2 colours. My next step is to leave a "translucent" path of where both these colours have moved. The idea is that every time they cross over their trail it will get a shade darker.
Here is my current code:
# required libraries
import cv2
import numpy as np
# main function
def main():
# returns vid from camera -- cameras are indexed(0 is the front camera, 1 is rear)
cap = cv2.VideoCapture(0)
# cap is opened if pc is receiving cam data
if cap.isOpened():
ret, frame = cap.read()
ret = False
while ret:
ret, frame = cap.read()
# setting color range
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# BLUE color range
blue_low = np.array([100, 50, 50])
blue_high = np.array([140, 255, 255])
# GREEN color range
green_low = np.array([40, 40, 40])
green_high = np.array(([80, 255, 255]))
# creating masks
blue_mask = cv2.inRange(hsv, blue_low, blue_high)
green_mask = cv2.inRange(hsv, green_low, green_high)
# combination of masks
blue_green_mask = cv2.bitwise_xor(blue_mask, green_mask)
blue_green_mask_colored = cv2.bitwise_and(blue_mask, green_mask, mask=blue_green_mask)
# create the masked version (shows the background black and the specified color the color coming through cam)
output = cv2.bitwise_and(frame, frame, mask=blue_green_mask)
# create/open windows
cv2.imshow("image mask", blue_green_mask)
cv2.imshow("orig webcam feed", frame)
cv2.imshow("color tracking", output)
# if q is pressed the project breaks
if cv2.waitKey(1) & 0xFF == ord('q'):
# once broken the program will run remaining instructions (closing windows and stopping cam)
if __name__ == "__main__":
My question now is how would I add the trail of where both colours have gone? I have also read that I will run into a problem when the trail is implemented as insignificant objects may be detected as one of the colours and leave an unwanted trail.. meaning I will need to find a way to only trail the largest object of the specified colours.
For further clarification:
I am using 2 black highlighters (one with a blue cap and one with a green cap).
With regards to the trail I am referring to something similar to this:..
trail clarification
This guy did an okay job at explaining it but I was still very confused which is why I came to stack overflow for help.
with the trails; I would like for them to be 'translucent' and not solid like in the picture above. therefore if the object crosses over its path again that section of the path will become a shade darker.
hope this helped:)

Background substractor python opencv ( remove granulation )

Hello in using MOG2 to make a Background substrator from a base frame to a next frames.
but its showing me to much ruid
id like if there is another background substractor that can elimitate this ponts.
Also i have another problem.
When a car passes with flash lights on the flashlights is showed as white im mi image . i need to ignorate the reflexion of fleshlight in the ground.
Some one knows dow to do that ?
by cod for BGS:
backSub = cv2.createBackgroundSubtractorMOG2(history=1, varThreshold=150, detectShadows=True)
fgMask = backSub.apply(frame1)
fgMask2 = backSub.apply(actualframe)
maskedFrame = fgMask2 - fgMask
cv2.imshow("maskedFrame1 "+str(id), maskedFrame)
You can try to perform a Gaussian blur before sending the frame to backSub.apply() or experiment with the parameters for cv2.createBackgroundSubtractorMOG2(): if you need a better explanation of what they do, try this page.
This is the result from a 7x7 Gaussian blur using this video.
import cv2
import numpy as np
import sys
# read input video
cap = cv2.VideoCapture('traffic.mp4')
if (cap.isOpened()== False):
print("!!! Failed to open video")
# retrieve input video frame size
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = cap.get(cv2.CAP_PROP_FPS)
print('* Input Video settings:', frame_width, 'x', frame_height, '#', fps)
# adjust output video size
frame_height = int(frame_height / 2)
print('* Output Video settings:', frame_width, 'x', frame_height, '#', fps)
# create output video
video_out = cv2.VideoWriter('traffic_out.mp4', cv2.VideoWriter_fourcc(*'MP4V'), fps, (frame_width, frame_height))
#video_out = cv2.VideoWriter('traffic_out.avi', cv2.VideoWriter_fourcc('M','J','P','G'), fps, (frame_width, frame_height), True)
# create MOG
backSub = cv2.createBackgroundSubtractorMOG2(history=5, varThreshold=60, detectShadows=True)
while (True):
# retrieve frame from the video
ret, frame = cap.read() # 3-channels
if (frame is None):
# resize to 50% of its original size
frame = cv2.resize(frame, None, fx=0.5, fy=0.5)
# gaussian blur helps to remove noise
blur = cv2.GaussianBlur(frame, (7,7), 0)
#cv2.imshow('frame_blur', blur)
# subtract background
fgmask = backSub.apply(blur) # single channel
#cv2.imshow('fgmask', fgmask)
# concatenate both frames horizontally and write it as output
fgmask_bgr = cv2.cvtColor(fgmask, cv2.COLOR_GRAY2BGR) # convert single channel image to 3-channels
out_frame = cv2.hconcat([blur, fgmask_bgr]) #
#print('output=', out_frame.shape) # shape=(360, 1280, 3)
cv2.imshow('output', out_frame)
# quick pause to display the windows
if (cv2.waitKey(1) == 27):
# release resources
You can use SuBSENSE: A Universal Change Detection Method With Local Adaptive Sensitivity https://ieeexplore.ieee.org/document/6975239.
BackgroundSubtractionSuBSENSE bgs(/*...*/);
for(/*all frames in the video*/) {
You can find the complete implementation at
Plus I don't know the scale of your work, and your requirements. But Murari Mandal composed a very informative repository on GitHub comprising list of resources related to background subtraction, which can solve the above mentioned problems.

cv2.imshow shows 9 screens instead of 1

I'm building some code to adaptively detect skin from webcam video. I have it almost working, however, when outputting the video, it shows 9 screens of the "skin" mask instead of just one. Seems like I'm just missing something simple, but I can't figure it out.
image shown here
Code below:
# first let's train the data
data, labels = ReadData()
classifier = TrainTree(data, labels)
# get the webcam. The input is either a video file or the camera number
# since using laptop webcam (only 1 cam), input is 0. A 2nd cam would be input 1
camera = cv2.VideoCapture(0)
while True:
# reads in the current frame
# .read() returns True if frame read correctly, and False otherwise
ret, frame = camera.read() # frame.shape: (480,640,3)
if ret:
# reshape the frame to follow format of training data (rows*col, 3)
data = np.reshape(frame, (frame.shape[0] * frame.shape[1], 3))
bgr = np.reshape(data, (data.shape[0], 1, 3))
hsv = cv2.cvtColor(np.uint8(bgr), cv2.COLOR_BGR2HSV)
# once we have converted to HSV, we reshape back to original shape of (245057,3)
data = np.reshape(hsv, (hsv.shape[0], 3))
predictedLabels = classifier.predict(data)
# the AND operator applies the skinMask to the image
# predictedLabels consists of 1 (skin) and 2 (non-skin), needs to change to 0 (non-skin) and 255 (skin)
predictedMask = (-(predictedLabels - 1) + 1) * 255 # predictedMask.shape: (307200,)
# resize to match frame shape
imgLabels = np.resize(predictedMask, (frame.shape[0], frame.shape[1], 3)) # imgLabels.shape: (480,640,3)
# masks require 1 channel, not 3, so change from BGR to GRAYSCALE
imgLabels = cv2.cvtColor(np.uint8(imgLabels), cv2.COLOR_BGR2GRAY) # imgLabels.shape: (480,640)
# do bitwsie AND to pull out skin pixels. All skin pixels are anded with 255 and all others are 0
skin = cv2.bitwise_and(frame, frame, mask=imgLabels) # skin.shape: (480,640,3)
# show the skin in the image along with the mask, show images side-by-side
# **********THE BELOW LINE OUTPUTS 9 screens of the skin mask instead of just 1 ****************
cv2.imshow("images", np.hstack([frame, skin]))
# if the 'q' key is pressed, stop the loop
if cv2.waitKey(1) & 0xFF == ord("q"):
# release the video capture
You're working with bitmaps. To get an idea what they hold, cv2.imshow them individually. Then you're going to see (literally) where the data goes wrong.
Now, the culprit is most probably np.resize():
np.resize(a, new_shape)
Return a new array with the specified shape.
If the new array is larger than the original array, then the new array
is filled with repeated copies of a. Note that this behavior is
different from a.resize(new_shape) which fills with zeros instead of
repeated copies of a.
To scale a bitmap (=resize while striving to preserve the same visual image), use cv2.resize() as per OpenCV: Geometric Transformations of Images.

Reduced camera resolution but higher display window in python open cv

I am working on a project which requires to do face detection on raspberry pi. I have a USB camera to do this. The frame rate was apparently very slow. So, I scaled down the capture resolution using VideoCapture.set(). This decreased the resolution to 320, 214 as I set it. This increased the capture frame rate considerably but it the feed in displayed the feed on a window on 320 X 214. I want to keep the same capture resolution but I want higher size display window. I am just a beginner to python and open cv. Please help me do it. Below is the code I wrote for just a simple camera feed.
import numpy as np
import cv2
import time
cap = cv2.VideoCapture(-1)
cap.set(3, 320) #width
cap.set(4, 216) #height
cap.set(5, 15) #frame rate
ret, frame = cap.read()
cv2.imshow("captured video", frame)
if cv2.waitKey(33) == ord('q'):
If I understand you correctly, you want the display image to be a scaled up version of the original. If so, you just need cv2.resize
display_scale = 4
height, width = frame.shape[0:2]
height_display, width_display = display_scale * height, display_scale * width
# you can choose different interpolation methods
frame_display = cv2.resize(frame, (display_width, display_height),
cv2.imshow("captured video", frame_display)
