cv2.imshow shows 9 screens instead of 1 - python

I'm building some code to adaptively detect skin from webcam video. I have it almost working, however, when outputting the video, it shows 9 screens of the "skin" mask instead of just one. Seems like I'm just missing something simple, but I can't figure it out.
image shown here
Code below:
# first let's train the data
data, labels = ReadData()
classifier = TrainTree(data, labels)
# get the webcam. The input is either a video file or the camera number
# since using laptop webcam (only 1 cam), input is 0. A 2nd cam would be input 1
camera = cv2.VideoCapture(0)
while True:
# reads in the current frame
# .read() returns True if frame read correctly, and False otherwise
ret, frame = camera.read() # frame.shape: (480,640,3)
if ret:
# reshape the frame to follow format of training data (rows*col, 3)
data = np.reshape(frame, (frame.shape[0] * frame.shape[1], 3))
bgr = np.reshape(data, (data.shape[0], 1, 3))
hsv = cv2.cvtColor(np.uint8(bgr), cv2.COLOR_BGR2HSV)
# once we have converted to HSV, we reshape back to original shape of (245057,3)
data = np.reshape(hsv, (hsv.shape[0], 3))
predictedLabels = classifier.predict(data)
# the AND operator applies the skinMask to the image
# predictedLabels consists of 1 (skin) and 2 (non-skin), needs to change to 0 (non-skin) and 255 (skin)
predictedMask = (-(predictedLabels - 1) + 1) * 255 # predictedMask.shape: (307200,)
# resize to match frame shape
imgLabels = np.resize(predictedMask, (frame.shape[0], frame.shape[1], 3)) # imgLabels.shape: (480,640,3)
# masks require 1 channel, not 3, so change from BGR to GRAYSCALE
imgLabels = cv2.cvtColor(np.uint8(imgLabels), cv2.COLOR_BGR2GRAY) # imgLabels.shape: (480,640)
# do bitwsie AND to pull out skin pixels. All skin pixels are anded with 255 and all others are 0
skin = cv2.bitwise_and(frame, frame, mask=imgLabels) # skin.shape: (480,640,3)
# show the skin in the image along with the mask, show images side-by-side
# **********THE BELOW LINE OUTPUTS 9 screens of the skin mask instead of just 1 ****************
cv2.imshow("images", np.hstack([frame, skin]))
# if the 'q' key is pressed, stop the loop
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
# release the video capture
camera.release()
cv2.destroyAllWindows()

You're working with bitmaps. To get an idea what they hold, cv2.imshow them individually. Then you're going to see (literally) where the data goes wrong.
Now, the culprit is most probably np.resize():
np.resize(a, new_shape)
Return a new array with the specified shape.
If the new array is larger than the original array, then the new array
is filled with repeated copies of a. Note that this behavior is
different from a.resize(new_shape) which fills with zeros instead of
repeated copies of a.
To scale a bitmap (=resize while striving to preserve the same visual image), use cv2.resize() as per OpenCV: Geometric Transformations of Images.

Related

Open CV imshow() - ARGB32 to OpenCV image

I am trying to process images from Unity3D WebCamTexture graphics format(ARGB32) using OpenCV Python. But I am having trouble interpreting the image on the Open CV side. The image is all Blue (possibly due to ARGB)
try:
while(True):
data = sock.recv(480 * 640 * 4)
if(len(data) == 480 * 640 * 4):
image = numpy.fromstring(data, numpy.uint8).reshape( 480, 640, 4 )
#imageNoAlpha = image[:,:,0:2]
cv2.imshow('Image', image) #further do image processing
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
finally:
sock.close()
The reason is because of the order of the channels. I think the sender read image as a RGB image and you show it as a BGR image or vice versa.
Change the order of R and B channels will solve the problem:
image = image[..., [0,3,2,1]] # swap 3 and 1 represent for B and R
You will meet this problem frequently if you work with PIL.Image and OpenCV. The PIL.Image will read the image as RGB and cv2 will read as BGR, that's why all the red points in your image become blue.
OpenCV uses BGR (BGRA when including alpha) ordering when working with color images [1][2], this applies to images read/written with imread(), imwrite(); images acquired with VideoCapture; drawing functions ellipse(), rectangle(); and so on. This convention is self-consistent within the library, if you read an image with imread() and show it with imshow(), the correct colors will appear.
OpenCV is the only library I know that uses this ordering, e.g. PIL and Matplotlib both use RGB. If you want to convert from one color space to another use cvtColor(), example:
# Convert RGB to BGR.
new_image = cvtColor(image, cv2.COLOR_RGB2BGR)
See the ColorConversionCodes enum for all supported conversion pairs. Unfortunately there is no ARGB to BGR, but you can always manually manipulate the NumPy array anyway:
# Reverse channels ARGB to BGRA.
image_bgra = image[..., ::-1]
# Convert ARGB to BGR.
image_bgr = image[..., [3, 2, 1]]
There is also a mixChannels() function and a bunch other array manipulation utilities but most of these are redundant in OpenCV Python since images are backed by NumPy arrays so it's easier to just use the NumPy counterparts instead.
OpenCV uses BGR for seemingly historical reasons: Why OpenCV Using BGR Colour Space Instead of RGB.
References:
[1] OpenCV: Mat - The Basic Image Container (Search for 'BGR' under Storing methods.)
[2] OpenCV: How to scan images, lookup tables and time measurement with OpenCV
Image from [2] showing BGR layout in memory.
IMAGE_WIDTH = 640
IMAGE_HEIGHT = 480
IMAGE_SIZE = IMAGE_HEIGHT * IMAGE_WIDTH * 4
try:
while(True):
data = sock.recv(IMAGE_SIZE)
dataLen = len(data)
if(dataLen == IMAGE_SIZE):
image = numpy.fromstring(data, numpy.uint8).reshape(IMAGE_HEIGHT, IMAGE_WIDTH, 4)
imageDisp = cv2.cvtColor(image, cv2.COLOR_RGBA2BGR)
cv2.imshow('Image', imageDisp)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
finally:
sock.close()
Edited as per the suggestions from comment

how to save previous frame in an array and compare it with current frame in python

i want to remove the duplication of objects, so when the camera opens it captures the first frame and save on the disk, than untill next object appears in the scene it saves the next object frame (does not save the same frame consecutively).
i have written a code to compare two consecutive frames of webcam, i want to store one frame in an array (max limit 3) to compare it with current frame. so the first frame will be saved on the disk and it compares untill the next object appears(used threshold value for this purpose)
How can i save the frame to an array and compare with current frame?
from skimage.metrics import structural_similarity
import imutils
import sys
import datetime
import cv2
import time
import numpy as np
cap = cv2.VideoCapture(0)
while (True):
# Capture frame-by-frame
ret, frame1 = cap.read(0) # first image
time.sleep(1/50) # slight delay
ret, frame2 = cap.read(0) # second image
gray1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
# compute the Structural Similarity Index (SSIM) between the two
# images, ensuring that the difference image is returned
(score, diff) = structural_similarity (gray1, gray2, full=True)
diff = (diff * 255).astype ("uint8")
print ("SSIM: {}".format (score))
# threshold the difference image, followed by finding contours to
# obtain the regions of the two input images that differ
thresh = cv2.threshold (diff, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
if np.mean (thresh) < 0.4 :
print ("New object Detected")
date_string = datetime.datetime.now ( ).strftime ("%Y-%m-%d-%H:%M:%S")
cv2.imwrite ('img/img-' + date_string + '.png', frame2[y:y+h+30, x:x+w+30])
# Display the resulting frame
cv2.imshow ('frame1', frame1)
cv2.imshow('frame2', frame2)
cv2.imshow ("Diff", diff)
cv2.imshow ("Thresh", thresh)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
video_capture.release()
cv2.destroyAllWindows()
Not hard.
Actually, you can get the image as a NumPy array.
The shape is (720, 1280, 3).
To save it, try this
...
ret, frame1 = cap.read(0) # first image
print(frame1.shape)
rgb_frame1 = frame1[..., ::-1]
im = Image.fromarray(rgb_frame1)
im.save("your_file.jpeg")
time.sleep(1/50) # slight delay
...
Note: you need to change the channel order or you will get a blue image. Because the original channel is in BRG format.
Then you can store the frame:

Why do some pixel values cause stuttering during playback?

I use the following function to obtain video frames. I either pass noise_type=None to obtain original frames or pass salt and pepper to and overlay frames with salt and pepper noise (randomly replacing some RGB pixels with (0, 0, 0) or (255, 255, 255) This is passed alongside some probability that a pixel will be replaced with a black or white pixel (e.g. prob=0.1 to replace 10% of pixels with either a black or white pixel).
Please note, I am using Python 3.7.9 and OpenCV 4.4.0. Also, as the videos are to be ultimately written alongside audio data using moviepy, they are in RGB space; so running this code and viewing the video will be in the wrong colourspace, but you should still see that the video hangs during playback.
def get_video_frames(filename, noise_type=None, prob=None):
all_frames = []
video_capture = cv2.VideoCapture()
if not video_capture.open(filename):
print('Error: Cannot open video file {}'.format(filename))
return
fps = video_capture.get(cv2.CAP_PROP_FPS)
print("fps: {}".format(fps))
while True:
has_frames, frame = video_capture.read()
if not has_frames:
video_capture.release()
break
if noise_type is None:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
elif noise_type == 'salt and pepper':
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = cv2.resize(frame, dsize=(224, 224))
row,col,ch = frame.shape
s_vs_p = 0.5
salty_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
salty_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[salty_x_coords, salty_y_coords] = 255, 255, 255
peppery_x_coords = np.random.choice(frame.shape[0], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
peppery_y_coords = np.random.choice(frame.shape[1], np.int(np.ceil(frame.shape[0]*frame.shape[1]*prob*s_vs_p)))
frame[peppery_x_coords, peppery_y_coords] = 0, 0, 0
all_frames.append(frame)
return all_frames, fps
The issue comes with playback, it seems. I generate clean frames and display them using opencv:
frames_clean, fps = get_video_frames('C:/some_video_file.mp4')
for f in frames_clean:
cv2.imshow('clean', f)
cv2.waitKey(33)
cv2.destroyAllWindows()
Then I generate noisy frames and display them using opencv:
frames_noisy, fps = get_video_frames('C:/some_video_file.mp4', noise_type='salt and pepper', prob=0.1)
for f in frames_noisy:
cv2.imshow('noisy', f)
cv2.waitKey(33)
cv2.destroyAllWindows()
The noisy video hangs/pauses/stutters on some frames. It's really unusual as both frames_clean and frames_noisy are lists of uint8 frames of the same shape. The only difference is that the noisy frames have some different pixel values. This behaviour is also present if I create a videoclip using moviepy with these frame lists, write them to disk, and play them with VLC/ Windows Media Player. After 2 days of scouring the internet, I can't find any explanation. I would like the noisy videos I generate to play as expected with a seemingly stable display rate as per the clean video without noise. Thanks for any help!

How to get each frame as an image from a cv2.VideoCapture in python

I want to get each frame from a video as an image. background to this is following. I have written a Neural Network which is able to recognize Hand Signs. Now I want to start a video stream, where each image/frame of the stream is put through the Neural Network. To fit it into my neural Network, I want to render each frame and reduce the image to 28*28 pixels. In the end it should look similar to this: https://www.youtube.com/watch?v=JfSao30fMxY
I have searched through the web and found out that I can use cv2.VideoCapture to get the stream. But how can I pick each image of the Frame, render it and print the result back on the screen. My Code looks like this until now:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
# Todo: each Frame/Image from the video should be saved as a variable and open imageToLabel()
# Todo: before the image is handed to the method, it needs to be translated into a 28*28 np Array
# Todo: the returned Label should be printed onto the video (otherwise it can be )
i = 0
while (True):
# Capture frame-by-frame
# Load model once and pass it as an parameter
ret, frame = cap.read()
i += 1
image = cv2.imwrite('database/{index}.png'.format(index=i), frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2BGRAY)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
def imageToLabel(imgArr, checkpointLoad):
new_model = tf.keras.models.load_model(checkpointLoad)
imgArrNew = imgArr.reshape(1, 28, 28, 1) / 255
prediction = new_model.predict(imgArrNew)
label = np.argmax(prediction)
return label
frame is the RGB Image you get from the stream.
gray is the grayscale converted image.
I suppose your network takes grayscaled images because of its shape. Therefor you need to first resize the image to (28,28) and then pass it to your imageToLabel function
resizedImg = cv2.resize(gray,(28,28))
label = imageToLabel(resizedImg,yourModel)
now that you know the prediction you can draw it on the frame using e.g. cv2.putText() and then draw the frame it returns instead of frame
edit:
If you want to use parts of the image for your network you can slice the image like this:
slicedImg = gray[50:150,50:150]
resizedImg = cv2.resize(slicedImg,(28,28))
label = imageToLabel(resizedImg,yourModel)
If you're not that familiar with indexing in python you might want to take a look at this
Also if you want it to look like in the linked video you can draw a rectangle from e.g. (50,50) to (150,150) that is green (0,255,0)
cv2.rectangle(frame,(50,50),(150,150),(0,255,0))

How I Can insert images on Captured video in Python

I captured video using cv2.VideoCapured and display. Captured Video display on same time not saved. How I can insert image on this captured video for display on same time.
Assuming you want to add image directly to video frames at a certain x,y location without doing any color blending or image transparency. you can use the following python code:
#!/usr/bin/python3
import cv2
# load the overlay image. size should be smaller than video frame size
img = cv2.imread('logo.png')
# Get Image dimensions
img_height, img_width, _ = img.shape
# Start Capture
cap = cv2.VideoCapture(0)
# Get frame dimensions
frame_width = cap.get(cv2.CAP_PROP_FRAME_WIDTH )
frame_height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT )
# Print dimensions
print('image dimensions (HxW):',img_height,"x",img_width)
print('frame dimensions (HxW):',int(frame_height),"x",int(frame_width))
# Decide X,Y location of overlay image inside video frame.
# following should be valid:
# * image dimensions must be smaller than frame dimensions
# * x+img_width <= frame_width
# * y+img_height <= frame_height
# otherwise you can resize image as part of your code if required
x = 50
y = 50
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# add image to frame
frame[ y:y+img_height , x:x+img_width ] = img
# Display the resulting frame
cv2.imshow('frame',frame)
# Exit if ESC key is pressed
if cv2.waitKey(20) & 0xFF == 27:
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Please give more details if my assumption was wrong.

Categories