I am trying to view the output of an Omnivision OV7251 camera in OpenCV 4.2.0 Python 3.5.6. The camera output is 10 bit raw greyscale data, which I believe is right aligned in 16-bit words.
When I use this OpenCV code:
import cv2
cam2 = cv2.VideoCapture(0)
cam2.set(3, 640) # horizontal pixels
cam2.set(4, 480) # vertical pixels
while True:
b, frame = cam2.read()
if b:
cv2.imshow("Video", frame)
k = cv2.waitKey(5)
if k & 0xFF == 27:
cam2.release()
cv2.destroyAllWindows()
break
This is the image I get:
Presumably what's happening is that OpenCV is using the wrong process to convert from 10-bit raw to RGB, believing it to be some kind of YUV or something.
Is there some way I can either:
Tell OpenCV the camera's correct data format so that it does the conversion properly?
Get hold of the raw camera data so that I can do the conversion manually?
One way to do this is to grab the raw camera data, then use numpy to correct it:
import cv2
import numpy as np
cam2 = cv2.VideoCapture(0)
cam2.set(3, 640) # horizontal pixels
cam2.set(4, 480) # vertical pixels
cam2.set(cv2.CAP_PROP_CONVERT_RGB, False); # Request raw camera data
while True:
b, frame = cam2.read()
if b:
frame_16 = frame.view(dtype=np.int16) # reinterpret data as 16-bit pixels
frame_sh = np.right_shift(frame_16, 2) # Shift away the bottom 2 bits
frame_8 = frame_sh.astype(np.uint8) # Keep the top 8 bits
img = frame_8.reshape(480, 640) # Arrange them into a rectangle
cv2.imshow("Video", img)
k = cv2.waitKey(5)
if k & 0xFF == 27:
cam2.release()
cv2.destroyAllWindows()
break
Related
I am trying to write a script to manipulate video from a webcam. I am trying to do this through OpenCV with Python, but I am running into some issues.
If I run the video capture stream with no pixel manipulation applied, the stream works fine and has a smooth frame rate. However, I applied a threshold loop as a test, and my stream undergoes major lag and updates once every few seconds. Any ideas if it is possible to optimise this? Ideally, I am looking to get a 30 fps stream with the video manipulation applied. Here is the code:
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
T = 100
while True:
ret, frame = cap.read()
height, width, channels = frame.shape
for x in range(width):
for y in range(height):
if frame[y,x,0] < T:
frame[y,x]=0
else:
frame[y,x]=255
cv2.imshow('frame', frame)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
accessing Pixel by pixel in image processing in general is very bad practice as it slow the performance very much, packages like opencv and numpy has optimized this by doing matrix operations allowing your program to be much more faster, here is a sample code that will perform your task but much more faster.
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
T = 100
while True:
ret, frame = cap.read()
height, width, channels = frame.shape
B,G,R = cv2.split(frame)
# for x in range(width):
# for y in range(height):
# if frame[y,x,0] < T:
# frame[y,x]=0
# else:
# frame[y,x]=255
_,B = cv2.threshold(B,T,255,cv2.THRESH_BINARY)
frame = cv2.merge((B,G,R))
cv2.imshow('frame', frame)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I want to get each frame from a video as an image. background to this is following. I have written a Neural Network which is able to recognize Hand Signs. Now I want to start a video stream, where each image/frame of the stream is put through the Neural Network. To fit it into my neural Network, I want to render each frame and reduce the image to 28*28 pixels. In the end it should look similar to this: https://www.youtube.com/watch?v=JfSao30fMxY
I have searched through the web and found out that I can use cv2.VideoCapture to get the stream. But how can I pick each image of the Frame, render it and print the result back on the screen. My Code looks like this until now:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
# Todo: each Frame/Image from the video should be saved as a variable and open imageToLabel()
# Todo: before the image is handed to the method, it needs to be translated into a 28*28 np Array
# Todo: the returned Label should be printed onto the video (otherwise it can be )
i = 0
while (True):
# Capture frame-by-frame
# Load model once and pass it as an parameter
ret, frame = cap.read()
i += 1
image = cv2.imwrite('database/{index}.png'.format(index=i), frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2BGRAY)
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
def imageToLabel(imgArr, checkpointLoad):
new_model = tf.keras.models.load_model(checkpointLoad)
imgArrNew = imgArr.reshape(1, 28, 28, 1) / 255
prediction = new_model.predict(imgArrNew)
label = np.argmax(prediction)
return label
frame is the RGB Image you get from the stream.
gray is the grayscale converted image.
I suppose your network takes grayscaled images because of its shape. Therefor you need to first resize the image to (28,28) and then pass it to your imageToLabel function
resizedImg = cv2.resize(gray,(28,28))
label = imageToLabel(resizedImg,yourModel)
now that you know the prediction you can draw it on the frame using e.g. cv2.putText() and then draw the frame it returns instead of frame
edit:
If you want to use parts of the image for your network you can slice the image like this:
slicedImg = gray[50:150,50:150]
resizedImg = cv2.resize(slicedImg,(28,28))
label = imageToLabel(resizedImg,yourModel)
If you're not that familiar with indexing in python you might want to take a look at this
Also if you want it to look like in the linked video you can draw a rectangle from e.g. (50,50) to (150,150) that is green (0,255,0)
cv2.rectangle(frame,(50,50),(150,150),(0,255,0))
I have conferance call video with different people's tiles arranged on a grid.
Example:
gallery view zoom
Can I crop every video tile to a separate file using python or nodejs?
Yes, you can achieve that using OpenCV library
Read the video in OpenCV using VideoCapture API. Note down framerate while reading.
Parse through each frame and crop the frame:
Write the frame in a video using OpenCV VideoWriter
Here is the example code using (640,480) to be the new dimensions:
cap = cv2.VideoCapture(<video_file_name>)
fps = cap.get(cv2.CAP_PROP_FPS)
out = cv2.VideoWriter('<output video file name>, -1, fps, (640,480))
while(cap.isOpened()):
ret, frame = cap.read()
crop_frame = frame[y:y+h, x:x+w]
# write the crooped frame
out.write(crop_frame)
# Release reader wand writer after parsing all frames
cap.release()
out.release()
Here's the code (tested). It works by initialising a number of video outputs, then for each frame of the input video: cropping the region of interest (roi) and assigning each to the relevent output video. You might need to make tweaks depending on input video dimensions, number of times, offsets etc.
import numpy as np
import cv2
import time
cap = cv2.VideoCapture('in.mp4')
ret, frame = cap.read()
(h, w, d) = np.shape(frame)
horiz_divisions = 5 # Number of tiles stacked horizontally
vert_divisions = 5 # Number of tiles stacked vertically
divisions = horiz_divisions*vert_divisions # Total number of tiles
seg_h = int(h/vert_divisions) # Tile height
seg_w = int(w/horiz_divisions) # Tile width
# Initialise the output videos
outvideos = [0] * divisions
for i in range(divisions):
outvideos[i] = cv2.VideoWriter('out{}.avi'.format(str(i)),cv2.VideoWriter_fourcc('M','J','P','G'), 10, (seg_w,seg_h))
# main code
while(cap.isOpened()):
ret, frame = cap.read()
if ret == True:
vid = 0 # video counter
for i in range(vert_divisions):
for j in range(horiz_divisions):
# Get the coordinates (top left corner) of the current tile
row = i * seg_h
col = j * seg_w
roi = frame[row:row+seg_h,col:col+seg_w,0:3] # Copy the region of interest
outvideos[vid].write(roi)
vid += 1
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
# Release all the objects
cap.release()
for i in range(divisions):
outvideos[i].release()
# Release everything if job is finished
cv2.destroyAllWindows()
Hope this helps!
I have recently bought a stereo camera through Amazon and I want to use it for depth mapping. The problem is that the output that I get from the camera is in the form of a single video with the output of both the cameras.
What I want is two seprate outputs from the single usb port if it is possible.I can use cropping but I dont want to use that because i am trying to reduce the processing time and I want the outputs sepratley.
The obove image was generated from the following code
import numpy as np
import cv2
cam = cv2. VideoCapture(1)
cam.set(cv2.CAP_PROP_FPS, 120)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while(1):
s,orignal = cam.read()
cv2.imshow('original',orignal)
if cv2.waitKey(1) & 0xFF == ord('w'):
break
cam.release()
cv2.destroyAllWindows()
I have also tried other techniques such as:
import numpy as np
import cv2
left = cv2.VideoCapture(1)
right = cv2.VideoCapture(2)
left.set(cv2.CAP_PROP_FRAME_WIDTH, 720)
left.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
right.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
right.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
left.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
right.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
# Grab both frames first, then retrieve to minimize latency between cameras
while(True):
_, leftFrame = left.retrieve()
leftWidth, leftHeight = leftFrame.shape[:2]
_, rightFrame = right.retrieve()
rightWidth, rightHeight = rightFrame.shape[:2]
# TODO: Calibrate the cameras and correct the images
cv2.imshow('left', leftFrame)
cv2.imshow('right', rightFrame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
left.release()
right.release()
cv2.destroyAllWindows()
but they are not recognising the 3rd camera any help would be nice.
My openCV version is 3.4
P.S If anyone can present a soloution in c++ it would also work for me
Ok so after analysing the problem I figured that the best way would be to crop the images in half as it saves processing time. If you have two different image sources then your pipeline time is doubled for getting these images. After testing the stereo camera using cropping and without cropping I saw no noticeable change in the FPS. Here is a simple code for cropping the video and displaying it in two different windows.
import numpy as np
import cv2
cam = cv2. VideoCapture(1)
cam.set(cv2.CAP_PROP_FPS, 120)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
s,orignal = cam.read()
height, width, channels = orignal.shape
print(width)
print(height)
while(1):
s,orignal = cam.read()
left=orignal[0:height,0:int(width/2)]
right=orignal[0:height,int(width/2):(width)]
cv2.imshow('left',left)
cv2.imshow('Right',right)
if cv2.waitKey(1) & 0xFF == ord('w'):
break
cam.release()
cv2.destroyAllWindows()
[
What I need to do is fairly simple:
load a 5 frames video file
detect background
On every frames, one by one :
subtract background (create foreground mask)
do some calculations on foreground mask
save both original frame and foreground mask
Just to see the 5 frames and the 5 corresponding fgmasks :
import numpy as np
import cv2
cap = cv2.VideoCapture('test.avi')
fgbg = cv2.BackgroundSubtractorMOG()
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
fgmask = fgbg.apply(frame)
# Display the fgmask frame
cv2.imshow('fgmask',fgmask)
# Display original frame
cv2.imshow('img', frame)
k = cv2.waitKey(0) & 0xff
if k == 5:
break
cap.release()
cv2.destroyAllWindows()
Every frame gets opened and displayed correctly but the showed fgmask do not correspond to the showed original frame. Somewhere in the process, the order of the fgmasks gets mixed up.
The background does get subtracted correctly but I don't get the 5 expected fgmasks.
What am I missing ? I feel like this should be straightforward : the while loop runs over the 5 frames of the video and fgbg.apply apply the background subtraction function to each frame.
OpenCV version that I use is opencv-2.4.9-3
As bikz05 suggested, running average method worked pretty good on my 5 images sets. Thanks for the tip !
import cv2
import numpy as np
c = cv2.VideoCapture('test.avi')
_,f = c.read()
avg1 = np.float32(f)
avg2 = np.float32(f)
# loop over images and estimate background
for x in range(0,4):
_,f = c.read()
cv2.accumulateWeighted(f,avg1,1)
cv2.accumulateWeighted(f,avg2,0.01)
res1 = cv2.convertScaleAbs(avg1)
res2 = cv2.convertScaleAbs(avg2)
cv2.imshow('img',f)
cv2.imshow('avg1',res1)
cv2.imshow('avg2',res2)
k = cv2.waitKey(0) & 0xff
if k == 5:
break