I am using OpenCV to compute a disparity map of a scene. I have already calibrate the stereo camera by finding the intrinsic parameters individually with cv2.calibrateCamera and then with cv2.stereoCalibrate to find the rotation matrix and the translation vector.
I copy my calibration code but I think my problem is not here:
import numpy as np
import cv2
# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.000001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((6*7,3), np.float32)
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpointsL = []
imgpointsL = []
objpointsR = []
imgpointsR = []
imgR = cv2.imread('right2.jpg',0)
# Find the chess board corners
ret, cornersR = cv2.findChessboardCorners(imgR, (7,6),None)
# If found, add object points, image points (after refining them)
if ret == True:
imgL = cv2.imread('left3.jpg',0)
#grayL = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, cornersL = cv2.findChessboardCorners(imgL, (7,6),None)
# If found, add object points, image points (after refining them)
if ret == True:
#Intrinsic parameters
distCoeffsR = np.array([1.191755160158372399e-02, -8.585146447098485067e-03, 8.429413399383550720e-04, -6.222973871460263460e-05, -7.966310474599684957e-03])
distCoeffsL = np.array([-1.627558337813042599e-02, 2.409982163230293128e-01, 4.443126374210568282e-03, 1.288079049351137243e-03, -3.177831292965794807e-01])
cameraMatrixR = np.matrix('3.252248978261580987e+02 0 3.269955537627058106e+02;0 3.228400384496266042e+02 2.341068611530280350e+02;0 0 1')
cameraMatrixL = np.matrix('4.570360097428241488e+02 0 3.465188967298854550e+02;0 4.573286269805292363e+02 2.691439570063795372e+02;0 0 1')
retval,cameraMatrixL, distCoeffsL, cameraMatrixR, distCoeffsR, R, T, E, F = cv2.stereoCalibrate(objpointsL, imgpointsL, imgpointsR, cameraMatrixL, distCoeffsL, cameraMatrixR, distCoeffsR, (640,480))
Now I cv2.stereoRectify:
lFrame = cv2.imread('izquierda.jpg')
rFrame = cv2.imread('derecha.jpg')
w, h = lFrame.shape[:2] # both frames should be of same shape
#Perform stereorectification
R1, R2, P1, P2, Q, validPixROI1, validPixROI2 = cv2.stereoRectify(cameraMatrixL, distCoeffsL, cameraMatrixR, distCoeffsR, (w,h), R, T, cv2.CALIB_ZERO_DISPARITY,0, (0,0))
#computes undistort and rectify maps
mapxL, mapyL = cv2.initUndistortRectifyMap(cameraMatrixL, distCoeffsL, R1, P1, (w,h), cv2.CV_32FC1)
mapxR, mapyR = cv2.initUndistortRectifyMap(cameraMatrixR, distCoeffsR, R2, P2, (w,h), cv2.CV_32FC1)
dstL = cv2.remap(lFrame, mapxL, mapyL,cv2.INTER_LINEAR)
dstR = cv2.remap(rFrame, mapxR, mapyR,cv2.INTER_LINEAR)
while (True):
cv2.imshow('Left normal',lFrame)
cv2.imshow('Right normal',rFrame)
cv2.imshow('Left rectify',dstL)
cv2.imshow('Right rectify',dstR)
if cv2.waitKey(1) & 0xFF == ord('q'):
And these are the rectified images:
Left rectify
Right rectify
Try to switch height and width, namely change this line of your code:
w, h = lFrame.shape[:2] # both frames should be of same shape
to this:
h, w = lFrame.shape[:2] # both frames should be of same shape
I ran into the same problem and it helped me. I think this happens because OpenCV in its methods expects the second shape of numpy array as width and the first as height.
I was also facing the similar problem and then I compared my all matrices (obtained from python) with matrices obtained from Matlab using stereocalibration app. I found out that it is happening due to incorrect calibration. I got wrong camera distortion matrices which was causing this error. You can read my full answer here: Python 2.7/OpenCV 3.3: Error in cv2.initUndistortRectifyMap . Not showing undistort rectified images
For testing I generate a grid image as matrix and again the grid points as point array:
This represents a "distorted" camera image along with some feature points.
When I now undistort both the image and the grid points, I get the following result:
(Note that the fact that the "distorted" image is straight and the "undistorted" image is morphed is not the point, I'm just testing the undistortion functions with a straight test image.)
The grid image and the red grid points are totally misaligned now. I googled and found that some people forget to specify the "new camera matrix" parameter in undistortPoints but I didn't. The documentation also mentions a normalization but I still have the problem when I use the identity matrix as camera matrix. Also, in the central region it fits perfectly.
Why is this not identical, do I use something in a wrong way?
I use cv2 (4.1.0) in Python. Here is the code for testing:
import numpy as np
import matplotlib.pyplot as plt
import cv2
w = 401
h = 301
# helpers
def plotImageAndPoints(im, pu, pv):
plt.imshow(im, cmap="gray")
plt.scatter(pu, pv, c="red", s=16)
plt.xlim(0, w)
plt.ylim(0, h)
def cv2_undistortPoints(uSrc, vSrc, cameraMatrix, distCoeffs):
uvSrc = np.array([np.matrix([uSrc, vSrc]).transpose()], dtype="float32")
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
uDst = [uv[0] for uv in uvDst[0]]
vDst = [uv[1] for uv in uvDst[0]]
return uDst, vDst
# test data
# generate grid image
img = np.ones((h, w), dtype = "float32")
img[0::20, :] = 0
img[:, 0::20] = 0
# generate grid points
uPoints, vPoints = np.meshgrid(range(0, w, 20), range(0, h, 20), indexing='xy')
uPoints = uPoints.flatten()
vPoints = vPoints.flatten()
# see if points align with the image
plotImageAndPoints(img, uPoints, vPoints) # perfect!
# undistort both image and points individually
# camera matrix parameters
fx = 1
fy = 1
cx = w/2
cy = h/2
# distortion parameters
k1 = 0.00003
k2 = 0
p1 = 0
p2 = 0
# convert for opencv
mtx = np.matrix([
[fx, 0, cx],
[ 0, fy, cy],
[ 0, 0, 1]
], dtype = "float32")
dist = np.array([k1, k2, p1, p2], dtype = "float32")
# undistort image
imgUndist = cv2.undistort(img, mtx, dist)
# undistort points
uPointsUndist, vPointsUndist = cv2_undistortPoints(uPoints, vPoints, mtx, dist)
# test if they still match
plotImageAndPoints(imgUndist, uPointsUndist, vPointsUndist) # awful!
A bit late to the party, but to help others running into this issue:
The problem is that UndistortPoints is an iterative calculation which in some cases exits before a stable solution has been reached. This can be fixed by modifying the termination criteria for the calculation, which can be done by using UndistortPointsIter. You should replace:
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
uvDst = cv2.undistortPointsIter(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix,(cv2.TERM_CRITERIA_COUNT | cv2.TERM_CRITERIA_EPS, 40, 0.03))
Now, it tries 40 iterations to find a solution, rather than the default 5 iterations.
def run(self):
while True:
_ret, frame = self.cam.read()
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
vis = frame.copy()
if len(self.tracks) > 0:
img0, img1 = self.prev_gray, frame_gray
p0 = np.float32([tr[-1] for tr in self.tracks]).reshape(-1, 1, 2)
p1, _st, _err = cv2.calcOpticalFlowPyrLK(img0, img1, p0, None, **lk_params)
p0r, _st, _err = cv2.calcOpticalFlowPyrLK(img1, img0, p1, None, **lk_params)
d = abs(p0-p0r).reshape(-1, 2).max(-1)
good = d < 1
new_tracks = []
for i in range(len(p1)):
A.append(math.sqrt((p1[i][0][0])**2 + (p1[i][0][1])**2))
counts,bins,bars = plt.hist(A)
for tr, (x, y), good_flag in zip(self.tracks, p1.reshape(-1, 2), good):
if not good_flag:
tr.append((x, y))
if len(tr) > self.track_len:
del tr[0]
cv2.circle(vis, (x, y), 2, (0, 255, 0), -1)
self.tracks = new_tracks
cv2.polylines(vis, [np.int32(tr) for tr in self.tracks], False, (0, 255, 0))
draw_str(vis, (20, 20), 'track count: %d' % len(self.tracks))
if self.frame_idx % self.detect_interval == 0:
mask = np.zeros_like(frame_gray)
mask[:] = 255
for x, y in [np.int32(tr[-1]) for tr in self.tracks]:
cv2.circle(mask, (x, y), 5, 0, -1)
p = cv2.goodFeaturesToTrack(frame_gray, mask = mask, **feature_params)
if p is not None:
for x, y in np.float32(p).reshape(-1, 2):
self.tracks.append([(x, y)])
self.frame_idx += 1
self.prev_gray = frame_gray
cv2.imshow('lk_track', vis)
ch = cv2.waitKey(1)
if ch == 27:
i am using lk_track.py from opencv samples to try and detect a moving object. I am trying to find the camera motion using the histogram of magnitude of optical flow vectors and then calculate the average for similar values which should be directly proportional to the camera motion. I have calculated the magnitude of the vectors and saved it in a list A. Can some suggest on how to find highest similar values from it and calculate the average for only those values?
I created a toy problem to model the approach of binarizing the images by optical flow. This is a massively simplified view of the problem, but gives the general idea well. I'll split the problem up into a few chunks and give functions for them. If you're working directly with video, there will be a lot of additional code needed of course, and I just hardcoded a lot of values that you'll need to turn into parameters.
The first function is just for generating the image sequence. The images are moving through a scene with an object moving inside the sequence. The image sequence is just simply translating through the scene, and the object appears stationary in the sequence, but that means that the object is actually moving in the opposite direction of the camera of course.
import numpy as np
import cv2
def gen_seq():
"""Generate motion sequence with an object"""
scene = cv2.GaussianBlur(np.uint8(255*np.random.rand(400, 500)), (21, 21), 3)
h, w = 400, 400
step = 4
obj_mask = np.zeros((h, w), np.bool)
obj_h, obj_w = 50, 50
obj_x, obj_y = 175, 175
obj_mask[obj_y:obj_y+obj_h, obj_x:obj_x+obj_w] = True
obj_data = np.uint8(255*np.random.rand(obj_h, obj_w)).ravel()
imgs = []
for i in range(0, 1+w//step, step):
img = scene[:, i:i+w].copy()
img[obj_mask] = obj_data
return imgs
# generate image sequence
imgs = gen_seq()
# display images
for img in imgs:
cv2.imshow('Image', img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
So here's the basic image sequence visualized. I just used a random scene, translated through, and added a random object in the center.
Great! Now we need to calculate the flow between each frame. I used dense flow here, but sparse flow would be more robust for actual images.
def find_flows(imgs):
"""Finds the dense optical flows"""
optflow_params = [0.5, 3, 15, 3, 5, 1.2, 0]
prev = imgs[0]
flows = []
for img in imgs[1:]:
flow = cv2.calcOpticalFlowFarneback(prev, img, None, *optflow_params)
prev = img
return flows
# find optical flows between images
flows = find_flows(imgs)
# display flows
h, w = imgs[0].shape[:2]
hsv = np.zeros((h, w, 3), dtype=np.uint8)
hsv[..., 1] = 255
for flow in flows:
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
hsv[..., 0] = ang*180/np.pi/2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('Flow', rgb)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
Here I colorized the flow based on it's angle and magnitude. The angle will determine the color and the magnitude will determine the intensity/brightness of the color. This is the same view the OpenCV tutorial on dense optical flow uses.
Then, we need to binarize this flow so that we get two distinct sets of pixels based on how they're moving. In the sparse case, this works out the same except you will get two distinct sets of features.
def label_flows(flows):
"""Binarizes the flows by direction and magnitude"""
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
h, w = flows[0].shape[:2]
labeled_flows = []
for flow in flows:
flow = flow.reshape(h*w, -1)
comp, labels, centers = cv2.kmeans(flow, 2, None, criteria, 10, flags)
n = np.sum(labels == 1)
camera_motion_label = np.argmax([labels.size-n, n])
labeled = np.uint8(255*(labels.reshape(h, w) == camera_motion_label))
return labeled_flows
# binarize the flows
labeled_flows = label_flows(flows)
# display binarized flows
for labeled_flow in labeled_flows:
cv2.imshow('Labeled Flow', labeled_flow)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
cv2.destroyWindow('Labeled Flow')
The annoying thing here is the labels will be set randomly, i.e. the labels will be different for each frame. If you visualized the binary image, it would flip between black and white randomly. I'm only using binary labels, 0 and 1, so what I did was considered the label that is assigned to more pixels to be the "camera motion label" and then I set that label to be white in the resulting images, and the other label to be black, that way the camera motion label is always the same in each frame. This may need to be much more sophisticated for working on video feed.
But here we have it, a binarized flow where the color is just showing the two distinct sets of flow vectors.
Now if we wanted to find the target in this flow, we could invert the image and find the connected components of the binary image. The inversion will make the camera motion the background label (0). Then each of the black blobs will be white and will be labeled, and we could find the blob relating to the largest component which, in this case, will be the target. That will give a mask around the target, and we can draw the contours of that mask on the original images to see the target being detected. I'll also cut the borders of the image off before finding the connected components so edge effects from dense flow are ignored.
def find_target_in_labeled_flow(labeled_flow):
labeled_flow = cv2.bitwise_not(labeled_flow)
bw = 10
h, w = labeled_flow.shape[:2]
border_cut = labeled_flow[bw:h-bw, bw:w-bw]
conncomp, stats = cv2.connectedComponentsWithStats(border_cut, connectivity=8)[1:3]
target_label = np.argmax(stats[1:, cv2.CC_STAT_AREA]) + 1
img = np.zeros_like(labeled_flow)
img[bw:h-bw, bw:w-bw] = 255*(conncomp == target_label)
return img
for labeled_flow, img in zip(labeled_flows, imgs[:-1]):
target_mask = find_target_in_labeled_flow(labeled_flow)
display_img = cv2.merge([img, img, img])
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
display_img = cv2.drawContours(display_img, contours, -1, (0, 255, 0), 2)
cv2.imshow('Detected Target', display_img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
And of course this could get some cleaning up, and you won't be doing exactly this for sparse flow. You could just define a region of interest around the tracked points.
Now, there is still a lot of work to do. You have a binarized flow...you can probably assume that the label which occurs most frequently is the camera motion (like I did) safely. However, you'll have to make sure that the other label is the object you're interested in tracking. You'll have to keep track of it between flows so that if it stops moving, you'll know where it is as the camera is moving. When you do the k-means step, you'll want to make sure that the centers from k-means are "far enough" apart so that you know the object is moving or not.
The basic steps for that would be, from the starting frame of the video:
If the two centers are "close", then you can assume your object is either not in the scene or not moving in the scene.
Once the centers are split enough apart, you'll have found the object to track. Keep track of the location of the object.
During tracking of the object, verify the location is nearby a prediction. You can use the optical flow velocity vectors from the previous frame to predict the location each pixel/feature in the new frame, so make sure your predictions agree with your tracking result.
If the object stops moving, the centers from k-means should be close. Keep track of the optical flow vectors around the object location and follow them to have a prediction of where the object is again once it resumes moving, and again verify the detected location with this prediction.
I've never used these methods before so I'm not sure how robust they are. The typical approach for HOOF or "Histogram of oriented optical flow" is much more advanced than this (see the seminal paper here). Instead of just binarizing, the idea is to use histograms from each frame as a probability distribution, and the way this probability distribution changes over time can be analyzed with the tools from time series analysis, which I assume give a more robust framework to this approach.
with #alkasm's answer to avoid the following error:
(-215:Assertion failed) npoints > 0 in function 'drawContours'
simply replace:
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
contours, _ = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
I'm working on the optical flow tutorial of openCV using Python 2.7 with OpenCV 3.1.0 and have a question concerning the use of cv2.line(). Here is the original code with the highlighted part of interest:
import numpy as np
import cv2
cap = cv2.VideoCapture('slow.flv')
# params for ShiTomasi corner detection
feature_params = dict( maxCorners = 100,
qualityLevel = 0.3,
minDistance = 7,
blockSize = 7 )
# Parameters for lucas kanade optical flow
lk_params = dict( winSize = (15,15),
maxLevel = 2,
criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))
# Create some random colors
color = np.random.randint(0,255,(100,3))
# Take first frame and find corners in it
ret, old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)
p0 = cv2.goodFeaturesToTrack(old_gray, mask = None, **feature_params)
# Create a mask image for drawing purposes
mask = np.zeros_like(old_frame)
ret,frame = cap.read()
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# calculate optical flow
p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)
# Select good points
good_new = p1[st==1]
good_old = p0[st==1]
################## IMPORTANT ##################
# draw the tracks
for i,(new,old) in enumerate(zip(good_new,good_old)):
a,b = new.ravel()
c,d = old.ravel()
mask = cv2.line(mask, (a,b),(c,d), color[i].tolist(), 2)
frame = cv2.circle(frame,(a,b),5,color[i].tolist(),-1)
################## IMPORTANT ##################
########### START insert code below ###########
# Mean-vector of camera movement
############ END insert code below ############
img = cv2.add(frame,mask)
k = cv2.waitKey(30) & 0xff
if k == 27:
# Now update the previous frame and previous points
old_gray = frame_gray.copy()
p0 = good_new.reshape(-1,1,2)
In my workspace the variables a, b, c and d are shown as array scalar float32. So I would assume, that they need to be converted to tuples of int in order to execute cv2.line() or cv2.circle().
When I try to add code using cv2.line() I have to use a conversion to int (see below), otherwise I receive a very clear message: TypeError: integer argument expected, got float
###################### START added code
ofvec = p1 - p0
ofvec = np.mean(ofvec, 1) # Collapse the first dimension
ofvec_cam = np.mean(ofvec,0) # mean of camera movement
height, width = old_frame.shape[:2]
x0 = np.int(width/2)
y0 = np.int(height/2)
pt_center = (x0, y0)
x = np.int( x0 - ofvec_cam[0].tolist() )
y = np.int( y0 - ofvec_cam[1].tolist() )
pt_ofvec_cam = (x, y)
frame = cv2.line(frame, pt_center, pt_ofvec_cam, [0, 0, 255], 2)
###################### END added code
Can anyone explain this difference to me? Thanks in advance and have a nice day!
It seems that cv2.line() treats differently two types of floats: "standard" Python floats and numpy floats. See the minimum working example using Python 2.7 with OpenCV 3.1.0:
import numpy as np, cv2
mask = np.zeros([10, 20, 3], dtype=np.uint8)
color = [0, 0, 0]
# Using Numpy
a = np.float32(12.34)
mask = cv2.line(mask, (a,a), (a,a), color)
# Using standard Python data type
b = 12.34
mask = cv2.line(mask, (b,b), (b,b), color)
In case a the command executes without a hitch, in case b we find the above mentioned error:
in <module> mask = cv2.line(mask, (b,b), (b,b), color)
TypeError: integer argument expected, got float`
Concerning the initial question I confirm that in the OpenCV tutorial the variables a, b, c and d are all numpy-floats whereas in the added code the variables x and y are standard Python floats before they are converted to numpy-ints by np.int().
Both data types provide a method__int__() which returns the int-value of the float (see also difference between native int type and the numpy int types).
The only reference to speak of that I have found is this note concerning the method fromarray in the documentation of OpenCV 2.4.13:
Note In the new Python wrappers (cv2 module) the function is not needed, since cv2 can process Numpy arrays (and this is the only supported array type).
In the docs of OpenCV 3.1.0 the method fromarray does not exist anymore.
Well, I am trying to do camera calibration with opencv 2.4.9 in windows 8.1 operating system (ubuntu operating system doesn't resolve the problem.)
Problem: I am using the code below to calibrate my camera, but it seems that if the number of my sample images (with check board pattern) is more than 2, then the roi of newcameramtx,roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h)) result in [0,0,0,0]. How is the number of samples connected to that result? (earlier, prior of making some changes in this code, the maximum number of samples was 12).
By saying maximum number of samples I mean the images acquired from my camera with the chessboard pattern, were the roi doesn't give good result if the number exceeds the maximum number.
The corner detection works very well. You can find my sample images here.
# -*- coding: utf-8 -*-
Created on Fri May 16 15:23:00 2014
#author: kakarot
import numpy as np
import cv2
#import os
#import time
from matplotlib import pyplot as plt
LeftorRight = 'L'
numer = 12
chx = 6
chy = 9
chd = 25
# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, numer, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((chy*chx,3), np.float32)
objp[:,:2] = np.mgrid[0:chy,0:chx].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space, (x25mm)
imgpoints = [] # 2d points in image plane.
enum = 1
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv2.findChessboardCorners(gray, (chy,chx),None)
# If found, add object points, image points (after refining them)
if ret == True and enum <= numer:
# Draw and display the corners
cv2.drawChessboardCorners(img, (chy,chx),corners,ret)
print enum
if enum == numer:
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
img = cv2.imread('1280x720p/BestAsPerMatlab/calib_'+str(LeftorRight)+'7.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
h, w = img.shape[:2] #a (1 to see the whole picture)
newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
if (np.size(roi) == 4 and np.mean(roi) != 0):
# undistort
mapx,mapy = cv2.initUndistortRectifyMap(mtx,dist,None,newcameramtx,(w,h),5)
dst = cv2.remap(img,mapx,mapy,cv2.INTER_LINEAR)
# crop the image
x,y,w,h = roi
dst = dst[y:y+h, x:x+w]
dst = cv2.cvtColor(dst,cv2.COLOR_RGB2BGR)
np.disp('Something Went Wrong')
enum += 1
k = cv2.waitKey(1) & 0xFF
if k == 27:
EDIT: I am using two cheap usb cameras. I figured out that the sample set of one of the cameras is ok, and i can use more than 19 samples without problem. But when using the calibrating samples of the other camera the maximum number of sample images is 2. (if I make another set of samples the number will vary). In conclusion it seems that there is something going on with the calibrating matrixes that produce. But it is weird though.
Finally I am using fisheye cameras, believing that cutting enough pixels around the end of each capture I would simulate a normal camera... maybe this is what is causing me the trouble!
You should change dist to
dist = np.array([-0.13615181, 0.53005398, 0, 0, 0]) # no translation
and then make call to
newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
It worked for me.
I am trying to simulate an image standing out of a marker. This is my code so far which does what is pictured. Essentially, I just want to rotate the image to appear to stand out orthogonal to the checkerboard.
As you can see, I use the code to find the transformation matrix between a normalized square image and the corresponding checkerboard corners. I then use warpPerspective to get the image you see. I know that I can use the rotation vectors from the solvePnP to obtain a rotation matrix through rodrigues() but I dont know what the next step is
def transformTheSurface(inputFrame):
ret, frameLeft = capleft.read()
capGray = cv2.cvtColor(frameLeft,cv2.COLOR_BGR2GRAY)
found, corners = cv2.findChessboardCorners(capGray, (5,4), None, cv2.CALIB_CB_NORMALIZE_IMAGE + cv2.CALIB_CB_ADAPTIVE_THRESH ) #,None,cv2.CALIB_CB_FAST_CHECK)
if (found):
npGameFrame = pygame.surfarray.array3d(inputFrame)
inputFrameGray = cv2.cvtColor(npGameFrame,cv2.COLOR_BGR2GRAY)
cv2.drawChessboardCorners(frameLeft, (5,4), corners, found)
q = corners[[0, 4, 15, 19]]
ret, rvecs, tvecs = cv2.solvePnP(objp, corners, mtx, dist)
ptMatrix = cv2.getPerspectiveTransform( muffinCoords, q)
npGameFrame = cv2.flip(npGameFrame, 0)
ptMatrixWithXRot = ptMatrix * rodRotMat[0]
#inputFrameConv = cv2.cvtColor(npGameFrame,cv2.COLOR_BGRA2GRAY)
transMuffin = cv2.warpPerspective(npGameFrame, ptMatrix, (640, 480)) #, muffinImg, cv2.INTER_NEAREST, cv2.BORDER_CONSTANT, 0)
I have added some more code, in hopes to create my own 3x3 transformation matrix. I used the following reference . Here is my code for that:
#initialization happens earlier in code
muffinCoords = np.zeros((4,2), np.float32)
muffinCoords[0] = (0,0)
muffinCoords[1] = (200,0)
muffinCoords[2] = (0,200)
muffinCoords[3] = (200,200)
A1 = np.zeros((4,3), np.float32)
A1[0] = (1,0,322)
A1[1] = (0,1,203)
A1[2] = (0,0,0)
A1[3] = (0,0,1)
R = np.zeros((4,4), np.float32)
R[3,3] = 1.0
T = np.zeros((4,4), np.float32)
T[0] = (1,0,0,0)
T[1] = (0,1,0,0)
T[2] = (0,0,1,0)
T[3] = (0,0,0,1)
#end initialization
#load calib data derived using cv2.calibrateCamera, my Fx and Fy are about 800
loadedCalibFileMTX = np.load('calibDataMTX.npy')
mtx = np.zeros((3,4), np.float32)
mtx[:3,:3] = loadedCalibFileMTX
#this is new to my code, creating what I interpret as Rx*Ry*Rz
ret, rvecCalc, tvecs = cv2.solvePnP(objp, corners, loadedCalibFileMTX, dist)
rodRotMat = cv2.Rodrigues(rvecCalc)
R[:3,:3] = rodRotMat[0]
#then I create T
T[0,3] = tvecs[0]
T[1,3] = tvecs[1]
T[2,3] = tvecs[2]
# A1 -> 2d to 3d projection matrix
# R-> rotation matrix as calculated by solve PnP, or Rx * Ry * Rz
# T -> converted translation matrix, reference from site, vectors pulled from tvecs of solvPnP
# mtx -> 3d to 2d matrix
# customTransformMat = mtx * (T * (R * A1)) {this is intended calculation of following}
first = np.dot(R, A1)
second = np.dot(T, first)
finalCalc = np.dot(mtx, second)
finalNorm = finalCalc/(finalCalc[2,2]) # to make sure that the [2,2] element is 1
transMuffin = cv2.warpPerspective(npGameFrame, finalNorm, (640, 480), None, cv2.INTER_NEAREST, cv2.BORDER_CONSTANT, 0)
#transMuffin is returned as undefined here, any help?
# using the cv2.getPerspectiveTransform method to find what you can find pictured at the top
ptMatrix = cv2.getPerspectiveTransform( muffinCoords, q)
I finally figured out the right methodology. you can find the code here https://github.com/mikezucc/augmented-reality-fighter-pygame
Almost ALL of the game code is written by Leif Theiden, and is under license specified in the .py files. The code relevant to the computer vision is in states.py. I used the game to just show that it can be done for others looking to get started in simple computer vision.
My code opens a thread everytime a new surface (PyGame for frame simply) is called to be displayed on the main window. I start a thread at that point and execute a simple computer vision function that does the following:
Searches a camera stream frame for the 5x4 chessboard (cv2.findChessboardCorners)
The found corners are then drawn onto the image
Using cv2.solvePnP, the approximate pose (Rotation and translation vectors) are derived
The 3d points that describe a square are then projected from the 3d space determined by step 3 into a 2d space. this is used to convert a predertimined 3d structure into something you can use to graph on a 2d image.
However, this step instead finds the transformation to get from a set of 2d square points (the dimensions of the game frame) to the newly found projected 2d points (of the 3d frame). Now you can see that what we are trying to is simply do a two step transformation.
I then perform a basic tutorial style addition of the captured stream frame and the transformed game frame to get a final image
+from3dTransMatrix -> points of the projected 3d structure into 2d points. these are the red dots you see
+q -> this is the reference plane that we determine the pose from
+ptMatrix -> the final transformation, to transform the game frame to fit in the projected frame
Check out the screens in the topmost folder ;]