I'm an OpenCV beginner, just wondering which way would be the best to measure
the distance between the camera to an object in a given video.
Every tutorial I encountered before tutor by using camera calibration first and then undistorting the camera lens. But in this case I don't use my own camera, so is it necessary for me to use these functions?
In addition, I some data of the recording camera, such as:
(fx,fy) = focal length
(cx,cy) = principle point
(width,height) = image shape
radial = radial distortion
(t1,t2) = tangential distortion.
Usually, one does measure the distance between a single camera and an object with prior knowledge of the object. It could be the dimensions of a planar pattern or the 3D positions of edges that can easily be detected automatically using image analysis.
The computation of the position of the object with respect to the camera is usually done by solving a PnP problem.
https://en.m.wikipedia.org/wiki/Perspective-n-Point
Solving the PnP equations do require the camera parameters (at least the intrinsic matrix, and ideally the distortion coefficients for more accuracy).
These parameters can be estimated by calibrating your camera. OpenCV provides a handful of functions that you can use to calibrate your monocular camera. Alternatively, you can use a platform like CalibPro to compute these parameters for you.
[Disclaimer] I am the founder of Calibpro. I am happy to help you use the platform and I'd love your feedbacks on your experience using it.
Related
I am trying to scan an object with a laser to extract 3D point clouds. There are 2 cameras and 1 laser in my setup. What I do is giving nonzero points in masks to OpenCV's triangulatePoints function as projPoints arg. Since both numbers of points must be the same for triangulatePoints function and there are 2 masks, if one mask has more nonzero points than the other, I basically downsize it to other's size by doing this:
l1 = len(pts1)
l2 = len(pts2)
newPts1 = pts1[0:l2]
Is there a good way for matching left and right frame nonzero points?
First, if your images normally look like that, your sensors are deeply saturated, and consequently your 3D ranges are either worthless or much less accurate than they could be.
Second, you should aim for matching one point per rectified scanline on each image of the pair, rather than a set of points. The whole idea of using a laser stripe is to get a well focused beam of light on as small a spot or band as possible, so you can probe the surface in detail.
For best accuracy, the peak-finding should be done independently on each scanline of the original (distorted and not rectified) images, so it is not affected by the interpolation used by the undistortion and stereo rectification procedures. Rather, you would use the geometrical undistortion and stereo rectification transforms to map the peaks detected in original images into the rectified ones.
There are several classical algorithms for peak-finding with laser stripe-based triangulation methods, you may find this other answer of mine useful.
Last, if your setup is expected to be as in the picture, with the laser stripe illuminating two orthogonal planes in addition to the object of interest, then you do not need to use stereo at all: you can solve for the 3D plane spanned by the laser stripe projector and triangulate by intersecting that plane with each ray back-projecting the peaks of the image of the laser stripe on the object. This is similar to one of the methods J. Y. Bouguet used in his old Ph.D. thesis on desktop photography (here is a summary by S. Seitz). One implementation using a laser striper is detailed in this patent. This method is surprisingly accurate: with it we achieved approximately 0.2mm accuracy in a cubic foot of volume using a dinky 640x480 CCD video camera back in 1999. Patent has expired, so you are free to enjoy it.
I have a setup where a (2D) camera is mounted on the end-effector of a robot arm - similar to the OpenCV documentation:
I want to calibrate the camera and find the transformation from camera to end-effector.
I have already calibrated the camera using this OpenCV guide, Camera Calibration, with a checkerboard where the undistorted images are obtained.
My problem is about finding the transformation from camera to end-effector. I can see that OpenCV has a function, calibrateHandEye(), which supposely should achieve this. I already have the "gripper2base" vectors and are missing the "target2cam" vectors. Should this be based on the size of the checkerboard squares or what am I missing?
Any guidance in the right direction will be appreciated.
You are close to the answer.
Yes, it is based on the size of the checkerboard. But instead of directly taking those parameters and an image, this function is taking target2cam. How to get target2cam? Just simply move your robot arm above the chessboard so that the camera can see the chessboard and take a picture. From the picture of the chessboard and camera intrinsics, you can find target2cam. Calculating the extrinsic from the chessboard is already given in opencv.
Repeat this a couple of times at different robot poses and collect multiple target2cam. Put them calibrateHandEye() and you will get what you need.
Good day Community,
I am currently working on the subject of multiple camera calibration. there is one point that I have not yet understood.
Suppose I have calibrated my separately cameras with N-images on which a checkerboard can be seen in different positions and orientations in the same FOV. After the calibration I get the intrinsic parameters for each camera. Furthermore I get the extrinsic parameters rvec and tvec for each image. Now I want to determine each camera pose wrt to an arbitary world coordinate and then use one camera as a reference for the others. At this point some questions arise:
Which of the rvec's and tvec's should I use for the further process?
Do I need to average rvec and tvec to determine the camera pose as accurately as possible?
I hope someone can give me some advice.
firstly, I wanted to know the metric unit of the 3d point we got from the opencv reprojectImageTo3D() function.
secondly, I have calibrated each camera individually with a chessboard with "mm" as metric unit and then use the opencv functions to calibrate the stereo system, rectify the stereo pair and then compute the disparity map.
Basically i want the distance of a center of a bounding box.
so i compute the disparity map and reproject it to 3D with the reprojectImageTo3D() function and then i take from those 3D points, the one which correspond to the center of the bbox (x, y).
But which image should i use to get the center of bbox? the rectified or the original?
Secondly, is it better to use the same camera model for a stereo system?
Thank you
During the calibration process (calibrateCamera) you have to give the points grid of your calibration target. The unit that you give there will then define the unit for the rest of the process.
When calling reprojectImageTo3D, you probably used the matrix Q output by stereoRectify, which takes in the individual calibrations (cameraMatrix1, cameraMatrix2). That's where the unit came from.
So in your case you get mm I guess.
reprojectImageTo3D has to use the rectified image, since the disparity is calculated using the rectified image (It wouldn't be properly aligned otherwise). Also, when calculating the disparity, it is calculated relative to the first image given (left one in the doc). So you should use the left rectified image if you computed the disparity like this: cv::StereoMatcher::compute(left, right)
I never had two different cameras, but it makes sense to use the same ones. I think that if you have very different color images, edges or any image difference, that could potentially influence the disparity quality.
What is actually very important (unless you are only working with still pictures), is to use cameras that can be synchronized by hardware (e.g. GENLOCK signal: https://en.wikipedia.org/wiki/Genlock). If you have a bit of delay between left and right and a moving subject, the disparity can be wrong. This is also true for the calibration.
Hope this helps!
I am trying to compute the distance of an object from the camera using stereo vision approach. But before computing the disparity map, I must ensure that my cameras are calibrated.
I followed the opencv python tutorial on camera calibration. They have used a chessboard to calibrate their cameras. Now my question is, if I want to calibrate my cameras, do I need to click photos of a chessboard from various angles manually? Or can I use the 14 chessboard images they have made available?
My next question (depending on the answer to the previous question) is, if I can use their images to calibrate my cameras, what is the logic behind this? How can images clicked from their cameras be used to calibrate my cameras, i.e. get the camera matrix for my cameras? I would like to get more intuition behind this camera calibration process.
Any help will be appreciated. Thanks.
1- No, you print something similar with a chessboard pattern and you use it to calibrate your own camera. You can use code at here.
2- The process basically goes like this: To determine coordinate of a pixel in an image, you need to know two(counting only most fundamental ones, currently I exclude distortion parameters) set of parameters. First set of parameters are inner parameters of your camera (intrinsic parameters) are optical center of your camera (basically center pixel of your sensor/lens) and focal length of your camera. Intrinsic parameters are fixed for your camera unless you change some settings of the device or some settings change with time. Second set of parameters are a position and rotation vector that describes where your camera is in the 3D world (these are extrinsic parameters). Extrinsic parameters change for every example image you have. You can think of camera calibration as an optimization process that tries to find best parameters (parameters that give minimum reprojection error) for the example images you have given.