Depth from stereo camera with different focal length - python

I am working on stereo camera depth estimation. However, for particular purpose, I need to use two camera with different FOV and focal length. After a lot of Google research, I know that I can still calibrate the two camera and rectify them to generate disparity map, but I have no idea how to convert disparity map to depth because their focal length is different, which does not satisfy the model in
http://docs.opencv.org/trunk/dd/d53/tutorial_py_depthmap.html
Does anyone have solution?
Thanks!

Have you considered using cv::triangulate() to obtain the (x,y,z) 3D world coordinates ?
I believe your end goal is to obtain a depth map.

Related

How to measure distance between camera and an object?

I'm an OpenCV beginner, just wondering which way would be the best to measure
the distance between the camera to an object in a given video.
Every tutorial I encountered before tutor by using camera calibration first and then undistorting the camera lens. But in this case I don't use my own camera, so is it necessary for me to use these functions?
In addition, I some data of the recording camera, such as:
(fx,fy) = focal length
(cx,cy) = principle point
(width,height) = image shape
radial = radial distortion
(t1,t2) = tangential distortion.
Usually, one does measure the distance between a single camera and an object with prior knowledge of the object. It could be the dimensions of a planar pattern or the 3D positions of edges that can easily be detected automatically using image analysis.
The computation of the position of the object with respect to the camera is usually done by solving a PnP problem.
https://en.m.wikipedia.org/wiki/Perspective-n-Point
Solving the PnP equations do require the camera parameters (at least the intrinsic matrix, and ideally the distortion coefficients for more accuracy).
These parameters can be estimated by calibrating your camera. OpenCV provides a handful of functions that you can use to calibrate your monocular camera. Alternatively, you can use a platform like CalibPro to compute these parameters for you.
[Disclaimer] I am the founder of Calibpro. I am happy to help you use the platform and I'd love your feedbacks on your experience using it.

Align RGB Depth Map with RGB image without intrinsic matrix

I have pairs of image - an image (blurred intentionally) and its depth map (given as PNG).
For example:
However, there seems to be a shift between the depth map and the real image as can be seen in this example:
All i know that these images were shot with a RealSense LiDAR Camera L515 (I do not have knowledge of the underlying camera characteristics or the distance between both rgb and infrared sensors).
Is there a way to align both images? I searched the internet for possible solutions. However, all solutions rely on data that I do not have, such as the intrinsic matrix, cameras SDK and more.
Since the two imaging systems are very close physically, the homography between them would likely be a good approximation. You can find the homography using 4 corresponding points that you choose manually.
You can use the OpenvCV implementation.

opencv: reprojectImageTo3d what is the metric unit of the (X,Y,Z) point?

firstly, I wanted to know the metric unit of the 3d point we got from the opencv reprojectImageTo3D() function.
secondly, I have calibrated each camera individually with a chessboard with "mm" as metric unit and then use the opencv functions to calibrate the stereo system, rectify the stereo pair and then compute the disparity map.
Basically i want the distance of a center of a bounding box.
so i compute the disparity map and reproject it to 3D with the reprojectImageTo3D() function and then i take from those 3D points, the one which correspond to the center of the bbox (x, y).
But which image should i use to get the center of bbox? the rectified or the original?
Secondly, is it better to use the same camera model for a stereo system?
Thank you
During the calibration process (calibrateCamera) you have to give the points grid of your calibration target. The unit that you give there will then define the unit for the rest of the process.
When calling reprojectImageTo3D, you probably used the matrix Q output by stereoRectify, which takes in the individual calibrations (cameraMatrix1, cameraMatrix2). That's where the unit came from.
So in your case you get mm I guess.
reprojectImageTo3D has to use the rectified image, since the disparity is calculated using the rectified image (It wouldn't be properly aligned otherwise). Also, when calculating the disparity, it is calculated relative to the first image given (left one in the doc). So you should use the left rectified image if you computed the disparity like this: cv::StereoMatcher::compute(left, right)
I never had two different cameras, but it makes sense to use the same ones. I think that if you have very different color images, edges or any image difference, that could potentially influence the disparity quality.
What is actually very important (unless you are only working with still pictures), is to use cameras that can be synchronized by hardware (e.g. GENLOCK signal: https://en.wikipedia.org/wiki/Genlock). If you have a bit of delay between left and right and a moving subject, the disparity can be wrong. This is also true for the calibration.
Hope this helps!

Chessboard Camera Calibration using OpenCV

I am trying to compute the distance of an object from the camera using stereo vision approach. But before computing the disparity map, I must ensure that my cameras are calibrated.
I followed the opencv python tutorial on camera calibration. They have used a chessboard to calibrate their cameras. Now my question is, if I want to calibrate my cameras, do I need to click photos of a chessboard from various angles manually? Or can I use the 14 chessboard images they have made available?
My next question (depending on the answer to the previous question) is, if I can use their images to calibrate my cameras, what is the logic behind this? How can images clicked from their cameras be used to calibrate my cameras, i.e. get the camera matrix for my cameras? I would like to get more intuition behind this camera calibration process.
Any help will be appreciated. Thanks.
1- No, you print something similar with a chessboard pattern and you use it to calibrate your own camera. You can use code at here.
2- The process basically goes like this: To determine coordinate of a pixel in an image, you need to know two(counting only most fundamental ones, currently I exclude distortion parameters) set of parameters. First set of parameters are inner parameters of your camera (intrinsic parameters) are optical center of your camera (basically center pixel of your sensor/lens) and focal length of your camera. Intrinsic parameters are fixed for your camera unless you change some settings of the device or some settings change with time. Second set of parameters are a position and rotation vector that describes where your camera is in the 3D world (these are extrinsic parameters). Extrinsic parameters change for every example image you have. You can think of camera calibration as an optimization process that tries to find best parameters (parameters that give minimum reprojection error) for the example images you have given.

Processing Images with Depth Information

I've been doing a lot of Image Processing recently on Python using OpenCV and I've worked all this while with 2-D Images in the generic BGR style.
Now, I'm trying to figure out how to incorporate depth and work with depth information as well.
I've seen the documentation on creating simple point clouds using the Left and Right images of a Stereocamera, but I was hoping to gain some intuition on Depth-based cameras themselves like Kinect.
What kind of camera should I use for this purpose, and more importantly: how do I process these images in Python - as I can't find a lot of documentation on handling RGBD images in OpenCV.
If you want to work with depth based cameras you can go for Time of Flight(ToF) cameras like picoflexx and picomonstar cameras. They will give you X,Y and Z values. Where your x and y values are distances from camera centre of that point (like in 2D space) and Z will five you the direct distance of that point (not perpendicular) from camera centre.
For this camera and this 3d data processing you can use Point Cloud Library fro processing.

Categories