I want to use ArUco to find the "space coordinates" of a marker.
I have problems understanding the tvecs and rvecs. I came so far as to the tvecs are the translation and the rvecs are for rotation. But how are they oriented, in which order are they written in the code, or how do I orient them?
I have a camera (laptop webcam just drawn to illustrate the orientation of the camera) at the position X,Y,Z, the Camera is oriented, which can be described with angle a around X, angle b around Y, angle c around Z (angles in Rad).
So if my camera is stationary I would take different pictures of the ChArUco Boards and give the camera calibration algorithm the tvecs_camerapos (Z,Y,X) and the rvecs_camerapos (c,b,a). I get the cameraMatrix, distCoeffs and tvecs_cameracalib, rvecs_cameracalib. t/rvecs_camerapos and t/rvecs_cameracalib are different which I find weird.
Is this nomination/order of t/rvecs correct?
Should I use camerapos or cameracalib for pose estimation if the camera does not move?
I think t/rvecs_cameracalib is negligible because I am only interested in the intrinsic parameters of the camera calibration algorithm.
Now I want to find the X,Y,Z position of the marker, I use aruco.estimatePoseSingleMarkers with t/rvecs_camerapos and retrive t/rvecs_markerpos. The tvecs_markerpos don't match my expected values.
Do I need a transformation of t/rvecs_markerpos to find X,Y,Z of the Marker?
Where is my misconception?
OpenCV routines that deal with cameras and camera calibration (including AruCo) use a pinhole camera model. The world origin is defined as the centre of projection of the camera model (where all light rays entering the camera converge), the Z axis is defined as the optical axis of the camera model, and the X and Y axes form an orthogonal system with Z. +Z is in front of the camera, +X is to the right, and +Y is down. All AruCo coordinates are defined in this coordinate system. That explains why your "camera" tvecs and rvecs change: they do not define your camera's position in some world coordinate system, but rather the markers' positions relative to your camera.
You don't really need to know how the camera calibration algorithm works, other than that it will give you a camera matrix and some lens distortion parameters, which you use as input to other AruCo and OpenCV routines.
Once you have calibration data, you can use AruCo to identify markers and return their positions and orientations in the 3D coordinate system defined by your camera, with correct compensation for the distortion of your camera lens. This is adequate to do, for example, augmented reality using OpenGL on top of the video feed from your camera.
The tvec of a marker is the translation (x,y,z) of the marker from the origin; the distance unit is whatever unit you used to define your printed calibration chart (ie, if you described your calibration chart to OpenCV using mm, then the distance unit in your tvecs is mm).
The rvec of a marker is a 3D rotation vector which defines both an axis of rotation and the rotation angle about that axis, and gives the marker's orientation. It can be converted to a 3x3 rotation matrix using the Rodrigues function (cv::Rodrigues()). It is either the rotation which transforms the marker's local axes onto the world (camera) axes, or the inverse -- I can't remember, but you can easily check.
In my understanding, the camera coordinate is the reference frame of the 3D world. The rvec and tvec are the transformations used to get the position of any other 3D point(in the world reference frame) w.r.t the camera coordinate system. So both these vectors are the extrinsic parameters [R|t]. The intrinsic parameters are generally derived from calibration. Now, if you want to project any other 3D point w.r.t the world reference frame on to the image plane, you will need to get that 3D point into the camera coordinate system first and then project it onto the image to get a correct perspective.
Point in image plane (u,v,1)=[intrinsic] [extrinsic] [3D point,1]
The reference coordinate system is the camera. rvec,tvec gives the 6D pose of the marker wrt to camera.
Related
How can I use the ArUco Framework from OpenCV to get the distance from the camera to the ArUco Marker?
There are several steps required to get the distance from your camera to the ArUco Marker.
Calibrate the camera
This step is of most importance. If you skip this one, all results in the further described steps will not be accurate.
The camera you will be using will need to be calibrated. It is of importance that you use the camera for this step in the same exact way you will be using it for the marker detection. So same resolution, focal length (same lense).
There are a lot of good guides out there so I will skip this task to focus on the main question.
Possible guides:
https://docs.opencv.org/master/da/d13/tutorial_aruco_calibration.html
https://mecaruco2.readthedocs.io/en/latest/notebooks_rst/Aruco/Projet+calibration-Paul.html
Receive distance from camera to ArUco marker
This step requires the parameters to correct the distortion from the previous step. I will be calling the cameraMatrix mtx and the distCoeffs dist in the further examples.
Make sure that you adjust all the parameters needed for the ArUco detection.
I won't explain what is needed for the simple marker detection as there are a lot of good guides out there.
dictionary = cv.aruco.Dictionary_get(cv.aruco.DICT_4X4_50)
parameters = cv.aruco.DetectorParameters_create()
Search for makers on your frame
frame will be your frame containing the marker
(corners, ids, rejected) = cv.aruco.detectMarkers(frame, dictionary, parameters=parameters)
Estimate the pose of the marker
You will need the size of the marker you want to find. So the actual real life size printed of it on the paper. Measure it as we will be needing it. It doesn't matter what measuring unit you will use. But be aware as the result of the distance we will receive will be the same unit. I used CM in this e.g.
markerSizeInCM = 15.9
rvec , tvec, _ = aruco.estimatePoseSingleMarkers(corners, markerSizeInCM, mtx, dist)
Read out Distance
We now have the rotation vector of the marker (rvec) relative to the camera and the translation vector (tvec). See source: OpenCV
The translation vector is in the same unit we provided the real marker size of in step 3. It is in the format [x,y,z] which is the position of the marker in the 3D space.
We now only need to read out z from the tvec which will be the distance from our camera to the marker center in the same measuring unit we provided in step 3.
I’m trying to find coordinates of object from one image in another image.
There are 2 fixed vertically arranged cameras one above the other (for example 10 cm between cameras). They look in the same direction.
Using calibrateCamera from OpenCV I found the following parameters for each camera: ret, mtx, dist, rvecs, tvecs.
How do I calculate coordinates of an object from an image from one camera in another camera image? This is assuming that this object exists in both camera images.
Consider this as a stereo setup and do stereo calibration.
It will provide you with the Rotation and translation between the coordinates of both the cameras, Essential matrix and fundamental matrix.
You can estimate the fundamental matrix using just point correspondences, refer to the following link
Fundamental Matrix gives you a line in the other image along which a point in the reference image can exist.
I have a point cloud which I convert from .dat to .ply using open3d. The contents of the .ply file is an (nx3) matrix corresponding to the x, y, z points, as well as another (nx3) that corresponds to the RGB information. The total number of points are well over 2 million (the lidar was mounted on top of a vehicle). I also have a set of stereo cameras the were mounted along side the LiDAR (one left, one right), of which I only have the camera intrinsic parameters.
I am trying to replicate a formula found in several papers, which can be seen here, equations 2 & 3. It is originally found in the Kitti dataset paper, equation 8. Basically, they are projecting a point cloud based on the cameras projection with the following equation: where P is the projection matrix--containing the camera intrinsic parameters, R the rectifying rotation matrix of the reference camera, T_{cam}^{velo} the rigid boy transformation from lidar coordinates to camera coordinates, and T_{velo}^{imu}
I want to note that not all papers used the last parameter (T_{velo}^{imu}), and because I don't have the imu information I will omit that parameter.
While I only have the camera intrinsic parameters, I am able to extract the camera rotation and translation by way of the Essential matrix. Along with the data, I also have a file containing the yaw, pitch and roll (in degrees) of the camera and lidar at the time that the images where being taken. I know that I can extract a rotation matrix from these parameters, but I am not quite sure how to use them in this case, specifically to obtain the rigid body transformation from lidar to camera coordinates. I should also mention that I have the real world coordinates of the camera at the time each image was being taken (in x, y ,z coordinates).
Here is the detailed answer to transform points between the lidar and the camera coordinates systems. I used the homogeneous representation of 3D transformations (4x4 matrices) because it is easier to manipulate, but it would be exactly the same if you used R and T instead. (the equation would just be longer)
I assume the pose estimate from aruco markers is only valid when they are affixed to a flat surface. What about attaching them to a slightly curved surface? Is there any way to correct for the surface curvature in the resulting pose?
Yes, you should be able to get the pose estimate for a curved surface using an Aruco board, though it may be physically difficult to construct and measure. Aruco boards do not need to be planar; they can describe any arrangement of markers in 3D space. So the following steps should work:
attach markers to your curved surface (which may be a challenge if the surface is not developable).
calculate, or directly measure, the 3D positions of the physical markers' corners in your preferred Cartesian coordinate system.
define an Aruco board using the markers' 3D corner positions and IDs.
tune the Aruco detection parameters (at least adaptive threshold and polygonal approximation) to give robust marker detection in the presence of curvature in the marker edges and localised lighting variations due to the curved surface.
once marker detection is reliable, use estimatePoseBoard() to get the pose estimate of the board, and hence the curved surface, in the same Cartesian coordinate system. estimatePoseBoard() finds a best-fit solution for the pose while considering all visible markers simultaneously.
Note: I haven't actually tried this.
I need to evaluate whether a camera is viewing a 3D real object. To do so I count with the 3D model of the world I am moving and the pose from the robot my camera is attached to. So far, so good, the camera coordinate will be
[x,y,z]' = RX +T
where X is the real object position, and . The camera I am using is a 170º FOV camera, and I need to calibrate it in order to convert these [x,y,z] into pixel coordinates I can evaluate. If the pixel coordinates are bigger than (0,0) and smaller than (width,height) I will consider that the camera is looking at the object.
Can I do a similar test without the conversion to pixel coordinates? I guess not, so I am trying to calibrate the fisheye camera with https://bitbucket.org/amitibo/pyfisheye/src, which is a wrapper over the faulty opencv 3.1.0 fisheye model.
Here is one of my calibration images:
Using the simplest test ( https://bitbucket.org/amitibo/pyfisheye/src/default/example/test_fisheye.py) This is the comparison with the undistorted image:
It looks really nice, and here is the undistorted:
How can I get the whole "butterfly" undistorted image? I am currently seeing the lower border...