I've been doing a lot of Image Processing recently on Python using OpenCV and I've worked all this while with 2-D Images in the generic BGR style.
Now, I'm trying to figure out how to incorporate depth and work with depth information as well.
I've seen the documentation on creating simple point clouds using the Left and Right images of a Stereocamera, but I was hoping to gain some intuition on Depth-based cameras themselves like Kinect.
What kind of camera should I use for this purpose, and more importantly: how do I process these images in Python - as I can't find a lot of documentation on handling RGBD images in OpenCV.
If you want to work with depth based cameras you can go for Time of Flight(ToF) cameras like picoflexx and picomonstar cameras. They will give you X,Y and Z values. Where your x and y values are distances from camera centre of that point (like in 2D space) and Z will five you the direct distance of that point (not perpendicular) from camera centre.
For this camera and this 3d data processing you can use Point Cloud Library fro processing.
Related
I am working on a python project which involves stitching high resolution images which are mostly greyscale. The method of extracting features is not viable for this project as the images do not contain enough key points/descriptors for feature extracting with algorithms such as SIFT/SURF.
I have attempted to predict a pixels location between 2 images using the intrinsic matrix of the camera to create a rotation matrix but this was unsuccessful.
is it possible to project pixels from 1 image into another in this way at all or should I be looking at something else? possibly working with the extrinsic matrix?
the camera uses a 600mm lens with a full frame sensor.
I have a setup where a (2D) camera is mounted on the end-effector of a robot arm - similar to the OpenCV documentation:
I want to calibrate the camera and find the transformation from camera to end-effector.
I have already calibrated the camera using this OpenCV guide, Camera Calibration, with a checkerboard where the undistorted images are obtained.
My problem is about finding the transformation from camera to end-effector. I can see that OpenCV has a function, calibrateHandEye(), which supposely should achieve this. I already have the "gripper2base" vectors and are missing the "target2cam" vectors. Should this be based on the size of the checkerboard squares or what am I missing?
Any guidance in the right direction will be appreciated.
You are close to the answer.
Yes, it is based on the size of the checkerboard. But instead of directly taking those parameters and an image, this function is taking target2cam. How to get target2cam? Just simply move your robot arm above the chessboard so that the camera can see the chessboard and take a picture. From the picture of the chessboard and camera intrinsics, you can find target2cam. Calculating the extrinsic from the chessboard is already given in opencv.
Repeat this a couple of times at different robot poses and collect multiple target2cam. Put them calibrateHandEye() and you will get what you need.
I have pairs of image - an image (blurred intentionally) and its depth map (given as PNG).
For example:
However, there seems to be a shift between the depth map and the real image as can be seen in this example:
All i know that these images were shot with a RealSense LiDAR Camera L515 (I do not have knowledge of the underlying camera characteristics or the distance between both rgb and infrared sensors).
Is there a way to align both images? I searched the internet for possible solutions. However, all solutions rely on data that I do not have, such as the intrinsic matrix, cameras SDK and more.
Since the two imaging systems are very close physically, the homography between them would likely be a good approximation. You can find the homography using 4 corresponding points that you choose manually.
You can use the OpenvCV implementation.
recently I have been playing with the 360 fly HD camera and wondering if Aruco Marker can be detected during real time. The first thing come to my mind is to convert the fisheye image into perspective image first and then perform the detection on the perspective image(I am gonna try it and will update my result here later).
Converting a fisheye image into a panoramic, spherical or perspective projection
Hugin HowTo: Convert 360 Image to Cropped Flat Panoramic Image
I am not an expert in this field. Has anyone done this before? Is this something can be achieved by calibrating the camera differently such as correcting the camera matrix and distortion coefficient matrix?
If I am heading to the wrong direction, please let me know.
I was able to get a better understanding during the process.
First, I want to say that 360(fisheye, spherical, however you call it) image is NOT distorted. I was so tricked by my intuition and thought that the image was distorted based on what it looks like. NO it is not distorted. Please read enter link description here for more information.
Next, I have tried both 360 fly cameras and neither works. Every time I tried to access the camera with opencv, it automatically powers off and switch to storage mode. I guess the 360 dev team purposely implements this switching function to prevent "hacking" of their products. But, I've seen people successfully hacked the 360 fly, it's definitely workable.
At last, I was able to detect Aruco with Ricoh theta V(theta S should also work). It's so developer friendly and I was able to make it run in my first attempt. You just have to select the right camera and let the code run. The only problem is the range, which is expected(about 6ft) and Ricoh camera is kind of expensive($499).
click here to view succesful detection
I was implementing the depth map construction, code of which (in Python) is available here OpenCv Docs - depthMap I was successful in getting the depth map as they showed in the doc for their given images-pair (left and right stereo images) tsukuba_l.png and tsukuba_2.png. I considered testing my own image pairs, so I took from my mobile a pair of images, as shown below:
When I run the code, I'm getting the depth map something like this
I tried playing with numDisparities and blocksize, but it didn't help in getting the best map.
I thought of checking the script of cv2.StereoBM_create in its master folder in Github, but couldn't get that online. Can you help me with a way to implement depth maps for custom images taken by me? is there a way that we can play with the parameters or at least get me the link to GitHub master module that has all Stereo related modules. Thank you.
I guess you did not rectify the images which is fundamental for stereo matching.
You should first calibrate your stereo system (if you took them with mobile phone every image pair you take will have a different transform, the two cameras need to have always the same transformation between each other) and then rectify the images, in that way they are projected onto the same plane, then the stereo match algorithm looks for correspondences in the other image on the same rows.
Check in the docs for stereoRectify(), you will see some images as example of the rectification process.
By the way there is another python example based on SemiGlboal Block Matching algorithm in opencv/samples/python/stereo_match.py.