I am using Object segmentation dataset having following information:
Introduced: IROS 2012
Device: Kinect v1
Description: 111 RGBD images of stacked and occluding objects on table.
Labelling: Per-pixel segmentation into objects.
link for the page: http://www.acin.tuwien.ac.at/?id=289
I am trying to use the depth map provided by the dataset. However, it seems the depth map is completely black.
Original image for the above depth map
I tried to do some preprocessing and normalised the image so that the depth map could be visualised in the form of a gray image.
img_depth = cv2.imread("depth_map.png",-1) #depth_map.png has uint16 data type
depth_array = np.array(img_depth, dtype=np.float32)
frame = cv2.normalize(depth_array, depth_array, 0, 1, cv2.NORM_MINMAX)
cv2.imwrite('capture_depth.png',frame*255)
The result of doing this preprocessing is:
In one of the posts in stackoverflow, i read that these black patches are the regions where the depth map was not defined.
If i have to use this depth map, what is the best possible way to fill these undefined regions? (I am thinking of filling these regions with K-nearest neighbour but feel there could be better ways for this).
Are there any RGB-D datasets that do not have such problems or these kind of problems always exists? what are the best possible way to tackle such problems?
Thanks in Advance!
Pretty much every 3d imaging technology will produce data with invalid or missing points. Lack of texture, too steep slopes, obscuration, transparency, reflections,... you name it.
There is no magic solution to filling these holes. You'll need some sort of interpolation or you maybe replace missing points based on some model.
The internet is full of methods for filling holes. Most techniques for intensity images can be successsfully applied to depth images.
It will depend on your application, your requirements and what you know about your objects.
Data quality in 3d is a question of time, money and the right combination of object and technology.
Areas that absorb or scatter the Kinect IR (like glossy surfaces or sharp edges) are filled with zero pixel value (indicating non-calculated depth). A method to approximately fill the non-captured data around these areas is by using the statistical median of a 5x5 window. This method works just fine for Kinect depth images. An example implementation can be seen for Matlab and C# in the links.
Related
I have pairs of image - an image (blurred intentionally) and its depth map (given as PNG).
For example:
However, there seems to be a shift between the depth map and the real image as can be seen in this example:
All i know that these images were shot with a RealSense LiDAR Camera L515 (I do not have knowledge of the underlying camera characteristics or the distance between both rgb and infrared sensors).
Is there a way to align both images? I searched the internet for possible solutions. However, all solutions rely on data that I do not have, such as the intrinsic matrix, cameras SDK and more.
Since the two imaging systems are very close physically, the homography between them would likely be a good approximation. You can find the homography using 4 corresponding points that you choose manually.
You can use the OpenvCV implementation.
How to go from the image on the left to the image on the right programmatically using Python (and maybe some tools, like OpenCV)?
I made this one by hand using an online tool for clipping. I am completely noob in image processing (especially in practice). I was thinking to apply some edge or contour detection to create a mask, which I will apply later on the original image to paint everything else (except the region of interest) black. But I failed miserably.
The goal is to preprocess a dataset of very similar images, in order to train a CNN binary classifier. I tried to train it by just cropping the image close to the region of interest, but the noise is so high that the CNN learned absolutely nothing.
Can someone help me do this preprocessing?
I used OpenCV's implementation of watershed algorithm to solve your problem. You can find out how to use it if you read this great tutorial, so I will not explain this into a lot of detail.
I selected four points (markers). One is located on the region that you want to extract, one is outside and the other two are within lower/upper part of the interior that does not interest you. I then created an empty integer array (the so-called marker image) and filled it with zeros. Then I assigned unique values to pixels at marker positions.
The image below shows the marker positions and marker values, drawn on the original image:
I could also select more markers within the same area (for example several markers that belong to the area you want to extract) but in that case they should all have the same values (in this case 255).
Then I used watershed. The first input is the image that you provided and the second input is the marker image (zero everywhere except at marker positions). The algorithm stores the result in the marker image; the region that interests you is marked with the value of the region marker (in this case 255):
I set all pixels that did not have the 255 value to zero. I dilated the obtained image three times with 3x3 kernel. Then I used the dilated image as a mask for the original image (i set all pixels outside the mask to zero) and this is the result i got:
You will probably need some kind of method that will find markers automatically. The difficulty of this task depends heavily on the set of the input images. In some cases, the method can be really straightforward and simple (as in the tutorial linked above) but sometimes this can be a tough nut to crack. But I can't recommend anything because I don't know how your images look like in general (you only provided one). :)
I am in the process of putting together an OpenCV script to analyze immunohistochemically stained heart tissue. Our staining procedure renders cell types expressing certain proteins in their plasma membranes with pigments visible under a light microscope, which we use to photograph the images.
So far, I've succeded in segmenting the images to different layers based on color range using a modified version of the frequently cited color segmentation script available through the OpenCV community(http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html).
A screen shot of the original image:
B-Cell layer displayed:
At this point, I would like to calculate the ratio of area of B-Cells to unstained tissue. This operation prompted an extraction of the background cell layer as such based on color range:
Obviously, these results leave much to be desired.
Does anyone have ideas of how to approach this problem? Again, I would like to segment the background tissue (transparent) layer, which is unfortunately fairly sponge-like in texture. My goal is to create a mask representive of the area of unstained tissue. It seems a blur technique is necessary to fill the gaps in the tissue, but the loss in accuracy this approach entails is obvious.
In the sample image, the channels look highly correlated. If you apply decorrelation-stretching to the image you should be able to see more detail. Here in my blog post I've implemented decorrelation-stretching in C++ (unfortualtely not Python).
Using the sample code in the blog I did the following to segment the cell region:
dstretch the CIE Lab image with following targetMean and tergetSigma.
float mu[3] = {128.0f, 128.0f, 128.0f};
float sd[3] = {128.0f, 5.0f, 5.0f};
Mat mean = Mat(3, 1, CV_32F, mu);
Mat sigma = Mat(3, 1, CV_32F, sd);
Convert the dstretched CIE Lab image back to BGR.
Erode this BGR image with a 3x3 rectangular structuring element once.
Apply kmeans clustering to this eroded image with k = 2.
I don't know how good this segmentation is. I think it is possible to get a better segmentation by trying different values for the above parameters (mean, sigma, structuring element size and number of times the image is eroded).
(Following images are not to the original scale)
Original:
dstretched CIE Lab converted back to BGR:
Eroded:
kmeans with k = 2:
Imagine someone taking a burst shot from camera, he will be having multiple images, but since no tripod or stand was used, images taken will be slightly different.
How can I align them such that they overlay neatly and crop out the edges
I have searched a lot, but most of the solutions were either making a 3D reconstruction or using matlab.
e.g. https://github.com/royshil/SfM-Toy-Library
Since I'm very new to openCV, I will prefer a easy to implement solution
I have generated many datasets by manually rotating and cropping images in MSPaint but any link containing corresponding datasets(slightly rotated and translated images) will also be helpful.
EDIT:I found a solution here
http://www.codeproject.com/Articles/24809/Image-Alignment-Algorithms
which gives close approximations to rotation and translation vectors.
How can I do better than this?
It depends on what you mean by "better" (accuracy, speed, low memory requirements, etc). One classic approach is to align each frame #i (with i>2) with the first frame, as follows:
Local feature detection, for instance via SIFT or SURF (link)
Descriptor extraction (link)
Descriptor matching (link)
Alignment estimation via perspective transformation (link)
Transform image #i to match image 1 using the estimated transformation (link)
I need some help developing some code that segments a binary image into components of a certain pixel density. I've been doing some research in OpenCV algorithms, but before developing my own algorithm to do this, I wanted to ask around to make sure it hasn't been made already.
For instance, in this picture, I have code that imports it as a binary image. However, is there a way to segment objects in the objects from the lines? I would need to segment nodes (corners) and objects (the circle in this case). However, the object does not necessarily have to be a shape.
The solution I thought was to use pixel density. Most of the picture will made up of lines, and the objects have a greater pixel density than that of the line. Is there a way to segment it out?
Below is a working example of the task.
Original Picture:
Resulting Images after Segmentation of Nodes (intersection of multiple lines) and Components (Electronic components like the Resistor or the Voltage Source in the picture)
You can use an integral image to quickly compute the density of black pixels in a rectangular region. Detection of regions with high density can then be performed with a moving window in varying scales. This would be very similar to how face detection works but using only one super-simple feature.
It might be beneficial to make all edges narrow with something like skeletonizing before computing the integral image to make the result insensitive to wide lines.
OpenCV has some functionality for finding contours that is able to put the contours in a hierarchy. It might be what you are looking for. If not, please add some more information about your expected output!
If I understand correctly, you want to detect the lines and the circle in your image, right?
If it is the case, have a look at the Hough line transform and Hough circle transform.