I have an image containing cells. I can't provide it, but it is similar to the image used as an example here: http://blogs.mathworks.com/steve/2006/06/02/cell-segmentation/ but without the characteristic nuclei.
I have done some processing and am now left with a pretty good segmentation, but some cells are close to each other and I need to split them. Most of them consist of more or less overlapping ellipses.
I am certain that a few iterations of simple erosion will split almost all of those regions. But some of the other cells are so small, they will disappear before the others split. Therefore I need an algorithm that erodes the image, allowing region splitting, but does not delete the last pixel of a region.
I want to use watershed afterwards to segment the cells.
I guess I could implement this on my own by searching for cennected regions and then tracking that I don't lose any or something like that, but the implementation seems messy even in my head and I think there must be an easier way. So my question is basically, what's the name of this so I can google an implementation? Or if there is no off-the-shelf solution, what's an elegant way of implementing this without dozens of iterations and for loops etc.
(Language is python)
It's a classical problem, and if the overlap between cells is too important, let's say 40% or more, then there is not a good solution.
However, if the overlap is not important, here is the solution:
You start from the segmentation you have, let's call it S
You computer the ultimate eroded UE(S). It will give you the center of each cell. It will give you something like the red points on this image. In this image, they use a distance map, an ultimate eroded will be more stable. If there are still many red points per cell, then a dilation of the UE(S) will fix your problem like this example.
You invert Inv(S) or compute the voronoi diagram Voi(S) in order to have a marker in the background.
Watershed on the gradient image of S, using the UE(S) as inner marker (perfect because you have one point by cell) and Inv(S) or Voi(S) as background/outer marker.
You will get something like this example.
Related
I have the following JPG image. If I want to find the edges where the white page meets the black background. So I can rotate the contents a few degrees clockwise. My aim is to straighten the text for using with Tesseract OCR conversion. I don't see the need to rotate the text blocks as I have seen in similar examples.
In the docs Canny Edge Detection the third arg 200 eg edges = cv.Canny(img,100,200) is maxVal and said to be 'sure to be edges'. Is there anyway to determine these (max/min) values ahead of any trial & error approach?
I have used code examples which utilize the Python cv2 module. But the edge detection is set up for simpler applications.
Is there any approach I can use to take the text out of the equation. For example: only detecting edge lines greater than a specified length?
Any suggestions would be appreciated.
Below is an example of edge detection (above image same min/max values) The outer edge of the page is clearly defined. The image is high contrast b/w. It has even lighting. I can't see a need for the use of an adaptive threshold. Simple global is working. Its just at what ratio to use it.
I don't have the answer to this yet. But to add. I now have the contours of the above doc.
I used find contours tutorial with some customization of the file loading. Note: removing words gives a thinner/cleaner outline.
Consider Otsu.
Its chief virtue is that it is adaptive to local
illumination within the image.
In your case, blank margins might be the saving grace.
Consider working on a series of 2x reduced resolution images,
where new pixel is min() (or even max()!) of original four pixels.
These reduced images might help you to focus on the features
that matter for your use case.
The usual way to deskew scanned text is to binarize and
then keep changing theta until "sum of pixels across raster"
is zero, or small. In particular, with few descenders
and decent inter-line spacing, we will see "lots" of pixels
on each line of text and "near zero" between text lines,
when theta matches the original printing orientation.
Which lets us recover (1.) pixels per line, and (2.) inter-line spacing, assuming we've found a near-optimal theta.
In your particular case, focusing on the ... leader dots
seems a promising approach to finding the globally optimal
deskew correction angle. Discarding large rectangles of
pixels in the left and right regions of the image could
actually reduce noise and enhance the accuracy of
such an approach.
I am relatively new to Python and would like some help with some ideas to solve this problem...
I have a black and white image as so:
black image with white dots
And essentially need to get the midpoint (or honestly any point, as long as it's consistent across all of the dots) of each of those white dots. The program could spit out a list of coordinate points for each of those dots.
I am doing this because I want to have a list of the distances of each dot from its place to the bottom of the image. I said getting the mid-point doesn't matter, it could be any point as long as it's consistent across the dots because I am comparing the values of one image to the values of another that would be measured in the same way.
I had tried to split the image into rows and then count the number of pixels in each row, but that felt like it was limiting and wouldn't really do the best job.
I was thinking to maybe make a loop that looks at one pixel and then checks to see the pixels around it until it reaches the edge or something like that, but it seems like that would take a lot of computing power even with B&W as I have to run this through hundreds of images that have approximately 10 million pixels.
Possibly a solution related to converting the coordinates of the image into a graph and performing cluster analysis?
If you have a binary image, then I think that using skimage to label then get region properties. I think that this tutorial should get you moving on the take you are hoping to accomplish:
https://scikit-image.org/docs/stable/auto_examples/segmentation/plot_regionprops.html
I get in trouble by finding an algorithm to remove the convexity of my photos. As you can see the photos are captured from book pages, and I wanna remove the convexity. My question is similar to this but what I have is just page boundaries as input and neither I have grid nor am able to find by processing algorithms.
I wanna output as the right one in the below photo.
Obviously, the perspective transformation is the first thing comes in mind. However, as you can see the result is not promising:
Here's a possible pipeline to solve your problem. The main idea is to identify the text, create a super blob of it with some morphology, locate the 4 corners of this super blob and feed the points to a perspective "unwarper" (or rectifier, or whatever you wish to call that perspective correction method).
Start by converting your image to grayscale and apply adaptive thresholding to it. Try the Gaussian or Mean methods with parameters that better fit your tests. This is the result I obtain after fiddling with the values for a bit:
Now, the idea is to isolate just the text. The solution I applied is: obtain the biggest blobs and subtract them from the original image. You're going to need a method to calculate the area of each binary blob. Check this previous post for suggestions on how to implement one.
These are the biggest blobs from the image:
Subtract the largest blobs from the original image. This is the result:
As you can see, the text is almost isolated. Let me clean up the little bits of pixels by applying, again, an area filter. This time to eliminate the small blobs. This is the result:
Very good, some characters are lost during the operation, but that’s ok. We need a nice continuous block of text, because we are gonna dilate the hell of it. I tried applying a rectangular structuring element of size 5 and 5 Op iterations. Erode the output with 5 more iterations afterward, so you end up with this nice - isolated - super blob were the text used to be:
Check it out. The 3 markers you see are the centroids of the biggest blobs that I detected on the image. We need to find the 4 corners of the super blob. The biggest blob in the image is what we are after. I decided to re-use the area filter and look for the blob with the biggest area. This is the isolated super blob:
From here, the operations are pretty straightforward. Again, the goal is to get the four corners of this blob. You can fit a rectangle or apply an edge detector followed by Hough transform, to get the straight lines that follow the edges of the super blob.
I decided to apply a Canny Edge detector followed by Hough transform. Of course, I tuned the transform to filter only the possible lines I’m interested in – straight lines above a certain length. This is the result of the line detection:
There's some extra info plotted on the image. The markers you see (red and yellow) are the start/endpoints of the lines. My idea here was to find a bunch of these lines and compute the mean of these points. The idea is that we have a cluster of points that are separated in "quadrants". If we compute the mean of the start and endpoints of each line per quadrant, we will end up with 4 means – and these are the approximate values of the super blob’s corners!
I applied K-means to the start and endpoints of the lines, but you very well prefer other methods of processing. That's ok. My approximate corners are identified by the big red O markers in the above image.
As I suggested, try giving a fixed output position for these corners. I defined the red rectangle for the corners to be mapped on. For this test, I pretty much adjusted the rectangle manually. The perspective correction yields this result:
Some suggestions:
Depending on the resolution of the input image, you could downsize it
for a faster and better result, as your input seems big enough for
that.
Tune Hough Line Detection to yield larger lines. My current
configuration detects some smaller lines and that can hinder the
corner approximation.
I choose a somewhat robust method for calculating the 4 corners of
the super blob that I’ve personally used before (Edge detection +
Hough Line Transform + K-means) but whatever processing chain you
chose to obtain the data is entirely up to you!
When humans see markers suggesting the form of a shape, they immediately perceive the shape itself, as in https://en.wikipedia.org/wiki/Illusory_contours. I'm trying to accomplish something similar in OpenCV in order to detect the shape of a hand in a depth image with very heavy noise. In this question, assume that skin color based detection is not working (actually it is the best I've achieved so far but it is not robust under changing light conditions, shadows or skin colors. Also various paper shapes (flat and colorful) are on the table, confusing color-based approaches. This is why I'm attempting to use the depth cam instead).
Here's a sample image of the live footage that is already pre-processed for better contrast and with background gradient removed:
I want to isolate the exact shape of the hand from the rest of the picture. For a human eye this is a trivial thing to do. So here are a few attempts I did:
Here's the result with canny edge detection applied. The problem here is that the black shape inside the hand is larger than the actual hand, causing the detected hand to overshoot in size. Also, the lines are not connected and I fail at detecting contours.
Update: Combining Canny and a morphological closing (4x4 px ellipse) makes contour detection possible with the following result. It is still waaay too noisy.
Update 2: The result can be slightly enhanced by drawing that contour to an empty mask, save that in a buffer and re-detect yet another contour on a merge of three buffered images. The line that combines the buffered images is is hand_img = np.array(np.minimum(255, np.multiply.reduce(self.buf)), np.uint8) which is then morphed once again (closing) and finally contour detected. The results are slightly less horrible than in the picture above but laggy instead.
Alternatively I tried to use an existing CNN (https://github.com/victordibia/handtracking) for detecting the approximate position of the hand's center (this step works) and then flood from there. In order to detect contours the result is put into an OTSU filter and then the largest contour is taken, resulting in the following picture (ignore black rectangles in the left). The problem is that some of the noise is flooded as well and the results are mediocre:
Finally, I tried background removers such as MOG2 or GMG. They are confused by the enormous amount of fast-moving noise. Also they cut off the fingertips (which are crucial for this project). Finally, they don't see enough details in the hand (8 bit plus further color reduction via equalizeHist yield a very poor grayscale resolution) to reliably detect small movements.
It's ridiculous how simple it is for a human to see the exact precise shape of the hand in the first picture and how incredibly hard it is for the computer to draw a shape.
What would be your recommended method to achieve an exact hand segmentation?
After two days of desperate testing, the solution was to VERY carefully apply thresholding to an well-preprocessed image.
Here are the steps:
Remove as much noise as you possibly can. In my case, denoising was done using Intel's pyrealsense2 (I'm using an Intel RealSense depth camera and the algorithms were written for that camera family, thus they work very well). I used rs.temporal_filter() and directly after rs.hole_filling_filter() on every frame.
Capture the very first frame. Besides capturing the exact distance to the table (for later thresholding), this step also saves a still picture that is blurred by a 100x100 px kernel. Since the camera is never mounted perfectly but slightly tilted, there's an ugly grayscale gradient going over the picture and making operations impossible. This still picture is then subtracted from every single later frame, eliminating the gradient. BTW: this gradient removal step is already incorporated in the screenshots shown in the question above
Now the picture is almost noise-free. Do not use equalizeHist. This does not simply increase the general contrast regularly but instead empathizes the remaining noise way too much. This was my main error I did in almost all experiments. Instead, apply a threshold (binary with fixed border) directly. The border is extremely thin, setting it at 104 instead of 205 makes a huge difference.
Invert colors (unless you have taken BINARY_INV in the previous step), apply contours, take the largest one and write it to a mask
Voilà!
I have a grid on pictures (they are from camera). After binarization they look like this (red is 255, blue is 0):
What is the best way to detect grid nodes (crosses) on these pictures?
Note: grid is distorted from cell to cell non-uniformly.
Update:
Some examples of different grids and thier distortions before binarization:
In cases like this I first try to find the best starting point.
So, first I thresholded your image (however I could also skeletonize it and just then threshold. But this way some data is lost irrecoverably):
Then, I tried loads of tools to get the most prominent features emphasized in bulk. Finally, playing with Gimp's G'MIC plugin I found this:
Based on the above I prepared a universal pattern that looks like this:
Then I just got a part of this image:
To help determine angle I made local Fourier freq graph - this way you can obtain your pattern local angle:
Then you can make a simple thick that works fast on modern GPUs - get difference like this (missed case):
When there is hit the difference is minimal; what I had in mind talking about local maximums refers more or less to how the resulting difference should be treated. It wouldn't be wise to weight outside of the pattern circle difference the same as inside due to scale factor sensitivity. Thus, inside with cross should be weighted more in used algorithm. Nevertheless differenced pattern with image looks like this:
As you can see it's possible to differentiate between hit and miss. What is crucial is to set proper tolerance and use Fourier frequencies to obtain angle (with thresholded images Fourier usually follows overall orientation of image analyzed).
The above way can be later complemented by Harris detection, or Harris detection can be modified using above patterns to distinguish two to four closely placed corners.
Unfortunately, all techniques are scale dependent in such case and should be adjusted to it properly.
There are also other approaches to your problem, for instance by watershedding it first, then getting regions, then disregarding foreground, then simplifying curves, then checking if their corners form a consecutive equidistant pattern. But to my nose it would not produce correct results.
One more thing - libgmic is G'MIC library from where you can directly or through bindings use transformations shown above. Or get algorithms and rewrite them in your app.
I suppose that this can be a potential answer (actually mentioned in comments): http://opencv.itseez.com/2.4/modules/imgproc/doc/feature_detection.html?highlight=hough#houghlinesp
There can also be other ways using skimage tools for feature detection.
But actually I think that instead of Hough transformation that could contribute to huge bloat and and lack of precision (straight lines), I would suggest trying Harris corner detection - http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html .
This can be further adjusted (cross corners, so local maximum should depend on crossy' distribution) to your specific issue. Then some curves approximation can be done based on points got.
Maybe you cloud calculate Hough Lines and determine the intersections. An OpenCV documentation can be found here