I have a grid on pictures (they are from camera). After binarization they look like this (red is 255, blue is 0):
What is the best way to detect grid nodes (crosses) on these pictures?
Note: grid is distorted from cell to cell non-uniformly.
Update:
Some examples of different grids and thier distortions before binarization:
In cases like this I first try to find the best starting point.
So, first I thresholded your image (however I could also skeletonize it and just then threshold. But this way some data is lost irrecoverably):
Then, I tried loads of tools to get the most prominent features emphasized in bulk. Finally, playing with Gimp's G'MIC plugin I found this:
Based on the above I prepared a universal pattern that looks like this:
Then I just got a part of this image:
To help determine angle I made local Fourier freq graph - this way you can obtain your pattern local angle:
Then you can make a simple thick that works fast on modern GPUs - get difference like this (missed case):
When there is hit the difference is minimal; what I had in mind talking about local maximums refers more or less to how the resulting difference should be treated. It wouldn't be wise to weight outside of the pattern circle difference the same as inside due to scale factor sensitivity. Thus, inside with cross should be weighted more in used algorithm. Nevertheless differenced pattern with image looks like this:
As you can see it's possible to differentiate between hit and miss. What is crucial is to set proper tolerance and use Fourier frequencies to obtain angle (with thresholded images Fourier usually follows overall orientation of image analyzed).
The above way can be later complemented by Harris detection, or Harris detection can be modified using above patterns to distinguish two to four closely placed corners.
Unfortunately, all techniques are scale dependent in such case and should be adjusted to it properly.
There are also other approaches to your problem, for instance by watershedding it first, then getting regions, then disregarding foreground, then simplifying curves, then checking if their corners form a consecutive equidistant pattern. But to my nose it would not produce correct results.
One more thing - libgmic is G'MIC library from where you can directly or through bindings use transformations shown above. Or get algorithms and rewrite them in your app.
I suppose that this can be a potential answer (actually mentioned in comments): http://opencv.itseez.com/2.4/modules/imgproc/doc/feature_detection.html?highlight=hough#houghlinesp
There can also be other ways using skimage tools for feature detection.
But actually I think that instead of Hough transformation that could contribute to huge bloat and and lack of precision (straight lines), I would suggest trying Harris corner detection - http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html .
This can be further adjusted (cross corners, so local maximum should depend on crossy' distribution) to your specific issue. Then some curves approximation can be done based on points got.
Maybe you cloud calculate Hough Lines and determine the intersections. An OpenCV documentation can be found here
Related
I have the following JPG image. If I want to find the edges where the white page meets the black background. So I can rotate the contents a few degrees clockwise. My aim is to straighten the text for using with Tesseract OCR conversion. I don't see the need to rotate the text blocks as I have seen in similar examples.
In the docs Canny Edge Detection the third arg 200 eg edges = cv.Canny(img,100,200) is maxVal and said to be 'sure to be edges'. Is there anyway to determine these (max/min) values ahead of any trial & error approach?
I have used code examples which utilize the Python cv2 module. But the edge detection is set up for simpler applications.
Is there any approach I can use to take the text out of the equation. For example: only detecting edge lines greater than a specified length?
Any suggestions would be appreciated.
Below is an example of edge detection (above image same min/max values) The outer edge of the page is clearly defined. The image is high contrast b/w. It has even lighting. I can't see a need for the use of an adaptive threshold. Simple global is working. Its just at what ratio to use it.
I don't have the answer to this yet. But to add. I now have the contours of the above doc.
I used find contours tutorial with some customization of the file loading. Note: removing words gives a thinner/cleaner outline.
Consider Otsu.
Its chief virtue is that it is adaptive to local
illumination within the image.
In your case, blank margins might be the saving grace.
Consider working on a series of 2x reduced resolution images,
where new pixel is min() (or even max()!) of original four pixels.
These reduced images might help you to focus on the features
that matter for your use case.
The usual way to deskew scanned text is to binarize and
then keep changing theta until "sum of pixels across raster"
is zero, or small. In particular, with few descenders
and decent inter-line spacing, we will see "lots" of pixels
on each line of text and "near zero" between text lines,
when theta matches the original printing orientation.
Which lets us recover (1.) pixels per line, and (2.) inter-line spacing, assuming we've found a near-optimal theta.
In your particular case, focusing on the ... leader dots
seems a promising approach to finding the globally optimal
deskew correction angle. Discarding large rectangles of
pixels in the left and right regions of the image could
actually reduce noise and enhance the accuracy of
such an approach.
I am trying to obtain a radius and diameter distribution from some AFM (Atomic force microscopy) measurements. So far I am trying out Gwyddion, ImageJ and different workflows in Matlab.
At the moment the best results I have found is to use Gwyddion and to take the Phase image, high pass filter it and then try an edge detection with 'Laplacian of Gaussian'. The result is shown in figure 3. However this image is still too noisy and doesnt really capture the edges of all the particles. (some are merged together others do not have a clear perimeter).
In the end I need an image which segments each of the spherical particles which I can use for blob detection/analysis to obtain size/radius information.
Can anyone recommend a different method?
[
I would definitely try a Granulometry, it was designed for something really similar. There is a good explanation of granulometry here starting page 158.
The granulometry will perform consecutive / increasing openings that will erase the different patterns according to their dimensions. The bigger the pattern, the latter it will be erased. It will give you a curve that represent the pattern dimension distributions in your image, so exactly what you want.
However, it will not give you any information about the position inside the image. If you want to have a rough modeling of the blobs present in your image, you can take a look to the Ultimate Opening.
Maybe you can use Avizo, it's a powerful software for dealing with image issues, especially for three D data (CT)
Suppose that I have an array of sensors that allows me to come up with an estimate of my pose relative to some fixed rectangular marker. I thus have an estimate as to what the contour of the marker will look like in the image from the camera. How might I use this to better detect contours?
The problem that I'm trying to overcome is that sometimes, the marker is occluded, perhaps by a line cutting across it. As such, I'm left with two contours that if merged, would yield the marker. I've tried opening and closing to try and fix the problem, but it isn't robust to the different types of lighting.
One approach that I'm considering is to use the predicted contour, and perform a local convolution with the gradient of the image, to find my true pose.
Any thoughts or advice?
The obvious advantage of having a pose estimate is that it restricts the image region for searching your target.
Next, if your problem is occlusion, you then need to model that explicitly, rather than just try to paper it over with image processing tricks: add to your detector objective function a term that expresses what your target may look like when partially occluded. This can be either an explicit "occluded appearance" model, or implicit - e.g. using an algorithm that is able to recognize visible portions of the targets independently of the whole of it.
I would like to implement a Maya plugin (this question is independent from Maya) to create 3D Voronoi patterns, Something like
I just know that I have to start from point sampling (I implemented the adaptive poisson sampling algorithm described in this paper).
I thought that, from those points, I should create the 3D wire of the mesh applying Voronoi but the result was something different from what I expected.
Here are a few example of what I get handling the result i get from scipy.spatial.Voronoi like this (as suggested here):
vor = Voronoi(points)
for vpair in vor.ridge_vertices:
for i in range(len(vpair) - 1):
if all(x >= 0 for x in vpair):
v0 = vor.vertices[vpair[i]]
v1 = vor.vertices[vpair[i+1]]
create_line(v0.tolist(), v1.tolist())
The grey vertices are the sampled points (the original shape was a simple sphere):
Here is a more complex shape (an arm)
I am missing something? Can anyone suggest the proper pipeline and algorithms I have to implement to create such patterns?
I saw your question since you posted it but didn’t have a real answer for you, however as I see you still didn’t get any response I’ll at least write down some ideas from me. Unfortunately it’s still not a full solution for your problem.
For me it seems you’re mixing few separate problems in this question so it would help to break it down to few pieces:
Voronoi diagram:
The diagram is by definition infinite, so when you draw it directly you should expect a similar mess you’ve got on your second image, so this seems fine. I don’t know how the SciPy does that, but the implementation I’ve used flagged some edge ends as ‘infinite’ and provided me the edges direction, so I could clip it at some distance by myself. You’ll need to check the exact data you get from SciPy.
In the 3D world you’ll almost always want to remove such infinite areas to get any meaningful rendering, or at least remove the area that contains your camera.
Points generation:
The Poisson disc is fine as some sample data or for early R&D but it’s also the most boring one :). You’ll need more ways to generate input points.
I tried to imagine the input needed for your ball-like example and I came up with something like this:
Create two spheres of points, with the same center but different radius.
When you create a Voronoi diagram out of it and remove infinite areas you should end up with something like a football ball.
If you created both spheres randomly you’ll get very irregular boundaries of the ‘ball’, but if you scale the points of one sphere, to use for the 2nd one you should get a regular mesh, similar to ball. You can also use similar points, but add some random offset to control the level of surface irregularity.
Get your computed diagram and for each edge create few points along this edge - this will give you small areas building up the edges of bigger areas. Play with random offsets again. Try to ignore edges, that doesn't touch any infinite region to get result similar to your image.
Get the points from both stages and compute the diagram once more.
Mesh generation:
Up to now it didn’t look like your target images. In fact it may be really hard to do it with production quality (for a Maya plugin) but I see some tricks that may help.
What I would try first would be to get all my edges and extrude some circle along them. You may modulate circle size to make it slightly bigger at the ends. Then do Boolean ‘OR’ between all those meshes and some Mesh Smooth at the end.
This way may give you similar results but you’ll need to be careful at mesh intersections, they can get ugly and need some special treatment.
How do you detect the location of an image within a larger image? I have an unmodified copy of the image. This image is then changed to an arbitrary resolution and placed randomly within a much larger image which is of an arbitrary size. No other transformations are conducted on the resulting image. Python code would be ideal, and it would probably require libgd. If you know of a good approach to this problem you'll get a +1.
There is a quick and dirty solution, and that's simply sliding a window over the target image and computing some measure of similarity at each location, then picking the location with the highest similarity. Then you compare the similarity to a threshold, if the score is above the threshold, you conclude the image is there and that's the location; if the score is below the threshold, then the image isn't there.
As a similarity measure, you can use normalized correlation or sum of squared differences (aka L2 norm). As people mentioned, this will not deal with scale changes. So you also rescale your original image multiple times and repeat the process above with each scaled version. Depending on the size of your input image and the range of possible scales, this may be good enough, and it's easy to implement.
A proper solution is to use affine invariants. Try looking up "wide-baseline stereo matching", people looked at that problem in that context. The methods that are used are generally something like this:
Preprocessing of the original image
Run an "interest point detector". This will find a few points in the image which are easily localizable, e.g. corners. There are many detectors, a detector called "harris-affine" works well and is pretty popular (so implementations probably exist). Another option is to use the Difference-of-Gaussians (DoG) detector, it was developed for SIFT and works well too.
At each interest point, extract a small sub-image (e.g. 30x30 pixels)
For each sub-image, compute a "descriptor", some representation of the image content in that window. Again, many descriptors exist. Things to look at are how well the descriptor describes the image content (you want two descriptors to match only if they are similar) and how invariant it is (you want it to be the same even after scaling). In your case, I'd recommend using SIFT. It is not as invariant as some other descriptors, but can cope with scale well, and in your case scale is the only thing that changes.
At the end of this stage, you will have a set of descriptors.
Testing (with the new test image).
First, you run the same interest point detector as in step 1 and get a set of interest points. You compute the same descriptor for each point, as above. Now you have a set of descriptors for the target image as well.
Next, you look for matches. Ideally, to each descriptor from your original image, there will be some pretty similar descriptor in the target image. (Since the target image is larger, there will also be "leftover" descriptors, i.e. points that don't correspond to anything in the original image.) So if enough of the original descriptors match with enough similarity, then you know the target is there. Moreover, since the descriptors are location-specific, you will also know where in the target image the original image is.
You probably want cross-correlation. (Autocorrelation is correlating a signal with itself; cross correlating is correlating two different signals.)
What correlation does for you, over simply checking for exact matches, is that it will tell you where the best matches are, and how good they are. Flip side is that, for a 2-D picture, it's something like O(N^3), and it's not that simple an algorithm. But it's magic once you get it to work.
EDIT: Aargh, you specified an arbitrary resize. That's going to break any correlation-based algorithm. Sorry, you're outside my experience now and SO won't let me delete this answer.
http://en.wikipedia.org/wiki/Autocorrelation is my first instinct.
Take a look at Scale-Invariant Feature Transforms; there are many different flavors that may be more or less tailored to the type of images you happen to be working with.