Calculating object labelling consensus area - python

Scenario: four users are annotating images with one of four labels each. These are stored in a fairly complex format - either as polygons or as centre-radius circles. I'm interested in quantifying, for each class, the area of agreement between individual raters – in other words, I'm looking to get an m x n matrix, where M_i,j will be some metric, such as the IoU (intersection over union), between i's and j's ratings (with a 1 diagonal, obviously). There are two problems I'm facing.
One, I don't know what works best in Python for this. Shapely doesn't implement circles too well, for instance.
Two, is there a more efficient way for this than comparing it annotator-by-annotator?

IMO the simplest is to fill the shapes using polygon filling / circle filling (this is simple, you can roll your own) / path filling (from a seed). Then finding the area of overlap is an easy matter.

Related

How to select the minimal set of circles that covers another circle?

I'm looking for some solutions that, given a set S of circles with 2D-center points and radii, returns a minimal sub-set M in S that covers entirely a specific circle with 2d-center point and radius. This last circle is not in S.
I've chosen circles, but it doesn't matter if we change them to squares, hexagons, etc.
You have two distinct problems: you need to turn the geometric problem into a combinatoric problem, and then you need to solve the combinatoric problem. For the latter, you are looking at a minimum set cover problem, and there should be plenty of literature on that. Personally I like Knuth's Dancing Links approach to enumerate all solutions of a set cover, but I guess for a single minimal solution you can do better. A CPLEX formulation (to match your tag) would use a binary variable for each row, and a ≥1 constraint for each column.
So now about turning geometry into combinatorics. All the lines of all your circles divide the plane into a bunch of areas. The areas are delimited by lines. Of particular relevance are the points where two or more circles meet. The exact shape of the line between these points is less relevant, and you might imagine pulling those arcs straight to come up with a more classical planar graph representation. So compute all the pair-wise intersections between all your circles. Order all intersections of a single circle by angle and connect them with graph edges in that order. Do so for all circles. Then you can do a kind of bucket fill to determine for each circle which graph faces are within and which are outside.
Now you have your matrix for the set cover: every graph face which is inside the big circle is a column you need to cover. Every circle is a row and covers some of these faces, and you know which.

Measurement for intersection of 2 irregular shaped 3d object

I am trying to implement an objective function that minimize the overlap of 2 irregular shaped 3d objects. While the most accurate measurement of the overlap is the intersection volume, it's too computationally expensive as I am dealing with complex objects with 1000+ faces and are not convex.
I am wondering if there are other measurements of intersection between 3d objects that are much faster to compute? 2 requirements for the measurement are: 1. When the measurement is 0, there should be no overlap; 2. The measurement should be a scalar(not a boolean value) indicating the degree of overlapping, but this value doesn't need to be very accurate.
Possible measurements I am considering include some sort of 2D surface area of intersection, or 1D penetration depth. Alternatively I can estimate volume with a sample based method that sample points inside one object and test the percentage of points that exist in another object. But I don't know how computational expensive it is to sample points inside a complex 3d shape as well as to test if a point is enclosed by such a shape.
I will really appreciate any advices, codes, or equations on this matter. Also if you can suggest any libraries (preferably python library) that accept .obj, .ply...etc files and perform 3D geometry computation that will be great! I will also post here if I find out a good method.
Update:
I found a good python library called Trimesh that performs all the computations mentioned by me and others in this post. It computes the exact intersection volume with the Blender backend; it can voxelize meshes and compute the volume of the co-occupied voxels; it can also perform surface and volumetric points sampling within one mesh and test points containment within another mesh. I found surface point sampling and containment testing(sort of surface intersection) and the grid approach to be the fastest.
By straight voxelization:
If the faces are of similar size (if needed triangulate the large ones), you can use a gridding approach: define a regular 3D grid with a spacing size larger than the longest edge and store one bit per voxel.
Then for every vertex of the mesh, set the bit of the cell it is included in (this just takes a truncation of the coordinates). By doing this, you will obtain the boundary of the object as a connected surface. You will obtain an estimate of the volume by means of a 3D flood filling algorithm, either from an inside or an outside pixel. (Outside will be easier but be sure to leave a one voxel margin around the object.)
Estimating the volumes of both objects as well as intersection or union is straightforward with this machinery. The cost will depend on the number of faces and the number of voxels.
A sample-based approach is what I'd try first. Generate a bunch of points in the unioned bounding AABB, and divide the number of points in A and B by the number of points in A or B. (You can adapt this measure to your use case -- it doesn't work very well when A and B have very different volumes.) To check whether a given point is in a given volume, use a crossing number test, which Google. There are acceleration structures that can help with this test, but my guess is that the number of samples that'll give you reasonable accuracy is lower than the number of samples necessary to benefit overall from building the acceleration structure.
As a variant of this, you can check line intersection instead of point intersection: Generate a random (axis-aligned, for efficiency) line, and measure how much of it is contained in A, in B, and in both A and B. This requires more bookkeeping than point-in-polyhedron, but will give you better per-sample information and thus reduce the number of times you end up iterating through all the faces.

Corner detection in an array of points

I get a pointcloud from my lidar which is basically an numpy array of points in 2D cartesian coordinates. Is there any efficient way to detect corners formed by such 2D points?
What I tried until now was to detect clusters, then apply RANSAC on each cluster to detect two lines and then estimate the intersection point of those two lines. This method works well when I know how many clusters I have (in this case I put 3 boxes in front of my robot) and when the surrounding of the robot is free and no other objects are detected.
What I would like to do is run a general corner detection, then take the points surrounding each corner and check if lines are orthogonal. If it is the case then I can consider this corner as feature. This would make my algorithm more flexible when it comes to the surrounding environment.
Here is a visualization of the data I get:
There are many many ways to do this. First thing I'd try in your case would be to chain with a reasonable distance threshold for discontinuities, using the natural lidar scan ordering of the points. Then it become a problem of either estimating local curature or, as you have done, grow and merge linear segments.

Image registration using python and cross-correlation

I got two images showing exaktly the same content: 2D-gaussian-shaped spots. I call these two 16-bit png-files "left.png" and "right.png". But as they are obtained thru an slightly different optical setup, the corresponding spots (physically the same) appear at slightly different positions. Meaning the right is slightly stretched, distorted, or so, in a non-linear way. Therefore I would like to get the transformation from left to right.
So for every pixel on the left side with its x- and y-coordinate I want a function giving me the components of the displacement-vector that points to the corresponding pixel on the right side.
In a former approach I tried to get the positions of the corresponding spots to obtain the relative distances deltaX and deltaY. These distances then I fitted to the taylor-expansion up to second order of T(x,y) giving me the x- and y-component of the displacement vector for every pixel (x,y) on the left, pointing to corresponding pixel (x',y') on the right.
To get a more general result I would like to use normalized cross-correlation. For this I multiply every pixelvalue from left with a corresponding pixelvalue from right and sum over these products. The transformation I am looking for should connect the pixels that will maximize the sum. So when the sum is maximzied, I know that I multiplied the corresponding pixels.
I really tried a lot with this, but didn't manage. My question is if somebody of you has an idea or has ever done something similar.
import numpy as np
import Image
left = np.array(Image.open('left.png'))
right = np.array(Image.open('right.png'))
# for normalization (http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation)
left = (left - left.mean()) / left.std()
right = (right - right.mean()) / right.std()
Please let me know if I can make this question more clear. I still have to check out how to post questions using latex.
Thank you very much for input.
[left.png] http://i.stack.imgur.com/oSTER.png
[right.png] http://i.stack.imgur.com/Njahj.png
I'm afraid, in most cases 16-bit images appear just black (at least on systems I use) :( but of course there is data in there.
UPDATE 1
I try to clearify my question. I am looking for a vector-field with displacement-vectors that point from every pixel in left.png to the corresponding pixel in right.png. My problem is, that I am not sure about the constraints I have.
where vector r (components x and y) points to a pixel in left.png and vector r-prime (components x-prime and y-prime) points to the corresponding pixel in right.png. for every r there is a displacement-vector.
What I did earlier was, that I found manually components of vector-field d and fitted them to a polynom second degree:
So I fitted:
and
Does this make sense to you? Is it possible to get all the delta-x(x,y) and delta-y(x,y) with cross-correlation? The cross-correlation should be maximized if the corresponding pixels are linked together thru the displacement-vectors, right?
UPDATE 2
So the algorithm I was thinking of is as follows:
Deform right.png
Get the value of cross-correlation
Deform right.png further
Get the value of cross-correlation and compare to value before
If it's greater, good deformation, if not, redo deformation and do something else
After maximzied the cross-correlation value, know what deformation there is :)
About deformation: could one do first a shift along x- and y-direction to maximize cross-correlation, then in a second step stretch or compress x- and y-dependant and in a third step deform quadratic x- and y-dependent and repeat this procedure iterativ?? I really have a problem to do this with integer-coordinates. Do you think I would have to interpolate the picture to obtain a continuous distribution?? I have to think about this again :( Thanks to everybody for taking part :)
OpenCV (and with it the python Opencv binding) has a StarDetector class which implements this algorithm.
As an alternative you might have a look at the OpenCV SIFT class, which stands for Scale Invariant Feature Transform.
Update
Regarding your comment, I understand that the "right" transformation will maximize the cross-correlation between the images, but I don't understand how you choose the set of transformations over which to maximize. Maybe if you know the coordinates of three matching points (either by some heuristics or by choosing them by hand), and if you expect affinity, you could use something like cv2.getAffineTransform to have a good initial transformation for your maximization process. From there you could use small additional transformations to have a set over which to maximize. But this approach seems to me like re-inventing something which SIFT could take care of.
To actually transform your test image you can use cv2.warpAffine, which also can take care of border values (e.g. pad with 0). To calculate the cross-correlation you could use scipy.signal.correlate2d.
Update
Your latest update did indeed clarify some points for me. But I think that a vector field of displacements is not the most natural thing to look for, and this is also where the misunderstanding came from. I was thinking more along the lines of a global transformation T, which applied to any point (x,y) of the left image gives (x',y')=T(x,y) on the right side, but T has the same analytical form for every pixel. For example, this could be a combination of a displacement, rotation, scaling, maybe some perspective transformation. I cannot say whether it is realistic or not to hope to find such a transformation, this depends on your setup, but if the scene is physically the same on both sides I would say it is reasonable to expect some affine transformation. This is why I suggested cv2.getAffineTransform. It is of course trivial to calculate your displacement Vector field from such a T, as this is just T(x,y)-(x,y).
The big advantage would be that you have only very few degrees of freedom for your transformation, instead of, I would argue, 2N degrees of freedom in the displacement vector field, where N is the number of bright spots.
If it is indeed an affine transformation, I would suggest some algorithm like this:
identify three bright and well isolated spots on the left
for each of these three spots, define a bounding box so that you can hope to identify the corresponding spot within it in the right image
find the coordinates of the corresponding spots, e.g. with some correlation method as implemented in cv2.matchTemplate or by also just finding the brightest spot within the bounding box.
once you have three matching pairs of coordinates, calculate the affine transformation which transforms one set into the other with cv2.getAffineTransform.
apply this affine transformation to the left image, as a check if you found the right one you could calculate if the overall normalized cross-correlation is above some threshold or drops significantly if you displace one image with respect to the other.
if you wish and still need it, calculate the displacement vector field trivially from your transformation T.
Update
It seems cv2.getAffineTransform expects an awkward input data type 'float32'. Let's assume the source coordinates are (sxi,syi) and destination (dxi,dyi) with i=0,1,2, then what you need is
src = np.array( ((sx0,sy0),(sx1,sy1),(sx2,sy2)), dtype='float32' )
dst = np.array( ((dx0,dy0),(dx1,dy1),(dx2,dy2)), dtype='float32' )
result = cv2.getAffineTransform(src,dst)
I don't think a cross correlation is going to help here, as it only gives you a single best shift for the whole image. There are three alternatives I would consider:
Do a cross correlation on sub-clusters of dots. Take, for example, the three dots in the top right and find the optimal x-y shift through cross-correlation. This gives you the rough transform for the top left. Repeat for as many clusters as you can to obtain a reasonable map of your transformations. Fit this with your Taylor expansion and you might get reasonably close. However, to have your cross-correlation work in any way, the difference in displacement between spots must be less than the extend of the spot, else you can never get all spots in a cluster to overlap simultaneously with a single displacement. Under these conditions, option 2 might be more suitable.
If the displacements are relatively small (which I think is a condition for option 1), then we might assume that for a given spot in the left image, the closest spot in the right image is the corresponding spot. Thus, for every spot in the left image, we find the nearest spot in the right image and use that as the displacement in that location. From the 40-something well distributed displacement vectors we can obtain a reasonable approximation of the actual displacement by fitting your Taylor expansion.
This is probably the slowest method, but might be the most robust if you have large displacements (and option 2 thus doesn't work): use something like an evolutionary algorithm to find the displacement. Apply a random transformation, compute the remaining error (you might need to define this as sum of the smallest distance between spots in your original and transformed image), and improve your transformation with those results. If your displacements are rather large you might need a very broad search as you'll probably get lots of local minima in your landscape.
I would try option 2 as it seems your displacements might be small enough to easily associate a spot in the left image with a spot in the right image.
Update
I assume your optics induce non linear distortions and having two separate beampaths (different filters in each?) will make the relationship between the two images even more non-linear. The affine transformation PiQuer suggests might give a reasonable approach but can probably never completely cover the actual distortions.
I think your approach of fitting to a low order Taylor polynomial is fine. This works for all my applications with similar conditions. Highest orders probably should be something like xy^2 and x^2y; anything higher than that you won't notice.
Alternatively, you might be able to calibrate the distortions for each image first, and then do your experiments. This way you are not dependent on the distribution of you dots, but can use a high resolution reference image to get the best description of your transformation.
Option 2 above still stands as my suggestion for getting the two images to overlap. This can be fully automated and I'm not sure what you mean when you want a more general result.
Update 2
You comment that you have trouble matching dots in the two images. If this is the case, I think your iterative cross-correlation approach may not be very robust either. You have very small dots, so overlap between them will only occur if the difference between the two images is small.
In principle there is nothing wrong with your proposed solution, but whether it works or not strongly depends on the size of your deformations and the robustness of your optimization algorithm. If you start off with very little overlap, then it may be hard to find a good starting point for your optimization. Yet if you have sufficient overlap to begin with, then you should have been able to find the deformation per dot first, but in a comment you indicate that this doesn't work.
Perhaps you can go for a mixed solution: find the cross correlation of clusters of dots to get a starting point for your optimization, and then tweak the deformation using something like the procedure you describe in your update. Thus:
For a NxN pixel segment find the shift between the left and right images
Repeat for, say, 16 of those segments
Compute an approximation of the deformation using those 16 points
Use this as the starting point of your optimization approach
You might want to have a look at bunwarpj which already does what you're trying to do. It's not python but I use it in exactly this context. You can export a plain text spline transformation and use it if you wish to do so.

python: sorting two lists of polygons for intersections

I have two big lists of polygons.
Using python, I want to take each polygon in list 1, and find the results of its geometric intersection with the polygons in list 2 (I'm using shapely to do this).
So for polygon i in list 1, there may be several polygons in list 2 that would intersect with it.
The problem is that both lists are big, and if I simply nest two loops and run the intersection command for every
possible pair of polygons, it takes a really long time. I'm not sure if preceding the intersection with a boolean test would speed this up significantly (e.g. if intersects: return intersection).
What would be a good way for me to sort or organize these two lists of polygons in order to make the intersections
more efficient? Is there a sorting algorithm that would be appropriate to this situation, and which I could make with python?
I am relatively new to programming, and have no background in discrete mathematics, so if you know an existing algorithm
that I should use, (which I assume exist for these kinds of situations), please link to or give some explanation that could assist me in actually
implementing it in python.
Also, if there's a better StackExchange site for this question, let me know. I feel like it kind of bridges general python programming, gis, and geometry, so I wasn't really sure.
Quadtrees are often used for the purpose of narrowing down the sets of polygons that need to be checked against each other - two polygons only need to be checked against each other if they both occupy at least one of the same regions in the quadtree. How deep you make your quadtree (in the case of polygons, as opposed to points) is up to you.
Even just dividing your space up to smaller constant-size areas would speed up the intersection detection (if your polygons are small and sparse enough). You make a grid and mark each polygon to belong to some cells in the grid. And then find cells that have more than one polygon in them and make the intersection calculations for those polygons only. This optimization is the easiest to code, but the most ineffective. The second easiest and more effective way would be quadtrees. Then there are BSP tres, KD trees, and BVH trees that are probably the most effective, but the hardest to code.
Edit:
Another optimization would be the following: find out the left-most and the right-most vertices of each polygon and put them in a list. Sort the list and then loop it somehow from left to right and easily find polygons whose bounding boxes' x coordinates overlap, and then make the intersection calculations for those polygons.

Categories