Fastest polygon union when all polygons are rectangles? - python

At my job we have to union many polygons together for some spatial aggregation. One problem we have is limited runtime and memory constraints (aws lambda), so for larger feature collections our current geopandas implementations hit their limits.
My main question is: is there a faster polygon union / algorithm I could be using other than shapely unary union (which i assume is what geopandas dissolve is using), that could take advantage of the fact that all polygons are rectangles, with no holes. (I.e. hoping that unary union having to account for arbitrary shapes may leave performance on the table)

The following algorithm doesn't really use the rectangular property of items.
The initial process is to create additional points for each intersection : in case of intersection between two rectangle sides, side(s) may be split in 2 or more polygon segment.
The next step consist in finding the top left point of all polygons and
assume that the virtual previous point is at the left of this point.
To find the next point of the envelope, choose among all polygons sharing the current point, the vector CN (current point to next point) whose angle CP^CN is the maximum.
When next point is top left point, it is completed.
Process all points and remove those that are inside the envelope(s).
If there are remaining points, use the same algorithm to find other envelopes (archipelago case).
For performance issues, I recommend to have a relation between points and the list of [polygon, point index] to which points belong.

Related

Using Python to get unblocked area

I have a rather complicated problem. Suppose I have the shape below. You can think of the red dot as a person and the pointy shape inside the big polygon as an obstacle. My goal is to compute the total unblocked vision of the person inside the big polygon, which is the area of the polygon minus the red shaded area.
I want to write a function that takes in the coordinates of the person, the coordinates of the ordered vertices of the obstacle(s) and those of the ordered vertices of the big polygon, and returns the area of the unblocked vision.
I have tried multiple things, and I'm aware of the shoelace algorithm, but the only way that I can come up with is through monte carlo. Can I get a hint on a more intelligent and efficient way to compute the area in a closed-form way?
I think your problem is solved by finding the "Visibility polygon".
https://en.wikipedia.org/wiki/Visibility_polygon
You can use https://karlobermeyer.github.io/VisiLibity1/ library to compute the visibility area.
The first task is to get the two extreme lines of sight from the person.
A simple brute-force checking. I doubt there's a better method, unless you need this calculation at each frame. (See (a) below).
Calculate the angle (relative to X-axis, or whatever) of the line person-to-obstacle_vertex for every vertex.
Find the lowest and highest values. This can be tricky if the obstacle may somehow warp around the person.
So yo can calculate the angle of each pair of sight lines (combinatory issue), an get that with maximum angle. For this job use the Dot-Product.
The second task is to get the area of the shaded region.
You need to get the two intersections of the sight lines and the outer polygon. And then build a list of vertices of the polygon between the two intersections. Add these intersections to that list.
The area can be calculated as the sum of area of triangles, those from the person to each edge (two points in that list). While you have all coordinates, an easier way is to use the Shoelace Algorithm.
(a) If the obstacle has thousands of vertices and the person moves continuosly I'd try to reduce the number of pairs to check. You can mantain a list of shown/hidden vertices, and when the person moves check the last two used vertices and their neighbours, until you get a new couple of ending vertices.

Calculating object labelling consensus area

Scenario: four users are annotating images with one of four labels each. These are stored in a fairly complex format - either as polygons or as centre-radius circles. I'm interested in quantifying, for each class, the area of agreement between individual raters – in other words, I'm looking to get an m x n matrix, where M_i,j will be some metric, such as the IoU (intersection over union), between i's and j's ratings (with a 1 diagonal, obviously). There are two problems I'm facing.
One, I don't know what works best in Python for this. Shapely doesn't implement circles too well, for instance.
Two, is there a more efficient way for this than comparing it annotator-by-annotator?
IMO the simplest is to fill the shapes using polygon filling / circle filling (this is simple, you can roll your own) / path filling (from a seed). Then finding the area of overlap is an easy matter.

Measurement for intersection of 2 irregular shaped 3d object

I am trying to implement an objective function that minimize the overlap of 2 irregular shaped 3d objects. While the most accurate measurement of the overlap is the intersection volume, it's too computationally expensive as I am dealing with complex objects with 1000+ faces and are not convex.
I am wondering if there are other measurements of intersection between 3d objects that are much faster to compute? 2 requirements for the measurement are: 1. When the measurement is 0, there should be no overlap; 2. The measurement should be a scalar(not a boolean value) indicating the degree of overlapping, but this value doesn't need to be very accurate.
Possible measurements I am considering include some sort of 2D surface area of intersection, or 1D penetration depth. Alternatively I can estimate volume with a sample based method that sample points inside one object and test the percentage of points that exist in another object. But I don't know how computational expensive it is to sample points inside a complex 3d shape as well as to test if a point is enclosed by such a shape.
I will really appreciate any advices, codes, or equations on this matter. Also if you can suggest any libraries (preferably python library) that accept .obj, .ply...etc files and perform 3D geometry computation that will be great! I will also post here if I find out a good method.
Update:
I found a good python library called Trimesh that performs all the computations mentioned by me and others in this post. It computes the exact intersection volume with the Blender backend; it can voxelize meshes and compute the volume of the co-occupied voxels; it can also perform surface and volumetric points sampling within one mesh and test points containment within another mesh. I found surface point sampling and containment testing(sort of surface intersection) and the grid approach to be the fastest.
By straight voxelization:
If the faces are of similar size (if needed triangulate the large ones), you can use a gridding approach: define a regular 3D grid with a spacing size larger than the longest edge and store one bit per voxel.
Then for every vertex of the mesh, set the bit of the cell it is included in (this just takes a truncation of the coordinates). By doing this, you will obtain the boundary of the object as a connected surface. You will obtain an estimate of the volume by means of a 3D flood filling algorithm, either from an inside or an outside pixel. (Outside will be easier but be sure to leave a one voxel margin around the object.)
Estimating the volumes of both objects as well as intersection or union is straightforward with this machinery. The cost will depend on the number of faces and the number of voxels.
A sample-based approach is what I'd try first. Generate a bunch of points in the unioned bounding AABB, and divide the number of points in A and B by the number of points in A or B. (You can adapt this measure to your use case -- it doesn't work very well when A and B have very different volumes.) To check whether a given point is in a given volume, use a crossing number test, which Google. There are acceleration structures that can help with this test, but my guess is that the number of samples that'll give you reasonable accuracy is lower than the number of samples necessary to benefit overall from building the acceleration structure.
As a variant of this, you can check line intersection instead of point intersection: Generate a random (axis-aligned, for efficiency) line, and measure how much of it is contained in A, in B, and in both A and B. This requires more bookkeeping than point-in-polyhedron, but will give you better per-sample information and thus reduce the number of times you end up iterating through all the faces.

Corner detection in an array of points

I get a pointcloud from my lidar which is basically an numpy array of points in 2D cartesian coordinates. Is there any efficient way to detect corners formed by such 2D points?
What I tried until now was to detect clusters, then apply RANSAC on each cluster to detect two lines and then estimate the intersection point of those two lines. This method works well when I know how many clusters I have (in this case I put 3 boxes in front of my robot) and when the surrounding of the robot is free and no other objects are detected.
What I would like to do is run a general corner detection, then take the points surrounding each corner and check if lines are orthogonal. If it is the case then I can consider this corner as feature. This would make my algorithm more flexible when it comes to the surrounding environment.
Here is a visualization of the data I get:
There are many many ways to do this. First thing I'd try in your case would be to chain with a reasonable distance threshold for discontinuities, using the natural lidar scan ordering of the points. Then it become a problem of either estimating local curature or, as you have done, grow and merge linear segments.

Smallest circles enclosing points with minimized cost

I am trying to find the smallest circles enclosing points using a hierarchical search (in a tree). I searched a lot and I seem to only find smallest enclosing circle (singular) algorithms online. This is for a university class so I am asking for possible solutions and ideas more than actual code.
My problem is that I have a formula that involves two constants and the radius of a circle to compute its cost and I need to minimise the total cost. This means that for a set of points (x,y), I could find one circle enclosing all points, or multiples circles, each enclosing a part of the points, depending on the cost of each circle.
As an example, if the formulae is 1+2*radius**2, my answer will surely have multiple small circles. All points must be in a circle at the end.
My goal is to use a graph search algorithm like a*, branch and bound or breadth first and build a tree using a state and its possible actions.
I am currently trying to write my possible actions as adding a circle, removing a circle and change a circle's radius. To limit compute time, I decided to only try those actions on positions that are between two points or between two sets of points (where the center of my circles could be). But this algorithm seems to be far from optimal. If you have any ideas, it would really help me.
Thanks anyway for your help.
If the question is unclear, please tell me.
I'm going to focus on finding optimal solutions. You have a lot more options if you're open to approximate solutions, and I'm sure there will be other answers.
I would approach this problem by formulating it as an integer program. Abstractly, the program looks like
variable x_C: 1 if circle C is chosen; 0 if circle C is not chosen
minimize sum_C cost(C) * x_C
subject to
for all points p, sum_{C containing p} x_C >= 1
for all circles C, x_C in {0, 1}.
Now, there are of course infinitely many circles, but assuming that one circle that contains strictly more area than another costs more, there are O(n^3) circles that can reasonably be chosen, where n is the number of points. These are the degenerate circles covering exactly one point; the circles with two points forming a diameter; and the circles that pass through three points. You'll write code to expand the abstract integer program into a concrete one in a format accepted by an integer program solver (e.g., GLPK) and then run the solver.
The size of the integer program is O(n^4), which is prohibitively expensive for your larger instances. To get the cost down, you'll want to do column generation. This is where you'll need to figure out your solver's programmatic interface. You'll be looking for an option that, when solving the linear relaxation of the integer program, calls your code back with the current price of each point and expects an optional circle whose cost is less than the sum of the prices of the points that it encloses.
The naive algorithm to generate columns is still O(n^4), but if you switch to a sweep algorithm, the cost will be O(n^3 log n). Given a pair of points, imagine all of the circles passing by those points. All of the circle centers lie on the perpendicular bisector. For each other point, there is an interval of centers for which the circle encloses this point. Compute all of these event points, sort them, and then process the events in order, updating the current total price of the enclosed points as you go. (Since the circles are closed, process arrivals before departures.)
If you want to push this even further, look into branch and price. The high-level branching variables would be the decision to cover two points with the same circle or not.

Categories