Polygon of $n$ points: Area is easy, perimeter is hard? - python

I am in the situation that I have $n$ boundary points of a polygon in the plane.
Then, there is an explicit formula the so-called Shoelace formula to compute the area of the polygon
Fastest way to Shoelace formula
The nice property is that the boundary points do not have to be ordered.
However, I am wondering if there also exists a similar simple algorithmtic way to compute the perimeter of the polygon just from the set of (possibly unordered) boundary points?


Finding the center of mass of a convex hull

I'm trying to find the center of mass of a convex hull. This convex hull is constructed by the triangulation of a set of scattered data points, in particular a Delaunay triangulation, and each point has a value, i.e. w = f(x,y,z). The mass of the convex hull is given by the function f, which is treated as the density of the solid. This function is unknown, so it has to be interpolated from the values w at each point.
I'm a beginner with Python, so I was wondering which would be the best way of finding this center of mass. I was trying with 2D surfaces first. scipy.interpolate.griddata interpolates the data, but then I don't know how to integrate the function in order to compute the center of mass (I need to integrate the interpolated function f over the domain of the convexhull). Any help will be much appreciated! Thanks in advance.

Fastest algorithm to find the max distance within a set of points [duplicate]

This is a question that I was asked on a job interview some time ago. And I still can't figure out sensible answer.
Question is:
you are given set of points (x,y). Find 2 most distant points. Distant from each other.
For example, for points: (0,0), (1,1), (-8, 5) - the most distant are: (1,1) and (-8,5) because the distance between them is larger from both (0,0)-(1,1) and (0,0)-(-8,5).
The obvious approach is to calculate all distances between all points, and find maximum. The problem is that it is O(n^2), which makes it prohibitively expensive for large datasets.
There is approach with first tracking points that are on the boundary, and then calculating distances for them, on the premise that there will be less points on boundary than "inside", but it's still expensive, and will fail in worst case scenario.
Tried to search the web, but didn't find any sensible answer - although this might be simply my lack of search skills.
For this specific problem, with just a list of Euclidean points, one way is to find the convex hull of the set of points. The two distant points can then be found by traversing the hull once with the rotating calipers method.
Here is an O(N log N) implementation:
If the list of points is already sorted, you can remove the sort to get the optimal O(N) complexity.
For a more general problem of finding most distant points in a graph:
Algorithm to find two points furthest away from each other
The accepted answer works in O(N^2).
Boundary point algorithms abound (look for convex hull algorithms). From there, it should take O(N) time to find the most-distant opposite points.
From the author's comment: first find any pair of opposite points on the hull, and then walk around it in semi-lock-step fashion. Depending on the angles between edges, you will have to advance either one walker or the other, but it will always take O(N) to circumnavigate the hull.
You are looking for an algorithm to compute the diameter of a set of points, Diam(S). It can be shown that this is the same as the diameter of the convex hull of S, Diam(S) = Diam(CH(S)). So first compute the convex hull of the set.
Now you have to find all the antipodal points on the convex hull and pick the pair with maximum distance. There are O(n) antipodal points on a convex polygon. So this gives a O(n lg n) algorithm for finding the farthest points.
This technique is known as Rotating Calipers. This is what Marcelo Cantos describes in his answer.
If you write the algorithm carefully, you can do without computing angles. For details, check this URL.
A stochastic algorithm to find the most distant pair would be
Choose a random point
Get the point most distant to it
Repeat a few times
Remove all visited points
Choose another random point and repeat a few times.
You are in O(n) as long as you predetermine "a few times", but are not guaranteed to actually find the most distant pair. But depending on your set of points the result should be pretty good. =)
This question is introduced at Introduction to Algorithm. It mentioned 1) Calculate Convex Hull O(NlgN). 2) If there is M vectex on Convex Hull. Then we need O(M) to find the farthest pair.
I find this helpful links. It includes analysis of algorithm details and program.
Wish this will be helpful.
Find the mean of all the points, measure the difference between all points and the mean, take the point the largest distance from the mean and find the point farthest from it. Those points will be the absolute corners of the convex hull and the two most distant points.
I recently did this for a project that needed convex hulls confined to randomly directed infinite planes. It worked great.
See the comments: this solution isn't guaranteed to produce the correct answer.
Just a few thoughts:
You might look at only the points that define the convex hull of your set of points to reduce the number,... but it still looks a bit "not optimal".
Otherwise there might be a recursive quad/oct-tree approach to rapidly bound some distances between sets of points and eliminate large parts of your data.
This seems easy if the points are given in Cartesian coordinates. So easy that I'm pretty sure that I'm overlooking something. Feel free to point out what I'm missing!
Find the points with the max and min values of their x, y, and z coordinates (6 points total). These should be the most "remote" of all the boundary points.
Compute all the distances (30 unique distances)
Find the max distance
The two points that correspond to this max distance are the ones you're looking for.
Here's a good solution, which works in O(n log n). It's called Rotating Caliper’s Method.
Firstly you find the convex hull, which you can make in O(n log n) with the Graham's scan. Only the point from the convex hull can provide you the maximal distance. This algorithm arranges points of the convex hull in the clockwise traversal. This property will be used later.
Secondly, for all the points on the convex hull, you'll need to find the most distant point on this hull (it's called the antipodal point here). You don't have to find all the antipodal points separately (which would give quadratic time). Let's say the points of the convex hall are called p_1, ..., p_n, and their order corresponds to the clockwise traversal. There is a property of convex polygons that when you iterate through points p_j on the hull in the clockwise order and calculate the distances d(p_i, p_j), these distances firstly don't decrease (and maybe increase) and then don't increase (and maybe decrease). So you can find the maximum distance easily in this case. But when you've found the correct antipodal point p_j* for the p_i, you can start this search for p_{i+1} with the candidates points starting from that p_j*. You don't need to check all previously seen points. in total p_i iterates through points p_1, ..., p_n once, and p_j iterates through these points at most twice, because p_j can never catch up p_i as it would give zero distance, and we stop when the distance starts decreasing.
A solution that has runtime complexity O(N) is a combination of the above
answers. In detail:
(1) One can compute the convex hull with runtime complexity O(N) if you
use counting sort as an internal polar angle sort and are willing to
use angles rounded to the nearest integer [0, 359], inclusive.
(2) Note that the number of points on the convex hull is then N_H which is usually less than N.
We can speculate about the size of the hull from information in Cormen et al. Introduction to Algorithms, Exercise 33-5.
For sparse-hulled distributions of a unit-radius disk, a convex polygon with k sides, and a 2-D normal distribution respectively as n^(1/3), log_2(n), sqrt(log_2(n)).
The furthest pair problem is then between comparison of points on the hull.
This is N_H^2, but each leading point's search for distance point can be
truncated when the distances start to decrease if the points are traversed
in the order of the convex hull (those points are ordered CCW from first point).
The runtime complexity for this part is then O(N_H^2).
Because N_H^2 is usually less than N, the total runtime complexity
for furthest pair is O(N) with a caveat of using integer degree angles to reduce the sort in the convex hull to linear.
Given a set of points {(x1,y1), (x2,y2) ... (xn,yn)} find 2 most distant points.
My approach:
1). You need a reference point (xa,ya), and it will be:
xa = ( x1 + x2 +...+ xn )/n
ya = ( y1 + y2 +...+ yn )/n
2). Calculate all distance from point (xa,ya) to (x1,y1), (x2,y2),...(xn,yn)
The first "most distant point" (xb,yb) is the one with the maximum distance.
3). Calculate all distance from point (xb,yb) to (x1,y1), (x2,y2),...(xn,yn)
The other "most distant point" (xc,yc) is the one with the maximum distance.
So you got your most distant points (xb,yb) (xc,yc) in O(n)
For example, for points: (0,0), (1,1), (-8, 5)
1). Reference point (xa,ya) = (-2.333, 2)
2). Calculate distances:
from (-2.333, 2) to (0,0) : 3.073
from (-2.333, 2) to (1,1) : 3.480
from (-2.333, 2) to (-8, 5) : 6.411
So the first most distant point is (-8, 5)
3). Calculate distances:
from (-8, 5) to (0,0) : 9.434
from (-8, 5) to (1,1) : 9.849
from (-8, 5) to (-8, 5) : 0
So the other most distant point is (1, 1)

Point in Spherical Polygon using Python [duplicate]

Say I have an arbitrary set of latitude and longitude pairs representing points on some simple, closed curve. In Cartesian space I could easily calculate the area enclosed by such a curve using Green's Theorem. What is the analogous approach to calculating the area on the surface of a sphere? I guess what I am after is (even some approximation of) the algorithm behind Matlab's areaint function.
There several ways to do this.
1) Integrate the contributions from latitudinal strips. Here the area of each strip will be (Rcos(A)(B1-B0))(RdA), where A is the latitude, B1 and B0 are the starting and ending longitudes, and all angles are in radians.
2) Break the surface into spherical triangles, and calculate the area using Girard's Theorem, and add these up.
3) As suggested here by James Schek, in GIS work they use an area preserving projection onto a flat space and calculate the area in there.
From the description of your data, in sounds like the first method might be the easiest. (Of course, there may be other easier methods I don't know of.)
Edit – comparing these two methods:
On first inspection, it may seem that the spherical triangle approach is easiest, but, in general, this is not the case. The problem is that one not only needs to break the region up into triangles, but into spherical triangles, that is, triangles whose sides are great circle arcs. For example, latitudinal boundaries don't qualify, so these boundaries need to be broken up into edges that better approximate great circle arcs. And this becomes more difficult to do for arbitrary edges where the great circles require specific combinations of spherical angles. Consider, for example, how one would break up a middle band around a sphere, say all the area between lat 0 and 45deg into spherical triangles.
In the end, if one is to do this properly with similar errors for each method, method 2 will give fewer triangles, but they will be harder to determine. Method 1 gives more strips, but they are trivial to determine. Therefore, I suggest method 1 as the better approach.
I rewrote the MATLAB's "areaint" function in java, which has exactly the same result.
"areaint" calculates the "suface per unit", so I multiplied the answer by Earth's Surface Area (5.10072e14 sq m).
private double area(ArrayList<Double> lats,ArrayList<Double> lons)
double sum=0;
double prevcolat=0;
double prevaz=0;
double colat0=0;
double az0=0;
for (int i=0;i<lats.size();i++)
double colat=2*Math.atan2(Math.sqrt(Math.pow(Math.sin(lats.get(i)*Math.PI/180/2), 2)+ Math.cos(lats.get(i)*Math.PI/180)*Math.pow(Math.sin(lons.get(i)*Math.PI/180/2), 2)),Math.sqrt(1- Math.pow(Math.sin(lats.get(i)*Math.PI/180/2), 2)- Math.cos(lats.get(i)*Math.PI/180)*Math.pow(Math.sin(lons.get(i)*Math.PI/180/2), 2)));
double az=0;
if (lats.get(i)>=90)
else if (lats.get(i)<=-90)
az=Math.atan2(Math.cos(lats.get(i)*Math.PI/180) * Math.sin(lons.get(i)*Math.PI/180),Math.sin(lats.get(i)*Math.PI/180))% (2*Math.PI);
if(i>0 && i<lats.size())
sum=sum+(1-Math.cos(prevcolat + (colat-prevcolat)/2))*Math.PI*((Math.abs(az-prevaz)/Math.PI)-2*Math.ceil(((Math.abs(az-prevaz)/Math.PI)-1)/2))* Math.signum(az-prevaz);
sum=sum+(1-Math.cos(prevcolat + (colat0-prevcolat)/2))*(az0-prevaz);
return 5.10072E14* Math.min(Math.abs(sum)/4/Math.PI,1-Math.abs(sum)/4/Math.PI);
You mention "geography" in one of your tags so I can only assume you are after the area of a polygon on the surface of a geoid. Normally, this is done using a projected coordinate system rather than a geographic coordinate system (i.e. lon/lat). If you were to do it in lon/lat, then I would assume the unit-of-measure returned would be percent of sphere surface.
If you want to do this with a more "GIS" flavor, then you need to select an unit-of-measure for your area and find an appropriate projection that preserves area (not all do). Since you are talking about calculating an arbitrary polygon, I would use something like a Lambert Azimuthal Equal Area projection. Set the origin/center of the projection to be the center of your polygon, project the polygon to the new coordinate system, then calculate the area using standard planar techniques.
If you needed to do many polygons in a geographic area, there are likely other projections that will work (or will be close enough). UTM, for example, is an excellent approximation if all of your polygons are clustered around a single meridian.
I am not sure if any of this has anything to do with how Matlab's areaint function works.
I don't know anything about Matlab's function, but here we go. Consider splitting your spherical polygon into spherical triangles, say by drawing diagonals from a vertex. The surface area of a spherical triangle is given by
R^2 * ( A + B + C - \pi)
where R is the radius of the sphere, and A, B, and C are the interior angles of the triangle (in radians). The quantity in the parentheses is known as the "spherical excess".
Your n-sided polygon will be split into n-2 triangles. Summing over all the triangles, extracting the common factor of R^2, and bringing all of the \pi together, the area of your polygon is
R^2 * ( S - (n-2)\pi )
where S is the angle sum of your polygon. The quantity in parentheses is again the spherical excess of the polygon.
[edit] This is true whether or not the polygon is convex. All that matters is that it can be dissected into triangles.
You can determine the angles from a bit of vector math. Suppose you have three vertices A,B,C and are interested in the angle at B. We must therefore find two tangent vectors (their magnitudes are irrelevant) to the sphere from point B along the great circle segments (the polygon edges). Let's work it out for BA. The great circle lies in the plane defined by OA and OB, where O is the center of the sphere, so it should be perpendicular to the normal vector OA x OB. It should also be perpendicular to OB since it's tangent there. Such a vector is therefore given by OB x (OA x OB). You can use the right-hand rule to verify that this is in the appropriate direction. Note also that this simplifies to OA * (OB.OB) - OB * (OB.OA) = OA * |OB| - OB * (OB.OA).
You can then use the good ol' dot product to find the angle between sides: BA'.BC' = |BA'|*|BC'|*cos(B), where BA' and BC' are the tangent vectors from B along sides to A and C.
[edited to be clear that these are tangent vectors, not literal between the points]
Here is a Python 3 implementation, loosely inspired by the above answers:
def polygon_area(lats, lons, algorithm = 0, radius = 6378137):
Computes area of spherical polygon, assuming spherical Earth.
Returns result in ratio of the sphere's area if the radius is specified.
Otherwise, in the units of provided radius.
lats and lons are in degrees.
from numpy import arctan2, cos, sin, sqrt, pi, power, append, diff, deg2rad
lats = np.deg2rad(lats)
lons = np.deg2rad(lons)
# Line integral based on Green's Theorem, assumes spherical Earth
#close polygon
if lats[0]!=lats[-1]:
lats = append(lats, lats[0])
lons = append(lons, lons[0])
#colatitudes relative to (0,0)
a = sin(lats/2)**2 + cos(lats)* sin(lons/2)**2
colat = 2*arctan2( sqrt(a), sqrt(1-a) )
#azimuths relative to (0,0)
az = arctan2(cos(lats) * sin(lons), sin(lats)) % (2*pi)
# Calculate diffs
# daz = diff(az) % (2*pi)
daz = diff(az)
daz = (daz + pi) % (2 * pi) - pi
# Perform integral
integrands = (1-cos(colat)) * daz
# Integrate
area = abs(sum(integrands))/(4*pi)
area = min(area,1-area)
if radius is not None: #return in units of radius
return area * 4*pi*radius**2
else: #return in ratio of sphere total area
return area
Please find a somewhat more explicit version (and with many more references and TODOs...) here.
You could also have a look at this code of the spherical_geometry package: Here and here. It does provide two different methods for calculating the area of a spherical polygon.

Area of polygon with list of (x,y) coordinates

It might seem a bit odd that I am asking for python code to calculate the area of a polygon with a list of (x,y) coordinates given that there have been solutions offered in stackoverflow in the past. However, I have found that all the solutions provided are sensitive to the order of the list of (x,y) coordinates given. For example, with the code below to find an area of a polygon:
def area(p):
return 0.5 * abs(sum(x0*y1 - x1*y0
for ((x0, y0), (x1, y1)) in segments(p)))
def segments(p):
return zip(p, p[1:] + [p[0]])
coordinates1 = [(0.5,0.5), (1.5,0.5), (0.5,1.5), (1.5,1.5)]
coordinates2 = [(0.5,0.5), (1.5,0.5), (1.5,1.5), (0.5,1.5)]
print "coordinates1", area(coordinates1)
print "coordinates2", area(coordinates2)
This returns
coordinates1 0.0
coordinates2 1.0 #This is the correct area
For the same set of coordinates but with a different order. How would I correct this in order to get the area of the non-intersecting full polygon with a list of random (x,y) coordinates that I want to make into a non-intersecting polygon?
EDIT: I realise now that there can be multiple non-intersecting polygons from a set of coodinates. Basically I am using scipy.spatial.Voronoi to create Voronoi cells and I wish to calculate the area of the cells once I've fed the coordinates to the scipy Voronoi function - unfortunately the function doesn't always output the coordinates in the order that will allow me to calculate the correct area.
Several non-intersecting polygons can be created from a random list of coordinates (depending on its order), and each polygon will have a different area, so it is essential that you specify the order of the coordinates to build the polygon (see attached picture for an example).
The Voronoi cells are convex, so that the polygon is unambiguously defined.
You can compute the convex hull of the points, but as there are no reflex vertices to be removed, the procedure is simpler.
1) sort the points by increasing abscissa; in case of ties, sort on ordinates (this is a lexicographical ordering);
2) consider the straight line from the first point to the last and split the point sequence in a left and a right subsequence (with respect to the line);
3) the requested polygon is the concatenation of the left subsequence and the right one, reversed.

Width of an arbitrary polygon

I need a way to characterize the size of sets of 2-D points, so I can determine whether to render them as individual points in a space or as representative polygons, dependent on the scale of the viewport. I already have an algorithm to calculate the convex hull of the set to produce the representative polygon, but I need a way to characterize its size. One obvious measure is the maximum distance between points on the convex hull, which is the diameter of the set. But I'm really more interested in the size of its cross-section perpendicular to its diameter, to figure out how narrow the bounding polygon is. Is there a simple way to do this, given the sorted list of vertices and and the indices of the furthest points (ideally in Python)?
Or alternatively, is there an easy way to calculate the radii of the minimal area bounding ellipse of a set of points? I have seen some approaches to this problem, but nothing that I can readily convert to Python, so I'm really looking for something that's turnkey.
You can compute:
the size of its cross-section perpendicular to its diameter
with the following steps:
Find the convex hull
Find the two points a and b which are furthest apart
Find the direction vector d = (a - b).normalized() between those two
Rotate your axes so that this direction vector lies horizontal, using the matrix:
[ d.x, d.y]
[-d.y, d.x]
Find the minimum and maximum y value of points in this new coordinate system. The difference is your "width"
Note that this is not a particularly good definition of "width" - a better one is:
The minimal perpendicular distance between two distinct parallel lines each having at least one point in common with the polygon's boundary but none with the polygon's interior
Another useful definition of size might be twice the average distance between points on the hull and the center
center = sum(convexhullpoints) / len(convexhullpoints)
size = 2 * sum(abs(p - center) for p in convexhullpoints) / len(convexhullpoints)
