How to calculate 3D distance (including altitude) between two points in GeoDjango - python

Prologue:
This is a question arising often in SO:
3d distance calculations with GeoDjango
Calculating distance between two points using latitude longitude and altitude (elevation)
Distance between two 3D point in geodjango (postgis)
I wanted to compose an example on SO Documentation but the geodjango chapter never took off and since the Documentation got shut down on August 8, 2017, I will follow the suggestion of this widely upvoted and discussed meta answer and write my example as a self-answered post.
Of course, I would be more than happy to see any different approach as well!!
Question:
Assume the model:
class MyModel(models.Model):
name = models.CharField()
coordinates = models.PointField()
Where I store the point in the coordinate variable as a lan, lng, alt point:
MyModel.objects.create(
name='point_name',
coordinates='SRID=3857;POINT Z (100.00 10.00 150)')
I am trying to calculate the 3D distance between two such points:
p1 = MyModel.objects.get(name='point_1').coordinates
p2 = MyModel.objects.get(name='point_2').coordinates
d = Distance(m=p1.distance(p2))
Now d=X in meters.
If I change only the altitude of one of the points in question:
For example:
p1.coordinates = 'SRID=3857;POINT Z (100.00 10.00 200)'
from 150 previously, the calculation:
d = Distance(m=p1.distance(p2))
returns d=X again, like the elevation is ignored.
How can I calculate the 3D distance between my points?

Reading from the documentation on the GEOSGeometry.distance method:
Returns the distance between the closest points on this geometry and the given geom (another GEOSGeometry object).
Note
GEOS distance calculations are linear – in other words, GEOS does not perform a spherical calculation even if the SRID specifies a geographic coordinate system.
Therefore we need to implement a method to calculate a more accurate 2D distance between 2 points and then we can try to apply the altitude (Z) difference between those points.
1. Great-Circle 2D distance calculation (Take a look at the 2022 UPDATE below the explanation for a better approach using geopy):
The most common way to calculate the distance between 2 points on the surface of a sphere (as the Earth is simplistically but usually modeled) is the Haversine formula:
The haversine formula determines the great-circle distance between two points on a sphere given their longitudes and latitudes.
Although from the great-circle distance wiki page we read:
Although this formula is accurate for most distances on a sphere, it too suffers from rounding errors for the special (and somewhat unusual) case of antipodal points (on opposite ends of the sphere). A formula that is accurate for all distances is the following special case of the Vincenty formula for an ellipsoid with equal major and minor axes.
We can create our own implementation of the Haversine or the Vincenty formula (as shown here for Haversine: Haversine Formula in Python (Bearing and Distance between two GPS points)) or we can use one of the already implemented methods contained in geopy:
geopy.distance.great_circle (Haversine):
from geopy.distance import great_circle
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
# This call will result in 536.997990696 miles
great_circle(newport_ri, cleveland_oh).miles)
geopy.distance.vincenty (Vincenty):
from geopy.distance import vincenty
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
# This call will result in 536.997990696 miles
vincenty(newport_ri, cleveland_oh).miles
!!!2022 UPDATE: On 2D distance calculation using geopy:
GeoPy discourages the use of Vincenty as of version 1.14.0. Changelog states:
CHANGED: Vincenty usage now issues a warning. Geodesic should be used instead. Vincenty is planned to be removed in geopy 2.0. (#293)
So (especially if we are going to apply the calculation on a WGS84 ellipsoid) we should use geodesic distance instead:
from geopy.distance import geodesic
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
# This call will result in 538.390445368 miles
geodesic(newport_ri, cleveland_oh).miles
2. Adding altitude to the mix:
As mentioned, each of the above calculations yields a great circle distance between 2 points. That distance is also called "as the crow flies", assuming that the "crow" flies without changing altitude and as straight as possible from point A to point B.
We can have a better estimation of the "walking/driving" ("as the crow walks"??) distance by combining the result of one of the previous methods with the difference (delta) in altitude between point A and point B, inside the Euclidean Formula for distance calculation:
acw_dist = sqrt(great_circle(p1, p2).m**2 + (p1.z - p2.z)**2)
The previous solution is prone to errors especially the longer the real distance between the points is. I leave it here for comment continuation reasons.
GeoDjango Distance calculates the 2D distance between two points and doesn't take into consideration the altitude differences.
In order to get the 3D calculation, we need to create a distance function that will consider altitude differences in the calculation:
Theory:
The latitude, longitude and altitude are Polar coordinates and we need to translate them to Cartesian coordinates (x, y, z) in order to apply the Euclidean Formula on them and calculate their 3D distance.
Assume:
polar_point_1 = (long_1, lat_1, alt_1)
and polar_point_2 = (long_2, lat_2, alt_2)
Translate each point to it's Cartesian equivalent by utilizing this formula:
x = alt * cos(lat) * sin(long)
y = alt * sin(lat)
z = alt * cos(lat) * cos(long)
and you will have p_1 = (x_1, y_1, z_1) and p_2 = (x_2, y_2, z_2) points respectively.
Finally use the Euclidean formula:
dist = sqrt((x_2-x_1)**2 + (y_2-y_1)**2 + (z_2-z_1)**2)

Using geopy, this is the easiest and perfect solution.
https://geopy.readthedocs.io/en/stable/#geopy.distance.lonlat
>>> from geopy.distance import distance
>>> from geopy.point import Point
>>> a = Point(-71.312796, 41.49008, 0)
>>> b = Point(-81.695391, 41.499498, 0)
>>> print(distance(a, b).miles)
538.3904453677203

Once converted into Cartesian coordinates, you can compute the norm with numpy:
np.linalg.norm(point_1 - point_2)

Related

What unit is the distance in when using cKDTree?

I am trying to calculate the distance between the closest points in two geodataframes.
I used the function created by jHUW here. The function is as follows:
def ckdnearest(gdA, gdB):
nA = np.array(list(gdA.geometry.apply(lambda x: (x.x, x.y))))
nB = np.array(list(gdB.geometry.apply(lambda x: (x.x, x.y))))
btree = cKDTree(nB)
dist, idx = btree.query(nA, k=1)
gdB_nearest = gdB.iloc[idx].drop(columns="geometry").reset_index(drop=True)
gdf = pd.concat(
[
gdA.reset_index(drop=True),
gdB_nearest,
pd.Series(dist, name='dist')
],
axis=1)
return gdf
It's working fine between my datasets, but I was wondering what unit the returned distance is in. I did some research and found that the unit will be the same as the unit of the array used. I used an array of lat-lons, like so:
array([[-122.3295182, 47.6202074],
[-122.296276 , 37.8789939],
[-122.6857603, 45.5289172],
[-118.3804073, 33.9017057],
[ -93.2911788, 44.860997 ]])
I tried to find out what the units of lat-lons would be, but was unsuccessful. I also checked the distance between some of the point pairs on GoogleMaps to get some insight, but couldn't make sense of them. For instance, Googlemaps show a distance of 1.5 miles for my first pair, but the distance returned by the function is 0.0087466. I understand that ckDTree calculates the Euclidean distance but even then, the difference seems quite large. Please provide some insight if you have them.
The result of Scipy is indeed a L2 norm (aka Euclidean distance). The meaning of this distance is dependent of the chosen coordinate system. In your case, you appear to use a geographic coordinate system (which is a spherical coordinate system). As a result, coordinates are are based on angles and cannot be linearly transformed to meters (for example in Antarctica changing the angle does not impact much the distance in meter). Additionally, one need to consider the distortion of the space while computing the distance: a straight line for use on earth is a geodesic in your geographic space. The L2 norm computed by Scipy does not consider this. In fact, using this metric probably results in wrong results: the L2 norm computed over-estimate the actual distance (of the geodesic, both in meter or radian) in Antarctica compared to the equator. This means two point near to the north pole can be considered as close as two points located each in Japan and Europe... Thus, you certainly need to use a better metric. As for the unit of the distance, it does nor make much sense mainly because of this issue. On relatively good metric would be the length of the geodesic (possibly in meters) or the angle between two point. Unfortunately, AFAIK this is not possible with Scipy... Using a GIS library (like GDAL) may help.

Point in Spherical Polygon using Python [duplicate]

Say I have an arbitrary set of latitude and longitude pairs representing points on some simple, closed curve. In Cartesian space I could easily calculate the area enclosed by such a curve using Green's Theorem. What is the analogous approach to calculating the area on the surface of a sphere? I guess what I am after is (even some approximation of) the algorithm behind Matlab's areaint function.
There several ways to do this.
1) Integrate the contributions from latitudinal strips. Here the area of each strip will be (Rcos(A)(B1-B0))(RdA), where A is the latitude, B1 and B0 are the starting and ending longitudes, and all angles are in radians.
2) Break the surface into spherical triangles, and calculate the area using Girard's Theorem, and add these up.
3) As suggested here by James Schek, in GIS work they use an area preserving projection onto a flat space and calculate the area in there.
From the description of your data, in sounds like the first method might be the easiest. (Of course, there may be other easier methods I don't know of.)
Edit – comparing these two methods:
On first inspection, it may seem that the spherical triangle approach is easiest, but, in general, this is not the case. The problem is that one not only needs to break the region up into triangles, but into spherical triangles, that is, triangles whose sides are great circle arcs. For example, latitudinal boundaries don't qualify, so these boundaries need to be broken up into edges that better approximate great circle arcs. And this becomes more difficult to do for arbitrary edges where the great circles require specific combinations of spherical angles. Consider, for example, how one would break up a middle band around a sphere, say all the area between lat 0 and 45deg into spherical triangles.
In the end, if one is to do this properly with similar errors for each method, method 2 will give fewer triangles, but they will be harder to determine. Method 1 gives more strips, but they are trivial to determine. Therefore, I suggest method 1 as the better approach.
I rewrote the MATLAB's "areaint" function in java, which has exactly the same result.
"areaint" calculates the "suface per unit", so I multiplied the answer by Earth's Surface Area (5.10072e14 sq m).
private double area(ArrayList<Double> lats,ArrayList<Double> lons)
{
double sum=0;
double prevcolat=0;
double prevaz=0;
double colat0=0;
double az0=0;
for (int i=0;i<lats.size();i++)
{
double colat=2*Math.atan2(Math.sqrt(Math.pow(Math.sin(lats.get(i)*Math.PI/180/2), 2)+ Math.cos(lats.get(i)*Math.PI/180)*Math.pow(Math.sin(lons.get(i)*Math.PI/180/2), 2)),Math.sqrt(1- Math.pow(Math.sin(lats.get(i)*Math.PI/180/2), 2)- Math.cos(lats.get(i)*Math.PI/180)*Math.pow(Math.sin(lons.get(i)*Math.PI/180/2), 2)));
double az=0;
if (lats.get(i)>=90)
{
az=0;
}
else if (lats.get(i)<=-90)
{
az=Math.PI;
}
else
{
az=Math.atan2(Math.cos(lats.get(i)*Math.PI/180) * Math.sin(lons.get(i)*Math.PI/180),Math.sin(lats.get(i)*Math.PI/180))% (2*Math.PI);
}
if(i==0)
{
colat0=colat;
az0=az;
}
if(i>0 && i<lats.size())
{
sum=sum+(1-Math.cos(prevcolat + (colat-prevcolat)/2))*Math.PI*((Math.abs(az-prevaz)/Math.PI)-2*Math.ceil(((Math.abs(az-prevaz)/Math.PI)-1)/2))* Math.signum(az-prevaz);
}
prevcolat=colat;
prevaz=az;
}
sum=sum+(1-Math.cos(prevcolat + (colat0-prevcolat)/2))*(az0-prevaz);
return 5.10072E14* Math.min(Math.abs(sum)/4/Math.PI,1-Math.abs(sum)/4/Math.PI);
}
You mention "geography" in one of your tags so I can only assume you are after the area of a polygon on the surface of a geoid. Normally, this is done using a projected coordinate system rather than a geographic coordinate system (i.e. lon/lat). If you were to do it in lon/lat, then I would assume the unit-of-measure returned would be percent of sphere surface.
If you want to do this with a more "GIS" flavor, then you need to select an unit-of-measure for your area and find an appropriate projection that preserves area (not all do). Since you are talking about calculating an arbitrary polygon, I would use something like a Lambert Azimuthal Equal Area projection. Set the origin/center of the projection to be the center of your polygon, project the polygon to the new coordinate system, then calculate the area using standard planar techniques.
If you needed to do many polygons in a geographic area, there are likely other projections that will work (or will be close enough). UTM, for example, is an excellent approximation if all of your polygons are clustered around a single meridian.
I am not sure if any of this has anything to do with how Matlab's areaint function works.
I don't know anything about Matlab's function, but here we go. Consider splitting your spherical polygon into spherical triangles, say by drawing diagonals from a vertex. The surface area of a spherical triangle is given by
R^2 * ( A + B + C - \pi)
where R is the radius of the sphere, and A, B, and C are the interior angles of the triangle (in radians). The quantity in the parentheses is known as the "spherical excess".
Your n-sided polygon will be split into n-2 triangles. Summing over all the triangles, extracting the common factor of R^2, and bringing all of the \pi together, the area of your polygon is
R^2 * ( S - (n-2)\pi )
where S is the angle sum of your polygon. The quantity in parentheses is again the spherical excess of the polygon.
[edit] This is true whether or not the polygon is convex. All that matters is that it can be dissected into triangles.
You can determine the angles from a bit of vector math. Suppose you have three vertices A,B,C and are interested in the angle at B. We must therefore find two tangent vectors (their magnitudes are irrelevant) to the sphere from point B along the great circle segments (the polygon edges). Let's work it out for BA. The great circle lies in the plane defined by OA and OB, where O is the center of the sphere, so it should be perpendicular to the normal vector OA x OB. It should also be perpendicular to OB since it's tangent there. Such a vector is therefore given by OB x (OA x OB). You can use the right-hand rule to verify that this is in the appropriate direction. Note also that this simplifies to OA * (OB.OB) - OB * (OB.OA) = OA * |OB| - OB * (OB.OA).
You can then use the good ol' dot product to find the angle between sides: BA'.BC' = |BA'|*|BC'|*cos(B), where BA' and BC' are the tangent vectors from B along sides to A and C.
[edited to be clear that these are tangent vectors, not literal between the points]
Here is a Python 3 implementation, loosely inspired by the above answers:
def polygon_area(lats, lons, algorithm = 0, radius = 6378137):
"""
Computes area of spherical polygon, assuming spherical Earth.
Returns result in ratio of the sphere's area if the radius is specified.
Otherwise, in the units of provided radius.
lats and lons are in degrees.
"""
from numpy import arctan2, cos, sin, sqrt, pi, power, append, diff, deg2rad
lats = np.deg2rad(lats)
lons = np.deg2rad(lons)
# Line integral based on Green's Theorem, assumes spherical Earth
#close polygon
if lats[0]!=lats[-1]:
lats = append(lats, lats[0])
lons = append(lons, lons[0])
#colatitudes relative to (0,0)
a = sin(lats/2)**2 + cos(lats)* sin(lons/2)**2
colat = 2*arctan2( sqrt(a), sqrt(1-a) )
#azimuths relative to (0,0)
az = arctan2(cos(lats) * sin(lons), sin(lats)) % (2*pi)
# Calculate diffs
# daz = diff(az) % (2*pi)
daz = diff(az)
daz = (daz + pi) % (2 * pi) - pi
deltas=diff(colat)/2
colat=colat[0:-1]+deltas
# Perform integral
integrands = (1-cos(colat)) * daz
# Integrate
area = abs(sum(integrands))/(4*pi)
area = min(area,1-area)
if radius is not None: #return in units of radius
return area * 4*pi*radius**2
else: #return in ratio of sphere total area
return area
Please find a somewhat more explicit version (and with many more references and TODOs...) here.
You could also have a look at this code of the spherical_geometry package: Here and here. It does provide two different methods for calculating the area of a spherical polygon.

Scipy: how to convert KD-Tree distance from query to kilometers (Python/Pandas)

This post builds upon this one.
I got a Pandas dataframe containing cities with their geo-coordinates (geodetic) as longitude and latitude.
import pandas as pd
df = pd.DataFrame([{'city':"Berlin", 'lat':52.5243700, 'lng':13.4105300},
{'city':"Potsdam", 'lat':52.3988600, 'lng':13.0656600},
{'city':"Hamburg", 'lat':53.5753200, 'lng':10.0153400}]);
For each city I'm trying to find two other cities that are closest. Therefore I tried the scipy.spatial.KDTree. To do so, I had to convert the geodetic coordinates into 3D catesian coordinates (ECEF = earth-centered, earth-fixed):
from math import *
def to_Cartesian(lat, lng):
R = 6367 # radius of the Earth in kilometers
x = R * cos(lat) * cos(lng)
y = R * cos(lat) * sin(lng)
z = R * sin(lat)
return x, y, z
df['x'], df['y'], df['z'] = zip(*map(to_Cartesian, df['lat'], df['lng']))
df
This give me this:
With this I can create the KDTree:
coordinates = list(zip(df['x'], df['y'], df['z']))
from scipy import spatial
tree = spatial.KDTree(coordinates)
tree.data
Now I'm testing it with Berlin,
tree.query(coordinates[0], 2)
which correctly gives me Berlin (itself) and Potsdam as the two cities from my list that are closest to Berlin.
Question: But I wonder what to do with the distance from that query? It says 1501 - but how can I convert this to meters or kilometers? The real distance between Berlin and Potsdam is 27km and not 1501km.
Remark: I know I could get longitude/latitude for both cities and calculate the haversine-distance. But would be cool that use the output from KDTree instead.
(array([ 0. , 1501.59637685]), array([0, 1]))
Any help is appreciated.
The KDTree is computing the euclidean distance between the two points (cities). The two cities and the center of the earth form an isosceles triangle.
The German wikipedia entry contains a nice overview of the geometric properties which the English entry lacks. You can use this to compute the distance.
import numpy as np
def deg2rad(degree):
rad = degree * 2*np.pi / 360
return(rad)
def distToKM(x):
R = 6367 # earth radius
gamma = 2*np.arcsin(deg2rad(x/(2*R))) # compute the angle of the isosceles triangle
dist = 2*R*sin(gamma/2) # compute the side of the triangle
return(dist)
distToKM(1501.59637685)
# 26.207800812050056
Update
After the comment about obtaining the opposite I re-read the question and realised that while it seems that one can use the proposed function above, the real problem lies somewhere else.
cos and sin in your function to_Cartesian expect the input to be in radians (documentation) whereas you are handing them the angles in degree. You can use the function deg2rad defined above to transform the latitude and longitude to radians. This should give you the distance in km directly from the KDTree.

Efficiently finding the closest coordinate pair from a set in Python

The Problem
Imagine I am stood in an airport. Given a geographic coordinate pair, how can one efficiently determine which airport I am stood in?
Inputs
A coordinate pair (x,y) representing the location I am stood at.
A set of coordinate pairs [(a1,b1), (a2,b2)...] where each coordinate pair represents one airport.
Desired Output
A coordinate pair (a,b) from the set of airport coordinate pairs representing the closest airport to the point (x,y).
Inefficient Solution
Here is my inefficient attempt at solving this problem. It is clearly linear in the length of the set of airports.
shortest_distance = None
shortest_distance_coordinates = None
point = (50.776435, -0.146834)
for airport in airports:
distance = compute_distance(point, airport)
if distance < shortest_distance or shortest_distance is None:
shortest_distance = distance
shortest_distance_coordinates = airport
The Question
How can this solution be improved? This might involve some way of pre-filtering the list of airports based on the coordinates of the location we are currently stood at, or sorting them in a certain order beforehand.
Using a k-dimensional tree:
>>> from scipy import spatial
>>> airports = [(10,10),(20,20),(30,30),(40,40)]
>>> tree = spatial.KDTree(airports)
>>> tree.query([(21,21)])
(array([ 1.41421356]), array([1]))
Where 1.41421356 is the distance between the queried point and the nearest neighbour and 1 is the index of the neighbour.
See: http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.query.html#scipy.spatial.KDTree.query
If your coordinates are unsorted, your search can only be improved slightly assuming it is (latitude,longitude) by filtering on latitude first as for earth
1 degree of latitude on the sphere is 111.2 km or 69 miles
but that would not give a huge speedup.
If you sort the airports by latitude first then you can use a binary search for finding the first airport that could match (airport_lat >= point_lat-tolerance) and then only compare up to the last one that could match (airport_lat <= point_lat+tolerance) - but take care of 0 degrees equaling 360. While you cannot use that library directly, the sources of bisect are a good start for implementing a binary search.
While technically this way the search is still O(n), you have much fewer actual distance calculations (depending on tolerance) and few latitude comparisons. So you will have a huge speedup.
From this SO question:
import numpy as np
def closest_node(node, nodes):
nodes = np.asarray(nodes)
deltas = nodes - node
dist_2 = np.einsum('ij,ij->i', deltas, deltas)
return np.argmin(dist_2)
where node is a tuple with two values (x, y) and nodes is an array of tuples with two values ([(x_1, y_1), (x_2, y_2),])
The answer of #Juddling is great, but KDTree does not support haversine distance, which is better suited for latitude/longitude coordinates.
For the haversine distance you can use BallTree. Please note, that you need to convert your coordinates to radians first.
from math import radians
from sklearn.neighbors import BallTree
import numpy as np
airports = [(10,10),(20,20),(30,30),(40,40)]
airports_rad = np.array([[radians(x[0]), radians(x[1])] for x in airports ])
tree = BallTree(airports_rad , metric = 'haversine')
result = tree.query([(radians(21),radians(21))])
print(result)
gives
(array([[0.02391369]]), array([[1]], dtype=int64))
To convert the distance to meters you need to multiply by the earth radius (in meters).
earth_radius = 6371000 # meters in earth
print(result[0][0] * earth_radius)
[152354.11114795]

How to calculate the area of a polygon on the earth's surface using python?

The title basically says it all. I need to calculate the area inside a polygon on the Earth's surface using Python. Calculating area enclosed by arbitrary polygon on Earth's surface says something about it, but remains vague on the technical details:
If you want to do this with a more
"GIS" flavor, then you need to select
an unit-of-measure for your area and
find an appropriate projection that
preserves area (not all do). Since you
are talking about calculating an
arbitrary polygon, I would use
something like a Lambert Azimuthal
Equal Area projection. Set the
origin/center of the projection to be
the center of your polygon, project
the polygon to the new coordinate
system, then calculate the area using
standard planar techniques.
So, how do I do this in Python?
Let's say you have a representation of the state of Colorado in GeoJSON format
{"type": "Polygon",
"coordinates": [[
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0]
]]}
All coordinates are longitude, latitude. You can use pyproj to project the coordinates and Shapely to find the area of any projected polygon:
co = {"type": "Polygon", "coordinates": [
[(-102.05, 41.0),
(-102.05, 37.0),
(-109.05, 37.0),
(-109.05, 41.0)]]}
lon, lat = zip(*co['coordinates'][0])
from pyproj import Proj
pa = Proj("+proj=aea +lat_1=37.0 +lat_2=41.0 +lat_0=39.0 +lon_0=-106.55")
That's an equal area projection centered on and bracketing the area of interest. Now make new projected GeoJSON representation, turn into a Shapely geometric object, and take the area:
x, y = pa(lon, lat)
cop = {"type": "Polygon", "coordinates": [zip(x, y)]}
from shapely.geometry import shape
shape(cop).area # 268952044107.43506
It's a very close approximation to the surveyed area. For more complex features, you'll need to sample along the edges, between the vertices, to get accurate values. All caveats above about datelines, etc, apply. If you're only interested in area, you can translate your feature away from the dateline before projecting.
The easiest way to do this (in my opinion), is to project things into (a very simple) equal-area projection and use one of the usual planar techniques for calculating area.
First off, I'm going to assume that a spherical earth is close enough for your purposes, if you're asking this question. If not, then you need to reproject your data using an appropriate ellipsoid, in which case you're going to want to use an actual projection library (everything uses proj4 behind the scenes, these days) such as the python bindings to GDAL/OGR or (the much more friendly) pyproj.
However, if you're okay with a spherical earth, it quite simple to do this without any specialized libraries.
The simplest equal-area projection to calculate is a sinusoidal projection. Basically, you just multiply the latitude by the length of one degree of latitude, and the longitude by the length of a degree of latitude and the cosine of the latitude.
def reproject(latitude, longitude):
"""Returns the x & y coordinates in meters using a sinusoidal projection"""
from math import pi, cos, radians
earth_radius = 6371009 # in meters
lat_dist = pi * earth_radius / 180.0
y = [lat * lat_dist for lat in latitude]
x = [long * lat_dist * cos(radians(lat))
for lat, long in zip(latitude, longitude)]
return x, y
Okay... Now all we have to do is to calculate the area of an arbitrary polygon in a plane.
There are a number of ways to do this. I'm going to use what is probably the most common one here.
def area_of_polygon(x, y):
"""Calculates the area of an arbitrary polygon given its verticies"""
area = 0.0
for i in range(-1, len(x)-1):
area += x[i] * (y[i+1] - y[i-1])
return abs(area) / 2.0
Hopefully that will point you in the right direction, anyway...
A bit late perhaps, but here is a different method, using Girard's theorem. It states that the area of a polygon of great circles is R**2 times the sum of the angles between the polygons minus (N-2)*pi where N is number of corners.
I thought this would be worth posting, since it doesn't rely on any other libraries than numpy, and it is a quite different method than the others. Of course, this only works on a sphere, so there will be some inaccuracy when applying it to the Earth.
First, I define a function to compute the bearing angle from point 1 along a great circle to point 2:
import numpy as np
from numpy import cos, sin, arctan2
d2r = np.pi/180
def greatCircleBearing(lon1, lat1, lon2, lat2):
dLong = lon1 - lon2
s = cos(d2r*lat2)*sin(d2r*dLong)
c = cos(d2r*lat1)*sin(d2r*lat2) - sin(lat1*d2r)*cos(d2r*lat2)*cos(d2r*dLong)
return np.arctan2(s, c)
Now I can use this to find the angles, and then the area (In the following, lons and lats should of course be specified, and they should be in the right order. Also, the radius of the sphere should be specified.)
N = len(lons)
angles = np.empty(N)
for i in range(N):
phiB1, phiA, phiB2 = np.roll(lats, i)[:3]
LB1, LA, LB2 = np.roll(lons, i)[:3]
# calculate angle with north (eastward)
beta1 = greatCircleBearing(LA, phiA, LB1, phiB1)
beta2 = greatCircleBearing(LA, phiA, LB2, phiB2)
# calculate angle between the polygons and add to angle array
angles[i] = np.arccos(cos(-beta1)*cos(-beta2) + sin(-beta1)*sin(-beta2))
area = (sum(angles) - (N-2)*np.pi)*R**2
With the Colorado coordinates given in another reply, and with Earth radius 6371 km, I get that the area is 268930758560.74808
Or simply use a library: https://github.com/scisco/area
from area import area
>>> obj = {'type':'Polygon','coordinates':[[[-180,-90],[-180,90],[180,90],[180,-90],[-180,-90]]]}
>>> area(obj)
511207893395811.06
...returns the area in square meters.
You can compute the area directly on the sphere, instead of using an equal-area projection.
Moreover, according to this discussion, it seems that Girard's theorem (sulkeh's answer) does not give accurate results in certain cases, for example "the area enclosed by a 30º lune from pole to pole and bounded by the prime meridian and 30ºE" (see here).
A more precise solution would be to perform line integral directly on the sphere. The comparison below shows this method is more precise.
Like all other answers, I should mention the caveat that we assume a spherical earth, but I assume that for non-critical purposes this is enough.
Python implementation
Here is a Python 3 implementation which uses line integral and Green's theorem:
def polygon_area(lats, lons, radius = 6378137):
"""
Computes area of spherical polygon, assuming spherical Earth.
Returns result in ratio of the sphere's area if the radius is specified.
Otherwise, in the units of provided radius.
lats and lons are in degrees.
"""
from numpy import arctan2, cos, sin, sqrt, pi, power, append, diff, deg2rad
lats = np.deg2rad(lats)
lons = np.deg2rad(lons)
# Line integral based on Green's Theorem, assumes spherical Earth
#close polygon
if lats[0]!=lats[-1]:
lats = append(lats, lats[0])
lons = append(lons, lons[0])
#colatitudes relative to (0,0)
a = sin(lats/2)**2 + cos(lats)* sin(lons/2)**2
colat = 2*arctan2( sqrt(a), sqrt(1-a) )
#azimuths relative to (0,0)
az = arctan2(cos(lats) * sin(lons), sin(lats)) % (2*pi)
# Calculate diffs
# daz = diff(az) % (2*pi)
daz = diff(az)
daz = (daz + pi) % (2 * pi) - pi
deltas=diff(colat)/2
colat=colat[0:-1]+deltas
# Perform integral
integrands = (1-cos(colat)) * daz
# Integrate
area = abs(sum(integrands))/(4*pi)
area = min(area,1-area)
if radius is not None: #return in units of radius
return area * 4*pi*radius**2
else: #return in ratio of sphere total area
return area
I wrote a somewhat more explicit version (and with many more references and TODOs...) in the sphericalgeometry package there.
Numerical Comparison
Colorado will be the reference, since all previous answers were evaluated on its area. Its precise total area is 104,093.67 square miles (from the US Census Bureau, p. 89, see also here), or 269601367661 square meters. I found no source for the actual methodology of the USCB, but I assume it is based on summing actual measurements on ground, or precise computations using WGS84/EGM2008.
Method | Author | Result | Variation from ground truth
--------------------------------------------------------------------------------
Albers Equal Area | sgillies | 268952044107 | -0.24%
Sinusoidal | J. Kington | 268885360163 | -0.26%
Girard's theorem | sulkeh | 268930758560 | -0.25%
Equal Area Cylindrical | Jason | 268993609651 | -0.22%
Line integral | Yellows | 269397764066 | **-0.07%**
Conclusion: using direct integral is more precise.
Performance
I have not benchmarked the different methods, and comparing pure Python code with compiled PROJ projections would not be meaningful. Intuitively less computations are needed. On the other hand, trigonometric functions may be computationally intensive.
Here is a solution that uses basemap, instead of pyproj and shapely, for the coordinate conversion. The idea is the same as suggested by #sgillies though. NOTE that I've added the 5th point so that the path is a closed loop.
import numpy
from mpl_toolkits.basemap import Basemap
coordinates=numpy.array([
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0],
[-102.05, 41.0]])
lats=coordinates[:,1]
lons=coordinates[:,0]
lat1=numpy.min(lats)
lat2=numpy.max(lats)
lon1=numpy.min(lons)
lon2=numpy.max(lons)
bmap=Basemap(projection='cea',llcrnrlat=lat1,llcrnrlon=lon1,urcrnrlat=lat2,urcrnrlon=lon2)
xs,ys=bmap(lons,lats)
area=numpy.abs(0.5*numpy.sum(ys[:-1]*numpy.diff(xs)-xs[:-1]*numpy.diff(ys)))
area=area/1e6
print area
The result is 268993.609651 in km^2.
UPDATE: Basemap has been deprecated, so you may want to consider alternative solutions first.
Because the earth is a closed surface a closed polygon drawn on its surface creates TWO polygonal areas. You also need to define which one is inside and which is outside!
Most times people will be dealing with small polygons, and so it's 'obvious' but once you have things the size of oceans or continents, you better make sure you get this the right way round.
Also, remember that lines can go from (-179,0) to (+179,0) in two different ways. One is very much longer than the other. Again, mostly you'll make the assumption that this is a line that goes from (-179,0) to (-180,0) which is (+180,0) and then to (+179,0), but one day... it won't.
Treating lat-long like a simple (x,y) coordinate system, or even neglecting the fact that any coordinate projection is going to have distortions and breaks, can make you fail big-time on spheres.
I know that answering 10 years later has some advantages, but to somebody that looks today at this question it seems fair to provide an updated answer.
pyproj directly calculates areas, without need of calling shapely:
# Modules:
from pyproj import Geod
import numpy as np
# Define WGS84 as CRS:
geod = Geod('+a=6378137 +f=0.0033528106647475126')
# Data for Colorado (no need to close the polygon):
coordinates = np.array([
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0]])
lats = coordinates[:,1]
lons = coordinates[:,0]
# Compute:
area, perim = geod.polygon_area_perimeter(lons, lats)
print(abs(area)) # Positive is counterclockwise, the data is clockwise.
The result is: 269154.54988400977 km2, or -0.17% of the reported correct value (269601.367661 km2).
According to Yellows' assertion, direct integral is more precise.
But Yellows use an earth radius = 6378 137m, which is the WGS-84 ellipsoid, semi-major axis, while Sulkeh use 6371 000 m.
Using a radius = 6378 137 m in the Sulkeh' method, gives 269533625893 square meters.
Assuming that the true value of Colorado area (from the US Census Bureau) is 269601367661 square meters then the variation from the ground truth of Sulkeh' method is : -0,025%, better than -0.07 with the Line integral method.
So Sulkeh' proposal seems to be the more precise so far.
In order to be able to make a numerical comparison of the solutions, with the assumption of a spherical Earth, all calculations must use the same terrestrial radius.
Here is a Python 3 implementation where the function would take a list of tuple-pairs of lats and longs and would return the area enclosed in the projected polygon.It uses pyproj to project the coordinates and then Shapely to find the area of any projected polygon
def calc_area(lis_lats_lons):
import numpy as np
from pyproj import Proj
from shapely.geometry import shape
lons, lats = zip(*lis_lats_lons)
ll = list(set(lats))[::-1]
var = []
for i in range(len(ll)):
var.append('lat_' + str(i+1))
st = ""
for v, l in zip(var,ll):
st = st + str(v) + "=" + str(l) +" "+ "+"
st = st +"lat_0="+ str(np.mean(ll)) + " "+ "+" + "lon_0" +"=" + str(np.mean(lons))
tx = "+proj=aea +" + st
pa = Proj(tx)
x, y = pa(lons, lats)
cop = {"type": "Polygon", "coordinates": [zip(x, y)]}
return shape(cop).area
For a sample set of lats/longs, it gives an area value close to the surveyed approximation value
calc_area(lis_lats_lons = [(-102.05, 41.0),
(-102.05, 37.0),
(-109.05, 37.0),
(-109.05, 41.0)])
Which outputs an area of 268952044107.4342 Sq. Mts.

Categories