Background
For an algorithm I'm working on, I currently use a 3D sphere as binary mask, with a NxNxN array having voxels in a sphere of radius N//2 as True. Further processing does computation for each voxel set as True.
It proved computationally intensive for my specific task as N grew large = O(N^3), so I now want to reduce my binary mask to a subsample of lines radiating from array center within radius.
Objective
I want a 3D binary mask of the lines in gray in the image.
To have a bit of control over the number of voxels, I would have a parameter (say l) regulating the number of lines sampled in each 2D circle, and maybe a second one (k ?) for the number of z-rotation.
What I tried
I am using numpy and scipy, and I thought that I could use the scipy.ndimage.interpolation.rotate method to rotate a single line around on a plane, then use that complete 2D mask to rotate around the z-axis.
This proved difficult, as interpolate uses some deep magic regarding splines that discard my True values on rotation.
I am thinking that I could compute mathematically which voxel should be set to True by following some line-equations, but I'm at a loss to find them.
Any idea how to get there ?
Update : Solution !
Thanks to jkalden who helped me think this through and gave code samples, I have this :
rmax is radius of sphere, n_theta and n_phi the number of polar and azimutal lines to use.
out_mask = np.zeros((rmax*2,) * 3, dtype=bool)
# for each phi = one circle in azimutal circles
for phi in np.linspace(0, np.deg2rad(360), n_phi,endpoint=False):
# for all lines in polar circle of this azimutal circle
for theta in np.linspace(0, np.deg2rad(360), n_theta,endpoint=False):
# for all distances (0-rmax) in these lines
for r in range(rmax):
coords = spherical_to_cartesian([r, theta, phi]) + rmax
out_mask[tuple(coords)] = True
With the spherical_to_cartesian from this code sample.
Which gives me this (with rmax = 50 and n_theta = n_phi = 8) :
(Center area tuned out of my function by choice)
I propose to change the coordinate system to spherical coordinates. Thus, you will choose your 2D circle by an azimuthal angle, and a line then is defined by additionally choosing a polar angle. The variable along the line is then just the radius, and you can use ´numpy.linspace´ to discretize it. Doing so might also save time during calculation.
You can switch your coordinate system any time by using the bijective relation which is implemented e.g. here or here.
I'm trying to draw ellipses around points of a group on a graph, with matplotlib. I would like to obtain something like this:
A dataset for a group (the red one for example) could look like this:
[[-23.88315146 -3.26328266] # first point
[-25.94906669 -1.47440904] # second point
[-26.52423229 -4.84947907]] # third point
I can easily draw the points on a graph, but I encounter problems to draw the ellipses.
The ellipses have diameters of 2 * standard deviation, and its center has the coordinates (x_mean, y_mean). The width of one ellipse equals the x standard deviation * 2. Its height equals the y standard deviation * 2.
However, I don't know how to calculate the angle of the ellipses (you can see on the picture the ellipses are not perfectly vertical).
Do you have an idea about how to do that ?
Note:
This question is a simplification of LDA problem (Linear Discriminant Analysis). I'm trying to simplify the problem to its most basic expression.
This is a well-studied problem. First take the convex hull of the set of points
you wish to enclose. Then perform computations as described in the literature.
I provide two sources below.
"Smallest Enclosing Ellipses--An Exact and Generic Implementation in C++" (abstract link).
Charles F. Van Loan. "Using the Ellipse to Fit and Enclose Data Points."
(PDF download).
This has a lot more to do with mathematics than programming ;)
Since you already have the dimensions and only want to find the angle, here is what I would do (based on my instinct):
Try to find the line that best fits the given set of points (trendline), this is also called Linear Regression. There are several methods to do this but the Least Squares method is a relatively easy one (see below).
Once you found the best fitting line, you could use the slope as your angle.
Least Squares Linear Regression
The least squares linear regression method is used to find the slope of the trendline, exactly what we want.
Here is a video explaining how it works
Let's assume you have a data set: data = [(x1, y1), (x2, y2), ...]
Using the least square method, your slope would be:
# I see in your example that you already have x_mean and y_mean
# No need to calculate them again, skip the two following lines
# and use your values in the rest of the example
avg_x = sum(element[0] for element in data)/len(data)
avg_y = sum(element[1] for element in data)/len(data)
x_diff = [element[0] - avg_x for element in data]
y_diff = [element[1] - avg_y for element in data]
x_diff_squared = [element**2 for element in x_diff]
slope = sum(x * y for x,y in zip(x_diff, y_diff)) / sum(x_diff_squared)
Once you have that, you are almost done. The slope is equal to the tangent of the angle slope = tan(angle)
Use python's math module angle = math.atan(slope) this will return the angle in radians. If you want it in degrees you have to convert it using math.degrees(angle)
Combine this with the dimensions and position you already have and you got yourself an ellipse ;)
This is how I would solve this particular problem, but there are probably a thousand different methods that would have worked too
and may eventually be better (and more complex) than what I propose.
I wrote a simple function to implement Mathieu David's solution. I'm sure there are many ways to do this, but this worked for my application.
def get_ellipse_params(self, points):
''' Calculate the parameters needed to graph an ellipse around a cluster of points in 2D.
Calculate the height, width and angle of an ellipse to enclose the points in a cluster.
Calculate the width by finding the maximum distance between the x-coordinates of points
in the cluster, and the height by finding the maximum distance between the y-coordinates
in the cluster. Multiple both by a scale factor to give padding around the points when
constructing the ellipse. Calculate the angle by taking the inverse tangent of the
gradient of the regression line. Note that tangent solutions repeat every 180 degrees,
and so to ensure the correct solution has been found for plotting, add a correction
factor of +/- 90 degrees if the magnitude of the angle exceeds 45 degrees.
Args:
points (ndarray): The points in a cluster to enclose with an ellipse, containing n
ndarray elements representing each point, each with d elements
representing the coordinates for the point.
Returns:
width (float): The width of the ellipse.
height (float): The height of the ellipse.
angle (float): The angle of the ellipse in degrees.
'''
if points.ndim == 1:
width, height, angle = 0.1, 0.1, 0
return width, height, angle
else:
SCALE = 2.5
width = np.amax(points[:,0]) - np.amin(points[:,0])
height = np.amax(points[:,1]) - np.amin(points[:,1])
# Calculate angle
x_reg, y_reg = [[p[0]] for p in points], [[p[1]] for p in points]
grad = LinearRegression().fit(x_reg, y_reg).coef_[0][0]
angle = np.degrees(np.arctan(grad))
# Account for multiple solutions of arctan
if angle < -45: angle += 90
elif angle > 45: angle -= 90
return width*SCALE, height*SCALE, angle
Is there even such a thing as a 3D centroid? Let me be perfectly clear—I've been reading and reading about centroids for the last 2 days both on this site and across the web, so I'm perfectly aware at the existing posts on the topic, including Wikipedia.
That said, let me explain what I'm trying to do. Basically, I want to take a selection of edges and/or vertices, but NOT faces. Then, I want to place an object at the 3D centroid position.
I'll tell you what I don't want:
The vertices average, which would pull too far in any direction that has a more high-detailed mesh.
The bounding box center, because I already have something working for this scenario.
I'm open to suggestions about center of mass, but I don't see how this would work, because vertices or edges alone don't define any sort of mass, especially when I just have an edge loop selected.
For kicks, I'll show you some PyMEL that I worked up, using #Emile's code as reference, but I don't think it's working the way it should:
from pymel.core import ls, spaceLocator
from pymel.core.datatypes import Vector
from pymel.core.nodetypes import NurbsCurve
def get_centroid(node):
if not isinstance(node, NurbsCurve):
raise TypeError("Requires NurbsCurve.")
centroid = Vector(0, 0, 0)
signed_area = 0.0
cvs = node.getCVs(space='world')
v0 = cvs[len(cvs) - 1]
for i, cv in enumerate(cvs[:-1]):
v1 = cv
a = v0.x * v1.y - v1.x * v0.y
signed_area += a
centroid += sum([v0, v1]) * a
v0 = v1
signed_area *= 0.5
centroid /= 6 * signed_area
return centroid
texas = ls(selection=True)[0]
centroid = get_centroid(texas)
print(centroid)
spaceLocator(position=centroid)
In theory centroid = SUM(pos*volume)/SUM(volume) when you split the part into finite volumes each with a location pos and volume value volume.
This is precisely the calculation done for finding the center of gravity of a composite part.
There is not just a 3D centroid, there is an n-dimensional centroid, and the formula for it is given in the "By integral formula" section of the Wikipedia article you cite.
Perhaps you are having trouble setting up this integral? You have not defined your shape.
[Edit] I'll beef up this answer in response to your comment. Since you have described your shape in terms of edges and vertices, then I'll assume it is a polyhedron. You can partition a polyedron into pyramids, find the centroids of the pyramids, and then the centroid of your shape is the centroid of the centroids (this last calculation is done using ja72's formula).
I'll assume your shape is convex (no hollow parts---if this is not the case then break it into convex chunks). You can partition it into pyramids (triangulate it) by picking a point in the interior and drawing edges to the vertices. Then each face of your shape is the base of a pyramid. There are formulas for the centroid of a pyramid (you can look this up, it's 1/4 the way from the centroid of the face to your interior point). Then as was said, the centroid of your shape is the centroid of the centroids---ja72's finite calculation, not an integral---as given in the other answer.
This is the same algorithm as in Hugh Bothwell's answer, however I believe that 1/4 is correct instead of 1/3. Perhaps you can find some code for it lurking around somewhere using the search terms in this description.
I like the question. Centre of mass sounds right, but the question then becomes, what mass for each vertex?
Why not use the average length of each edge that includes the vertex? This should compensate nicely areas with a dense mesh.
You will have to recreate face information from the vertices (essentially a Delauney triangulation).
If your vertices define a convex hull, you can pick any arbitrary point A inside the object. Treat your object as a collection of pyramidal prisms having apex A and each face as a base.
For each face, find the area Fa and the 2d centroid Fc; then the prism's mass is proportional to the volume (== 1/3 base * height (component of Fc-A perpendicular to the face)) and you can disregard the constant of proportionality so long as you do the same for all prisms; the center of mass is (2/3 A + 1/3 Fc), or a third of the way from the apex to the 2d centroid of the base.
You can then do a mass-weighted average of the center-of-mass points to find the 3d centroid of the object as a whole.
The same process should work for non-convex hulls - or even for A outside the hull - but the face-calculation may be a problem; you will need to be careful about the handedness of your faces.
The title basically says it all. I need to calculate the area inside a polygon on the Earth's surface using Python. Calculating area enclosed by arbitrary polygon on Earth's surface says something about it, but remains vague on the technical details:
If you want to do this with a more
"GIS" flavor, then you need to select
an unit-of-measure for your area and
find an appropriate projection that
preserves area (not all do). Since you
are talking about calculating an
arbitrary polygon, I would use
something like a Lambert Azimuthal
Equal Area projection. Set the
origin/center of the projection to be
the center of your polygon, project
the polygon to the new coordinate
system, then calculate the area using
standard planar techniques.
So, how do I do this in Python?
Let's say you have a representation of the state of Colorado in GeoJSON format
{"type": "Polygon",
"coordinates": [[
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0]
]]}
All coordinates are longitude, latitude. You can use pyproj to project the coordinates and Shapely to find the area of any projected polygon:
co = {"type": "Polygon", "coordinates": [
[(-102.05, 41.0),
(-102.05, 37.0),
(-109.05, 37.0),
(-109.05, 41.0)]]}
lon, lat = zip(*co['coordinates'][0])
from pyproj import Proj
pa = Proj("+proj=aea +lat_1=37.0 +lat_2=41.0 +lat_0=39.0 +lon_0=-106.55")
That's an equal area projection centered on and bracketing the area of interest. Now make new projected GeoJSON representation, turn into a Shapely geometric object, and take the area:
x, y = pa(lon, lat)
cop = {"type": "Polygon", "coordinates": [zip(x, y)]}
from shapely.geometry import shape
shape(cop).area # 268952044107.43506
It's a very close approximation to the surveyed area. For more complex features, you'll need to sample along the edges, between the vertices, to get accurate values. All caveats above about datelines, etc, apply. If you're only interested in area, you can translate your feature away from the dateline before projecting.
The easiest way to do this (in my opinion), is to project things into (a very simple) equal-area projection and use one of the usual planar techniques for calculating area.
First off, I'm going to assume that a spherical earth is close enough for your purposes, if you're asking this question. If not, then you need to reproject your data using an appropriate ellipsoid, in which case you're going to want to use an actual projection library (everything uses proj4 behind the scenes, these days) such as the python bindings to GDAL/OGR or (the much more friendly) pyproj.
However, if you're okay with a spherical earth, it quite simple to do this without any specialized libraries.
The simplest equal-area projection to calculate is a sinusoidal projection. Basically, you just multiply the latitude by the length of one degree of latitude, and the longitude by the length of a degree of latitude and the cosine of the latitude.
def reproject(latitude, longitude):
"""Returns the x & y coordinates in meters using a sinusoidal projection"""
from math import pi, cos, radians
earth_radius = 6371009 # in meters
lat_dist = pi * earth_radius / 180.0
y = [lat * lat_dist for lat in latitude]
x = [long * lat_dist * cos(radians(lat))
for lat, long in zip(latitude, longitude)]
return x, y
Okay... Now all we have to do is to calculate the area of an arbitrary polygon in a plane.
There are a number of ways to do this. I'm going to use what is probably the most common one here.
def area_of_polygon(x, y):
"""Calculates the area of an arbitrary polygon given its verticies"""
area = 0.0
for i in range(-1, len(x)-1):
area += x[i] * (y[i+1] - y[i-1])
return abs(area) / 2.0
Hopefully that will point you in the right direction, anyway...
A bit late perhaps, but here is a different method, using Girard's theorem. It states that the area of a polygon of great circles is R**2 times the sum of the angles between the polygons minus (N-2)*pi where N is number of corners.
I thought this would be worth posting, since it doesn't rely on any other libraries than numpy, and it is a quite different method than the others. Of course, this only works on a sphere, so there will be some inaccuracy when applying it to the Earth.
First, I define a function to compute the bearing angle from point 1 along a great circle to point 2:
import numpy as np
from numpy import cos, sin, arctan2
d2r = np.pi/180
def greatCircleBearing(lon1, lat1, lon2, lat2):
dLong = lon1 - lon2
s = cos(d2r*lat2)*sin(d2r*dLong)
c = cos(d2r*lat1)*sin(d2r*lat2) - sin(lat1*d2r)*cos(d2r*lat2)*cos(d2r*dLong)
return np.arctan2(s, c)
Now I can use this to find the angles, and then the area (In the following, lons and lats should of course be specified, and they should be in the right order. Also, the radius of the sphere should be specified.)
N = len(lons)
angles = np.empty(N)
for i in range(N):
phiB1, phiA, phiB2 = np.roll(lats, i)[:3]
LB1, LA, LB2 = np.roll(lons, i)[:3]
# calculate angle with north (eastward)
beta1 = greatCircleBearing(LA, phiA, LB1, phiB1)
beta2 = greatCircleBearing(LA, phiA, LB2, phiB2)
# calculate angle between the polygons and add to angle array
angles[i] = np.arccos(cos(-beta1)*cos(-beta2) + sin(-beta1)*sin(-beta2))
area = (sum(angles) - (N-2)*np.pi)*R**2
With the Colorado coordinates given in another reply, and with Earth radius 6371 km, I get that the area is 268930758560.74808
Or simply use a library: https://github.com/scisco/area
from area import area
>>> obj = {'type':'Polygon','coordinates':[[[-180,-90],[-180,90],[180,90],[180,-90],[-180,-90]]]}
>>> area(obj)
511207893395811.06
...returns the area in square meters.
You can compute the area directly on the sphere, instead of using an equal-area projection.
Moreover, according to this discussion, it seems that Girard's theorem (sulkeh's answer) does not give accurate results in certain cases, for example "the area enclosed by a 30º lune from pole to pole and bounded by the prime meridian and 30ºE" (see here).
A more precise solution would be to perform line integral directly on the sphere. The comparison below shows this method is more precise.
Like all other answers, I should mention the caveat that we assume a spherical earth, but I assume that for non-critical purposes this is enough.
Python implementation
Here is a Python 3 implementation which uses line integral and Green's theorem:
def polygon_area(lats, lons, radius = 6378137):
"""
Computes area of spherical polygon, assuming spherical Earth.
Returns result in ratio of the sphere's area if the radius is specified.
Otherwise, in the units of provided radius.
lats and lons are in degrees.
"""
from numpy import arctan2, cos, sin, sqrt, pi, power, append, diff, deg2rad
lats = np.deg2rad(lats)
lons = np.deg2rad(lons)
# Line integral based on Green's Theorem, assumes spherical Earth
#close polygon
if lats[0]!=lats[-1]:
lats = append(lats, lats[0])
lons = append(lons, lons[0])
#colatitudes relative to (0,0)
a = sin(lats/2)**2 + cos(lats)* sin(lons/2)**2
colat = 2*arctan2( sqrt(a), sqrt(1-a) )
#azimuths relative to (0,0)
az = arctan2(cos(lats) * sin(lons), sin(lats)) % (2*pi)
# Calculate diffs
# daz = diff(az) % (2*pi)
daz = diff(az)
daz = (daz + pi) % (2 * pi) - pi
deltas=diff(colat)/2
colat=colat[0:-1]+deltas
# Perform integral
integrands = (1-cos(colat)) * daz
# Integrate
area = abs(sum(integrands))/(4*pi)
area = min(area,1-area)
if radius is not None: #return in units of radius
return area * 4*pi*radius**2
else: #return in ratio of sphere total area
return area
I wrote a somewhat more explicit version (and with many more references and TODOs...) in the sphericalgeometry package there.
Numerical Comparison
Colorado will be the reference, since all previous answers were evaluated on its area. Its precise total area is 104,093.67 square miles (from the US Census Bureau, p. 89, see also here), or 269601367661 square meters. I found no source for the actual methodology of the USCB, but I assume it is based on summing actual measurements on ground, or precise computations using WGS84/EGM2008.
Method | Author | Result | Variation from ground truth
--------------------------------------------------------------------------------
Albers Equal Area | sgillies | 268952044107 | -0.24%
Sinusoidal | J. Kington | 268885360163 | -0.26%
Girard's theorem | sulkeh | 268930758560 | -0.25%
Equal Area Cylindrical | Jason | 268993609651 | -0.22%
Line integral | Yellows | 269397764066 | **-0.07%**
Conclusion: using direct integral is more precise.
Performance
I have not benchmarked the different methods, and comparing pure Python code with compiled PROJ projections would not be meaningful. Intuitively less computations are needed. On the other hand, trigonometric functions may be computationally intensive.
Here is a solution that uses basemap, instead of pyproj and shapely, for the coordinate conversion. The idea is the same as suggested by #sgillies though. NOTE that I've added the 5th point so that the path is a closed loop.
import numpy
from mpl_toolkits.basemap import Basemap
coordinates=numpy.array([
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0],
[-102.05, 41.0]])
lats=coordinates[:,1]
lons=coordinates[:,0]
lat1=numpy.min(lats)
lat2=numpy.max(lats)
lon1=numpy.min(lons)
lon2=numpy.max(lons)
bmap=Basemap(projection='cea',llcrnrlat=lat1,llcrnrlon=lon1,urcrnrlat=lat2,urcrnrlon=lon2)
xs,ys=bmap(lons,lats)
area=numpy.abs(0.5*numpy.sum(ys[:-1]*numpy.diff(xs)-xs[:-1]*numpy.diff(ys)))
area=area/1e6
print area
The result is 268993.609651 in km^2.
UPDATE: Basemap has been deprecated, so you may want to consider alternative solutions first.
Because the earth is a closed surface a closed polygon drawn on its surface creates TWO polygonal areas. You also need to define which one is inside and which is outside!
Most times people will be dealing with small polygons, and so it's 'obvious' but once you have things the size of oceans or continents, you better make sure you get this the right way round.
Also, remember that lines can go from (-179,0) to (+179,0) in two different ways. One is very much longer than the other. Again, mostly you'll make the assumption that this is a line that goes from (-179,0) to (-180,0) which is (+180,0) and then to (+179,0), but one day... it won't.
Treating lat-long like a simple (x,y) coordinate system, or even neglecting the fact that any coordinate projection is going to have distortions and breaks, can make you fail big-time on spheres.
I know that answering 10 years later has some advantages, but to somebody that looks today at this question it seems fair to provide an updated answer.
pyproj directly calculates areas, without need of calling shapely:
# Modules:
from pyproj import Geod
import numpy as np
# Define WGS84 as CRS:
geod = Geod('+a=6378137 +f=0.0033528106647475126')
# Data for Colorado (no need to close the polygon):
coordinates = np.array([
[-102.05, 41.0],
[-102.05, 37.0],
[-109.05, 37.0],
[-109.05, 41.0]])
lats = coordinates[:,1]
lons = coordinates[:,0]
# Compute:
area, perim = geod.polygon_area_perimeter(lons, lats)
print(abs(area)) # Positive is counterclockwise, the data is clockwise.
The result is: 269154.54988400977 km2, or -0.17% of the reported correct value (269601.367661 km2).
According to Yellows' assertion, direct integral is more precise.
But Yellows use an earth radius = 6378 137m, which is the WGS-84 ellipsoid, semi-major axis, while Sulkeh use 6371 000 m.
Using a radius = 6378 137 m in the Sulkeh' method, gives 269533625893 square meters.
Assuming that the true value of Colorado area (from the US Census Bureau) is 269601367661 square meters then the variation from the ground truth of Sulkeh' method is : -0,025%, better than -0.07 with the Line integral method.
So Sulkeh' proposal seems to be the more precise so far.
In order to be able to make a numerical comparison of the solutions, with the assumption of a spherical Earth, all calculations must use the same terrestrial radius.
Here is a Python 3 implementation where the function would take a list of tuple-pairs of lats and longs and would return the area enclosed in the projected polygon.It uses pyproj to project the coordinates and then Shapely to find the area of any projected polygon
def calc_area(lis_lats_lons):
import numpy as np
from pyproj import Proj
from shapely.geometry import shape
lons, lats = zip(*lis_lats_lons)
ll = list(set(lats))[::-1]
var = []
for i in range(len(ll)):
var.append('lat_' + str(i+1))
st = ""
for v, l in zip(var,ll):
st = st + str(v) + "=" + str(l) +" "+ "+"
st = st +"lat_0="+ str(np.mean(ll)) + " "+ "+" + "lon_0" +"=" + str(np.mean(lons))
tx = "+proj=aea +" + st
pa = Proj(tx)
x, y = pa(lons, lats)
cop = {"type": "Polygon", "coordinates": [zip(x, y)]}
return shape(cop).area
For a sample set of lats/longs, it gives an area value close to the surveyed approximation value
calc_area(lis_lats_lons = [(-102.05, 41.0),
(-102.05, 37.0),
(-109.05, 37.0),
(-109.05, 41.0)])
Which outputs an area of 268952044107.4342 Sq. Mts.