I have implemented GeoDjango using postgis.
Here is my model:
...
geometria = models.PolygonField(srid=4326, null=True)
...
When I call data.area it returns a float, but I don't have any clues about it's measurement units, and it's a problem because I want to test if it's bigger of a pre-set area in squared meters.
Can you help me?
If you are dealing with large areas on the map, you should set
geometria = models.PolygonField(srid=4326, null=True, geography=True)
As mentioned in geodjango's documentation https://docs.djangoproject.com/en/dev/ref/contrib/gis/model-api/#geography
Geography Type In PostGIS 1.5, the geography type was introduced -- it
provides native support for spatial features represented with
geographic coordinates (e.g., WGS84 longitude/latitude). [7] Unlike
the plane used by a geometry type, the geography type uses a spherical
representation of its data. Distance and measurement operations
performed on a geography column automatically employ great circle arc
calculations and return linear units. In other words, when ST_Distance
is called on two geographies, a value in meters is returned (as
opposed to degrees if called on a geometry column in WGS84).
If you do not have geography=True, we are storing things as plain geometries, we will need to do conversion from square degrees (the floating point result you are getting) into a unit of measure you prefer because we cannot calculate area from geographic coordinates. We can instead add a helper method which is in a projected coordinate space to do the transformation:
def get_acres(self):
"""
Returns the area in acres.
"""
# Convert our geographic polygons (in WGS84)
# into a local projection for New York (here EPSG:32118)
self.polygon.transform(32118)
meters_sq = self.polygon.area.sq_m
acres = meters_sq * 0.000247105381 # meters^2 to acres
return acres
Which projection we use depends on the extent of the data, and how accurate we need the results: here I've illustrated with a specific projection for part of New York, but if your data isn't particularly accurate, you could easily substitute a global projection or just use a simple formula.
Related
Here is a docs:
osmnx.distance.nearest_edges(G, X, Y, interpolate=None, return_dist=False)
Find the nearest edge to a point or to each of several points.
If X and Y are single coordinate values, this will return the nearest edge to that point. If X and Y are lists of coordinate values, this will return the nearest edge to each point.
If interpolate is None, search for the nearest edge to each point, one at a time, using an r-tree and minimizing the euclidean distances from the point to the possible matches. For accuracy, use a projected graph and points. This method is precise and also fastest if searching for few points relative to the graph’s size.
For a faster method if searching for many points relative to the graph’s size, use the interpolate argument to interpolate points along the edges and index them. If the graph is projected, this uses a k-d tree for euclidean nearest neighbor search, which requires that scipy is installed as an optional dependency. If graph is unprojected, this uses a ball tree for haversine nearest neighbor search, which requires that scikit-learn is installed as an optional dependency.
Parameters:
G (networkx.MultiDiGraph) – graph in which to find nearest edges
X (float or list) – points’ x (longitude) coordinates, in same CRS/units as graph and containing no nulls
Y (float or list) – points’ y (latitude) coordinates, in same CRS/units as graph and containing no nulls
interpolate (float) – spacing distance between interpolated points, in same units as graph. smaller values generate more points.
return_dist (bool) – optionally also return distance between points and nearest edges
Returns:
ne or (ne, dist) – nearest edges as (u, v, key) or optionally a tuple where dist contains distances between the points and their nearest edges
Return type:
tuple or list
Here is a question
But what is crs? why cant I use a normal longitude and latitude here? points are 6467474 something like this(dtype:float64). I am new to GIS.
u v key
What is CRS/units in osmnx python?
Are you asking what these terms mean? Or what their default values are? If it is the former, you can refer to any introductory GIS textbook. If it is the latter, as the OSMnx documentation states, the default CRS is EPSG:4326. Regarding distance units, it depends on what you did in your code. You did not provide a complete, minimal, reproducible example. If you projected your graph, then distances are measured in whatever units your projection is in. If you did not, then distances are measured in meters by default.
why cant I use a normal longitude and latitude here?
As the documentation states, you can pass in latitude and longitude to find the nearest edge(s) to point(s). I would strongly encourage you to work through the OSMnx usage examples to learn how the package works and practice some demonstration code (including find nearest edges to lat-lng points).
points are 6467474 something like this(dtype:float64). I am new to GIS. u v key
I don't know what you mean. Again, you need to provide a complete, minimal, reproducible code snippet so we can diagnose and troubleshoot. If you do, I can edit this answer if your code snippet provides enough info for me to give further information.
I am trying to calculate the distance between the closest points in two geodataframes.
I used the function created by jHUW here. The function is as follows:
def ckdnearest(gdA, gdB):
nA = np.array(list(gdA.geometry.apply(lambda x: (x.x, x.y))))
nB = np.array(list(gdB.geometry.apply(lambda x: (x.x, x.y))))
btree = cKDTree(nB)
dist, idx = btree.query(nA, k=1)
gdB_nearest = gdB.iloc[idx].drop(columns="geometry").reset_index(drop=True)
gdf = pd.concat(
[
gdA.reset_index(drop=True),
gdB_nearest,
pd.Series(dist, name='dist')
],
axis=1)
return gdf
It's working fine between my datasets, but I was wondering what unit the returned distance is in. I did some research and found that the unit will be the same as the unit of the array used. I used an array of lat-lons, like so:
array([[-122.3295182, 47.6202074],
[-122.296276 , 37.8789939],
[-122.6857603, 45.5289172],
[-118.3804073, 33.9017057],
[ -93.2911788, 44.860997 ]])
I tried to find out what the units of lat-lons would be, but was unsuccessful. I also checked the distance between some of the point pairs on GoogleMaps to get some insight, but couldn't make sense of them. For instance, Googlemaps show a distance of 1.5 miles for my first pair, but the distance returned by the function is 0.0087466. I understand that ckDTree calculates the Euclidean distance but even then, the difference seems quite large. Please provide some insight if you have them.
The result of Scipy is indeed a L2 norm (aka Euclidean distance). The meaning of this distance is dependent of the chosen coordinate system. In your case, you appear to use a geographic coordinate system (which is a spherical coordinate system). As a result, coordinates are are based on angles and cannot be linearly transformed to meters (for example in Antarctica changing the angle does not impact much the distance in meter). Additionally, one need to consider the distortion of the space while computing the distance: a straight line for use on earth is a geodesic in your geographic space. The L2 norm computed by Scipy does not consider this. In fact, using this metric probably results in wrong results: the L2 norm computed over-estimate the actual distance (of the geodesic, both in meter or radian) in Antarctica compared to the equator. This means two point near to the north pole can be considered as close as two points located each in Japan and Europe... Thus, you certainly need to use a better metric. As for the unit of the distance, it does nor make much sense mainly because of this issue. On relatively good metric would be the length of the geodesic (possibly in meters) or the angle between two point. Unfortunately, AFAIK this is not possible with Scipy... Using a GIS library (like GDAL) may help.
So, I am doing some work with data from an INS unit, in order to calculate the errors in its readings by integrating its velocity data over time to get a change in position, and then comparing that to its actual recorded change in position. The problem is that it gives its position with Latitude and Longitude in degrees (to 11 decimal places), and its documentation indicates that these are using the WGS84 standard, while its velocities are given in meters/second (to 10 decimal places).
I found this other question, but the answers to it were giving answers that assumed that the Earth is a sphere, while the WGS standard uses an ellipsoid, and it seems possible that using calculations that assume that the Earth is spherical might introduce errors into my calculations.
I'm intending to use Python to perform my data analysis with, so ideally answers should use Python as well, but using another language to do the data cleaning would work as long as I can save the cleaned data into a text file that Python can read.
Perhaps you could use LatLon (or for python3 LatLon23), which does enable treating eearth as an ellipsoid.
see an example code using LatLon23 for python3:
from LatLon23 import LatLon, Latitude, Longitude
palmyra = LatLon(Latitude(5.8833), Longitude(-162.0833)) # Location of Palmyra Atoll
honolulu = LatLon(Latitude(21.3), Longitude(-157.8167)) # Location of Honolulu, HI
distance = palmyra.distance(honolulu) # WGS84 distance in km
print(distance)
print(palmyra.distance(honolulu, ellipse = 'sphere')) # FAI distance in km
initial_heading = palmyra.heading_initial(honolulu) # Heading from Palmyra to Honolulu on WGS84 ellipsoid
print(initial_heading)
hnl = palmyra.offset(initial_heading, distance) # Reconstruct Honolulu based on offset from Palmyra
print(hnl.to_string('D')) # Coordinates of Honolulu
How to design a simple code to automatically quantify a 2D rough surface based on given scatter points geometrically? For example, to use a number, r=0 for a smooth surface, r=1 for a very rough surface and the surface is in between smooth and rough when 0 < r < 1.
To more explicitly illustrate this question, the attached figure below is used to show several sketches of 2D rough surfaces. The dots are the scattered points with given coordinates. Accordingly, every two adjacent dots can be connected and a normal vector of each segment can be computed (marked with arrow). I would like to design a function like
def roughness(x, y):
...
return r
where x and y are sequences of coordinates of each scatter point. For example, in case (a), x=[0,1,2,3,4,5,6], y=[0,1,0,1,0,1,0]; in case (b), x=[0,1,2,3,4,5], y=[0,0,0,0,0,0]. When we call the function roughness(x, y), we will get r=1 (very rough) for case (a) and r=0 (smooth) for case (b). Maybe r=0.5 (medium) for case (d). The question is refined to what appropriate components do we need to put inside the function roughness?
Some initial thoughts:
Roughness of a surface is a local concept, which we only consider within a specific range of area, i.e. only with several local points around the location of interest. To use mean of local normal vectors? This may fail: (a) and (b) are with the same mean, (0,1), but (a) is rough surface and (b) is smooth surface. To use variance of local normal vectors? This may also fail: (c) and (d) are with the same variance, but (c) is rougher than (d).
maybe something like this:
import numpy as np
def roughness(x, y):
# angles between successive points
t = np.arctan2(np.diff(y), np.diff(x))
# differences between angles
ts = np.sin(t)
tc = np.cos(t)
dt = ts[1:] * tc[:-1] - tc[1:] * ts[:-1]
# sum of squares
return np.sum(dt**2) / len(dt)
would give you something like you're asking?
Maybe you should consider a protocol definition:
1) geometric definition of the surface first
2) grant unto that geometric surface intrinsic properties.
2.a) step function can be based on quadratic curve between two peaks or two troughs with their concatenated point as the focus of the 'roughness quadratic' using the slope to define roughness in analogy to the science behind road speed-bumps.
2.b) elliptical objects can be defined by a combination of deformation analysis with centered circles on the incongruity within the body. This can be solved in many ways analogous to step functions.
2.c) flat lines: select points that deviate from the mean and do a Newtonian around with a window of 5-20 concatenated points or what ever is clever.
3) define a proper threshold that fits what ever intuition you are defining as "roughness" or apply conventions of any professional field to your liking.
This branched approach might be quicker to program, but I am certain this solution can be refactored into a Euclidean construct of 3-point ellipticals, if someone is up for a geometry problem.
The mathematical definitions of many surface parameters can be found here, which can be easily put into numpy:
https://www.keyence.com/ss/products/microscope/roughness/surface/parameters.jsp
Image (d) shows a challenge: basically you want to flatten the shape before doing the calculation. This requires prior knowledge of the type of geometry you want to fit. I found an app Gwyddion that can do this in 3D, but it can only interface with Python 2.7, not 3.
If you know which base shape lies underneath:
fit the known shape
calculate the arc distance between each two points
remap the numbers by subtracting 1) from the original data and assigning new coordinates according to 2)
perform normal 2D/3D roughness calculations
TL;DR:
Exact question would be how to generate semi-equal polygons of desired size e.g around 100x100m 1000x1000m, 5000x5000m grid that will cover Earth ?**
Background story:
I'm building a python based microservice that for given LAT,LON (WGS84) will return a json with some data, eg. matched country/city/or selected polygon grid.
Part with country/city/clutter works fine so far as I'm using shapefile and R-tree for quick check if Point is within area.
I'm struggling with the following case: imagine I have high numbers of GPS based samples with some data that I would like to e.g average over some geo-bins (grid).
I'm trying to divide Earth in semi-rectangular areas (for Merkator projection) that I could later on use with "contains" or "within" functions.
Currently it is done by SQL query and "GROUP BY" using SIN/COS and rounding
Samples
into BINS
(LINKS TO PICTURES)
Since with shapefiles and upcoming data from the requests I'm working with WGS84 my idea was to jump into merkator (or webmerkator) generate geopandas Polygons and use to_crs function to jump back to WGS84.
world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")]
world = world.to_crs({'init': 'epsg:3857'})
plotworld = world.plot( figsize=(20,10))
plotworld.set_title("Merkator")
# Keep map proportionate
plotworld.axis('equal')
#Draw saple polygon rectangle (in merkator)
x_point_list = [0.5*1e7,0.75*1e7,0.75*1e7,0.5*1e7]
y_point_list = [-0*1e7,0*1e7,0.25*1e7,0.25*1e7]
polygon_geom = Polygon(zip(x_point_list, y_point_list))
crs = {'init': 'epsg:3857'}
polygon = gp.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])
polygon.plot(ax=plotworld,color='red')
#transform to WGS84
world = world.to_crs({'init': 'epsg:4326'})
polygon = polygon.to_crs({'init': 'epsg:4326'})
plotworld2= world.plot( figsize=(20,10))
polygon.plot(ax=plotworld2,color='red')
My question is: how to generate semi-equal polygons of desired size e.g around 100x100mx 1000x1000m, 5000x5000m grid that will cover Earth?
I've gone through a number of geopandas/shapely sites, that shows some tutorial about some shapes, bins however no One mention how to draw/generate bins with desired size.
I truly understand that dimension of the polygons will vary a bit, but it does not hurt me that much.
Any help appreciated!