Get nodes within 1km walking distance of a GPS location OSMNX - python

Given a (lat,lng) point and an all type OSMNX network, how can I find which nodes in the graph are within 1km walking distance from the point?
I was thinking about calculating the great circle distance between each node and the point and checking whether it is at most 1km, but I do not believe this will be very accurate since the topology of the network will be ignored.

This OSMnx usage example demonstrates how.

I have never used OSMnx before, but the documentation seems to be very good. And no, you are right that calculating the Haversine (great circle) distance or Euclidean distance will not give you the actual walking distance. The whole point of OSMnx is that it takes real-life street networks into account.
One of the functions that seem to be based on the actual network is
osmnx.distance.shortest_path(G, orig, dest, weight='length').
You might use this function to calculate the shortest distance between all nodes and your point ... and then select those whose shortest distance is below 1 km.
I do not know, however, how walking paths, cicling paths and streets for cars can be differentiated in OSMnx. You might need to consult the documentation for more details or open an issue in the OSMnx GITHUB repo.

Related

What algorithm does GeoPandas use for calculating distances between polygons?

I'm using ArcGIS Pro and GeoPandas for spatial analysis operations. I noticed that the distance operations in ArcGIS and the GeoPandas don't align. I wonder which algorithm GeoPandas uses for its distance calculations (function distance).
In my example I selected polygons within a distance of 10 km from another polygon. One polygon is selected in ArcGIS but not in GeoPandas as the distance there is > 10 km. The data is projected to the same crs in both cases.
It's not surprising that different distance algorithms are used, I just can't find any information on which algorithm GeoPandas uses. I already checked the documentation and the code in Git.
ArcGIS uses vertex distances for polygons (ArcGIS documentation here).
Has anyone background information on the GeoPandas distance tool algorithm?
Help is greatly appreciated!
There are a lot of dependencies geopandas relies on. For the distance computation that involves point to point object, it uses shapely Euclidean distance as can be traced to this link:- https://github.com/Toblerity/Shapely/blob/master/shapely/geometry/base.py
If you have a geodataframe df5 like this
a b geometry
0 0 1 POINT (0.00000 0.00000)
1 0 2 POINT (1.00000 0.00000)
2 1 3 POINT (0.00000 2.00000)
3 5 4 POINT (1.00000 1.00000)
You can do computation check with
df5.geometry.values[0].distance( df5.geometry.values[3] )
The result will be 1.4142135623730951, which is the Euclidean distance.
There is no exact explanation for this in the shapely documentation. However, it can be estimated that
https://shapely.readthedocs.io/en/latest/manual.html#shapely.ops.nearest_points is likely to be used.
So I think it will represent the minimum distance between two polygons.
However, this is just my opinion, and I think you will have to test it yourself in the end. In my experience shapely doesn't produce the closest points other than the points shown on the polygon. (You can see this in an example where shapely calculates the minimum distance between a point and a line. The minimum distance is not the distance along the vertical tangent, but the closest point to the line's vertices.)
Since it is based only on points that already exist in polygons, I think that there may be a difference between the two values if ARCGIS finds and calculates the contact point with the minimum distance.

Shortest distance between random points on a circle?

I am trying to test the accuracy of a few optimization algorithms on the traveling salesman problem.
I wanted to create a system where I always knew what the optimal solution is. My logic was that I would create a bunch of random points on a unit circle and thus would always know the shortest path because it would just be the order of the points on the circle. And how would I find the order? Well, just iterate through each point and find its closest neighbor. It turns out that it works most of the time, but sometimes... It doesn't!
Are there any suggestions for an algorithm that would find the optimal solution of random points on a unit circle 100% of the time? I like being able to just randomly create points on the circle.
You can compute a 100% accurate solution using a convex hull algorithm. This solution will be exact as long as the optimal TSP path is convex, which is the case for a simple circle. The monotone chain algorithm is very interesting because it is both very fast and trivial to understand (not to mention that the implementation is also provided by Wikipedia in many languages).
Jérôme Richard's answer works for this case and for all cases involving points on the boundary of a convex polygon, but there is an even simpler algorithm that also works for all these cases: For each point, just find the angle that a line through that point and the circle centre makes with a horizontal line through the circle centre, and sort the points by that.
If the origin (0, 0) is inside your circle, and your language has the atan2() function (most do -- it's a standard trig function), you can just apply atan2() to each point and sort them by that.

How to snap coordinates to road and calculate distance

maybe somebody knows something, since I am not able to find anything that makes sense to me.
I have a dataset positions (lon, lat) and I want to snap them to the nearest road and calculate the distance between them.
So far I discovered OSM, however I can't find a working example on how to use the API using python.
If any of you could help, I am thankful for ever little detail.
Will try to find it out by myself in the meantime and publish the answer if successful (couldn't find any similar question so maybe it will help someone in the future)
Welcome! OSM is a wonderful resource, but is essentially a raw dataset that you have to download and do your own processing on. There are a number of ways to do this, if you need a relatively small extract of the data (as opposed to the full planet file) the Overpass API is the place to look. Overpass turbo (docs) is a useful tool to help with this API.
Once you have the road network data you need, you can use a library like Shapely to snap your points to the road network geometry, and then either calculate the distance between them (if you need "as the crow flies" distance), or split the road geometry by the snapped points and calculate the length of the line. If you need real-world distance that takes the curvature of the earth into consideration (as opposed to the distance as it appears on a projected map), you can use something like Geopy.
You may also want to look into the Map Matching API from Mapbox (full disclosure, I work there), which takes a set of coordinates, snaps them to the road network, and returns the snapped geometry as well as information about the route, including distance.
You might use KDTree of sklearn for this. You fill an array with coordinates of candidate roads (I downloaded this from openstreetmap). Then use KDTree to make a tree of this array. Finally, use KDTree.query(your_point, k=1) to get the nearest point of the tree (which is the nearest node of the coordinate roads). Since searching the tree is very fast (essentially log(N) for N points that form the tree), you can query lots of points.

Euclidean Minimal Spanning Tree and Delaunay Triangulation

I want to calculate the minimal spanning tree based on the euclidean distance between a set of points on a 2D-plane. My current code stores all the edges, and then performs Prim's algorithm in order to get the minimal spanning tree. However, I am aware that doing this takes O(n^2) space for all the edges.
After doing some research, it becomes clear that the memory and runtime can be optimized if I calculate the delaunay triangulation first on this set of points, then obtain the minimal spanning tree via running Prim's or Kruskal's algorithm on the edges of the triangulation.
This is part of a programming contest (https://prologin.org/train/2017/qualification/taxi_des_neiges), so I doubt I'd be able to use scipy.spatial. Is there any alternatives to simply get the edges contained in the Delaunay triangulation?
Thanks in advance.
Would a module help? Here's a few that might work:
delaunator
poly2tri
triangle
matplotlib?
Roll your own? Both of these describe the incremental algorithm, which Wikipedia seems to say is O(n log n):
http://www.geom.uiuc.edu/~samuelp/del_project.html
http://graphics.stanford.edu/courses/cs368-06-spring/handouts/Delaunay_1.pdf
Here's an ActiveState recipe that might help to get started, but it looks like it's not finished.
It looks like Scipy uses QHull, which.. somewhere in this folder... has the code for performing the delaunay triangulation and getting the edges (albeit implemented in C)
https://github.com/scipy/scipy/tree/master/scipy/spatial/qhull/src

Minimum bottleneck spanning tree vs MST

A MST(minimum spanning tree) is necessarily a MBST(minimum bottleneck spanning tree).
Given a set of points in a 2D plane with the edges of the weight of the Euclidean distance between each pair of point, I want to find the minimum bottleneck edge in the spanning tree (the maximum-weighted edge).
According to my limited knowledge in this field and research on the internet, the best way to calculate this using the MST approach is to do a Delaunay triangulation of the given points, and use the fact that the MST of these points is a subset of the triangulation and then run Kruskal's or Prim's algorithm. Then all I need to do is to find the maximum weight in the MST.
I am wondering if I can do this more efficiently by building a MBST, as I am only looking for the bottleneck edge.
Thanks in advance.
Edit: my current way to find it, both using MST or MBST calculates the weight of all V * (V-1) / 2 edges first, which I consider quite inefficient. I'd like to know if there's an alternative around this.

Categories