I have a a line segment AB (2d) from point A to point B. For the representation of a coastline (closed polygon, 3*10^3 vertices), I have a NumPy array (2d) of points which start and end at the same point. I want to know, if the connection between point A and B intersects the coastline.
My first approach was to iterate over each line segment of the closed polygon and check if it intersects with AB. Here is the underlying method.
Even if I do this working with NumPy arrays or translating the function with cython, it is not fast enough, because I have to do it a lot of times for different As and Bs.
I thought, this may be a conceptual problem and I was wondering, if there is a smarter way to check only, if at least one intersection exists (True/False)?
I tried to use shapely as well. But this was kind of slow.
from shapely.geometry import LineString
import numpy as np
coastline = LineString(np.dstack(x_values,y_values))
def intersection(A,B,Coastline):
AB = LineString([(A[0], A[1]), (B[0], B[1])])
if AB.intersection(coastline).is_empty:
return False
return True
This is a collision detection problem.
So in your case the best is to put your coastline inside one spatial datastrcture such as bsp-tree, quad-tree, aabb-tree, etc.
Then perform intersection between your line segment and the tree-structure.
See for instance CGAL AABB_tree:
https://doc.cgal.org/latest/AABB_tree/index.html
That library is for 3D but the same idea works for 2D. You can embedd almost any geometry inside aabbtree and query line intersection very fast
Related
There is a boundary inside China, which divide the region as North-South. I have drawn this boundary as a polyline format shapefile Download link.
I want to divide those points in the following figures into "North" and "South". Is there any useful function in Python can achieve this.
fiona has point.within function to test points within/out a polygon, but I have not searched a suitable function to divide multiple points by polyline.
Any advices or tips would be appreciated!
updated
According to the valuable suggestion made by Prune, I worked it out. The codes are provided as follows:
from shapely.geometry import shape
from shapely.geometry import LineString
# loading the boundary layer
import fiona
fname = './N-S_boundary.shp'
line1 = fiona.open(fname)
line1 = shape(line1.next()['geometry'])
# set a end point which is the southernmost for all stations.
end_point = (dy[dy['lat']==dy['lat'].min()]['lon'].values[0],dy[dy['lat']==dy['lat'].min()]['lat'].values[0])
# loop all monitoring stations for classification
dy['NS']= np.nan
for i in range(0,len(dy),1):
start_point = (dy['lon'].iloc[i],dy['lat'].iloc[i])
line2 = LineString([start_point, end_point])
if line1.intersection(line2).is_empty:
dy["NS"].iloc[i]='S'
else:
dy["NS"].iloc[i]='N'
color_dict= {'N':'steelblue','S':'r'}
dy['site_color']=dy['NS'].map(color_dict)
You can apply a simple property from topology.
First, make sure that your boundary partitions the universe (all available points you're dealing with). You may need to extend the boundary through the ocean to finish this.
Now, pick any reference point that is labeled as to the region -- to define "North" and "South", you must have at least one such point. w.l.o.g. assume it's a "South" point called Z.
Now, for each point A you want to classify, draw a continuous path (a straight one is usually easiest, but not required) from A to Z. Find the intersections of this path with the boundary. If you have an even quantity of intersections, then A is in the same class ("South") as Z; other wise, it's in the other class ("North").
Note that this requires a topological property of "partition" -- there are no tangents to the boundary line: if your path touches the boundary, it must cross completely.
I'm trying to replicate an N dimensional Delaunay triangulation that is performed by the Matlab delaunayn function in Python using the scipy.spatial.Delaunay function. However, while the Matlab function gives me the result I want and expect, scipy is giving me something different. I find this odd considering both are wrappers of the QHull library. I assume Matlab is implicitly setting different parameters in its call. The situation I'm trying to replicate between the two of them is found in Matlab's documentation.
The set up is to have a cube with a point in the center as below. The blue lines I provided to help visualize the shape, but they serve no purpose or meaning for this problem.
The triangulation I expect from this results in 12 simplices (listed in the Matlab example) and looks like the following.
However this python equivalent produces "extra" simplices.
x = np.array([[-1,-1,-1],[-1,-1,1],[-1,1,-1],[1,-1,-1],[1,1,1],[1,1,-1],[1,-1,1],[-1,1,1],[0,0,0]])
simp = scipy.spatial.Delaunay(x).simplices
The returned variable simp should be an M x N array where M is the number of simplices found (should be 12 for my case) and N is the number of points in the simplex. In this case, each simplex should be a tetrahedron meaning N is 4.
What I'm finding though is that M is actually 18 and that the extra 6 simplices are not tetrahedrons, but rather the 6 faces of the cube.
What's going on here? How can I limit the returned simplices to only be tetrahedrons? I used this simple case to demonstrate the problem so I'd like a solution that isn't tailored to this problem.
EDIT
Thanks to an answer by Amro, I was able to figure this out and I can get a match in simplices between Matlab and Scipy. There were two factors in play. First, as pointed out, Matlab and Scipy use different QHull options. Second, QHull returns simplices with zero volume. Matlab removes these, Scipy doesn't. That was obvious in the example above because all 6 extra simplices were the zero-volume coplanar faces of the cube. These can be removed, in N dimensions, with the following bit of code.
N = 3 # The dimensions of our points
options = 'Qt Qbb Qc' if N <= 3 else 'Qt Qbb Qc Qx' # Set the QHull options
tri = scipy.spatial.Delaunay(points, qhull_options = options).simplices
keep = np.ones(len(tri), dtype = bool)
for i, t in enumerate(tri):
if abs(np.linalg.det(np.hstack((points[t], np.ones([1,N+1]).T)))) < 1E-15:
keep[i] = False # Point is coplanar, we don't want to keep it
tri = tri[keep]
I suppose the other conditions should be addressed, but I'm guaranteed that my points contain no duplicates already, and the orientation condition appears to have no affect on the outputs that I can discern.
Some notes comparing MATLAB and SciPy functions:
According to MATLAB docs, by default it uses Qt Qbb Qc Qhull options for 3-dimensional input, while SciPy uses Qt Qbb Qc Qz.
not sure if it matters, but your NumPy array is not in the same order as the points created with ndgrid in MATLAB.
In fact if you look at the MATLAB code in edit delaunayn.m, you can see three extra steps performed:
first it merges duplicate points mergeDuplicatePoints (this is not an issue in your case)
then it enforces an orientation convention for the points (see the code)
finally after getting the result from Qhull (implemented as a MEX-function qhullmx), there is the following comment above a few lines of code:
Strip the zero volume simplices that may have been created by the presence of degeneracy.
Since the file is copyrighted, I won't post the code here, but you can check it on your end.
I have made a three way venn diagram. I have three issues with it that I can't seem to solve.
What is the code to move the circle labels (i.e."Set1","Set2","Set3") because right now one is too far away from the circle.
What is the code to make the circles be three equal sizes/change the circle size?
What is the code to move the circles around the plot. Right now, set2 is within set3 (but coloured differently), I would like the diagram to look more like the "standard" way of showing a venn diagram (i.e. 3 separate circles with some overlap in the middle).
On another note, I found it difficult to find what the commands such as "set_x", "set_alpha" should be; if anyone knew of a manual that would answer by above questions I would appreciate it, I couldn't seem to find one place with all the information I needed.
import sys
import numpy
import scipy
from matplotlib_venn import venn3,venn3_circles
from matplotlib import pyplot as plt
#Build three lists to make 3 way venn diagram with
list_line = lambda x: set([line.strip() for line in open(sys.argv[x])])
set1,set2,set3 = list_line(1),list_line(2),list_line(3)
#Make venn diagram
vd = venn3([set1,set2,set3],set_labels=("Set1","Set2","Set3"))
#Colours: get the HTML codes from the net
vd.get_patch_by_id("100").set_color("#FF8000")
vd.get_patch_by_id("001").set_color("#5858FA")
vd.get_patch_by_id("011").set_color("#01DF3A")
#Move the numbers in the circles
vd.get_label_by_id("100").set_x(-0.55)
vd.get_label_by_id("011").set_x(0.1)
#Strength of color, 2.0 is very strong.
vd.get_patch_by_id("100").set_alpha(0.8)
vd.get_patch_by_id("001").set_alpha(0.6)
vd.get_patch_by_id("011").set_alpha(0.8)
plt.title("Venn Diagram",fontsize=14)
plt.savefig("output",format="pdf")
What is the code to move the circle labels (i.e."Set1","Set2","Set3") because right now one is too far away from the circle.
Something like that:
lbl = vd.get_label_by_id("A")
x, y = lbl.get_position()
lbl.set_position((x+0.1, y-0.2)) # Or whatever
The "A", "B", and "C" are predefined identifiers, denoting the three sets.
What is the code to make the circles be three equal sizes/change the circle size?
If you do not want the circle/region sizes to correspond to your data (not necessarily a good idea), you can get an unweighted ("classical") Venn diagram using the function venn3_unweighted:
from matplotlib_venn import venn3_unweighted
venn3_unweighted(...same parameters you used in venn3...)
You can further cheat and tune the result by providing a subset_areas parameter to venn3_unweighted - this is a seven-element vector specifying the desired relative size of each region. In this case the diagram will be drawn as if the region areas were subset_areas, yet the numbers will be shown from the actual subsets. Try, for example:
venn3_unweighted(...., subset_areas=(10,1,1,1,1,1,1))
What is the code to move the circles around the plot.
The need to "move the circles around" is somewhat unusual - normally you would either want the circles to be positioned so that their intersection sizes correspond to your data, or use the "default" positioning. The functions venn3 and venn3_unweighted cater to those two requirements. Moving circles around arbitrarily is possible, but would require some lower-level coding and I'd advice against that.
I found it difficult to find what the commands such as "set_x", "set_alpha" should be
The object you get when you call v.get_label_by_id is a Matplotlib Text object. You can read about its methods and properties here. The object returned by v.get_patch_by_id is a PathPatch, look here and here for reference.
I have a numpy array points of shape [N,2] which contains the (x,y) coordinates of N points. I'd like to compute the mean distance of every point to all other points using an existing function (which we'll call cmp_dist and which I just use as a black box).
First a verbose solution in "normal" python to illustrate what I want to do (written from the top of my head):
mean_dist = []
for i,(x0,y0) in enumerate(points):
dist = [
for j,(x1,y1) in enumerate(points):
if i==j: continue
dist.append(comp_dist(x0,y0,x1,y1))
mean_dist.append(np.array(dist).mean())
I already found a "better" solution using list comprehensions (assuming list comprehensions are usually better) which seems to work just fine:
mean_dist = [np.array([cmp_dist(x0,y0,x1,y1) for j,(x1,y1) in enumerate(points) if not i==j]).mean()
for i,(x0,y0) in enumerate(points)]
However, I'm sure there's a much better solution for this in pure numpy, hopefully some function that allows to do an operation for every element using all other elements.
How can I write this code in pure numpy/scipy?
I tried to find something myself, but this is quite hard to google without knowing how such operations are called (my respective math classes are quite a while back).
Edit: Not a duplicate of Fastest pairwise distance metric in python
The author of that question has a 1D array r and is satisfied with what scipy.spatial.distance.pdist(r, 'cityblock') returns (an array containing the distances between all points). However, pdist returns a flat array, that is, is is not clear which of the distances belong to which point (see my answer).
(Although, as explained in that answer, pdist is what I was ultimately looking for, it doesnt solve the problem as I've specified it in the question.)
Based on #ali_m's comment to the question ("Take a look at scipy.spatial.distance.pdist"), I found a "pure" numpy/scipy solution:
from scipy.spatial.distance import cdist
...
fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1])
mean_dist = np.sort(cdist(points,points,fct))[:,1:].mean(1)
definitely
That's for sure an improvement over my list comprehension "solution".
What i don't really like about this, though, is that I have to sort and slice the array to remove the 0.0 values which are the result of computing the distance between identical points (so basically that's my way of removing the diagonal entries of the matrix I get back from cdist).
Note two things about the above solution:
I'm using cdist, not pdist as suggested by #ali_m.
I'm getting back an array of the same size as points, which contains the mean distance from every point to all other points, just as specified in the original question.
pdist unfortunately just returns an array that contains all these mean values in a flat array, that is, the mean values are unlinked from the points they are referring to, which is necessary for the problem as it I've described it in the original question.
However, since in the actual problem at hand I only need the mean over the means of all points (which I did not mention in the question), pdist serves me just fine:
from scipy.spatial.distance import pdist
...
fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1])
mean_dist_overall = pdist(points,fct).mean()
Though this would for sure be the definite answer if I had asked for the mean of the means, but I've purposely asked for the array of means for all points. Because I think there's still room for improvement in the above cdist solution, I won't accept this as THE answer.
I have 2 shapefiles, 1 containing a lot of lines that make up a road network, and another with many GPS points.
So far I've managed to open both shapefiles and do an intersection() using Shapely and Fiona, using the code found here - https://gis.stackexchange.com/a/128210/52590
Here's a copy of my code getting the intersecting points:
from shapely.geometry import shape, MultiLineString
import fiona
Multilines = MultiLineString([shape(line['geometry']) for line in fiona.open("shapefiles/edges.shp")])
Poly = shape(fiona.open("shapefiles/testBuffer.shp").next()['geometry'])
intersecciones = Multilines.intersection(Poly)
And this is what 'intersecciones' looks like when printed:
> MULTILINESTRING ((339395.1489003573 6295646.564306445,
> 339510.1820952367 6295721.782758819), (339391.2927481248 6295686.99659219, 339410.0625 6295699), (339404.4651918385 6295630.405294137, 339520.18020253 6295708.663279793))
So this means there're 3 points of intersection between the lines shapefile and the first polygon of the polygons shapefile.
What I need though is to get two attributes ('Nombre' and 'Sentido') from every line in the lines shapefile that intersects the polygons, in addition to the exact point where they intersect, so I can get the distance from the center of the polygon to the intersecting point after.
So my question is if there's any way to get those attributes, using Shapely or any other Python library there is. Also, what would be the best way to iterate through each polygon and save the data? I'm thinking maybe of a dictionary that contains every polygon with the attributes of the intersecting lines and distance. And last, is there any more efficient way to find the intersections? It's taking around 1 minute to process a single polygon and I'll probably need it to be faster in the future.
If there's any information I'm missing please tell me so I can edit the question.
Thank you very much in advance,
Felipe.
You should have a look at GeoPandas http://geopandas.org/ which uses Fiona and Shapely whilst giving you also direct access to the attributes in a nice tabular format. Together with some pandas operations (such as in this post) you should be able to do what you want with a couple of lines of code.
Probably not the best code, but I solved it by loading the points shapefile (where the points attributes were), the lines shapefile (where the lines attributes were) and polygons (buffered points). Then I used 2 'for' to check if each buffered point intersected each line. If it did, I retrieved the attributes using the same exact 'for'.
In the end I have "listaCalles", which is a list that contains every intersection of polygon with line, with many attributes.
red = fiona.open("shapefiles/miniRedVial.shp") # loads road network
puntos = fiona.open("shapefiles/datosgps.shp") # loads points
# open points with shapely and fiona
Multipoints = MultiPoint([shape(point['geometry']) for point in fiona.open("shapefiles/datosgps.shp")])
# open road network with shapely and fiona
Multilines = MultiLineString([shape(line['geometry']) for line in fiona.open("shapefiles/miniRedVial.shp")])
# open buffered points with shapely and fiona
Polygons = MultiLineString([list(shape(pol['geometry']).exterior.coords) for pol in fiona.open("shapefiles/testBuffer.shp")])
# create list where I'll save the intersections
listaCalles = []
for i in range(0, len(Polygons)):
for j in range(0, len(Multilines)):
if Polygons[i].intersection(Multilines[j]):
idPunto = puntos[i].get("id")
latPunto = puntos[i].get("properties").get("LATITUDE")
lonPunto = puntos[i].get("properties").get("LONGITUDE")
idCalle = red[j].get("id")
nombreCalle = red[j].get("properties").get("NOMBRE")
distPuntoCalle = Multilines[j].distance(Multipoints[i])
listaCalles.append((idPunto, latPunto, lonPunto, idCalle, nombreCalle, distPuntoCalle))