Getting coordinates of surface nodes using pyvista - python

I'm wondering if anyone could help me figure out how to apply pyvista to extract the surface nodes of a 3D object. For example, suppose I have a collection of points that builds out a sphere, including 'interior' and 'surface' points:
import numpy as np
import matplotlib.pyplot as plt
N = 50
max_rad = 1
thetavec = np.linspace(0,np.pi,N)
phivec = np.linspace(0,2*np.pi,2*N)
[th, ph] = np.meshgrid(thetavec,phivec)
R = np.random.rand(*th.shape) * max_rad
x = R*np.sin(th)*np.cos(ph)
y = R*np.sin(th)*np.sin(ph)
z = R*np.cos(th)
ax = plt.axes(projection='3d')
ax.plot(x.flatten(), y.flatten(), z.flatten(), '*')
Now I'd like to apply pyvista's extract_surface to locate the 'nodes' that live on the surface, together with their coordinates. That is, I'd like for extract_surface to return an array or dataframe of the coordinates of the surface points. I've tried to build a polydata object just with the vertices above (see link and section 'Initialize with just vertices')
Any help is much appreciated. Thanks!

Since you've confirmed in a comment that you're looking for a convex hull, you can do this using the delaunay_3d() filter. The output of the triangulation is an UnstructuredGrid that contains a grid of tetrahedra that fills the convex hull of you mesh. Calling extract_surface() on this space-filling mesh will give you the actual exterior, i.e. the convex hull:
import numpy as np
import pyvista as pv
# your example data
N = 50
max_rad = 1
thetavec = np.linspace(0,np.pi,N)
phivec = np.linspace(0,2*np.pi,2*N)
[th, ph] = np.meshgrid(thetavec,phivec)
R = np.random.rand(*th.shape) * max_rad
x = R*np.sin(th)*np.cos(ph)
y = R*np.sin(th)*np.sin(ph)
z = R*np.cos(th)
# create a PyVista point cloud (in a PolyData)
points = np.array([x, y, z]).reshape(3, -1).T # shape (n_points, 3)
cloud = pv.PolyData(points)
# extract surface by Delaunay triangulation to get the convex hull
convex_hull = cloud.delaunay_3d().extract_surface() # contains faces
surface_points = convex_hull.cast_to_pointset() # only points
# check what we've got
surface_points.plot(
render_points_as_spheres=True,
point_size=10,
background='paleturquoise',
scalar_bar_args={'color': 'black'},
)
(On older PyVista versions where PolyData.cast_to_pointset() is not available, one can convex_hull.extract_points(range(convex_hull.n_points))).
The result looks like this:
Playing around with this interactively it's obvious that it only contains points from the convex hull (i.e. it doesn't contain interior points).
Also note the colouring: the scalars used are called 'vtkOriginalPointIds' which are what you would actually expect if you tried to guess: it is the index of each point in the original point cloud. So we can use these scalars to extract the indices of the points making up the point cloud:
# grab original point indices
surface_point_inds = surface_points.point_data['vtkOriginalPointIds']
# confirm that the indices are correct
print(np.array_equal(surface_points.points, cloud.points[surface_point_inds, :]))
# True
Of course if you don't need to identify the surface points in the original point cloud then you can just use surface_points.points or even convex_hull.points to get a standalone array of convex hull point coordinates.

Related

Calculate the area enclosed by a 2D array of unordered points in python

I am trying to calculate the area of a shape enclosed by a large set of unordered points in python. I have a 2D array of points which I can plot as a scatterplot like this.
There are several ways to calculate the area enclosed by points, but these all assume ordered points, such as here and here. This method calculates the area unordered points, but it doesn't appear to work for complex shapes, as seen here. How would I calculate this area from unordered points in python?
Sample data looks like this:
[[225.93459 -27.25677 ]
[226.98128 -32.001945]
[223.3623 -34.119724]
[225.84741 -34.416553]]
From pen and paper one can see that this shape contains an area of ~12 (unitless) but putting these coordinates into one of the algorithms linked to previously returns an area of ~0.78.
Let's first mention that in the question How would I calculate this area from unordered points in python? used phrase 'unordered points' in the context of calculation of an area usually means that given are points of a contour enclosing an area which area is to calculate.
But in the question provided data sample are not points of a contour but just a cloud of points, which if visualized using a scatterplot results in a visually perceivable area.
The above is the reason why in the question provided links to algorithms calculating areas from 'unordered points' don't apply at all to what the question is about.
In other words, the actual title of the question I will answer below will be:
Calculate the visually perceivable area a cloud of (x,y) points is forming when visualized as a scatterplot
One of the possible options is mentioned in a comment to the question:
Honestly, you might consider taking THAT graph as a bitmap, and counting the number of non-white pixels in it. That is probably as close as you can get. – Tim Roberts
Given the image perfectly covering (without any margin) all the non-white pixels you can calculate the area the image rectangle is covering in units used in the underlying (x,y) data by calculating the area TA of the rectangle visible in the image from the underlying list of points P with (x,y) point coordinates ( P = [(x1,y1), (x2,y2), ...] ) as follows:
X = [x for x,y in P]
Y = [y for x,y in P]
TA = (max(X)-min(X))*(max(Y)-min(Y))
Assuming N_white is the number of all white pixels in the image with N pixels the actual area A covered by non-white pixels expressed in units used in the list of points P will be:
A = TA*(N-N_white)/N
Another approach using a list of points P with (x,y) point coordinates only ( without creation of an image ) consists of following steps:
decide which area Ap a point is covering and calculate half of the size h2 of a rectangle with this area around that point ( h2 = 0.5*sqrt(Ap) )
create a list R with rectangles around all points in the list P: R = [(x-h2, y+h2, x+h2, y-h2) for x,y in P]
use the code provided through a link listed in the stackoverflow question
Area of Union Of Rectangles using Segment Trees to calculate the total area covered by the rectangles in the list R.
The above approach has the advantage over the graphical one obtained from the scatterplot that with the choice of the area covered by a point you directly influence the used precision/resolution/granularity for the area calculation.
Given a 2D array of points the area covered by the points can be calculated with help of the return value of the same hist2d() function provided in the matplotlib module (as matplotlib.pyplot.hist2d()) which is used to show the scatterplot.
The 'trick' is to set the cmin parameter value of the function to 1 ( cmin=1 ) and then calculate the number of numpy.nan values in the by the function returned array setting them in relation to entire amount of array values.
In other words all what is necessary to calculate the area when creating the scatterplot is already there for easy use in a simple area calculation formulas if you know that the histogram creating function provide as return value all what is therefore necessary.
Below code of a ready to use function for the area calculation along with demonstration of function usage:
def area_of_points(points, grid_size = [1000, 1000]):
"""
Returns the area covered by N 2D-points provided in a 'points' array
points = [ (x1,y1), (x2,y2), ... , (xN, yN) ]
'grid_size' gives the number of grid cells in x and y direction the
'points' bounding box is divided into for calculation of the area.
Larger 'grid_size' values mean smaller grid cells, higher precision
of the area calculation and longer runtime.
area_of_points() requires installed matplotlib module. """
import matplotlib.pyplot as plt
import numpy as np
pts_x = [x for x,y in points]
pts_y = [y for x,y in points]
pts_bb_area = (max(pts_x)-min(pts_x))*(max(pts_y)-min(pts_y))
h2D,_,_,_ = plt.hist2d( pts_x, pts_y, bins = grid_size, cmin=1)
numberOfWhiteBins = np.count_nonzero(np.isnan(h2D))
numberOfAll2Dbins = h2D.shape[0]*h2D.shape[1]
areaFactor = 1.0 - numberOfWhiteBins/numberOfAll2Dbins
pts_pts_area = areaFactor * pts_bb_area
print(f'Areas: b-box = {pts_bb_area:8.4f}, points = {pts_pts_area:8.4f}')
plt.show()
return pts_pts_area
#:def area_of_points(points, grid_size = [1000, 1000])
import numpy as np
np.random.seed(12345)
x = np.random.normal(size=100000)
y = x + np.random.normal(size=100000)
pts = [[xi,yi] for xi,yi in zip(x,y)]
print(area_of_points(pts))
# ^-- prints: Areas: b-box = 114.5797, points = 7.8001
# ^-- prints: 7.800126875291629
The above code creates following scatterplot:
Notice that the printed output Areas: b-box = 114.5797, points = 7.8001 and the by the function returned area value 7.800126875291629 give the area in units in which the x,y coordinates in the array of points are specified.
Instead of usage of a function when utilizing the know how you can play around with the parameter of the scatterplot calculating the area of what can be seen in the scatterplot.
Below code which changes the displayed scatterplot using the same underlying point data:
import numpy as np
np.random.seed(12345)
x = np.random.normal(size=100000)
y = x + np.random.normal(size=100000)
pts = [[xi,yi] for xi,yi in zip(x,y)]
pts_values_example = \
[[0.53005, 2.79209],
[0.73751, 0.18978],
... ,
[-0.6633, -2.0404],
[1.51470, 0.86644]]
# ---
pts_x = [x for x,y in pts]
pts_y = [y for x,y in pts]
pts_bb_area = (max(pts_x)-min(pts_x))*(max(pts_y)-min(pts_y))
# ---
import matplotlib.pyplot as plt
bins = [320, 300] # resolution of the grid (for the scatter plot)
# ^-- resolution of precision for the calculation of area
pltRetVal = plt.hist2d( pts_x, pts_y, bins = bins, cmin=1, cmax=15 )
plt.colorbar() # display the colorbar (for a 2d density histogram)
plt.show()
# ---
h2D, xedges1D, yedges1D, h2DhistogramObject = pltRetVal
numberOfWhiteBins = np.count_nonzero(np.isnan(h2D))
numberOfAll2Dbins = (len(xedges1D)-1)*(len(yedges1D)-1)
areaFactor = 1.0 - numberOfWhiteBins/numberOfAll2Dbins
area = areaFactor * pts_bb_area
print(f'Areas: b-box = {pts_bb_area:8.4f}, points = {area:8.4f}')
# prints "Areas: b-box = 114.5797, points = 20.7174"
creating following scatterplot:
Notice that the calculated area is now larger due to smaller values used for grid resolution resulting in more of the area colored.

How to calculate in which site of a Voronoi diagram a new point is?

I wrote a small script for showing voronoi diagram of M points from this tutorial. I use scipy.spatial.
I Want to give a new point of plane and say this point is in which site of voronoi diagram. Is it possible?
This is my code:
import random
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
N = 70
M = 10
Matrix = [(random.random()*100,random.random()*100) for x in range(M)]
points = np.array(Matrix)
vor = Voronoi(points)
print(vor.ridge_vertices)
voronoi_plot_2d(vor)
plt.show()
By the concept of Voronoi diagram, the cell that new point P belongs to is generated by the closest point to P among the original points. Finding this point is straightforward minimization of distance:
point_index = np.argmin(np.sum((points - new_point)**2, axis=1))
However, you want to find the region. And the regions in vor.regions are not in the same order as vor.points, unfortunately (I don't really understand why since there should be a region for each point).
So I used the following approach:
Find all ridges around the point I want, using vor.ridge_points
Take all of the ridge vertices from these ridges, as a set
Look for the (unique) region with the same set of vertices.
Result:
M = 15
points = np.random.uniform(0, 100, size=(M, 2))
vor = Voronoi(points)
voronoi_plot_2d(vor)
new_point = [50, 50]
plt.plot(new_point[0], new_point[1], 'ro')
point_index = np.argmin(np.sum((points - new_point)**2, axis=1))
ridges = np.where(vor.ridge_points == point_index)[0]
vertex_set = set(np.array(vor.ridge_vertices)[ridges, :].ravel())
region = [x for x in vor.regions if set(x) == vertex_set][0]
polygon = vor.vertices[region]
plt.fill(*zip(*polygon), color='yellow')
plt.show()
Here is a demo:
Note that the coloring of the region will be incorrect if it is unbounded; this is a flaw of the simple-minded coloring approach, not of the region-finding algorithm. See Colorize Voronoi Diagram for the correct way to color unbounded regions.
Aside: I used NumPy to generate random numbers, which is simpler than what you did.

speedup geolocation algorithm in python

I have a set 100k of of geo locations (lat/lon) and a hexogonal grid (4k polygons). My goal is to calculate the total number of points which are located within each polygon.
My current algorithm uses 2 for loops to loop over all geo points and all polygons, which is really slow if I increase the number of polygons... How would you speedup the algorithm? I have uploaded a minimal example which creates 100k random geo points and uses 561 cells in the grid...
I also saw that reading the geo json file (with 4k polygons) takes some time, maybe i should export the polygons into a csv?
hexagon_grid.geojson file:
https://gist.github.com/Arnold1/9e41454e6eea910a4f6cd68ff1901db1
minimal python example:
https://gist.github.com/Arnold1/ee37a2e4b2dfbfdca9bfae7c7c3a3755
You don't need to explicitly test each hexagon to see whether a given point is located inside it.
Let's assume, for the moment, that all of your points fall somewhere within the bounds of your hexagonal grid. Because your hexagons form a regular lattice, you only really need to know which of the hexagon centers is closest to each point.
This can be computed very efficiently using a scipy.spatial.cKDTree:
import numpy as np
from scipy.spatial import cKDTree
import json
with open('/tmp/grid.geojson', 'r') as f:
data = json.load(f)
verts = []
centroids = []
for hexagon in data['features']:
# a (7, 2) array of xy coordinates specifying the vertices of the hexagon.
# we ignore the last vertex since it's equal to the first
xy = np.array(hexagon['geometry']['coordinates'][0][:6])
verts.append(xy)
# compute the centroid by taking the average of the vertex coordinates
centroids.append(xy.mean(0))
verts = np.array(verts)
centroids = np.array(centroids)
# construct a k-D tree from the centroid coordinates of the hexagons
tree = cKDTree(centroids)
# generate 10000 normally distributed xy coordinates
sigma = 0.5 * centroids.std(0, keepdims=True)
mu = centroids.mean(0, keepdims=True)
gen = np.random.RandomState(0)
xy = (gen.randn(10000, 2) * sigma) + mu
# query the k-D tree to find which hexagon centroid is nearest to each point
distance, idx = tree.query(xy, 1)
# count the number of points that are closest to each hexagon centroid
counts = np.bincount(idx, minlength=centroids.shape[0])
Plotting the output:
from matplotlib import pyplot as plt
fig, ax = plt.subplots(1, 1, subplot_kw={'aspect': 'equal'})
ax.hold(True)
ax.scatter(xy[:, 0], xy[:, 1], 10, c='b', alpha=0.25, edgecolors='none')
ax.scatter(centroids[:, 0], centroids[:, 1], marker='h', s=(counts + 5),
c=counts, cmap='Reds')
ax.margins(0.01)
I can think of several different ways you could handle points that fall outside your grid depending on how much accuracy you need:
You could exclude points that fall outside the outer bounding rectangle of your hexagon vertices (i.e. x < xmin, x > xmax etc.). However, this will fail to exclude points that fall within the 'gaps' along the edges of your grid.
Another straightforward option would be to set a cut-off on distance according to the spacing of your hexagon centers, which is equivalent to using a circular approximation for your outer hexagons.
If accuracy is crucial then you could define a matplotlib.path.Path corresponding to the outer vertices of your hexagonal grid, then use its .contains_points() method to test whether your points are contained within it. Compared to the other two methods, this would probably be slower and more fiddly to code.

Take data from a circle in python

I am looking into how the intensity of a ring changes depending on angle. Here is an example of an image:
What I would like to do is take a circle of values from within the center of that doughnut and plot them vs angle. What I'm currently doing is using scipy.ndimage.interpolation.rotate and taking slices radially through the ring, and extracting the maximum of the two peaks and plotting those vs angle.
crop = np.ones((width,width)) #this is my image
slices = np.arange(0,width,1)
stack = np.zeros((2*width,len(slices)))
angles = np.linspace(0,2*np.pi,len(crop2))
for j in range(len(slices2)): # take slices
stack[:,j] = rotate(crop,slices[j],reshape=False)[:,width]
However I don't think this is doing what I'm actually looking for. I'm mostly struggling with how to extract the data I want. I have also tried applying a mask which looks like this;
to the image, but then I don't know how to get the values within that mask in the correct order (ie. in order of increasing angle 0 - 2pi)
Any other ideas would be of great help!
I made a different input image to help verifying correctness:
import numpy as np
import scipy as sp
import scipy.interpolate
import matplotlib.pyplot as plt
# Mock up an image.
W = 100
x = np.arange(W)
y = np.arange(W)
xx,yy = np.meshgrid(x,y)
image = xx//5*5 + yy//5*5
image = image / np.max(image) # scale into [0,1]
plt.imshow(image, interpolation='nearest', cmap='gray')
plt.show()
To sample values from circular paths in the image, we first build an interpolator because we want to access arbitrary locations. We also vectorize it to be faster.
Then, we generate the coordinates of N points on the circle's circumference using the parametric definition of the circle x(t) = sin(t), y(t) = cos(t).
N should be at least twice the circumference (Nyquist–Shannon sampling theorem).
interp = sp.interpolate.interp2d(x, y, image)
vinterp = np.vectorize(interp)
for r in (15, 30, 45): # radii for circles around image's center
xcenter = len(x)/2
ycenter = len(y)/2
arclen = 2*np.pi*r
angle = np.linspace(0, 2*np.pi, arclen*2, endpoint=False)
value = vinterp(xcenter + r*np.sin(angle),
ycenter + r*np.cos(angle))
plt.plot(angle, value, label='r={}'.format(r))
plt.legend()
plt.show()

Extract coordinates enclosed by a matplotlib patch.

I have created an ellipse using matplotlib.patches.ellipse as shown below:
patch = mpatches.Ellipse(center, major_ax, minor_ax, angle_deg, fc='none', ls='solid', ec='g', lw='3.')
What I want is a list of all the integer coordinates enclosed inside this patch.
I.e. If I was to plot this ellipse along with every integer point on the same grid, how many of those points are enclosed in the ellipse?
I have tried seeing if I can extract the equation of the ellipse so I can loop through each point and see whether it falls within the line but I can't seem to find an obvious way to do this, it becomes more complicated as the major axis of the ellipse can be orientated at any angle. The information to do this must be stored in patches somewhere, but I can't seem to find it.
Any advice on this would be much appreciated.
Ellipse objects have a method contains_point which will return 1 if the point is in the ellipse, 0 other wise.
Stealing from #DrV 's answer:
import matplotlib.pyplot as plt
import matplotlib.patches
import numpy as np
# create an ellipse
el = matplotlib.patches.Ellipse((50,-23), 10, 13.7, 30, facecolor=(1,0,0,.2), edgecolor='none')
# calculate the x and y points possibly within the ellipse
y_int = np.arange(-30, -15)
x_int = np.arange(40, 60)
# create a list of possible coordinates
g = np.meshgrid(x_int, y_int)
coords = list(zip(*(c.flat for c in g)))
# create the list of valid coordinates (from untransformed)
ellipsepoints = np.vstack([p for p in coords if el.contains_point(p, radius=0)])
# just to see if this works
fig = plt.figure()
ax = fig.add_subplot(111)
ax.add_artist(el)
ep = np.array(ellipsepoints)
ax.plot(ellipsepoints[:,0], ellipsepoints[:,1], 'ko')
plt.show()
This will give you the result as below:
If you really want to use the methods offered by matplotlib, then:
import matplotlib.pyplot as plt
import matplotlib.patches
import numpy as np
# create an ellipse
el = matplotlib.patches.Ellipse((50,-23), 10, 13.7, 30, facecolor=(1,0,0,.2), edgecolor='none')
# find the bounding box of the ellipse
bb = el.get_window_extent()
# calculate the x and y points possibly within the ellipse
x_int = np.arange(np.ceil(bb.x0), np.floor(bb.x1) + 1, dtype='int')
y_int = np.arange(np.ceil(bb.y0), np.floor(bb.y1) + 1, dtype='int')
# create a list of possible coordinates
g = np.meshgrid(x_int, y_int)
coords = np.array(zip(*(c.flat for c in g)))
# create a list of transformed points (transformed so that the ellipse is a unit circle)
transcoords = el.get_transform().inverted().transform(coords)
# find the transformed coordinates which are within a unit circle
validcoords = transcoords[:,0]**2 + transcoords[:,1]**2 < 1.0
# create the list of valid coordinates (from untransformed)
ellipsepoints = coords[validcoords]
# just to see if this works
fig = plt.figure()
ax = fig.add_subplot(111)
ax.add_artist(el)
ep = np.array(ellipsepoints)
ax.plot(ellipsepoints[:,0], ellipsepoints[:,1], 'ko')
Seems to work:
(Zooming in reveals that even the points hanging on the edge are inside.)
The point here is that matplotlib handles ellipses as transformed circles (translate, rotate, scale, anything affine). If the transform is applied in reverse, the result is a unit circle at origin, and it is very simple to check if a point is within that.
Just a word of warning: The get_window_extent may not be extremely reliable, as it seems to use the spline approximation of a circle. Also, see tcaswell's comment on the renderer-dependency.
In order to find a more reliable bounding box, you may:
create a horizontal and vertical vector into the plot coordinates (their position is not important, ([0,0],[1,0]) and ([0,0], [0,1]) will do)
transform these vectors into the ellipse coordinates (the get_transform, etc.)
find in the ellipse coordinate system (i.e. the system where the ellipse is a unit circle around the origin) the four tangents of the circle which are parallel to these two vectors
find the intersection points of the vectors (4 intersections, but 2 diagonal will be enough)
transform the intersection points back to the plot coordinates
This will give an accurate (but of course limited by the numerical precision) square bounding box.
However, you may use a simple approximation:
all possible points are within a circle whose center is the same as that of the ellipse and whose diameter is the same as that of the major axis of the ellipse
In other words, all possible points are within a square bounding box which is between x0+-m/2, y0+-m/2, where (x0, y0) is the center of the ellipse and m the major axis.
I'd like to offer another solution that uses the Path object's contains_points() method instead of contains_point():
First get the coordinates of the ellipse and make it into a Path object:
elpath=Path(el.get_verts())
(NOTE that el.get_paths() won't work for some reason.)
Then call the path's contains_points():
validcoords=elpath.contains_points(coords)
Below I'm comparing #tacaswell's solution (method 1), #Drv's (method 2) and my own (method 3) (I've enlarged the ellipse by ~5 times):
import numpy
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
from matplotlib.path import Path
import time
#----------------Create an ellipse----------------
el=Ellipse((50,-23),50,70,30,facecolor=(1,0,0,.2), edgecolor='none')
#---------------------Method 1---------------------
t1=time.time()
for ii in range(50):
y=numpy.arange(-100,50)
x=numpy.arange(-30,130)
g=numpy.meshgrid(x,y)
coords=numpy.array(zip(*(c.flat for c in g)))
ellipsepoints = numpy.vstack([p for p in coords if el.contains_point(p, radius=0)])
t2=time.time()
print 'time of method 1',t2-t1
#---------------------Method 2---------------------
t2=time.time()
for ii in range(50):
y=numpy.arange(-100,50)
x=numpy.arange(-30,130)
g=numpy.meshgrid(x,y)
coords=numpy.array(zip(*(c.flat for c in g)))
invtrans=el.get_transform().inverted()
transcoords=invtrans.transform(coords)
validcoords=transcoords[:,0]**2+transcoords[:,1]**2<=1.0
ellipsepoints=coords[validcoords]
t3=time.time()
print 'time of method 2',t3-t2
#---------------------Method 3---------------------
t3=time.time()
for ii in range(50):
y=numpy.arange(-100,50)
x=numpy.arange(-30,130)
g=numpy.meshgrid(x,y)
coords=numpy.array(zip(*(c.flat for c in g)))
#------Create a path from ellipse's vertices------
elpath=Path(el.get_verts())
# call contains_points()
validcoords=elpath.contains_points(coords)
ellipsepoints=coords[validcoords]
t4=time.time()
print 'time of method 3',t4-t3
#---------------------Plot it ---------------------
fig,ax=plt.subplots()
ax.add_artist(el)
ep=numpy.array(ellipsepoints)
ax.plot(ellipsepoints[:,0],ellipsepoints[:,1],'ko')
plt.show(block=False)
I got these execution time:
time of method 1 62.2502269745
time of method 2 0.488734006882
time of method 3 0.588987112045
So the contains_point() approach is way slower. The coordinate-transformation method is faster than mine, but when you get irregular shaped contours/polygons, this method would still work.
Finally the result plot:

Categories