Related
I have a list of (x,y)-coordinates that represent a line skeleton.
The list is obtained directly from a binary image:
import numpy as np
list=np.where(img_skeleton>0)
Now the points in the list are sorted according to their position in the image along one of the axes.
I would like to sort the list such that the order represents a smooth path along the line. (This is currently not the case where the line curves back).
Subsequently, I want to fit a spline to these points.
A similar problem has been described and solved using arcPy here. Is there a convenient way to achieve this using python, numpy, scipy, openCV (or another library?)
below is an example image. it results in a list of 59 (x,y)-coordinates.
when I send the list to scipy's spline fitting routine, I am running into a problem because the points aren't 'ordered' on the line:
I apologize for the long answer in advance :P (the problem is not that simple).
Lets start by rewording the problem. Finding a line that connects all the points, can be reformulated as a shortest path problem in a graph, where (1) the graph nodes are the points in the space, (2) each node is connected to its 2 nearest neighbors, and (3) the shortest path passes through each of the nodes only once. That last constrain is a very important (and quite hard one to optimize). Essentially, the problem is to find a permutation of length N, where the permutation refers to the order of each of the nodes (N is the total number of nodes) in the path.
Finding all the possible permutations and evaluating their cost is too expensive (there are N! permutations if I'm not wrong, which is too big for problems). Bellow I propose an approach that finds the N best permutations (the optimal permutation for each of the N points) and then find the permutation (from those N) that minimizes the error/cost.
1. Create a random problem with unordered points
Now, lets start to create a sample problem:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)
plt.plot(x, y)
plt.show()
And here, the unsorted version of the points [x, y] to simulate a random points in space connected in a line:
idx = np.random.permutation(x.size)
x = x[idx]
y = y[idx]
plt.plot(x, y)
plt.show()
The problem is then to order those points to recover their original order so that the line is plotted properly.
2. Create 2-NN graph between nodes
We can first rearrange the points in a [N, 2] array:
points = np.c_[x, y]
Then, we can start by creating a nearest neighbour graph to connect each of the nodes to its 2 nearest neighbors:
from sklearn.neighbors import NearestNeighbors
clf = NearestNeighbors(2).fit(points)
G = clf.kneighbors_graph()
G is a sparse N x N matrix, where each row represents a node, and the non-zero elements of the columns the euclidean distance to those points.
We can then use networkx to construct a graph from this sparse matrix:
import networkx as nx
T = nx.from_scipy_sparse_matrix(G)
3. Find shortest path from source
And, here begins the magic: we can extract the paths using dfs_preorder_nodes, which will essentially create a path through all the nodes (passing through each of them exactly once) given a starting node (if not given, the 0 node will be selected).
order = list(nx.dfs_preorder_nodes(T, 0))
xx = x[order]
yy = y[order]
plt.plot(xx, yy)
plt.show()
Well, is not too bad, but we can notice that the reconstruction is not optimal. This is because the point 0 in the unordered list lays in the middle of the line, that is way it first goes in one direction, and then comes back and finishes in the other direction.
4. Find the path with smallest cost from all sources
So, in order to obtain the optimal order, we can just get the best order for all the nodes:
paths = [list(nx.dfs_preorder_nodes(T, i)) for i in range(len(points))]
Now that we have the optimal path starting from each of the N = 100 nodes, we can discard them and find the one that minimizes the distances between the connections (optimization problem):
mindist = np.inf
minidx = 0
for i in range(len(points)):
p = paths[i] # order of nodes
ordered = points[p] # ordered nodes
# find cost of that order by the sum of euclidean distances between points (i) and (i+1)
cost = (((ordered[:-1] - ordered[1:])**2).sum(1)).sum()
if cost < mindist:
mindist = cost
minidx = i
The points are ordered for each of the optimal paths, and then a cost is computed (by calculating the euclidean distance between all pairs of points i and i+1). If the path starts at the start or end point, it will have the smallest cost as all the nodes will be consecutive. On the other hand, if the path starts at a node that lies in the middle of the line, the cost will be very high at some point, as it will need to travel from the end (or beginning) of the line to the initial position to explore the other direction. The path that minimizes that cost, is the path starting in an optimal point.
opt_order = paths[minidx]
Now, we can reconstruct the order properly:
xx = x[opt_order]
yy = y[opt_order]
plt.plot(xx, yy)
plt.show()
One possible solution is to use a nearest neighbours approach, possible by using a KDTree. Scikit-learn has an nice interface. This can then be used to build a graph representation using networkx. This will only really work if the line to be drawn should go through the nearest neighbours:
from sklearn.neighbors import KDTree
import numpy as np
import networkx as nx
G = nx.Graph() # A graph to hold the nearest neighbours
X = [(0, 1), (1, 1), (3, 2), (5, 4)] # Some list of points in 2D
tree = KDTree(X, leaf_size=2, metric='euclidean') # Create a distance tree
# Now loop over your points and find the two nearest neighbours
# If the first and last points are also the start and end points of the line you can use X[1:-1]
for p in X
dist, ind = tree.query(p, k=3)
print ind
# ind Indexes represent nodes on a graph
# Two nearest points are at indexes 1 and 2.
# Use these to form edges on graph
# p is the current point in the list
G.add_node(p)
n1, l1 = X[ind[0][1]], dist[0][1] # The next nearest point
n2, l2 = X[ind[0][2]], dist[0][2] # The following nearest point
G.add_edge(p, n1)
G.add_edge(p, n2)
print G.edges() # A list of all the connections between points
print nx.shortest_path(G, source=(0,1), target=(5,4))
>>> [(0, 1), (1, 1), (3, 2), (5, 4)] # A list of ordered points
Update: If the start and end points are unknown and your data is reasonably well separated, you can find the ends by looking for cliques in the graph. The start and end points will form a clique. If the longest edge is removed from the clique it will create a free end in the graph which can be used as a start and end point. For example, the start and end points in this list appear in the middle:
X = [(0, 1), (0, 0), (2, 1), (3, 2), (9, 4), (5, 4)]
After building the graph, now its a case of removing the longest edge from the cliques to find the free ends of the graph:
def find_longest_edge(l):
e1 = G[l[0]][l[1]]['weight']
e2 = G[l[0]][l[2]]['weight']
e3 = G[l[1]][l[2]]['weight']
if e2 < e1 > e3:
return (l[0], l[1])
elif e1 < e2 > e3:
return (l[0], l[2])
elif e1 < e3 > e2:
return (l[1], l[2])
end_cliques = [i for i in list(nx.find_cliques(G)) if len(i) == 3]
edge_lengths = [find_longest_edge(i) for i in end_cliques]
G.remove_edges_from(edge_lengths)
edges = G.edges()
start_end = [n for n,nbrs in G.adjacency_iter() if len(nbrs.keys()) == 1]
print nx.shortest_path(G, source=start_end[0], target=start_end[1])
>>> [(0, 0), (0, 1), (2, 1), (3, 2), (5, 4), (9, 4)] # The correct path
I had the exact same problem. If you have two arrays of scattered x and y values that are not too curvy, then you can transform the points into PCA space, sort them in PCA space, and then transform them back. (I've also added in some bonus smoothing functionality).
import numpy as np
from scipy.signal import savgol_filter
from sklearn.decomposition import PCA
def XYclean(x,y):
xy = np.concatenate((x.reshape(-1,1), y.reshape(-1,1)), axis=1)
# make PCA object
pca = PCA(2)
# fit on data
pca.fit(xy)
#transform into pca space
xypca = pca.transform(xy)
newx = xypca[:,0]
newy = xypca[:,1]
#sort
indexSort = np.argsort(x)
newx = newx[indexSort]
newy = newy[indexSort]
#add some more points (optional)
f = interpolate.interp1d(newx, newy, kind='linear')
newX=np.linspace(np.min(newx), np.max(newx), 100)
newY = f(newX)
#smooth with a filter (optional)
window = 43
newY = savgol_filter(newY, window, 2)
#return back to old coordinates
xyclean = pca.inverse_transform(np.concatenate((newX.reshape(-1,1), newY.reshape(-1,1)), axis=1) )
xc=xyclean[:,0]
yc = xyclean[:,1]
return xc, yc
I agree with Imanol_Luengo Imanol Luengo's solution, but if you know the index of the first point, then there is a considerably easier solution that uses only NumPy:
def order_points(points, ind):
points_new = [ points.pop(ind) ] # initialize a new list of points with the known first point
pcurr = points_new[-1] # initialize the current point (as the known point)
while len(points)>0:
d = np.linalg.norm(np.array(points) - np.array(pcurr), axis=1) # distances between pcurr and all other remaining points
ind = d.argmin() # index of the closest point
points_new.append( points.pop(ind) ) # append the closest point to points_new
pcurr = points_new[-1] # update the current point
return points_new
This approach appears to work well with the sine curve example, especially because it is easy to define the first point as either the leftmost or rightmost point.
For the img_skeleton data cited in the question, it would be similarly easy to algorithmically obtain the first point, for example as the topmost point.
# create sine curve:
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)
# shuffle the order of the x and y coordinates:
idx = np.random.permutation(x.size)
xs,ys = x[idx], y[idx] # shuffled points
# find the leftmost point:
ind = xs.argmin()
# assemble the x and y coordinates into a list of (x,y) tuples:
points = [(xx,yy) for xx,yy in zip(xs,ys)]
# order the points based on the known first point:
points_new = order_points(points, ind)
# plot:
fig,ax = plt.subplots(1, 2, figsize=(10,4))
xn,yn = np.array(points_new).T
ax[0].plot(xs, ys) # original (shuffled) points
ax[1].plot(xn, yn) # new (ordered) points
ax[0].set_title('Original')
ax[1].set_title('Ordered')
plt.tight_layout()
plt.show()
I am working on a similar problem, but it has an important constraint (much like the example given by the OP) which is that each pixel has either one or two neighboring pixel, in the 8-connected sense. With this constraint, there is a very simple solution.
def sort_to_form_line(unsorted_list):
"""
Given a list of neighboring points which forms a line, but in random order,
sort them to the correct order.
IMPORTANT: Each point must be a neighbor (8-point sense)
to a least one other point!
"""
sorted_list = [unsorted_list.pop(0)]
while len(unsorted_list) > 0:
i = 0
while i < len(unsorted_list):
if are_neighbours(sorted_list[0], unsorted_list[i]):
#neighbours at front of list
sorted_list.insert(0, unsorted_list.pop(i))
elif are_neighbours(sorted_list[-1], unsorted_list[i]):
#neighbours at rear of list
sorted_list.append(unsorted_list.pop(i))
else:
i = i+1
return sorted_list
def are_neighbours(pt1, pt2):
"""
Check if pt1 and pt2 are neighbours, in the 8-point sense
pt1 and pt2 has integer coordinates
"""
return (np.abs(pt1[0]-pt2[0]) < 2) and (np.abs(pt1[1]-pt2[1]) < 2)
Modifying upon Toddp's answer , you can find end-points of arbitrarily shaped lines using this code and then order the points as Toddp stated, this is much faster than Imanol Luengo's answer, the only constraint is that the line must have only 2 end-points :
def order_points(points):
if isinstance(points,np.ndarray):
assert points.shape[1]==2
points = points.tolist()
exts = get_end_points(points)
assert len(exts) ==2
ind = points.index(exts[0])
points_new = [ points.pop(ind) ] # initialize a new list of points with the known first point
pcurr = points_new[-1] # initialize the current point (as the known point)
while len(points)>0:
d = np.linalg.norm(np.array(points) - np.array(pcurr), axis=1) # distances between pcurr and all other remaining points
ind = d.argmin() # index of the closest point
points_new.append( points.pop(ind) ) # append the closest point to points_new
pcurr = points_new[-1] # update the current point
return points_new
def get_end_points(ptsxy):
#source : https://stackoverflow.com/a/67145008/10998081
if isinstance(ptsxy,list): ptsxy = np.array(ptsxy)
assert ptsxy.shape[1]==2
#translate to (0,0)for faster excution
xx,yy,w,h = cv2.boundingRect(ptsxy)
pts_translated = ptsxy -(xx,yy)
bim = np.zeros((h+1,w+1))
bim[[*np.flip(pts_translated).T]]=255
extremes = []
for p in pts_translated:
x = p[0]
y = p[1]
n = 0
n += bim[y - 1,x]
n += bim[y - 1,x - 1]
n += bim[y - 1,x + 1]
n += bim[y,x - 1]
n += bim[y,x + 1]
n += bim[y + 1,x]
n += bim[y + 1,x - 1]
n += bim[y + 1,x + 1]
n /= 255
if n == 1:
extremes.append(p)
extremes = np.array(extremes)+(xx,yy)
return extremes.tolist()
I need to integrate over the arcs that are resulted from the intersection of a circle with a rectangle and fall inside the rectangle. I can find the intersection points using the shapely package. However, I don't know how to obtain integration intervals. For example, in the below figure my code returns [-2.1562, 2.1562] in radians (with respect to the center of the circle), while it should be able to automatically understand that the integration intervals that falls inside the rectangle are [[2.1562, 3.1415],[-3.1415, -2.1562]] (assuming pi = 3.1415).
Here is another example:
My code returns [-0.45036, -0.29576, 0.29576, 0.45036] and the expected intervals will be [[0.29576, 0.45036], [-0.45036, -0.29576]].
The code should also work for any other location that the circle is located (with any radius), whether its center is outside or inside the rectangle.
Here is my code, written using iPython:
import matplotlib.pyplot as plt
import math
import numpy as np
from shapely.geometry import LineString, MultiPoint
from shapely.geometry import Polygon
from shapely.geometry import Point
# Utilities
def cart2pol(xy, center):
x,y = xy
x_0,y_0 = center
rho = np.sqrt((x-x_0)**2 + (y-y_0)**2)
phi = np.arctan2(y-y_0, x-x_0)
return(rho, phi)
def pol2cart(rho, phi, center):
x_0,y_0 = center
x = rho * np.cos(phi)+x_0
y = rho * np.sin(phi)+y_0
return(x, y)
def distance(A,B):
return math.sqrt((A[0]-B[0])**2+(A[1]-B[1])**2)
#######################
rad = 6
center = (-1,5)
p = Point(center)
c = p.buffer(rad).boundary
A = (10,0)
B = (0,0)
C = (0,10)
D = (10,10)
coords = [Point(A), Point(B), Point(C), Point(D)]
poly = MultiPoint(coords).convex_hull
i=c.intersection(poly)
lines = [LineString([A, D]), LineString([D, C]),
LineString([C, B]), LineString([B, A])]
points = []
for l in lines:
i = c.intersection(l)
if not i.is_empty:
if i.geom_type == 'MultiPoint':
for j in range(len(i.geoms)):
points.append(i.geoms[j].coords[0])
else:
points.append(i.coords[0])
# Repeat the tangential points
for k, point in enumerate(points.copy()):
if abs(distance(center, point)**2 + distance(point, B)**2 - distance(B, center)**2) < 1e-4:
points.insert(k+1,point)
elif abs(distance(center, point)**2 + distance(point, D)**2 -distance(D, center)**2) < 1e-4:
points.insert(k+1,point)
# Sort points in polar coordinates
phis = [cart2pol(point,center)[1] for point in points]
phis.sort()
print(phis)
# Plot the shapes
x,y = c.xy
plt.plot(*c.xy)
for l in lines:
plt.plot(*l.xy, 'b')
plt.gca().set_aspect('equal', adjustable='box')
I tried to sort the intersection points according to their angle in a way that each two adjacent items in the list of intersection points corresponds to an arc. The problem is that there will be a jump in the angles from -pi to pi when rotating along the unit circle. Also I don't know how to find that whether an arc is inside the rectangle or not given its 2 end points.
Dealing with angle ranges is not straightforward.
1) select a non-ambiguous representation range, such as [-π, π) radians.
2) write a function that finds the intersections of the circle with a (h/v) half-plane and returns an angle interval. It the interval straddles the ±π border, split it in two.
3) write a function that finds the intersection between two lists of intervals (this is a modified merging problem).
4) process the four edges and intersect the resulting intervals.
5) possibly merge intervals that straddle the ±π border.
Ray Casting Algorithm
MandelBulb Ray Casting Algorithm Python Example
So, if I understand correctly, the ray casting algorithm requires that an observer be located external to the 3D fractal at which point vectors are drawn from the observer toward a point on the plane normal to the vector and intersecting the origin.
It would seem to me that this would either severely limit the rendered view of the fractal or require stereoscopic 3D reconstruction of the fractal using multiple observer positions (which seems difficult to me). Additionally, no information can be gathered regarding the internal structure of the fractal.
Other Algorithms
Alternatively, Direct Volume Rendering seems intuitive enough however, computationally expensive and potentially inefficient in and of itself. Indirect Volume Rendering using an algorithm such as marching cubes might also employ a bit of a learning curve it seems.
Somewhere in the pdf of the 2nd link it talks about cut plane views in order to see slices of the fractal.
Question:
Why not use cut planes as a rendering method?
1) Using a modified ray tracing algorithm, say we put the observer at point Q at the origin (0, 0, 0).
2) Let us then emit rays from the origin toward the incident plane spanned by y & z point combinations that is slicing the fractal.
3) Calculate the distance to the fractal surface using the algorithm in the 1st link. If the x component of computed distance is within a certain tolerance, dx of the slicing plane, then the y & z coordinates along with the x value of the slicing plane are stored as the x, y, z coordinates. These coordinates are now representative of the surface at that specific slice in x.
4) Let us say that the slicing plane has one degree of freedom in the x direction. By moving the plane in its degree of freedom, we can receive yet another set of x, y, z coordinates for a given slice.
5) The final result is a calculable surface generated by the point cloud created in the previous steps.
6) Additionally, the degree of freedom of the slicing plane can be altered to create an another point cloud which can then be verified against the previous as a means of post-processing.
Please see the image below as a visual aid (the sphere represents the MandelBulb).
Below is my Python code so far, adapted from the first link. I successfully generate the plane of points and am able to get the directions from the origin to the points on the plane. There must be something fundamentally flawed in the distance estimator function because thats where everything breaks down and I get nans for the total distances
def get_plane_points(x, y_res=500, z_res=500, y_min=-10, y_max=10, z_min=-10, z_max=10):
y = np.linspace(y_min, y_max, y_res)
z = np.linspace(z_min, z_max, z_res)
x, y, z = np.meshgrid(x, y, z)
x, y, z = x.reshape(-1), y.reshape(-1) , z.reshape(-1)
P = np.vstack((x, y, z)).T
return P
def get_directions(P):
v = np.array(P - 0)
v = v/np.linalg.norm(v, axis=1)[:, np.newaxis]
return v
#jit
def DistanceEstimator(positions, plane_loc, iterations, degree):
m = positions.shape[0]
x, y, z = np.zeros(m), np.zeros(m), np.zeros(m)
x0, y0, z0 = positions[:, 0], positions[:, 1], positions[:, 2]
dr = np.zeros(m) + 1
r = np.zeros(m)
theta = np.zeros(m)
phi = np.zeros(m)
zr = np.zeros(m)
for _ in range(iterations):
r = np.sqrt(x * x + y * y + z * z)
dx = .01
x_loc = plane_loc
idx = (x < x_loc + dx) & (x > x_loc - dx)
dr[idx] = np.power(r[idx], degree - 1) * degree * dr[idx] + 1.0
theta[idx] = np.arctan2(np.sqrt(x[idx] * x[idx] + y[idx] * y[idx]), z[idx])
phi[idx] = np.arctan2(y[idx], x[idx])
zr[idx] = r[idx] ** degree
theta[idx] = theta[idx] * degree
phi[idx] = phi[idx] * degree
x[idx] = zr[idx] * np.sin(theta[idx]) * np.cos(phi[idx]) + x0[idx]
y[idx] = zr[idx] * np.sin(theta[idx]) * np.sin(phi[idx]) + y0[idx]
z[idx] = zr[idx] * np.cos(theta[idx]) + z0[idx]
return 0.5 * np.log(r) * r / dr
def trace(directions, plane_location, max_steps=50, iterations=50, degree=8):
total_distance = np.zeros(directions.shape[0])
keep_iterations = np.ones_like(total_distance)
steps = np.zeros_like(total_distance)
for _ in range(max_steps):
positions = total_distance[:, np.newaxis] * directions
distance = DistanceEstimator(positions, plane_location, iterations, degree)
total_distance += distance * keep_iterations
steps += keep_iterations
# return 1 - (steps / max_steps) ** power
return total_distance
def run():
plane_location = 2
plane_points = get_plane_points(x=plane_location)
directions = get_directions(plane_points)
distance = trace(directions, plane_location)
return distance
I am eager to hear thoughts on this and what limitations/issues I may encounter. Thanks in advance for the help!
If I am not mistaken, it is not impossible for this algorithm to work. There is inherent potential for problems with any assumptions made about the internal structure of the MandelBulb and what positions an observer is allowed to occupy. That is, if the observer is known to initially be in a zone of convergence then the ray tracing algorithm with return nothing meaningful since the furthest distance that could be measured is 0. This is due to the fact that the current ray tracing algorithm terminates upon first contact with the surface. It is likely this could be altered, however.
Rather than slicing the fractal with plane P, it might make more sense to prevent the termination of the ray upon first contact and instead, terminate based on a distance thats known to exist past the surface of the MandelBulb.
I have code to expand the polygon, it works by multiplying the xs and ys by a factor then re centering the resultant polyon at the center of the original.
I also have code to find the value for the expansion factor, given a point that the polygon needs to reach:
import numpy as np
import itertools as IT
import copy
from shapely.geometry import LineString, Point
def getPolyCenter(points):
"""
http://stackoverflow.com/a/14115494/190597 (mgamba)
"""
area = area_of_polygon(*zip(*points))
result_x = 0
result_y = 0
N = len(points)
points = IT.cycle(points)
x1, y1 = next(points)
for i in range(N):
x0, y0 = x1, y1
x1, y1 = next(points)
cross = (x0 * y1) - (x1 * y0)
result_x += (x0 + x1) * cross
result_y += (y0 + y1) * cross
result_x /= (area * 6.0)
result_y /= (area * 6.0)
return (result_x, result_y)
def expandPoly(points, factor):
points = np.array(points, dtype=np.float64)
expandedPoly = points*factor
expandedPoly -= getPolyCenter(expandedPoly)
expandedPoly += getPolyCenter(points)
return np.array(expandedPoly, dtype=np.int64)
def distanceLine2Point(points, point):
points = np.array(points, dtype=np.float64)
point = np.array(point, dtype=np.float64)
points = LineString(points)
point = Point(point)
return points.distance(point)
def distancePolygon2Point(points, point):
distances = []
for i in range(len(points)):
if i==len(points)-1:
j = 0
else:
j = i+1
line = [points[i], points[j]]
distances.append(distanceLine2Point(line, point))
minDistance = np.min(distances)
#index = np.where(distances==minDistance)[0][0]
return minDistance
"""
Returns the distance from a point to the nearest line of the polygon,
AND the distance from where the normal to the line (to reach the point)
intersets the line to the center of the polygon.
"""
def distancePolygon2PointAndCenter(points, point):
distances = []
for i in range(len(points)):
if i==len(points)-1:
j = 0
else:
j = i+1
line = [points[i], points[j]]
distances.append(distanceLine2Point(line, point))
minDistance = np.min(distances)
i = np.where(distances==minDistance)[0][0]
if i==len(points)-1:
j = 0
else:
j = i+1
line = copy.deepcopy([points[i], points[j]])
centerDistance = distanceLine2Point(line, getPolyCenter(points))
return minDistance, centerDistance
minDistance, centerDistance = distancePolygon2PointAndCenter(points, point)
expandedPoly = expandPoly(points, 1+minDistance/centerDistance)
This code only works when the point is directly opposing one of the polygons lines.
Modify your method distancePolygon2PointAndCenter to instead of
Returns the distance from a point to the nearest line of the polygon
To return the distance from a point to the segment intersected by a ray from the center to the point. This is the line that will intersect the point once the polygon is fully expanded. To get this segment, take both endpoints of each segment of your polygon, and plug them into the equation for the line parallel & intersecting the ray mentioned earlier. That is y = ((centerY-pointY)/(centerX-pointX)) * (x - centerX) + centerY. You want to want to find endpoints where either one of them intersect the line, or the two are on opposite sides of the line.
Then, the only thing left to do is make sure that we pick the segment intersecting the right "side" of the line. To do this, there are a few options. The fail-safe method would be to use the formula cos(theta) = sqrt((centerX**2 + centerY**2)*(pointX**2 + pointY**2)) / (centerX * pointX + centerY * pointY) however, you could use methods such as comparing x and y values, taking the arctan2(), and such to figure out which segment is on the correct "side" of center. You'll just have lots of edge cases to cover. After all this is said and done, your two (unless its not convex, in which case take the segment farthest from you center) endpoints makeup the segment to expand off of.
Determine what is "polygon center" as central point C of expanding. Perhaps it is centroid (or some point with another properties?).
Make a segment from your point P to C. Find intersection point I between PC and polygon edges. If polygon is concave and there are some intersection points, choose the closest one to P.
Calculate coefficient of expanding:
E = Length(PC) / Length(CI)
Calculate new vertex coordinates. For i-th vertex of polygon:
V'[i].X = C.X + (V[i].X - C.X) * E
V'[i].Y = C.Y + (V[i].Y - C.Y) * E
Decide which point you want to reach, then calculate how much % your polygon needs to expand to reach that point and use the shapely.affinity.scale function. For example, in my case I just needed to make the polygon 5% bigger:
region = shapely.affinity.scale(myPolygon,
xfact=1.05, yfact=1.05 )
I have a list of x and y values for two curves, both having weird shapes, and I don't have a function for any of them. I need to do two things:
Plot it and shade the area between the curves like the image below.
Find the total area of this shaded region between the curves.
I'm able to plot and shade the area between those curves with fill_between and fill_betweenx in matplotlib, but I have no idea on how to calculate the exact area between them, specially because I don't have a function for any of those curves.
Any ideas?
I looked everywhere and can't find a simple solution for this. I'm quite desperate, so any help is much appreciated.
Thank you very much!
EDIT: For future reference (in case anyone runs into the same problem), here is how I've solved this: connected the first and last node/point of each curve together, resulting in a big weird-shaped polygon, then used shapely to calculate the polygon's area automatically, which is the exact area between the curves, no matter which way they go or how nonlinear they are. Works like a charm! :)
Here is my code:
from shapely.geometry import Polygon
x_y_curve1 = [(0.121,0.232),(2.898,4.554),(7.865,9.987)] #these are your points for curve 1 (I just put some random numbers)
x_y_curve2 = [(1.221,1.232),(3.898,5.554),(8.865,7.987)] #these are your points for curve 2 (I just put some random numbers)
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
print(area)
EDIT 2: Thank you for the answers. Like Kyle explained, this only works for positive values. If your curves go below 0 (which is not my case, as showed in the example chart), then you would have to work with absolute numbers.
The area calculation is straightforward in blocks where the two curves don't intersect: thats the trapezium as has been pointed out above. If they intersect, then you create two triangles between x[i] and x[i+1], and you should add the area of the two. If you want to do it directly, you should handle the two cases separately. Here's a basic working example to solve your problem. First, I will start with some fake data:
#!/usr/bin/python
import numpy as np
# let us generate fake test data
x = np.arange(10)
y1 = np.random.rand(10) * 20
y2 = np.random.rand(10) * 20
Now, the main code. Based on your plot, looks like you have y1 and y2 defined at the same X points. Then we define,
z = y1-y2
dx = x[1:] - x[:-1]
cross_test = np.sign(z[:-1] * z[1:])
cross_test will be negative whenever the two graphs cross. At these points, we want to calculate the x coordinate of the crossover. For simplicity, I will calculate x coordinates of the intersection of all segments of y. For places where the two curves don't intersect, they will be useless values, and we won't use them anywhere. This just keeps the code easier to understand.
Suppose you have z1 and z2 at x1 and x2, then we are solving for x0 such that z = 0:
# (z2 - z1)/(x2 - x1) = (z0 - z1) / (x0 - x1) = -z1/(x0 - x1)
# x0 = x1 - (x2 - x1) / (z2 - z1) * z1
x_intersect = x[:-1] - dx / (z[1:] - z[:-1]) * z[:-1]
dx_intersect = - dx / (z[1:] - z[:-1]) * z[:-1]
Where the curves don't intersect, area is simply given by:
areas_pos = abs(z[:-1] + z[1:]) * 0.5 * dx # signs of both z are same
Where they intersect, we add areas of both triangles:
areas_neg = 0.5 * dx_intersect * abs(z[:-1]) + 0.5 * (dx - dx_intersect) * abs(z[1:])
Now, the area in each block x[i] to x[i+1] is to be selected, for which I use np.where:
areas = np.where(cross_test < 0, areas_neg, areas_pos)
total_area = np.sum(areas)
That is your desired answer. As has been pointed out above, this will get more complicated if the both the y graphs were defined at different x points. If you want to test this, you can simply plot it (in my test case, y range will be -20 to 20)
negatives = np.where(cross_test < 0)
positives = np.where(cross_test >= 0)
plot(x, y1)
plot(x, y2)
plot(x, z)
plt.vlines(x_intersect[negatives], -20, 20)
Define your two curves as functions f and g that are linear by segment, e.g. between x1 and x2, f(x) = f(x1) + ((x-x1)/(x2-x1))*(f(x2)-f(x1)).
Define h(x)=abs(g(x)-f(x)). Then use scipy.integrate.quad to integrate h.
That way you don't need to bother about the intersections. It will do the "trapeze summing" suggested by ch41rmn automatically.
Your set of data is quite "nice" in the sense that the two sets of data share the same set of x-coordinates. You can therefore calculate the area using a series of trapezoids.
e.g. define the two functions as f(x) and g(x), then, between any two consecutive points in x, you have four points of data:
(x1, f(x1))-->(x2, f(x2))
(x1, g(x1))-->(x2, g(x2))
Then, the area of the trapezoid is
A(x1-->x2) = ( f(x1)-g(x1) + f(x2)-g(x2) ) * (x2-x1)/2 (1)
A complication arises that equation (1) only works for simply-connected regions, i.e. there must not be a cross-over within this region:
|\ |\/|
|_| vs |/\|
The area of the two sides of the intersection must be evaluated separately. You will need to go through your data to find all points of intersections, then insert their coordinates into your list of coordinates. The correct order of x must be maintained. Then, you can loop through your list of simply connected regions and obtain a sum of the area of trapezoids.
EDIT:
For curiosity's sake, if the x-coordinates for the two lists are different, you can instead construct triangles. e.g.
.____.
| / \
| / \
| / \
|/ \
._________.
Overlap between triangles must be avoided, so you will again need to find points of intersections and insert them into your ordered list. The lengths of each side of the triangle can be calculated using Pythagoras' formula, and the area of the triangles can be calculated using Heron's formula.
The area_between_two_curves function in pypi library similaritymeasures (released in 2018) might give you what you need. I tried a trivial example on my side, comparing the area between a function and a constant value and got pretty close tie-back to Excel (within 2%). Not sure why it doesn't give me 100% tie-back, maybe I am doing something wrong. Worth considering though.
I had the same problem.The answer below is based on an attempt by the question author. However, shapely will not directly give the area of the polygon in purple. You need to edit the code to break it up into its component polygons and then get the area of each. After-which you simply add them up.
Area Between two lines
Consider the lines below:
Sample Two lines
If you run the code below you will get zero for area because it takes the clockwise and subtracts the anti clockwise area:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
print(area)
The solution is therefore to split the polygon into smaller pieces based on where the lines intersect. Then use a for loop to add these up:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
x,y = polygon.exterior.xy
# original data
ls = LineString(np.c_[x, y])
# closed, non-simple
lr = LineString(ls.coords[:] + ls.coords[0:1])
lr.is_simple # False
mls = unary_union(lr)
mls.geom_type # MultiLineString'
Area_cal =[]
for polygon in polygonize(mls):
Area_cal.append(polygon.area)
Area_poly = (np.asarray(Area_cal).sum())
print(Area_poly)
A straightforward application of the area of a general polygon (see Shoelace formula) makes for a super-simple and fast, vectorized calculation:
def area(p):
# for p: 2D vertices of a polygon:
# area = 1/2 abs(sum(p0 ^ p1 + p1 ^ p2 + ... + pn-1 ^ p0))
# where ^ is the cross product
return np.abs(np.cross(p, np.roll(p, 1, axis=0)).sum()) / 2
Application to area between two curves. In this example, we don't even have matching x coordinates!
np.random.seed(0)
n0 = 10
n1 = 15
xy0 = np.c_[np.linspace(0, 10, n0), np.random.uniform(0, 10, n0)]
xy1 = np.c_[np.linspace(0, 10, n1), np.random.uniform(0, 10, n1)]
p = np.r_[xy0, xy1[::-1]]
>>> area(p)
4.9786...
Plot:
plt.plot(*xy0.T, 'b-')
plt.plot(*xy1.T, 'r-')
p = np.r_[xy0, xy1[::-1]]
plt.fill(*p.T, alpha=.2)
Speed
For both curves having 1 million points:
n = 1_000_000
xy0 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
xy1 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
%timeit area(np.r_[xy0, xy1[::-1]])
# 42.9 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Simple viz of polygon area calculation
# say:
p = np.array([[0, 3], [1, 0], [3, 3], [1, 3], [1, 2]])
p_closed = np.r_[p, p[:1]]
fig, axes = plt.subplots(ncols=2, figsize=(10, 5), subplot_kw=dict(box_aspect=1), sharex=True)
ax = axes[0]
ax.set_aspect('equal')
ax.plot(*p_closed.T, '.-')
ax.fill(*p_closed.T, alpha=0.6)
center = p.mean(0)
txtkwargs = dict(ha='center', va='center')
ax.text(*center, f'{area(p):.2f}', **txtkwargs)
ax = axes[1]
ax.set_aspect('equal')
for a, b in zip(p_closed, p_closed[1:]):
ar = 1/2 * np.cross(a, b)
pos = ar >= 0
tri = np.c_[(0,0), a, b, (0,0)].T
# shrink a bit to make individual triangles easier to visually identify
center = tri.mean(0)
tri = (tri - center)*0.95 + center
c = 'b' if pos else 'r'
ax.plot(*tri.T, 'k')
ax.fill(*tri.T, c, alpha=0.2, zorder=2 - pos)
t = ax.text(*center, f'{ar:.1f}', color=c, fontsize=8, **txtkwargs)
t.set_bbox(dict(facecolor='white', alpha=0.8, edgecolor='none'))
plt.tight_layout()