Find closest point in 2D mashed array - python

To give y'all some context, I'm doing this inversion technique where I am trying to reproduce a profile using the integrated values. To do that I need to find the value within an array along a certain line(s). To exemplify my issue I have the following code:
fig, ax = plt.subplots(1, figsize = (10,10))
#Create the grid (different grid spacing):
X = np.arange(0,10.01,0.25)
Y = np.arange(0,10.01,1.00)
#Create the 2D array to be plotted
Z = []
for i in range(np.size(X)):
Zaux = []
for j in range(np.size(Y)):
Zaux.append(i*j + j)
ax.scatter(X[i],Y[j], color = 'red', s = 0.25)
Z.append(Zaux)
#Mesh the 1D grids:
Ymesh, Xmesh = np.meshgrid(Y, X)
#Plot the color plot:
ax.pcolor(Y,X, Z, cmap='viridis', vmin=np.nanmin(Z), vmax=np.nanmax(Z))
#Plot the points in the grid of the color plot:
for i in range(np.size(X)):
for j in range(np.size(Y)):
ax.scatter(Y[j],X[i], color = 'red', s = 3)
#Create a set of lines:
for i in np.linspace(0,2,5):
X_line = np.linspace(0,10,256)
Y_line = i*X_line*3.1415-4
#Plot each line:
ax.plot(X_line,Y_line, color = 'blue')
ax.set_xlim(0,10)
ax.set_ylim(0,10)
plt.show()
That outputs this graph:
I need to find the closest points in Z that are being crossed by each of the lines. The idea is to integrate the values in Z that are crossed by the blue lines and plot that as a function of slope of the lines. Anyone has a good solution for it? I've tried a set of for loops, but I think it's kind of clunky.
Anyway, thanks for your time...

I am not sure about the closest points thing. That seems "clunky" too. What if it passes exactly in the middle between two points? Also I already had written code that weighs the four neighbor pixels by their closeness for an other project so I am going with that. Also I take the liberty of not rescaling the picture.
i,j = np.meshgrid(np.arange(41),np.arange(11))
Z = i*j + j
class Image_knn():
def fit(self, image):
self.image = image.astype('float')
def predict(self, x, y):
image = self.image
weights_x = [1-(x % 1), x % 1]
weights_y = [1-(y % 1), y % 1]
start_x = np.floor(x).astype('int')
start_y = np.floor(y).astype('int')
return sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
And a little sanity check it returns the picture if we give it it's coordinates.
image_model = Image_knn()
image_model.fit(Z)
assert np.allclose(image_model.predict(*np.where(np.ones(Z.shape, dtype='bool'))).reshape((11,41)), Z)
I generate m=100 lines and scale the points on them so that they are evenly spaced. Here is a plot of every 10th of them.
n = 1000
m = 100
slopes = np.linspace(1e-10,10,m)
t, slope = np.meshgrid(np.linspace(0,1,n), slopes)
x_max, y_max = Z.shape[0]-1, Z.shape[1]-1
lines_x = t
lines_y = t*slope
scales = np.broadcast_to(np.stack([x_max/lines_x[:,-1], y_max/lines_y[:,-1]]).min(axis=0), (n,m)).T
lines_x *= scales
lines_y *= scales
And finally I can get the "points" consisting of slope and "integral" and draw it. You probably should take a closer look at the "integral" it's just a ruff guess of mine.
%%timeit
points = np.array([(slope, np.mean(image_model.predict(lines_x[i],lines_y[i]))
*np.linalg.norm(np.array((lines_x[i,-1],lines_y[i,-1]))))
for i,slope in enumerate(slopes)])
plt.scatter(points[:,0],points[:,1])
Notice the %%timeit in the last block. This takes ~38.3 ms on my machine and therefore wasn't optimized. As Donald Knuth puts it "premature optimization is the root of all evil". If you were to optimize this you would remove the for loop, shove all the coordinates for line points in the model at once by reshaping and reshaping back and then organize them with the slopes. But I saw no reason to put myself threw that for a few ms.
And finally we get a nice cusp as a reward. Notice that it makes sense that the maximum is at 4 since the diagonal is at a slope of 4 for our 40 by 10 picture. The intuition for the cusp is a bit harder to explain but I guess you probably have that already. For the length it comes down to the function (x,y) -> sqrt(x^2+y^2) having different directional differentials when going up and when going left on the rectangle.

Related

Questions about details of the filter back projection

Lately I've been studying the filter back projection, and I download the code from github.com. I was confused the process of the filter back projection. here is part of his code:
def backproject(sinogram, theta):
"""Backprojection function.
inputs: sinogram - [n x m] numpy array where n is the number of projections and m the number of angles
theta - vector of length m denoting the angles represented in the sinogram
output: backprojArray - [n x n] backprojected 2-D numpy array"""
imageLen = sinogram.shape[0] #sinogram : [n x m] , so imageLen = n(height)
reconMatrix = np.zeros((imageLen, imageLen))
x = np.arange(imageLen)-imageLen/2
y = x.copy()
X, Y = np.meshgrid(x, y)
plt.ion()
fig2, ax = plt.subplots()
im = plt.imshow(reconMatrix, cmap='gray')
theta = theta*np.pi/180
numAngles = len(theta)
for n in range(numAngles):
Xrot = X*np.sin(theta[n])-Y*np.cos(theta[n])
XrotCor = np.round(Xrot+imageLen/2)
XrotCor = XrotCor.astype('int')
projMatrix = np.zeros((imageLen, imageLen))
m0, m1 = np.where((XrotCor >= 0) & (XrotCor <= (imageLen-1)))
s = sinogram[:,n]
projMatrix[m0, m1] = s[XrotCor[m0, m1]]
reconMatrix += projMatrix
im.set_data(Image.fromarray((reconMatrix-np.min(reconMatrix))/np.ptp(reconMatrix)*255))
ax.set_title('Theta = %.2f degrees' % (theta[n]*180/np.pi))
fig2.canvas.draw()
fig2.canvas.flush_events()
plt.close()
plt.ioff()
backprojArray = np.flipud(reconMatrix)
return backprojArray
For the loop 'for', I was confused for two weeks.
Firstly, I really don't know the following code.
Xrot = X*np.sin(theta[n])-Y*np.cos(theta[n])
XrotCor = np.round(Xrot+imageLen/2) .
I don't know how it works through geometric ways. I have drown the matrix and so on, but I still don't know the priciples.
Lastly, for the code, im.set_data(Image.fromarray((reconMatrix-np.min(reconMatrix))/np.ptp(reconMatrix)*255)) , what does it mean, cause I only know the direct back projection. And I really don't know why there's 255
Xrot = X*np.sin(theta[n])-Y*np.cos(theta[n])
This is the simple back projection algorithmm. I am also learning it so I will try to make it as simple and concise as possible.
There are some steps for FBP.
Input Sinogram image(Radon Transform)
_ Create Filter (Ram filter works best but you can try other High pass filters as well)
Forward Fourier Transform(dft function )
Apply Filter
Inverse Fourier Transform
Backprojection (Basically reversing the sinogram technique)
Backprojection is simply back projecting the values and add up them to get the original image for each projection.
im.set_data(Image.fromarray((reconMatrix-.min(reconMatrix))/np.ptp(reconMatrix)*255))
I believe this code is normalizing the image nothing else.

Plotting a surface contour for a given value in a 3D numpy matrix

I have three 3D mesh matrices (X, Y, Z) corresponding to the xyz coordinate space.
I also have a 3D Numpy matrix A where A[i,j,k] contains a float that is associated with the point (x,y,z) where x=X[i,j,k], y=Y[i,j,k], and z=Z[i,j,k]. The float values are continuous within A (i.e. the change in value between adjacent elements of A are typically small).
Is there a way to plot the surface that corresponds to a given float value in A using Matplotlib or any other Python-based graphics package? For example, if given a value 2.34, I am interested in getting a plotted contour surface of the matrix A wherever 2.34 (plus or minus some tolerance) shows up?
So far, I have been able to recover the xyz coordinates of all values in A that are within some tolerance of the target value and then make a 3D scatter plot using this (code below). Perhaps there is also a way of plotting a surface from these points?
def clean (A, t, dt):
# function for making A binary for t+-dt
# t is the target value I want in the matrix A with tolerance dt
new_A = np.copy(A)
new_A[np.logical_and(new_A > t-dt, new_A < t+dt)] = -1
new_A[new_A != -1] = 0
new_A[new_A == -1] = 1
return (new_A)
def get_surface (X, Y, Z, new_A):
x_vals = []
y_vals = []
z_vals = []
# Retrieve (x,y,z) coordinates of surface
for i in range(new_A.shape[0]):
for j in range(new_A.shape[1]):
for k in range(new_A.shape[2]):
if new_A[i,j,k] == 1.0:
x_vals.append(X[i,j,k])
y_vals.append(Y[i,j,k])
z_vals.append(Z[i,j,k])
return (np.array(x_vals), np.array(y_vals), np.array(z_vals))
cleaned_A = clean (A, t=2.5, dt=0.001)
x_f, y_f, z_f = get_surface (X, Y, Z, cleaned_A )
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d', aspect='equal')
ax.scatter(x_f, y_f, z_f, color='g', s=1)
I have also tried ax.plot_trisurf(x_f,y_f,z_f), but this gives me an poorly connected plot. I'm guessing that the ordering of values in my arrays might be affecting this, in which case is there a package that can do some kind of 3D interpolated surface plot with random ordering of points (e.g. through minimizing the surface area or something like that?)
The object that I am interested in is roughly spherical (i.e. two z's per (x,y)). I can't seem to find any working examples of someone triangulating over a closed 3D surface, but maybe I'm not looking in the right places.
After a lot of digging around, I think I've arrived at a solution that works (for a sphere at least--will update my answer when I try out deformations of a sphere). Many thanks to the comments which helped me think down the right path. I am basically using a ConvexHull for the triangulation from scipy.spatial:
from matplotlib.tri import Triangulation
from scipy.spatial import ConvexHull
def clean (A, t, dt):
# function for making A binary for t+-dt
# t is the target value I want in the matrix A with tolerance dt
new_A = np.copy(A)
new_A[np.logical_and(new_A > t-dt, new_A < t+dt)] = -1
new_A[new_A != -1] = 0
new_A[new_A == -1] = 1
return (new_A)
def get_surface (X, Y, Z, new_A):
x_vals = []
y_vals = []
z_vals = []
# Retrieve (x,y,z) coordinates of surface
for i in range(new_A.shape[0]):
for j in range(new_A.shape[1]):
for k in range(new_A.shape[2]):
if new_A[i,j,k] == 1.0:
x_vals.append(X[i,j,k])
y_vals.append(Y[i,j,k])
z_vals.append(Z[i,j,k])
return (np.array(x_vals), np.array(y_vals), np.array(z_vals))
cleaned_A = clean (A, t=2.5, dt=0.001)
x_f, y_f, z_f = get_surface (X, Y, Z, cleaned_A )
Xs = np.vstack((x_f, y_f, z_f)).T
hull = ConvexHull(Xs)
x, y, z = Xs.T
tri = Triangulation(x, y, triangles=hull.simplices)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d', aspect='equal')
ax.plot_trisurf(tri, z, color='g', alpha=0.1)

Generate profiles through a 2D array at an angle without altering pixels

I'd like to plot two profiles through the highest intensity point in a 2D numpy array, which is an image of a blob (i.e. a line through the semi-major axis, and another line through the semi-minor axis). The blob is rotated at an angle theta counterclockwise from the standard x-axis and is asymmetric.
It is a 600x600 array with a max intensity of 1 (at only one pixel) that is located right at the center at (300, 300). The angle rotation from the x-axis (which then gives the location of the semi-major axis when rotated by that angle) is theta = 89.54 degrees. I do not want to use scipy.ndimage.rotate because it uses spline interpolation, and I do not want to change any of my pixel values. But I suppose a nearest-neighbor interpolation method would be okay.
I tried generating lines corresponding to the major and minor axes across the image, but the result was not right at all (the peak was far less than 1), so maybe I did something wrong. The code for this is below:
import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
def profiles_at_angle(image, axis, theta):
theta = np.deg2rad(theta)
if axis == 'major':
x_0, y_0 = 0, 300-300*np.tan(theta)
x_1, y_1 = 599, 300+300*np.tan(theta)
elif axis=='minor':
x_0, y_0 = 300-300*np.tan(theta), 599
x_1, y_1 = 300+300*np.tan(theta), -599
num = 600
x, y = np.linspace(x_0, x_1, num), np.linspace(y_0, y_1, num)
z = ndimage.map_coordinates(image, np.vstack((x,y)))
fig, axes = plt.subplots(nrows=2)
axes[0].imshow(image, cmap='gray')
axes[0].axis('image')
axes[1].plot(z)
plt.xlim(250,350)
plt.show()
profiles_at_angle(image, 'major', theta)
Did I do something obviously wrong in my code above? Or how else can I accomplish this? Thank you.
Edit: Here are some example images. Sorry for the bad quality; my browser crashed every time I tried uploading them anywhere so I had to take photos of the screen.
Figure 1: This is the result of my code above, which is clearly wrong since the peak should be at 1. I'm not sure what I did wrong though.
Figure 2: I made this plot below by just taking the profiles through the standard x and y axes, ignoring any rotation (this only looks good coincidentally because the real angle of rotation is so close to 90 degrees, so I was able to just switch the labels and get this). I want my result to look something like this, but taking the correction rotation angle into account.
Edit: It could be useful to run tests on this method using data very much like my own (it's a 2D Gaussian with nearly the same parameters):
image = np.random.random((600,600))
def generate(data_set):
xvec = np.arange(0, np.shape(data_set)[1], 1)
yvec = np.arange(0, np.shape(data_set)[0], 1)
X, Y = np.meshgrid(xvec, yvec)
return X, Y
def gaussian_func(xy, x0, y0, sigma_x, sigma_y, amp, theta, offset):
x, y = xy
a = (np.cos(theta))**2/(2*sigma_x**2) + (np.sin(theta))**2/(2*sigma_y**2)
b = -np.sin(2*theta)/(4*sigma_x**2) + np.sin(2*theta)/(4*sigma_y**2)
c = (np.sin(theta))**2/(2*sigma_x**2) + (np.cos(theta))**2/(2*sigma_y**2)
inner = a * (x-x0)**2
inner += 2*b*(x-x0)*(y-y0)
inner += c * (y-y0)**2
return (offset + amp * np.exp(-inner)).ravel()
xx, yy = generate(image)
image = gaussian_func((xx.ravel(), yy.ravel()), 300, 300, 5, 4, 1, 1.56, 0)
image = np.reshape(image, (600, 600))
This should do it for you. You just did not properly compute your lines.
theta = 65
peak = np.argwhere(image==1)[0]
x = np.linspace(peak[0]-100,peak[0]+100,1000)
y = lambda x: (x-peak[1])*np.tan(np.deg2rad(theta))+peak[0]
y_maj = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
y = lambda x: -(x-peak[1])/np.tan(np.deg2rad(theta))+peak[0]
y_min = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
del y
z_min = scipy.ndimage.map_coordinates(image, np.vstack((x,y_min)))
z_maj = scipy.ndimage.map_coordinates(image, np.vstack((x,y_maj)))
fig, axes = plt.subplots(nrows=2)
axes[0].imshow(image)
axes[0].plot(x,y_maj)
axes[0].plot(x,y_min)
axes[0].axis('image')
axes[1].plot(z_min)
axes[1].plot(z_maj)
plt.show()

Missing coordinates when dividing image into grid

I am trying to partition a set of provided coordinates into several buckets in Python 3 with numpy. I have a grid of buckets. See below:
def partition(image, num_tiles):
"""Divide an image into a (num_tiles x num_tiles) grid and return the
partitioned input."""
# The object to return. Ignore - I am just trying to test 'draw' works currently.
partitioned_image = np.empty((num_tiles, num_tiles), dtype=object)
draw = []
# The input array contains coordinates of the form [xMin, xMax, yMin, yMax].
# This is because these are coordinates for bounding boxes around biological cells.
# When I say 'point(s)', I refer to a [xMn, xMx, yMn, yMx] array(s).
xMin = image[:,0]
xMax = image[:,1]
yMin = image[:,2]
yMax = image[:,3]
# The base to start searching from (not 0,0).
x_base = min(xMin)
y_base = min(yMin)
# max(?Max) - min(?Min) defines the entire range for the variable. Divide this
# range by the number of tiles, which is the number of ticks of the grid.
# E.g. range is 100, want a 10x10 grid, so we step along in steps of 10.
x_step = (max(xMax) - min(xMin)) // num_tiles
y_step = (max(yMax) - min(yMin)) // num_tiles
for i in range(num_tiles):
for j in range(num_tiles):
# Define the bottom-left point of the region of interest (a tile)
x_left = x_base + x_step * i
y_low = y_base + y_step * j
# Define the upper-right point of the region of interest
x_right = x_base + x_step * (i + 1)
y_high = y_base + y_step * (j + 1)
# Every point in image that is within the region gets added to the
# draw list. Remember, each point is of the form [xMn, xMx, yMn, yMx]
result = ((yMin >= y_low) & (yMax < y_high) &
(xMin >= x_left) & (xMax < x_right)).nonzero()[0]
for coordinates in image[result]:
draw.append(coordinates)
# I would want to add the actual points to my partitioned_input array
# here, in the corresponding tile. The above code for draw is *JUST TESTING*.
# Convert draw list to numpy array and check to see we got all the points.
draw = np.asarray(draw)
print(draw.shape == image.shape) # We do not. This is annoying.
# Below is the code for plotting. I just take the average of
# the xMin/yMin and xMax/yMax values for this.
draw_xAvg = np.mean(np.array([draw[:,0], draw[:,1]]), axis=0)
draw_yAvg = np.mean(np.array([draw[:,2], draw[:,3]]), axis=0)
image_xAvg = np.mean(np.array([image[:,0], image[:,1]]), axis=0)
image_yAvg = np.mean(np.array([image[:,2], image[:,3]]), axis=0)
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(30, 10))
ax1.set_title('Test', fontsize=30)
ax1.scatter(draw_xAvg, draw_yAvg, s=0.1, c='b')
ax2.set_title('Image', fontsize=30)
ax2.scatter(image_xAvg, image_yAvg, s=0.1, c='r')
ax3.set_title('Overlay (Image)', fontsize=30)
ax3.scatter(image_xAvg, image_yAvg, s=0.1, c='r')
ax3.scatter(draw_xAvg, draw_yAvg, s=0.1, c='b')
# Would return this once I partitioned the input correctly.
# The idea is to have a list per tile of all the points found in that tile.
# All I am doing is checking that I get the right number of points in total.
return partitioned_image
Calling code:
partitioned_cells = partition(cells, 20)
As you can see I step along the input in steps proportionate to the size of the input. This should be completely fine and I do get the vast majority of points so clearly the code is not completely wrong and my logic is fine. However, I expect perfect overlap in the third figure below:
If you look closely at the righthand figure you can see a distinct grid-like red coming up which the blue is not overlapping, particularly on the right of that figure - the sizes of the resulting numpy arrays (12948 v 13804) also confirm there is a mismatch, with the red outnumbering the blue. I am missing some coordinates in my partitioning.
I have no idea why this is - even when my boundaries are inclusive (>= or <=) they still do not get all of the points. I don't understand why. Can someone explain or have a guess as to why this might be?

Find the area between two curves plotted in matplotlib (fill_between area)

I have a list of x and y values for two curves, both having weird shapes, and I don't have a function for any of them. I need to do two things:
Plot it and shade the area between the curves like the image below.
Find the total area of this shaded region between the curves.
I'm able to plot and shade the area between those curves with fill_between and fill_betweenx in matplotlib, but I have no idea on how to calculate the exact area between them, specially because I don't have a function for any of those curves.
Any ideas?
I looked everywhere and can't find a simple solution for this. I'm quite desperate, so any help is much appreciated.
Thank you very much!
EDIT: For future reference (in case anyone runs into the same problem), here is how I've solved this: connected the first and last node/point of each curve together, resulting in a big weird-shaped polygon, then used shapely to calculate the polygon's area automatically, which is the exact area between the curves, no matter which way they go or how nonlinear they are. Works like a charm! :)
Here is my code:
from shapely.geometry import Polygon
x_y_curve1 = [(0.121,0.232),(2.898,4.554),(7.865,9.987)] #these are your points for curve 1 (I just put some random numbers)
x_y_curve2 = [(1.221,1.232),(3.898,5.554),(8.865,7.987)] #these are your points for curve 2 (I just put some random numbers)
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
print(area)
EDIT 2: Thank you for the answers. Like Kyle explained, this only works for positive values. If your curves go below 0 (which is not my case, as showed in the example chart), then you would have to work with absolute numbers.
The area calculation is straightforward in blocks where the two curves don't intersect: thats the trapezium as has been pointed out above. If they intersect, then you create two triangles between x[i] and x[i+1], and you should add the area of the two. If you want to do it directly, you should handle the two cases separately. Here's a basic working example to solve your problem. First, I will start with some fake data:
#!/usr/bin/python
import numpy as np
# let us generate fake test data
x = np.arange(10)
y1 = np.random.rand(10) * 20
y2 = np.random.rand(10) * 20
Now, the main code. Based on your plot, looks like you have y1 and y2 defined at the same X points. Then we define,
z = y1-y2
dx = x[1:] - x[:-1]
cross_test = np.sign(z[:-1] * z[1:])
cross_test will be negative whenever the two graphs cross. At these points, we want to calculate the x coordinate of the crossover. For simplicity, I will calculate x coordinates of the intersection of all segments of y. For places where the two curves don't intersect, they will be useless values, and we won't use them anywhere. This just keeps the code easier to understand.
Suppose you have z1 and z2 at x1 and x2, then we are solving for x0 such that z = 0:
# (z2 - z1)/(x2 - x1) = (z0 - z1) / (x0 - x1) = -z1/(x0 - x1)
# x0 = x1 - (x2 - x1) / (z2 - z1) * z1
x_intersect = x[:-1] - dx / (z[1:] - z[:-1]) * z[:-1]
dx_intersect = - dx / (z[1:] - z[:-1]) * z[:-1]
Where the curves don't intersect, area is simply given by:
areas_pos = abs(z[:-1] + z[1:]) * 0.5 * dx # signs of both z are same
Where they intersect, we add areas of both triangles:
areas_neg = 0.5 * dx_intersect * abs(z[:-1]) + 0.5 * (dx - dx_intersect) * abs(z[1:])
Now, the area in each block x[i] to x[i+1] is to be selected, for which I use np.where:
areas = np.where(cross_test < 0, areas_neg, areas_pos)
total_area = np.sum(areas)
That is your desired answer. As has been pointed out above, this will get more complicated if the both the y graphs were defined at different x points. If you want to test this, you can simply plot it (in my test case, y range will be -20 to 20)
negatives = np.where(cross_test < 0)
positives = np.where(cross_test >= 0)
plot(x, y1)
plot(x, y2)
plot(x, z)
plt.vlines(x_intersect[negatives], -20, 20)
Define your two curves as functions f and g that are linear by segment, e.g. between x1 and x2, f(x) = f(x1) + ((x-x1)/(x2-x1))*(f(x2)-f(x1)).
Define h(x)=abs(g(x)-f(x)). Then use scipy.integrate.quad to integrate h.
That way you don't need to bother about the intersections. It will do the "trapeze summing" suggested by ch41rmn automatically.
Your set of data is quite "nice" in the sense that the two sets of data share the same set of x-coordinates. You can therefore calculate the area using a series of trapezoids.
e.g. define the two functions as f(x) and g(x), then, between any two consecutive points in x, you have four points of data:
(x1, f(x1))-->(x2, f(x2))
(x1, g(x1))-->(x2, g(x2))
Then, the area of the trapezoid is
A(x1-->x2) = ( f(x1)-g(x1) + f(x2)-g(x2) ) * (x2-x1)/2 (1)
A complication arises that equation (1) only works for simply-connected regions, i.e. there must not be a cross-over within this region:
|\ |\/|
|_| vs |/\|
The area of the two sides of the intersection must be evaluated separately. You will need to go through your data to find all points of intersections, then insert their coordinates into your list of coordinates. The correct order of x must be maintained. Then, you can loop through your list of simply connected regions and obtain a sum of the area of trapezoids.
EDIT:
For curiosity's sake, if the x-coordinates for the two lists are different, you can instead construct triangles. e.g.
.____.
| / \
| / \
| / \
|/ \
._________.
Overlap between triangles must be avoided, so you will again need to find points of intersections and insert them into your ordered list. The lengths of each side of the triangle can be calculated using Pythagoras' formula, and the area of the triangles can be calculated using Heron's formula.
The area_between_two_curves function in pypi library similaritymeasures (released in 2018) might give you what you need. I tried a trivial example on my side, comparing the area between a function and a constant value and got pretty close tie-back to Excel (within 2%). Not sure why it doesn't give me 100% tie-back, maybe I am doing something wrong. Worth considering though.
I had the same problem.The answer below is based on an attempt by the question author. However, shapely will not directly give the area of the polygon in purple. You need to edit the code to break it up into its component polygons and then get the area of each. After-which you simply add them up.
Area Between two lines
Consider the lines below:
Sample Two lines
If you run the code below you will get zero for area because it takes the clockwise and subtracts the anti clockwise area:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
print(area)
The solution is therefore to split the polygon into smaller pieces based on where the lines intersect. Then use a for loop to add these up:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
x,y = polygon.exterior.xy
# original data
ls = LineString(np.c_[x, y])
# closed, non-simple
lr = LineString(ls.coords[:] + ls.coords[0:1])
lr.is_simple # False
mls = unary_union(lr)
mls.geom_type # MultiLineString'
Area_cal =[]
for polygon in polygonize(mls):
Area_cal.append(polygon.area)
Area_poly = (np.asarray(Area_cal).sum())
print(Area_poly)
A straightforward application of the area of a general polygon (see Shoelace formula) makes for a super-simple and fast, vectorized calculation:
def area(p):
# for p: 2D vertices of a polygon:
# area = 1/2 abs(sum(p0 ^ p1 + p1 ^ p2 + ... + pn-1 ^ p0))
# where ^ is the cross product
return np.abs(np.cross(p, np.roll(p, 1, axis=0)).sum()) / 2
Application to area between two curves. In this example, we don't even have matching x coordinates!
np.random.seed(0)
n0 = 10
n1 = 15
xy0 = np.c_[np.linspace(0, 10, n0), np.random.uniform(0, 10, n0)]
xy1 = np.c_[np.linspace(0, 10, n1), np.random.uniform(0, 10, n1)]
p = np.r_[xy0, xy1[::-1]]
>>> area(p)
4.9786...
Plot:
plt.plot(*xy0.T, 'b-')
plt.plot(*xy1.T, 'r-')
p = np.r_[xy0, xy1[::-1]]
plt.fill(*p.T, alpha=.2)
Speed
For both curves having 1 million points:
n = 1_000_000
xy0 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
xy1 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
%timeit area(np.r_[xy0, xy1[::-1]])
# 42.9 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Simple viz of polygon area calculation
# say:
p = np.array([[0, 3], [1, 0], [3, 3], [1, 3], [1, 2]])
p_closed = np.r_[p, p[:1]]
fig, axes = plt.subplots(ncols=2, figsize=(10, 5), subplot_kw=dict(box_aspect=1), sharex=True)
ax = axes[0]
ax.set_aspect('equal')
ax.plot(*p_closed.T, '.-')
ax.fill(*p_closed.T, alpha=0.6)
center = p.mean(0)
txtkwargs = dict(ha='center', va='center')
ax.text(*center, f'{area(p):.2f}', **txtkwargs)
ax = axes[1]
ax.set_aspect('equal')
for a, b in zip(p_closed, p_closed[1:]):
ar = 1/2 * np.cross(a, b)
pos = ar >= 0
tri = np.c_[(0,0), a, b, (0,0)].T
# shrink a bit to make individual triangles easier to visually identify
center = tri.mean(0)
tri = (tri - center)*0.95 + center
c = 'b' if pos else 'r'
ax.plot(*tri.T, 'k')
ax.fill(*tri.T, c, alpha=0.2, zorder=2 - pos)
t = ax.text(*center, f'{ar:.1f}', color=c, fontsize=8, **txtkwargs)
t.set_bbox(dict(facecolor='white', alpha=0.8, edgecolor='none'))
plt.tight_layout()

Categories