How to find crossection of hline and a function in python? - python

I have created a plot like in the figure below, red plot is original data, blue is a fitted function, the horizontal lines are different levels. I need to find the both intersection points with each line. do you have any suggestions ? Thanks In advance.

An easy way to do this numerically is to subtract the y-value of each horizontal line from your fit and then to solve the equation
fit(x) - y = 0 for x.
For this, scipy.optimize.fsolve can be used as follows (
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fsolve # To find the zeros
from scipy.stats import cauchy
def my_fit(x):
# dummy function for this example
return cauchy().pdf(x)
horizontal_lines = [0.05, 0.15, 0.25]
colors = ['r', 'g', 'm']
x = np.linspace(-5, 5, 1000)
plt.plot(x, my_fit(x))
plt.hlines(horizontal_lines, -5, 5, ls="--", color=colors)
for i, y in enumerate(horizontal_lines):
x0 = fsolve(lambda x: my_fit(x) - y, -1)
x1 = fsolve(lambda x: my_fit(x) - y, 1)
print(f"Intersection points for {y=}: {x0=} {x1=}")
plt.scatter(x0, my_fit(x0), color=colors[i])
plt.scatter(x1, my_fit(x1), color=colors[i])
Intersection points for y=0.05: x0=array([-2.3165055]) x1=array([2.3165055])
Intersection points for y=0.15: x0=array([-1.05927612]) x1=array([1.05927612])
Intersection points for y=0.25: x0=array([-0.5227232]) x1=array([0.5227232])

A simple solution to this would be to just calculate the intersection points knowing the functions of the lines.
Line 1 : y = 2x
Line 2 : y = x^2
Intersection points:
2x = x^2
0 = x^2 - 2x
x1 = 2
x2 = 0
For y just substitute x in one of the functions,
y1 = 2x y1 = 2*(2) y1 = 4
y2 = 2x y2 = 2*(0) y2 = 0
Intersection point 1 of line 1 and line 2:
Itersection point 1: (2, 4)
Intersection point 2: (0,0)
enter image description here
Hope this helps.

The simplest non-mathematical solution would be:
take all points with y > y0
take the largest and smallest x from the remaining list.
The result is approximately correct for high point density.


Make a parabola steeper at both sides while keeping both ends

I'm having a parabola with both axes being from 0 to 1 as follows:
The parabola is created and normalized with the following code:
import matplotlib.pyplot as plt
import numpy as np
# normalize array
def min_max_scale_array(arr):
arr = np.array(arr)
return (arr - arr.min())/(arr.max()-arr.min())
x = np.linspace(-50,50,100)
y = x**2
x = min_max_scale_array(x)
y = min_max_scale_array(y)
fig, ax = plt.subplots()
ax.plot(x, y)
I want to create another one with both ends being the same but both sides become steeper like this:
I thought of joining an exponential curve and its reflection but that would make the resulting parabola looks pointy at the bottom.
Can you show me how to achieve this? Thank you!
If you want to modify any arbitrary curve, you can change the x values, for example taking a power of it:
# x and y are defined
for factor in [1.1, 1.5, 2, 3, 4]:
x2 = 2*x-1
x3 = (abs(x2)**(1/factor))*np.sign(x2)/2+0.5
ax.plot(x3, y, label=f'{factor=}')
You can change the exponent to get a steeper curve with the same value at the extremes. You need to pick a larger value that is an even integer (odd numbers won't give a parabola).
y = x**4

Generate profiles through a 2D array at an angle without altering pixels

I'd like to plot two profiles through the highest intensity point in a 2D numpy array, which is an image of a blob (i.e. a line through the semi-major axis, and another line through the semi-minor axis). The blob is rotated at an angle theta counterclockwise from the standard x-axis and is asymmetric.
It is a 600x600 array with a max intensity of 1 (at only one pixel) that is located right at the center at (300, 300). The angle rotation from the x-axis (which then gives the location of the semi-major axis when rotated by that angle) is theta = 89.54 degrees. I do not want to use scipy.ndimage.rotate because it uses spline interpolation, and I do not want to change any of my pixel values. But I suppose a nearest-neighbor interpolation method would be okay.
I tried generating lines corresponding to the major and minor axes across the image, but the result was not right at all (the peak was far less than 1), so maybe I did something wrong. The code for this is below:
import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
def profiles_at_angle(image, axis, theta):
theta = np.deg2rad(theta)
if axis == 'major':
x_0, y_0 = 0, 300-300*np.tan(theta)
x_1, y_1 = 599, 300+300*np.tan(theta)
elif axis=='minor':
x_0, y_0 = 300-300*np.tan(theta), 599
x_1, y_1 = 300+300*np.tan(theta), -599
num = 600
x, y = np.linspace(x_0, x_1, num), np.linspace(y_0, y_1, num)
z = ndimage.map_coordinates(image, np.vstack((x,y)))
fig, axes = plt.subplots(nrows=2)
axes[0].imshow(image, cmap='gray')
profiles_at_angle(image, 'major', theta)
Did I do something obviously wrong in my code above? Or how else can I accomplish this? Thank you.
Edit: Here are some example images. Sorry for the bad quality; my browser crashed every time I tried uploading them anywhere so I had to take photos of the screen.
Figure 1: This is the result of my code above, which is clearly wrong since the peak should be at 1. I'm not sure what I did wrong though.
Figure 2: I made this plot below by just taking the profiles through the standard x and y axes, ignoring any rotation (this only looks good coincidentally because the real angle of rotation is so close to 90 degrees, so I was able to just switch the labels and get this). I want my result to look something like this, but taking the correction rotation angle into account.
Edit: It could be useful to run tests on this method using data very much like my own (it's a 2D Gaussian with nearly the same parameters):
image = np.random.random((600,600))
def generate(data_set):
xvec = np.arange(0, np.shape(data_set)[1], 1)
yvec = np.arange(0, np.shape(data_set)[0], 1)
X, Y = np.meshgrid(xvec, yvec)
return X, Y
def gaussian_func(xy, x0, y0, sigma_x, sigma_y, amp, theta, offset):
x, y = xy
a = (np.cos(theta))**2/(2*sigma_x**2) + (np.sin(theta))**2/(2*sigma_y**2)
b = -np.sin(2*theta)/(4*sigma_x**2) + np.sin(2*theta)/(4*sigma_y**2)
c = (np.sin(theta))**2/(2*sigma_x**2) + (np.cos(theta))**2/(2*sigma_y**2)
inner = a * (x-x0)**2
inner += 2*b*(x-x0)*(y-y0)
inner += c * (y-y0)**2
return (offset + amp * np.exp(-inner)).ravel()
xx, yy = generate(image)
image = gaussian_func((xx.ravel(), yy.ravel()), 300, 300, 5, 4, 1, 1.56, 0)
image = np.reshape(image, (600, 600))
This should do it for you. You just did not properly compute your lines.
theta = 65
peak = np.argwhere(image==1)[0]
x = np.linspace(peak[0]-100,peak[0]+100,1000)
y = lambda x: (x-peak[1])*np.tan(np.deg2rad(theta))+peak[0]
y_maj = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
y = lambda x: -(x-peak[1])/np.tan(np.deg2rad(theta))+peak[0]
y_min = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
del y
z_min = scipy.ndimage.map_coordinates(image, np.vstack((x,y_min)))
z_maj = scipy.ndimage.map_coordinates(image, np.vstack((x,y_maj)))
fig, axes = plt.subplots(nrows=2)

Solve nonlinear equation in python

I am trying to find the fundamental TE mode of the dielectric waveguide. The way I try to solve it is to compute two function and try to find their intersection on graph. However, I am having trouble get the intersect point on the plot.
My code:
def LHS(w):
theta = 2*np.pi*1.455*10*10**(-6)*np.cos(np.deg2rad(w))/(900*10**(-9))
if(theta>(np.pi/2) or theta < 0):
y1 = 0
y1 = np.tan(theta)
return y1
def RHS(w):
y = ((np.sin(np.deg2rad(w)))**2-(1.440/1.455)**2)**0.5/np.cos(np.deg2rad(w))
return y
x = np.linspace(80, 90, 500)
LHS_vals = [LHS(v) for v in x]
RHS_vals = [RHS(v) for v in x]
# plot
fig, ax = plt.subplots(1, 1, figsize=(6,3))
ax.legend(['LHS','RHS'],loc='center left', bbox_to_anchor=(1, 0.5))
I got plot like this:
The intersect point is around 89 degree, however, I am having trouble to compute the exact value of x. I have tried fsolve, solve to find the solution but still in vain. It seems not able to print out solution if it is not the only solution. Is it possible to only find solution that x is in certain range? Could someone give me any suggestion here? Thanks!
the equation is like this (m=0):
and I am trying to solve theta here by finding the intersection point
One of the way I tried is as this:
from scipy.optimize import fsolve
def f(wy):
w, y = wy
z = np.array([y - LHS(w), y - RHS(w)])
return z
fsolve(f,[85, 90])
However it gives me the wrong answer.
I also tried something like this:
import matplotlib.pyplot as plt
x = np.arange(85, 90, 0.1)
f = LHS(x)
g = RHS(x)
plt.plot(x, f, '-')
plt.plot(x, g, '-')
idx = np.argwhere(np.diff(np.sign(f - g)) != 0).reshape(-1) + 0
plt.plot(x[idx], f[idx], 'ro')
But it shows:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
First, you need to make sure that the function can actually handle numpy arrays. Several options for defining piecewise functions are shown in
Plot Discrete Distribution using np.linspace(). E.g.
def LHS(w):
theta = 2*np.pi*1.455*10e-6*np.cos(np.deg2rad(w))/(900e-9)
y1 =[theta < 0, theta <= np.pi/2, theta>np.pi/2], [np.nan, np.tan(theta), np.nan])
return y1
This already allows to use the second approach, plotting a point at the index which is closest to the minimum of the difference between the two curves.
import numpy as np
import matplotlib.pyplot as plt
def LHS(w):
theta = 2*np.pi*1.455*10e-6*np.cos(np.deg2rad(w))/(900e-9)
y1 =[theta < 0, theta <= np.pi/2, theta>np.pi/2], [np.nan, np.tan(theta), np.nan])
return y1
def RHS(w):
y = ((np.sin(np.deg2rad(w)))**2-(1.440/1.455)**2)**0.5/np.cos(np.deg2rad(w))
return y
x = np.linspace(82.1, 89.8, 5000)
f = LHS(x)
g = RHS(x)
plt.plot(x, f, '-')
plt.plot(x, g, '-')
idx = np.argwhere(np.diff(np.sign(f - g)) != 0).reshape(-1) + 0
plt.plot(x[idx], f[idx], 'ro')
One may then also use scipy.optimize.fsolve to get the actual solution.
idx = np.argwhere(np.diff(np.sign(f - g)) != 0)[-1]
from scipy.optimize import fsolve
h = lambda x: LHS(x)-RHS(x)
sol = fsolve(h,x[idx])
plt.plot(sol, LHS(sol), 'ro')
Something quick and (very) dirty that seems to work (at least it gave theta value of ~89 for your parameters) - add the following to your code before the figure, after RHS_vals = [RHS(v) for v in x] line:
# build a list of differences between the values of RHS and LHS
diff = [abs(r_val- l_val) for r_val, l_val in zip(RHS_vals, LHS_vals)]
# find the minimum of absolute values of the differences
# find the index of this minimum difference, then find at which angle it occured
min_diff = min(diff)
print "Minimum difference {}".format(min_diff)
print "Theta = {}".format(x[diff.index(min_diff)])
I looked in range of 85-90:
x = np.linspace(85, 90, 500)

Find the area between two curves plotted in matplotlib (fill_between area)

I have a list of x and y values for two curves, both having weird shapes, and I don't have a function for any of them. I need to do two things:
Plot it and shade the area between the curves like the image below.
Find the total area of this shaded region between the curves.
I'm able to plot and shade the area between those curves with fill_between and fill_betweenx in matplotlib, but I have no idea on how to calculate the exact area between them, specially because I don't have a function for any of those curves.
Any ideas?
I looked everywhere and can't find a simple solution for this. I'm quite desperate, so any help is much appreciated.
Thank you very much!
EDIT: For future reference (in case anyone runs into the same problem), here is how I've solved this: connected the first and last node/point of each curve together, resulting in a big weird-shaped polygon, then used shapely to calculate the polygon's area automatically, which is the exact area between the curves, no matter which way they go or how nonlinear they are. Works like a charm! :)
Here is my code:
from shapely.geometry import Polygon
x_y_curve1 = [(0.121,0.232),(2.898,4.554),(7.865,9.987)] #these are your points for curve 1 (I just put some random numbers)
x_y_curve2 = [(1.221,1.232),(3.898,5.554),(8.865,7.987)] #these are your points for curve 2 (I just put some random numbers)
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
EDIT 2: Thank you for the answers. Like Kyle explained, this only works for positive values. If your curves go below 0 (which is not my case, as showed in the example chart), then you would have to work with absolute numbers.
The area calculation is straightforward in blocks where the two curves don't intersect: thats the trapezium as has been pointed out above. If they intersect, then you create two triangles between x[i] and x[i+1], and you should add the area of the two. If you want to do it directly, you should handle the two cases separately. Here's a basic working example to solve your problem. First, I will start with some fake data:
import numpy as np
# let us generate fake test data
x = np.arange(10)
y1 = np.random.rand(10) * 20
y2 = np.random.rand(10) * 20
Now, the main code. Based on your plot, looks like you have y1 and y2 defined at the same X points. Then we define,
z = y1-y2
dx = x[1:] - x[:-1]
cross_test = np.sign(z[:-1] * z[1:])
cross_test will be negative whenever the two graphs cross. At these points, we want to calculate the x coordinate of the crossover. For simplicity, I will calculate x coordinates of the intersection of all segments of y. For places where the two curves don't intersect, they will be useless values, and we won't use them anywhere. This just keeps the code easier to understand.
Suppose you have z1 and z2 at x1 and x2, then we are solving for x0 such that z = 0:
# (z2 - z1)/(x2 - x1) = (z0 - z1) / (x0 - x1) = -z1/(x0 - x1)
# x0 = x1 - (x2 - x1) / (z2 - z1) * z1
x_intersect = x[:-1] - dx / (z[1:] - z[:-1]) * z[:-1]
dx_intersect = - dx / (z[1:] - z[:-1]) * z[:-1]
Where the curves don't intersect, area is simply given by:
areas_pos = abs(z[:-1] + z[1:]) * 0.5 * dx # signs of both z are same
Where they intersect, we add areas of both triangles:
areas_neg = 0.5 * dx_intersect * abs(z[:-1]) + 0.5 * (dx - dx_intersect) * abs(z[1:])
Now, the area in each block x[i] to x[i+1] is to be selected, for which I use np.where:
areas = np.where(cross_test < 0, areas_neg, areas_pos)
total_area = np.sum(areas)
That is your desired answer. As has been pointed out above, this will get more complicated if the both the y graphs were defined at different x points. If you want to test this, you can simply plot it (in my test case, y range will be -20 to 20)
negatives = np.where(cross_test < 0)
positives = np.where(cross_test >= 0)
plot(x, y1)
plot(x, y2)
plot(x, z)
plt.vlines(x_intersect[negatives], -20, 20)
Define your two curves as functions f and g that are linear by segment, e.g. between x1 and x2, f(x) = f(x1) + ((x-x1)/(x2-x1))*(f(x2)-f(x1)).
Define h(x)=abs(g(x)-f(x)). Then use scipy.integrate.quad to integrate h.
That way you don't need to bother about the intersections. It will do the "trapeze summing" suggested by ch41rmn automatically.
Your set of data is quite "nice" in the sense that the two sets of data share the same set of x-coordinates. You can therefore calculate the area using a series of trapezoids.
e.g. define the two functions as f(x) and g(x), then, between any two consecutive points in x, you have four points of data:
(x1, f(x1))-->(x2, f(x2))
(x1, g(x1))-->(x2, g(x2))
Then, the area of the trapezoid is
A(x1-->x2) = ( f(x1)-g(x1) + f(x2)-g(x2) ) * (x2-x1)/2 (1)
A complication arises that equation (1) only works for simply-connected regions, i.e. there must not be a cross-over within this region:
|\ |\/|
|_| vs |/\|
The area of the two sides of the intersection must be evaluated separately. You will need to go through your data to find all points of intersections, then insert their coordinates into your list of coordinates. The correct order of x must be maintained. Then, you can loop through your list of simply connected regions and obtain a sum of the area of trapezoids.
For curiosity's sake, if the x-coordinates for the two lists are different, you can instead construct triangles. e.g.
| / \
| / \
| / \
|/ \
Overlap between triangles must be avoided, so you will again need to find points of intersections and insert them into your ordered list. The lengths of each side of the triangle can be calculated using Pythagoras' formula, and the area of the triangles can be calculated using Heron's formula.
The area_between_two_curves function in pypi library similaritymeasures (released in 2018) might give you what you need. I tried a trivial example on my side, comparing the area between a function and a constant value and got pretty close tie-back to Excel (within 2%). Not sure why it doesn't give me 100% tie-back, maybe I am doing something wrong. Worth considering though.
I had the same problem.The answer below is based on an attempt by the question author. However, shapely will not directly give the area of the polygon in purple. You need to edit the code to break it up into its component polygons and then get the area of each. After-which you simply add them up.
Area Between two lines
Consider the lines below:
Sample Two lines
If you run the code below you will get zero for area because it takes the clockwise and subtracts the anti clockwise area:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
The solution is therefore to split the polygon into smaller pieces based on where the lines intersect. Then use a for loop to add these up:
from shapely.geometry import Polygon
x_y_curve1 = [(1,1),(2,1),(3,3),(4,3)] #these are your points for curve 1
x_y_curve2 = [(1,3),(2,3),(3,1),(4,1)] #these are your points for curve 2
polygon_points = [] #creates a empty list where we will append the points to create the polygon
for xyvalue in x_y_curve1:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 1
for xyvalue in x_y_curve2[::-1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append all xy points for curve 2 in the reverse order (from last point to first point)
for xyvalue in x_y_curve1[0:1]:
polygon_points.append([xyvalue[0],xyvalue[1]]) #append the first point in curve 1 again, to it "closes" the polygon
polygon = Polygon(polygon_points)
area = polygon.area
x,y = polygon.exterior.xy
# original data
ls = LineString(np.c_[x, y])
# closed, non-simple
lr = LineString(ls.coords[:] + ls.coords[0:1])
lr.is_simple # False
mls = unary_union(lr)
mls.geom_type # MultiLineString'
Area_cal =[]
for polygon in polygonize(mls):
Area_poly = (np.asarray(Area_cal).sum())
A straightforward application of the area of a general polygon (see Shoelace formula) makes for a super-simple and fast, vectorized calculation:
def area(p):
# for p: 2D vertices of a polygon:
# area = 1/2 abs(sum(p0 ^ p1 + p1 ^ p2 + ... + pn-1 ^ p0))
# where ^ is the cross product
return np.abs(np.cross(p, np.roll(p, 1, axis=0)).sum()) / 2
Application to area between two curves. In this example, we don't even have matching x coordinates!
n0 = 10
n1 = 15
xy0 = np.c_[np.linspace(0, 10, n0), np.random.uniform(0, 10, n0)]
xy1 = np.c_[np.linspace(0, 10, n1), np.random.uniform(0, 10, n1)]
p = np.r_[xy0, xy1[::-1]]
>>> area(p)
plt.plot(*xy0.T, 'b-')
plt.plot(*xy1.T, 'r-')
p = np.r_[xy0, xy1[::-1]]
plt.fill(*p.T, alpha=.2)
For both curves having 1 million points:
n = 1_000_000
xy0 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
xy1 = np.c_[np.linspace(0, 10, n), np.random.uniform(0, 10, n)]
%timeit area(np.r_[xy0, xy1[::-1]])
# 42.9 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Simple viz of polygon area calculation
# say:
p = np.array([[0, 3], [1, 0], [3, 3], [1, 3], [1, 2]])
p_closed = np.r_[p, p[:1]]
fig, axes = plt.subplots(ncols=2, figsize=(10, 5), subplot_kw=dict(box_aspect=1), sharex=True)
ax = axes[0]
ax.plot(*p_closed.T, '.-')
ax.fill(*p_closed.T, alpha=0.6)
center = p.mean(0)
txtkwargs = dict(ha='center', va='center')
ax.text(*center, f'{area(p):.2f}', **txtkwargs)
ax = axes[1]
for a, b in zip(p_closed, p_closed[1:]):
ar = 1/2 * np.cross(a, b)
pos = ar >= 0
tri = np.c_[(0,0), a, b, (0,0)].T
# shrink a bit to make individual triangles easier to visually identify
center = tri.mean(0)
tri = (tri - center)*0.95 + center
c = 'b' if pos else 'r'
ax.plot(*tri.T, 'k')
ax.fill(*tri.T, c, alpha=0.2, zorder=2 - pos)
t = ax.text(*center, f'{ar:.1f}', color=c, fontsize=8, **txtkwargs)
t.set_bbox(dict(facecolor='white', alpha=0.8, edgecolor='none'))

Plane fitting to 4 (or more) XYZ points

I have 4 points, which are very near to be at the one plane - it is the 1,4-Dihydropyridine cycle.
I need to calculate distance from C3 and N1 to the plane, which is made of C1-C2-C4-C5.
Calculating distance is OK, but fitting plane is quite difficult to me.
1,4-DHP cycle:
1,4-DHP cycle, another view:
from array import *
from numpy import *
from scipy import *
# coordinates (XYZ) of C1, C2, C4 and C5
x = [0.274791784, -1.001679346, -1.851320839, 0.365840754]
y = [-1.155674199, -1.215133985, 0.053119249, 1.162878076]
z = [1.216239624, 0.764265677, 0.956099579, 1.198231236]
# plane equation Ax + By + Cz = D
# non-fitted plane
abcd = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
# creating distance variable
distance = zeros(4, float)
# calculating distance from point to plane
for i in range(4):
distance[i] = (x[i]*abcd[0]+y[i]*abcd[1]+z[i]*abcd[2]+abcd[3])/sqrt(abcd[0]**2 + abcd[1]**2 + abcd[2]**2)
print distance
# calculating squares
squares = distance**2
print squares
How to make sum(squares) minimized? I have tried least squares, but it is too hard for me.
That sounds about right, but you should replace the nonlinear optimization with an SVD. The following creates the moment of inertia tensor, M, and then SVD's it to get the normal to the plane. This should be a close approximation to the least-squares fit and be much faster and more predictable. It returns the point-cloud center and the normal.
def planeFit(points):
p, n = planeFit(points)
Given an array, points, of shape (d,...)
representing points in d-dimensional space,
fit an d-dimensional plane to the points.
Return a point, p, on the plane (the point-cloud centroid),
and the normal, n.
import numpy as np
from numpy.linalg import svd
points = np.reshape(points, (np.shape(points)[0], -1)) # Collapse trialing dimensions
assert points.shape[0] <= points.shape[1], "There are only {} points in {} dimensions.".format(points.shape[1], points.shape[0])
ctr = points.mean(axis=1)
x = points - ctr[:,np.newaxis]
M =, x.T) # Could also use np.cov(x) here.
return ctr, svd(M)[0][:,-1]
For example: Construct a 2D cloud at (10, 100) that is thin in the x direction and 100 times bigger in the y direction:
>>> pts = np.diag((.1, 10)).dot(randn(2,1000)) + np.reshape((10, 100),(2,-1))
The fit plane is very nearly at (10, 100) with a normal very nearly along the x axis.
>>> planeFit(pts)
(array([ 10.00382471, 99.48404676]),
array([ 9.99999881e-01, 4.88824145e-04]))
Least squares should fit a plane easily. The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
x_n y_n 1
x = b
B = z_1
In other words: Ax = B. Now solve for x which are your coefficients. But since you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
b = (A^T A)^-1 A^T B
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution: %f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("residual: {}".format(residual))
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
The solution for your points:
0.143509 x + 0.057196 y + 1.129595 = z
The fact that you are fitting to a plane is only slightly relevant here. What you are trying to do is minimize a particular function starting from a guess. For that use scipy.optimize. Note that there is no guarantee that this is the globally optimal solution, only locally optimal. A different initial condition may converge to a different result, this works well if you start close to the local minima you are seeking.
I've taken the liberty to clean up your code by taking advantage of numpy's broadcasting:
import numpy as np
# coordinates (XYZ) of C1, C2, C4 and C5
XYZ = np.array([
[0.274791784, -1.001679346, -1.851320839, 0.365840754],
[-1.155674199, -1.215133985, 0.053119249, 1.162878076],
[1.216239624, 0.764265677, 0.956099579, 1.198231236]])
# Inital guess of the plane
p0 = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
def f_min(X,p):
plane_xyz = p[0:3]
distance = (plane_xyz*X.T).sum(axis=1) + p[3]
return distance / np.linalg.norm(plane_xyz)
def residuals(params, signal, X):
return f_min(X, params)
from scipy.optimize import leastsq
sol = leastsq(residuals, p0, args=(None, XYZ))[0]
print("Solution: ", sol)
print("Old Error: ", (f_min(XYZ, p0)**2).sum())
print("New Error: ", (f_min(XYZ, sol)**2).sum())
This gives:
Solution: [ 14.74286241 5.84070802 -101.4155017 114.6745077 ]
Old Error: 0.441513295404
New Error: 0.0453564286112
This returns the 3D plane coefficients along with the RMSE of the fit.
The plane is provided in a homogeneous coordinate representation, meaning its dot product with the homogeneous coordinates of a point produces the distance between the two.
def fit_plane(points):
assert points.shape[1] == 3
centroid = points.mean(axis=0)
x = points - centroid[None, :]
U, S, Vt = np.linalg.svd(x.T # x)
normal = U[:, -1]
origin_distance = normal # centroid
rmse = np.sqrt(S[-1] / len(points))
return np.hstack([normal, -origin_distance]), rmse
Minor note: the SVD can also be directly applied to the points instead of the outer product matrix, but I found it to be slower with NumPy's SVD implementation.
U, S, Vt = np.linalg.svd(x.T, full_matrices=False)
rmse = S[-1] / np.sqrt(len(points))
Another way aside from svd to quickly reach a solution while dealing with outliers ( when you have a large data set ) is ransac :
def fit_plane(voxels, iterations=50, inlier_thresh=10): # voxels : x,y,z
inliers, planes = [], []
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
for _ in range(iterations):
random_pts = voxels[np.random.choice(voxels.shape[0], voxels.shape[1] * 10, replace=False), :]
plane_transformation, residual = fit_pts_to_plane(random_pts)
inliers.append(((z - np.matmul(xy1, plane_transformation)) <= inlier_thresh).sum())
return planes[np.array(inliers).argmax()]
def fit_pts_to_plane(voxels): # x y z (m x 3)
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
fit = np.matmul(np.matmul(np.linalg.inv(np.matmul(xy1.T, xy1)), xy1.T), z)
errors = z - np.matmul(xy1, fit)
residual = np.linalg.norm(errors)
return fit, residual
Here's one way. If your points are P[1]..P[n] then compute the mean M of these and subtract it from each, getting points p[1]..p[n]. Then compute C = Sum{ p[i]*p[i]'} (the "covariance" matrix of the points). Next diagonalise C, that is find orthogonal U and diagonal E so that C = U*E*U'. If your points are indeed on a plane then one of the eigenvalues (ie the diagonal entries of E) will be very small (with perfect arithmetic it would be 0). In any case if the j'th one of these is the smallest, then let the j'th column of U be (A,B,C) and compute D = -M'*N. These parameters define the "best" plane, the one such that the sum of the squares of the distances from the P[] to the plane is least.
