Finding the Closest Points between Two Cubic Splines with Python and Numpy - python

I'm looking for a way to analyze two cubic splines and find the point where they come the closest to each other. I've seen a lot of solutions and posts but I've been unable to implement the methods suggested. I know that the closest point will be one of the end-points of the two curves or a point where the first derivative of both curves is equal. Checking the end points is easy. Finding the points where the first derivatives match is hard.
Given:
Curve 0 is B(t) (red)
Curve 1 is C(s) (blue)
A candidate for closest point is where:
B'(t) = C'(s)
The first derivative of each curve takes the following form:
Where the a, b, c coefficients are formed from the control points of the curves:
a=P1-P0
b=P2-P1
c=P3-P2
Taking the 4 control points for each cubic spline I can get each curve's parametric sections into a matrix form that can be expressed with Numpy with the following Python code:
def test_closest_points():
# Control Points for the two qubic splines.
spline_0 = [(1,28), (58,93), (113,95), (239,32)]
spline_1 = [(58, 241), (26,76), (225,83), (211,205)]
first_derivative_matrix = np.array([[3, -6, 3], [-6, 6, 0], [3, 0, 0]])
spline_0_x_A = spline_0[1][0] - spline_0[0][0]
spline_0_x_B = spline_0[2][0] - spline_0[1][0]
spline_0_x_C = spline_0[3][0] - spline_0[2][0]
spline_0_y_A = spline_0[1][1] - spline_0[0][1]
spline_0_y_B = spline_0[2][1] - spline_0[1][1]
spline_0_y_C = spline_0[3][1] - spline_0[2][1]
spline_1_x_A = spline_1[1][0] - spline_1[0][0]
spline_1_x_B = spline_1[2][0] - spline_1[1][0]
spline_1_x_C = spline_1[3][0] - spline_1[2][0]
spline_1_y_A = spline_1[1][1] - spline_1[0][1]
spline_1_y_B = spline_1[2][1] - spline_1[1][1]
spline_1_y_C = spline_1[3][1] - spline_1[2][1]
spline_0_first_derivative_x_coefficients = np.array([[spline_0_x_A], [spline_0_x_B], [spline_0_x_C]])
spline_0_first_derivative_y_coefficients = np.array([[spline_0_y_A], [spline_0_y_B], [spline_0_y_C]])
spline_1_first_derivative_x_coefficients = np.array([[spline_1_x_A], [spline_1_x_B], [spline_1_x_C]])
spline_1_first_derivative_y_coefficients = np.array([[spline_1_y_A], [spline_1_y_B], [spline_1_y_C]])
# Show All te matrix values
print 'first_derivative_matrix:'
print first_derivative_matrix
print
print 'spline_0_first_derivative_x_coefficients:'
print spline_0_first_derivative_x_coefficients
print
print 'spline_0_first_derivative_y_coefficients:'
print spline_0_first_derivative_y_coefficients
print
print 'spline_1_first_derivative_x_coefficients:'
print spline_1_first_derivative_x_coefficients
print
print 'spline_1_first_derivative_y_coefficients:'
print spline_1_first_derivative_y_coefficients
print
# Now taking B(t) as spline_0 and C(s) as spline_1, I need to find the values of t and s where B'(t) = C'(s)
This post has some good high-level advice but I'm unsure how to implement a solution in python that can find the correct values for t and s that have matching first derivatives (slopes). The B'(t) - C'(s) = 0 problem seems like a matter of finding roots. Any advice on how to do it with python and Numpy would be greatly appreciated.

Using Numpy assumes that the problem should be solved numerically. Without loss of generality we can treat that 0<s<=1 and 0<t<=1. You can use SciPy package to solve the problem numerically, e.g.
from scipy.optimize import minimize
import numpy as np
def B(t):
"""Assumed for simplicity: 0 < t <= 1
"""
return np.sin(6.28 * t), np.cos(6.28 * t)
def C(s):
"""0 < s <= 1
"""
return 10 + np.sin(3.14 * s), 10 + np.cos(3.14 * s)
def Q(x):
"""Distance function to be minimized
"""
b = B(x[0])
c = C(x[1])
return (b[0] - c[0]) ** 2 + (b[1] - c[1]) ** 2
res = minimize(Q, (0.5, 0.5))
print("B-Point: ", B(res.x[0]))
print("C-Point: ", C(res.x[1]))
B-Point: (0.7071067518175205, 0.7071068105555733)
C-Point: (9.292893243165555, 9.29289319446135)
This is example for two circles (one circle and arc). This will likely work with splines.

Your assumption of B'(t) = C'(s) is too strong.
Derivatives have direction and magnitude. Directions must coincide in the candidate points, but magnitudes might differ.
To find points with the same derivative slopes and the closest distance you can solve equation system (of course, high power :( )
yb'(t) * xc'(u) - yc'(t) * xb'(u) = 0 //vector product of (anti)collinear vectors is zero
((xb(t) - xc(u))^2 + (xb(t) - xc(u))^2)' = 0 //distance derivative

You can use the function fmin also:
import numpy as np
import matplotlib.pylab as plt
from scipy.optimize import fmin
def BCubic(t, P0, P1, P2, P3):
a=P1-P0
b=P2-P1
c=P3-P2
return a*3*(1-t)**2 + b*6*(1-t)*t + c*3*t**2
def B(t):
return BCubic(t,4,2,3,1)
def C(t):
return BCubic(t,1,4,3,4)
def f(t):
# L1 or manhattan distance
return abs(B(t) - C(t))
init = 0 # 2
tmin = fmin(f,np.array([init]))
#Optimization terminated successfully.
#Current function value: 2.750000
# Iterations: 23
# Function evaluations: 46
print(tmin)
# [0.5833125]
tmin = tmin[0]
t = np.linspace(0, 2, 100)
plt.plot(t, B(t), label='B')
plt.plot(t, C(t), label='C')
plt.plot(t, abs(B(t)-C(t)), label='|B-C|')
plt.plot(tmin, B(tmin), 'r.', markersize=12, label='min')
plt.axvline(x=tmin, linestyle='--', color='k')
plt.legend()
plt.show()

Related

how to correctly display the integral on the graph?

I counted the integral and I want to display it on the graph, but I wonder how it should be correctly placed on the graph. It seems to me that plt.plot() alone is not enough, or maybe I am wrong, I would like to know the correct way to display this result in a graph.
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import quad
def integral(x, a, b):
return a * np.log(x + b)
a = 3
b = 2
I = quad(integral, 1, 5, args=(a, b))
print(I)
plt.plot()
plt.show()
I assume you know calculus but not so much about programming.
matplotlib.plot only plots data, so you have to construct an array with the datapoints you want to plot. Also the result of quad is a pair of numbers, the definite integral approximation and an estimated bound for the numerical errors.
If you want to plot the antiderivative of a function you will have to compute the integral for each of the points you want to display.
Here is an example in which I create an array and compute the integral between each element a[i] < x < a[i+1], and use a cumulative sum to get the curve.
For reference I also plotted the analytic integral
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import quad
def integral(x, a, b):
return a * np.log(x + b)
def II(a, b, x0, x1 = None):
if x1 is None:
# indefinite integral
return a * ((x0 + b) * np.log(x0 + b) - x0)
else:
# definite integral
return II(a,b, x1) - II(a,b, x0)
a = 3
b = 2
# plot 100 points equally spaced for 1 < x < 5
x = np.linspace(1, 5, 100)
# The first column of I is the value of the integral, the second is the accumulated error
# I[i,0] is the integral from x[0] to x[i+1].
I = np.cumsum([quad(integral, x[i], x[i+1], args=(a, b)) for i in range(len(x) - 1)], axis=0);
# Now you can plot I
plt.plot(x[1:], I[:,0])
plt.plot(x[1:], II(a, b, x[0], x[1:]), '--r')
plt.show()
Also take the tour if you didn't already, to know how to classify the answers you receive.
I hope this answers your question.

Plotting orthogonal distances in python

Given a set of points and a line in 2D, I would like to plot the orthogonal distance between each point and the line. Any suggestions?
Find the equation of your given line in the form of y = m*x + b where m is slope and b is your y-intercept. The slope of the perpendicular line is the negative inverse of your known slope (i.e. m2 = -1/m). Use the given point and the new slope m2 to get the equation of the line perpendicular to the given line which goes through your point. Set the second line equal to the first and solve for x and y. This is where the two lines intersect. Get the difference between the intersection and find the magnitude to determine distance between the given line and the given point:
distance = ((x2 - x)**2 + (y2 - y)**2)**0.5
where [x2, y2] is the given point and [x, y] is the intersection.
More precisely, the following image was generated to illustrate this technique with sample code below:
import matplotlib.pyplot as plt
import numpy as np
# points follow [x, y] format
line_point1 = [2, 3]
line_point2 = [6, 8]
random_point = [-6, 5]
def line(x, get_eq=False):
m = (line_point1[1] - line_point2[1])/(line_point1[0] - line_point2[0])
b = line_point1[1] - m*line_point1[0]
if get_eq:
return m, b
else:
return m*x + b
def perpendicular_line(x, get_eq=False):
m, b = line(0, True)
m2 = -1/m
b2 = random_point[1] - m2*random_point[0]
if get_eq:
return m2, b2
else:
return m2*x + b2
def get_intersection():
m, b = line(0, True)
m2, b2 = perpendicular_line(0, True)
x = (b2 - b) / (m - m2)
y = line(x)
return [x, y]
domain = np.linspace(-10, 10)
plt.figure(figsize=(8, 9))
plt.plot(domain, [line(x) for x in domain], label='given line')
plt.plot(random_point[0], random_point[1], 'ro', label='given point')
plt.plot(domain, [perpendicular_line(x) for x in domain], '--', color='orange', label='perpendicular line')
intersection = get_intersection()
plt.plot(intersection[0], intersection[1], 'go', label='intersection')
plt.plot([intersection[0], random_point[0]], [intersection[1], random_point[1]], color='black', label='distance')
plt.legend()
plt.grid()
plt.show()
distance = ((random_point[0] - intersection[0])**2 + (random_point[1] -
intersection[1])**2)**0.5
print(distance)
This is more like a math question.
What #jacob says is perfect solution using coordinate geometry. If you prefer to use vector math (and hence numpy arrays as vectors), you can go about it like this:
Consider the vector equation of a line: L = A + ql
(q is a free parameter, the one we wish to find)
The position vector of your point: P
The orthogonality condition (just the dot product being zero): L . P = 0
Hence, (A + ql) . P = 0
or, q = - (A . P / l . P)
(Bold denotes a vector, bold-small denotes a unit vector, all else are scalars)
We've found q; substituting q in the vector equation of the line yields the position vector of the point which intersects with the perpendicular from the point dropped on the line. Now just find the distance between the two points, which is the magnitude of the difference vector:
d = |P - L(q)|
A numpy implementation is pretty straightforward:
(Define A, l and P as numpy arrays first)
...
L = A + lambda q: q*l // define the line as a function of q
q = - numpy.dot(A, P)/numpy.dot(l, P) // find q subject to condition
d = numpy.linalg.norm(P - L(q)) // find the norm of the difference vector
The advantage to this method is, it works in N-dimensions as well.
Here is a resource to refer to, on vector equations of lines.

fmin_slsqp returns initial guess finding the minimum of cubic spline

I am trying to find the minimum of a natural cubic spline. I have written the following code to find the natural cubic spline. (I have been given test data and have confirmed this method is correct.) Now I can not figure out how to find the minimum of this function.
This is the data
xdata = np.linspace(0.25, 2, 8)
ydata = 10**(-12) * np.array([1,2,1,2,3,1,1,2])
This is the function
import scipy as sp
import numpy as np
import math
from numpy.linalg import inv
from scipy.optimize import fmin_slsqp
from scipy.optimize import minimize, rosen, rosen_der
def phi(x, xd,yd):
n = len(xd)
h = np.array(xd[1:n] - xd[0:n-1])
f = np.divide(yd[1:n] - yd[0:(n-1)],h)
q = [0]*(n-2)
for i in range(n-2):
q[i] = 3*(f[i+1] - f[i])
A = np.zeros(((n-2),(n-2)))
#define A for j=0
A[0,0] = 2*(h[0] + h[1])
A[0,1] = h[1]
#define A for j = n-2
A[-1,-2] = h[-2]
A[-1,-1] = 2*(h[-2] + h[-1])
#define A for in the middle
for j in range(1,(n-3)):
A[j,j-1] = h[j]
A[j,j] = 2*(h[j] + h[j+1])
A[j,j+1] = h[j+1]
Ainv = inv(A)
B = Ainv.dot(q)
b = (n)*[0]
b[1:(n-1)] = B
# now we find a, b, c and d
a = [0]*(n-1)
c = [0]*(n-1)
d = [0]*(n-1)
s = [0]*(n-1)
for r in range(n-1):
a[r] = 1/(3*h[r]) * (b[r + 1] - b[r])
c[r] = f[r] - h[r]*((2*b[r] + b[r+1])/3)
d[r] = yd[r]
#solution 1 start
for m in range(n-1):
if xd[m] <= x <= xd[m+1]:
s = a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m]
return(s)
#solution 1 end
I want to find the minimum on the domain of my xdata, so a fmin didn't work as you can not define bounds there. I tried both fmin_slsqp and minimize. They are not compatible with the phi function I wrote so I rewrote phi(x, xd,yd) and added an extra variable such that phi is phi(x, xd,yd, m). M indicates in which subfunction of the spline we are calculating a solution (from x_m to x_m+1). In the code we replaced #solution 1 by the following
# solution 2 start
return(a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m])
# solution 2 end
To find the minimum in a domain x_m to x_(m+1) we use the following code: (we use an instance where m=0, so x from 0.25 to 0.5. The initial guess is 0.3)
fmin_slsqp(phi, x0 = 0.3, bounds=([(0.25,0.5)]), args=(xdata, ydata, 0))
What I would then do (I know it's crude), is iterate this with a for loop to find the minimum on all subdomains and then take the overall minimum. However, the function fmin_slsqp constantly returns the initial guess as the minimum. So there is something wrong, which I do not know how to fix. If you could help me this would be greatly appreciated. Thanks for reading this far.
When I plot your function phi and the data you feed in, I see that its range is of the order of 1e-12. However, fmin_slsqp is unable to handle that level of precision and fails to find any change in your objective.
The solution I propose is scaling the return of your objective by the same order of precision like so:
return(s*1e12)
Then you get good results.
>>> sol = fmin_slsqp(phi, x0=0.3, bounds=([(0.25, 0.5)]), args=(xdata, ydata))
>>> print(sol)
Optimization terminated successfully. (Exit mode 0)
Current function value: 1.0
Iterations: 2
Function evaluations: 6
Gradient evaluations: 2
[ 0.25]

Getting spline equation from UnivariateSpline object

I'm using UnivariateSpline to construct piecewise polynomials for some data that I have. I would then like to use these splines in other programs (either in C or FORTRAN) and so I would like to understand the equation behind the generated spline.
Here is my code:
import numpy as np
import scipy as sp
from scipy.interpolate import UnivariateSpline
import matplotlib.pyplot as plt
import bisect
data = np.loadtxt('test_C12H26.dat')
Tmid = 800.0
print "Tmid", Tmid
nmid = bisect.bisect(data[:,0],Tmid)
fig = plt.figure()
plt.plot(data[:,0], data[:,7],ls='',marker='o',markevery=20)
npts = len(data[:,0])
#print "npts", npts
w = np.ones(npts)
w[0] = 100
w[nmid] = 100
w[npts-1] = 100
spline1 = UnivariateSpline(data[:nmid,0],data[:nmid,7],s=1,w=w[:nmid])
coeffs = spline1.get_coeffs()
print coeffs
print spline1.get_knots()
print spline1.get_residual()
print coeffs[0] + coeffs[1] * (data[0,0] - data[0,0]) \
+ coeffs[2] * (data[0,0] - data[0,0])**2 \
+ coeffs[3] * (data[0,0] - data[0,0])**3, \
data[0,7]
print coeffs[0] + coeffs[1] * (data[nmid,0] - data[0,0]) \
+ coeffs[2] * (data[nmid,0] - data[0,0])**2 \
+ coeffs[3] * (data[nmid,0] - data[0,0])**3, \
data[nmid,7]
print Tmid,data[-1,0]
spline2 = UnivariateSpline(data[nmid-1:,0],data[nmid-1:,7],s=1,w=w[nmid-1:])
print spline2.get_coeffs()
print spline2.get_knots()
print spline2.get_residual()
plt.plot(data[:,0],spline1(data[:,0]))
plt.plot(data[:,0],spline2(data[:,0]))
plt.savefig('test.png')
And here is the resulting plot. I believe I have valid splines for each interval but it looks like my spline equation is not correct... I can't find any reference to what it is supposed to be in the scipy documentation. Anybody knows? Thanks !
The scipy documentation does not have anything to say about how one can take the coefficients and manually generate the spline curve. However, it is possible to figure out how to do this from the existing literature on B-splines. The following function bspleval shows how to construct the B-spline basis functions (the matrix B in the code), from which one can easily generate the spline curve by multiplying the coefficients with the highest-order basis functions and summing:
def bspleval(x, knots, coeffs, order, debug=False):
'''
Evaluate a B-spline at a set of points.
Parameters
----------
x : list or ndarray
The set of points at which to evaluate the spline.
knots : list or ndarray
The set of knots used to define the spline.
coeffs : list of ndarray
The set of spline coefficients.
order : int
The order of the spline.
Returns
-------
y : ndarray
The value of the spline at each point in x.
'''
k = order
t = knots
m = alen(t)
npts = alen(x)
B = zeros((m-1,k+1,npts))
if debug:
print('k=%i, m=%i, npts=%i' % (k, m, npts))
print('t=', t)
print('coeffs=', coeffs)
## Create the zero-order B-spline basis functions.
for i in range(m-1):
B[i,0,:] = float64(logical_and(x >= t[i], x < t[i+1]))
if (k == 0):
B[m-2,0,-1] = 1.0
## Next iteratively define the higher-order basis functions, working from lower order to higher.
for j in range(1,k+1):
for i in range(m-j-1):
if (t[i+j] - t[i] == 0.0):
first_term = 0.0
else:
first_term = ((x - t[i]) / (t[i+j] - t[i])) * B[i,j-1,:]
if (t[i+j+1] - t[i+1] == 0.0):
second_term = 0.0
else:
second_term = ((t[i+j+1] - x) / (t[i+j+1] - t[i+1])) * B[i+1,j-1,:]
B[i,j,:] = first_term + second_term
B[m-j-2,j,-1] = 1.0
if debug:
plt.figure()
for i in range(m-1):
plt.plot(x, B[i,k,:])
plt.title('B-spline basis functions')
## Evaluate the spline by multiplying the coefficients with the highest-order basis functions.
y = zeros(npts)
for i in range(m-k-1):
y += coeffs[i] * B[i,k,:]
if debug:
plt.figure()
plt.plot(x, y)
plt.title('spline curve')
plt.show()
return(y)
To give an example of how this can be used with Scipy's existing univariate spline functions, the following is an example script. This takes the input data and uses Scipy's functional and also its object-oriented approach to spline fitting. Taking the coefficients and knot points from either of the two and using these as inputs to our manually-calculated routine bspleval, we reproduce the same curve that they do. Note that the difference between the manually evaluated curve and Scipy's evaluation method is so small that it is almost certainly floating-point noise.
x = array([-273.0, -176.4, -79.8, 16.9, 113.5, 210.1, 306.8, 403.4, 500.0])
y = array([2.25927498e-53, 2.56028619e-03, 8.64512988e-01, 6.27456769e+00, 1.73894734e+01,
3.29052124e+01, 5.14612316e+01, 7.20531200e+01, 9.40718450e+01])
x_nodes = array([-273.0, -263.5, -234.8, -187.1, -120.3, -34.4, 70.6, 194.6, 337.8, 500.0])
y_nodes = array([2.25927498e-53, 3.83520726e-46, 8.46685318e-11, 6.10568083e-04, 1.82380809e-01,
2.66344008e+00, 1.18164677e+01, 3.01811501e+01, 5.78812583e+01, 9.40718450e+01])
## Now get scipy's spline fit.
k = 3
tck = splrep(x_nodes, y_nodes, k=k, s=0)
knots = tck[0]
coeffs = tck[1]
print('knot points=', knots)
print('coefficients=', coeffs)
## Now try scipy's object-oriented version. The result is exactly the same as "tck": the knots are the
## same and the coeffs are the same, they are just queried in a different way.
uspline = UnivariateSpline(x_nodes, y_nodes, s=0)
uspline_knots = uspline.get_knots()
uspline_coeffs = uspline.get_coeffs()
## Here are scipy's native spline evaluation methods. Again, "ytck" and "y_uspline" are exactly equal.
ytck = splev(x, tck)
y_uspline = uspline(x)
y_knots = uspline(knots)
## Now let's try our manually-calculated evaluation function.
y_eval = bspleval(x, knots, coeffs, k, debug=False)
plt.plot(x, ytck, label='tck')
plt.plot(x, y_uspline, label='uspline')
plt.plot(x, y_eval, label='manual')
## Next plot the knots and nodes.
plt.plot(x_nodes, y_nodes, 'ko', markersize=7, label='input nodes') ## nodes
plt.plot(knots, y_knots, 'mo', markersize=5, label='tck knots') ## knots
plt.xlim((-300.0,530.0))
plt.legend(loc='best', prop={'size':14})
plt.figure()
plt.title('difference')
plt.plot(x, ytck-y_uspline, label='tck-uspl')
plt.plot(x, ytck-y_eval, label='tck-manual')
plt.legend(loc='best', prop={'size':14})
plt.show()
The coefficients given by get_coeffs are B-spline (Basis spline) coefficients, described here: B-spline (Wikipedia)
Probably whatever other program/language you will be using has an implementation. Supply the knot locations and coefficients, and you should be all set.

Plane fitting to 4 (or more) XYZ points

I have 4 points, which are very near to be at the one plane - it is the 1,4-Dihydropyridine cycle.
I need to calculate distance from C3 and N1 to the plane, which is made of C1-C2-C4-C5.
Calculating distance is OK, but fitting plane is quite difficult to me.
1,4-DHP cycle:
1,4-DHP cycle, another view:
from array import *
from numpy import *
from scipy import *
# coordinates (XYZ) of C1, C2, C4 and C5
x = [0.274791784, -1.001679346, -1.851320839, 0.365840754]
y = [-1.155674199, -1.215133985, 0.053119249, 1.162878076]
z = [1.216239624, 0.764265677, 0.956099579, 1.198231236]
# plane equation Ax + By + Cz = D
# non-fitted plane
abcd = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
# creating distance variable
distance = zeros(4, float)
# calculating distance from point to plane
for i in range(4):
distance[i] = (x[i]*abcd[0]+y[i]*abcd[1]+z[i]*abcd[2]+abcd[3])/sqrt(abcd[0]**2 + abcd[1]**2 + abcd[2]**2)
print distance
# calculating squares
squares = distance**2
print squares
How to make sum(squares) minimized? I have tried least squares, but it is too hard for me.
That sounds about right, but you should replace the nonlinear optimization with an SVD. The following creates the moment of inertia tensor, M, and then SVD's it to get the normal to the plane. This should be a close approximation to the least-squares fit and be much faster and more predictable. It returns the point-cloud center and the normal.
def planeFit(points):
"""
p, n = planeFit(points)
Given an array, points, of shape (d,...)
representing points in d-dimensional space,
fit an d-dimensional plane to the points.
Return a point, p, on the plane (the point-cloud centroid),
and the normal, n.
"""
import numpy as np
from numpy.linalg import svd
points = np.reshape(points, (np.shape(points)[0], -1)) # Collapse trialing dimensions
assert points.shape[0] <= points.shape[1], "There are only {} points in {} dimensions.".format(points.shape[1], points.shape[0])
ctr = points.mean(axis=1)
x = points - ctr[:,np.newaxis]
M = np.dot(x, x.T) # Could also use np.cov(x) here.
return ctr, svd(M)[0][:,-1]
For example: Construct a 2D cloud at (10, 100) that is thin in the x direction and 100 times bigger in the y direction:
>>> pts = np.diag((.1, 10)).dot(randn(2,1000)) + np.reshape((10, 100),(2,-1))
The fit plane is very nearly at (10, 100) with a normal very nearly along the x axis.
>>> planeFit(pts)
(array([ 10.00382471, 99.48404676]),
array([ 9.99999881e-01, 4.88824145e-04]))
Least squares should fit a plane easily. The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
...
x_n y_n 1
And
a
x = b
c
And
z_0
B = z_1
...
z_n
In other words: Ax = B. Now solve for x which are your coefficients. But since you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
a
b = (A^T A)^-1 A^T B
c
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET = 5
EXTENTS = 5
NOISE = 5
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution: %f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual: {}".format(residual))
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
The solution for your points:
0.143509 x + 0.057196 y + 1.129595 = z
The fact that you are fitting to a plane is only slightly relevant here. What you are trying to do is minimize a particular function starting from a guess. For that use scipy.optimize. Note that there is no guarantee that this is the globally optimal solution, only locally optimal. A different initial condition may converge to a different result, this works well if you start close to the local minima you are seeking.
I've taken the liberty to clean up your code by taking advantage of numpy's broadcasting:
import numpy as np
# coordinates (XYZ) of C1, C2, C4 and C5
XYZ = np.array([
[0.274791784, -1.001679346, -1.851320839, 0.365840754],
[-1.155674199, -1.215133985, 0.053119249, 1.162878076],
[1.216239624, 0.764265677, 0.956099579, 1.198231236]])
# Inital guess of the plane
p0 = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
def f_min(X,p):
plane_xyz = p[0:3]
distance = (plane_xyz*X.T).sum(axis=1) + p[3]
return distance / np.linalg.norm(plane_xyz)
def residuals(params, signal, X):
return f_min(X, params)
from scipy.optimize import leastsq
sol = leastsq(residuals, p0, args=(None, XYZ))[0]
print("Solution: ", sol)
print("Old Error: ", (f_min(XYZ, p0)**2).sum())
print("New Error: ", (f_min(XYZ, sol)**2).sum())
This gives:
Solution: [ 14.74286241 5.84070802 -101.4155017 114.6745077 ]
Old Error: 0.441513295404
New Error: 0.0453564286112
This returns the 3D plane coefficients along with the RMSE of the fit.
The plane is provided in a homogeneous coordinate representation, meaning its dot product with the homogeneous coordinates of a point produces the distance between the two.
def fit_plane(points):
assert points.shape[1] == 3
centroid = points.mean(axis=0)
x = points - centroid[None, :]
U, S, Vt = np.linalg.svd(x.T # x)
normal = U[:, -1]
origin_distance = normal # centroid
rmse = np.sqrt(S[-1] / len(points))
return np.hstack([normal, -origin_distance]), rmse
Minor note: the SVD can also be directly applied to the points instead of the outer product matrix, but I found it to be slower with NumPy's SVD implementation.
U, S, Vt = np.linalg.svd(x.T, full_matrices=False)
rmse = S[-1] / np.sqrt(len(points))
Another way aside from svd to quickly reach a solution while dealing with outliers ( when you have a large data set ) is ransac :
def fit_plane(voxels, iterations=50, inlier_thresh=10): # voxels : x,y,z
inliers, planes = [], []
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
for _ in range(iterations):
random_pts = voxels[np.random.choice(voxels.shape[0], voxels.shape[1] * 10, replace=False), :]
plane_transformation, residual = fit_pts_to_plane(random_pts)
inliers.append(((z - np.matmul(xy1, plane_transformation)) <= inlier_thresh).sum())
planes.append(plane_transformation)
return planes[np.array(inliers).argmax()]
def fit_pts_to_plane(voxels): # x y z (m x 3)
# https://math.stackexchange.com/questions/99299/best-fitting-plane-given-a-set-of-points
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
fit = np.matmul(np.matmul(np.linalg.inv(np.matmul(xy1.T, xy1)), xy1.T), z)
errors = z - np.matmul(xy1, fit)
residual = np.linalg.norm(errors)
return fit, residual
Here's one way. If your points are P[1]..P[n] then compute the mean M of these and subtract it from each, getting points p[1]..p[n]. Then compute C = Sum{ p[i]*p[i]'} (the "covariance" matrix of the points). Next diagonalise C, that is find orthogonal U and diagonal E so that C = U*E*U'. If your points are indeed on a plane then one of the eigenvalues (ie the diagonal entries of E) will be very small (with perfect arithmetic it would be 0). In any case if the j'th one of these is the smallest, then let the j'th column of U be (A,B,C) and compute D = -M'*N. These parameters define the "best" plane, the one such that the sum of the squares of the distances from the P[] to the plane is least.

Categories