I want to fit a plane to a set of points (x, y, z) in Python. I found various answers how to perform the fitting if the error is measured with respect to the z-axis but I want to consider errors in orthogonal direction. I found the following question (Best fit plane by minimizing orthogonal distances) which addresses the same question - but it's not clear to me how to implement this in Python (likely with NumPy/SciPy). Further details regarding the mathematical derivation can also be found here: http://www.ncorr.com/download/publications/eberlyleastsquares.pdf (section 2).
The first link you gave does describe the algorithm for orthogonal distance fitting, but rather tersely. Here, in case it helps, is a more prolix description:
I suppose you have points (in your case 3d, but the dimension makes no odds to the algotithm) P[i], i=1..N
You want to find a (hyper-) plane that is of mininmal orthogonal distance from your points.
A hyper-plane can be described by a unit vector n and a scalar d. The set of points on the plane is
{ P | n.P + d = 0 }
and the (orthogonal) distance of a point P from the plane is
n.P + d
So we want to find n and d to minimise
Q(n,d) = Sum{ i | (n.P[i]+d)*(n.P[i]+d) } /N
(The division by N isn't essential, and makes no difference to the values of n and d that are found, but to my mind makes the algebra neater)
The first thing to notice is that if we knew n, the d that minimises Q will be
d = -n.Pbar where
Pbar = Sum{ i | P[i]}/N, the mean of the P[]
We may as well use this value of d, so that, after a little algebra the problem reduces to minimising Q^:
Q^(n) = Sum{ i | (n.P[i]-n.Pbar)*(n.P[i]-n.Pbar) } /N
= n' * C * n
where
C = Sum{ i | (P[i]-Pbar)*(P[i]-Pbar) } /N
The form of Q^ tells us that the value of n to minimise Q^ will be an eigenvector of C correseponding to a minimal eigenvalue.
So (sorry I can't give code but my python is contemptible):
a/ compute
Pbar = Sum{ i | P[i]}/N, the mean of the points
b/ compute
C = Sum{ i | (P[i]-Pbar)*(P[i]-Pbar) } /N, the covariance matrix of the points
c/ diagonalise C, and pick out a minimal eigenvalue and the corresponding eigenvector n
d/ compute
d = -Pbar.n
Then n, d define the hyperplane you want.
I've also had to deal with this situation and at first the mathematical notation can be overwhelming, but in the end the solution is fairly simple.
Once you get the intuition that the vector (A,B,C) that defines the best fitting plane Ax+By+Cz+D=0 is the one that explains the minimum variance of your set of coordinates, then the solution is straightforward.
First thing to do is center your coordinates (this way D will be 0 in your plane equation)
coords -= coords.mean(axis=0)
Then you have 2 options to get the vector you are interested in: (1) use the PCA implementation from sklearn or scipy to get the vector that explains minimal variance
pca = PCA(n_components=3)
pca.fit(coords)
# The last component/vector is the one with minimal variance, see PCA documentation
normal_vector = pca.components_[-1]
(2) re-implement the procedure described in the Geometric Tool reference you've linked.
#njit
def get_best_fitting_plane_vector(coords):
# Calculate the covariance matrix of the coordinates
covariance_matrix = np.cov(coords, rowvar=False) # Variables = columns
# Calculate the eigenvalues & eigenvectors of the covariance matrix
e_val, e_vect = np.linalg.eig(covariance_matrix)
# The normal vector to the plane is the eigenvector associated to the minimum eigenvalue
min_eval = np.argmin(e_val)
normal_vector = e_vect[:, min_eval]
return normal_vector
In terms of speed, the re-implemented procedure is faster than using PCA, and can be a lot faster if you use numba (just decorate the function with #njit).
Based on your second refernce
[]
Say you have n samples (x,y,z)
I'll call the 3 terms M*A=V, and define the column arrays
X=[ x_0, x_1 .. x_n ]'
Y=[ y_0, y_1 .. y_n ]'
Z=[ z_0, z_1 .. z_n ]'
Define the (n by 3) matrix XY1=[X,Y,1n]:
[[x_0,y_0,1],
XY1= [x_1,y_1,1],
...
[x_n,y_n,1]]
The matrix M can be obtained as
M = XY1' * XY1
Where apostrophe (') is the transposition operator and (*) the matrix product.
And the array V is
V = XY1'*Z
The least squares solution can be obtained through the moore-penrose pseoudoinverse: [(M'*M)^-1 * M']
~A = [(M'*M)^-1 * M'] * V
Sample code:
import numpy as np
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
#Input your values
A=3
B=2
C=1
#reserve memory
xy1=np.ones([n,3])
#Make random data, n ( x,y ) tuples.
n=30 #samples
xy1[:,:2]=np.random.rand(n,2)
#plane: A*x+B*y+C = z , the z coord is calculated from random x,y
z=xy1.dot (np.array([[A,B,C],]).transpose() )
#addnoise
xy1[:,:2]+=np.random.normal(scale=0.05,size=[n,2])
z+=np.random.normal(scale=0.05,size=[n,1])
#calculate M and V
M=xy1.transpose().dot(xy1)
V=xy1.transpose().dot(z)
#pseudoinverse:
Mp=np.linalg.inv(M.transpose().dot(M)).dot(M.transpose())
#Least-squares Solution
ABC= Mp.dot(V)
Output
In [24]: ABC
Out[24]:
array([[3.11395111],
[2.02909874],
[1.01340411]])
Related
I'm trying to find the distance between a fitted hyperplane and five points. Most of the responses I've read use SVM, but I'm not trying to do a classification problem. I know there are probably multiple ways to do this in Python, but I'm a little stumped.
As an example here are my points:
[[ 163.3828172 169.65537306 144.69201418]
[-212.50951396 -167.06555958 56.69388025]
[-164.65129832 -163.42420063 -149.97008725]
[ 41.8704004 52.2538316 14.0683657 ]
[-128.38386078 -102.76840542 -303.4960438 ]]
To find the equation of a fitted plane I use SVD to compute the coefficients ax + by + cz - b = 0.
def fit_plane(points):
assert points.shape[1] == 3
centroid = points.mean(axis=0)
x = points - centroid[None, :]
U, S, Vt = np.linalg.svd(x.T # x)
#normal vector of best fitting plane is the left
#singular vector corresponding to the least singular value
normal = U[:, -1]
#calculate the distance from origin
origin_distance = normal # centroid
return np.hstack([normal, -origin_distance])
fit_plane(X)
Giving the equation:
-0.67449074x + 0.73767288y -0.03001614z -10.75632119 = 0
Now how do I calculate the distance between the points and the hyperplane? The answer I've seen used in conjunction with SVMs is d = |w^Tx +b|/||w||, but I don't know how to go from the equation I have already.
You can find the distance between an equation π and a point P by dropping a perpendicular N from P to π and get the point A where N and π intersect. The distance you are looking for is the distance between A and P.
This video explains the math of finding A (although it is about finding the reflection, finding A is part of it).
First of all, I know that these threads exist! So bear with me, my question is not fully answered by them.
As an example assume we are in a 4-dimensional vector space, i.e R^4. We are looking at the two linear equations:
3*x1 - 2* x2 + 7*x3 - 2*x4 = 6
1*x1 + 3* x2 - 2*x3 + 5*x4 = -2
The actual questions is: Is there a way to generate a number N of points that solve both of these equations making use of the linear solvers from NumPy etc?
The main problem with all python libraries I have tried so far is: they need n equations for a n-dimensional space
Solving the problem is very easy for one equation, since you can simply use n-1 randomly generated vlaues and adapt the last one such that the vector solves the equation.
My expected result would be a list of N "randomly" generated points that solve k linear equations in an n-dimensional space, where k<n.
A system of linear equations with more variables than equations is known as an underdetermined system.
An underdetermined linear system has either no solution or infinitely many solutions.
...
There are algorithms to decide whether an underdetermined system has solutions, and if it has any, to express all solutions as linear functions of k of the variables (same k as above). The simplest one is Gaussian elimination.
As you say, many functions available in libraries (e.g. np.linalg.solve) require a square matrix (i.e. n equations for n unknowns), what you are looking for is an implementation of Gaussian elimination for non square linear systems.
This isn't 'random', but np.linalg.lstsq (least square) is will solve non-square matrices:
Return the least-squares solution to a linear matrix equation.
Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.
For more info, see:
solving Ax =b for a non-square matrix A using python
Since you have an underdetermined system of equations (too few constraints for your solutions, or fewer equations than variables) you can just pick some arbitrary values for x3 and x4 and solve the system in x1, x2 (this has 2 variables/2 equations).
You will just need to check that the resulting system is not inconsistent (i.e. it admits no solution) and that there are no duplicate solutions.
You could for instance fix x3=0 and choosing random values of x4 generate solutions for your equations in x1, x2
Here's an example generating 10 "random" solutions
n = 10
x3 = 0
X = []
for x4 in np.random.choice(1000, n):
b = np.array([[6-7*x3+2*x4],[-2+2*x3-5*x4]])
x = np.linalg.solve(a, b)
X.append(np.append(x,[x3,x4]))
# check solution nr. 3
[x1, x2, x3, x4] = X[3]
3*x1 - 2* x2 + 7*x3 - 2*x4
# output: 6.0
1*x1 + 3* x2 - 2*x3 + 5*x4
# output: -2.0
Thanks for the answers, which both helped me and pointed me in the right direction.
I now have an easy step-by-step solution to my problem for arbitrary k<n.
1. Find one solution to all equations given. This can be done by using
solution_vec = numpy.linalg.lstsq(A,b)
this gives a solution as seen in ukemis answer. In my example above, the Matrix A is equal to the coefficients of the equations on the left side, b represents the vector on the right side.
2. Determine the null space of your matrix A.
These are all vectors v such that the skalar product v*A_i = 0 for every(!) row A_i of A. The following function, found in this thread can be used to get representatives of the null space of A:
def nullSpaceOfMatrix(A, eps=1e-15):
u, s, vh = scipy.linalg.svd(A)
null_mask = (s <= eps)
null_space = scipy.compress(null_mask, vh, axis=0)
return scipy.transpose(null_space)
3. Generate as many (N) "random" linear combinations (meaning with random coefficients) of solution_vec and resulting vectors of the nullspace of the matrix as you want! This works because the scalar product is additive and nullspace vectors have a scalar product of 0 to the vectors of the equations. Those linear combinations always must contain solution_vec, as in:
linear_combination = solution_vec + a*null_spacevec_1 + b*nullspacevec_2...
where a and b can be randomly chosen.
Let's look at m points in n-d space- (A solution for 4 points in 3-d space is here: minimize distance from sets of points)
a= (x1, y1, z1, ..)
b= (x2, y2 ,z2, ..)
c= (x3, y3, z3, ..)
.
.
p= (x , y , z, ..)
Find point q = c1* a + c2* b + c3* c + ..
where c1 + c2 + c3 + .. = 1
and c1, c2, c3, .. >= 0
s.t.
euclidean distance pq is minimized.
What algorithms can be used ? Idea or pseudocode is enough.
(Optimizing performance is a big issue here. Monte Carlo method with all vertices and changing coefficients would also give a solution.)
We can assume p = 0 by subtracting p from all the other points. Then the question is one of minimizing the norm over a convex hull of a finite set of points, i.e., a polytope.
There are a few papers on this problem. It looks like "A recursive algorithm for finding the minimum norm point in a polytope and a pair of closest points in two polytopes" by Kazuyuki Sekitani and Yoshitsugu Yamamoto is a good one, with a short survey of prior solutions to the problem. It is behind a paywall but if you have access to a university library you may be able to download a copy.
The algorithm they give is fairly simple, once you get past the notation. P is the finite set of points. C(P) is its convex hull. Nr(C(P)) is the unique point of minimum norm, which is what you want to find.
Step 0: Choose a point x_0 from the convex hull C(P) of your finite set of points P. They recommend choosing x_0 to be the point in P with minimum norm. Let k=1.
Now loop:
Step 1: Let a_k = min {x^t_{k-1} p | p is in P}. Here x^t_{k-1} is the transpose of x_{k-1} (so the function being minimized is just a dot product as p ranges over your finite set P). If |x_{k-1}|^2 <= a_k, then the answer is x_{k-1}, stop.
Step 2: P_k = {p | p in P and x^t_{k-1} = a_k}. P_k is the subset of P that minimizes the expression in Step 1. Call the algorithm recursively on this set P_k, and let the result be y_k = Nr(C(P_k)).
Step 3: b_k = min{y^t_k p | p in P\P_k}, the minimum of the dot product of y_k with points in the complement set P\P_k. If |y_k|^2 <= b_k then y_k is the answer, stop.
Step 4: s_k = max{s| [(1-s)x_{k-1} + sy_k]^t y_k <= [(1-s)x_{k-1} + sy_k]^t p for every p in P\P_k}. Let x_k = (1-s_k) x_{k-1} + s_k y_k, let k=k+1, and go back to Step 1.
There is an explicit formula for s_k in Step 4:
s_k = min{ [x^t_{k-1} (p-y_k)]/[(y_k-x_{k-1})^t (y_k-p)] | p in P\P_k and (y_k - x_{k-1})^t (y_k-p) > 0 }
There is a proof in the paper that s_k has the necessary properties, that the algorithm terminates after a finite number of operations, and that the result is indeed optimal.
Note that you should add some tolerance into your comparisons, otherwise rounding errors may cause the algorithm to fail. There is a lot of discussion about numerical stability, see the paper for details.
They do not give a complete analysis of the computational complexity of the algorithm, but they do prove it is at most O(m^2) in the two-dimensional case (m is the number of points in P), and they have done numerical experiments which give the impression that it is sublinear in time as a function of m, with dimension fixed. I'm skeptical of that claim. In the absence of a detailed analysis, I suggest you try some experiments with typical data to see how well the algorithm performs for you.
Stated a simpler way, you have a set of points {a}i, and you are considering all points which are some weighted average thereof. This set of points is exactly the convex hull of those points; it's a polytope (polygon, polyhedron, etc.) that just happens to be convex, where the corners are a subset of the {a}i points.
You are just asking which point on a polytope(~hedron) is closest to a point. (your query point p)
The closest point must be on the exterior of the polytope. One algorithm would be to brute-force searching all N-1 dimensional surfaces. Do this in the usual way you would find the closest point on a line or surface or N-dimensional surface to a query point.
(If the points are not all linearly independent, you will have multiple ways (multiple weight vectors) which can give you the same weighted-average point q. You can worry about reconstructing the answer q from the basis vectors after you find it geometrically.)
I've seen several posts on this subject, but I need a pure Python (no Numpy or any other imports) solution that accepts a list of points (x,y,z coordinates) and calculates a normal for the closest plane that to those points.
I'm following one of the working Numpy examples from here: Fit points to a plane algorithms, how to iterpret results?
def fitPLaneLTSQ(XYZ):
# Fits a plane to a point cloud,
# Where Z = aX + bY + c ----Eqn #1
# Rearanging Eqn1: aX + bY -Z +c =0
# Gives normal (a,b,-1)
# Normal = (a,b,-1)
[rows,cols] = XYZ.shape
G = np.ones((rows,3))
G[:,0] = XYZ[:,0] #X
G[:,1] = XYZ[:,1] #Y
Z = XYZ[:,2]
(a,b,c),resid,rank,s = np.linalg.lstsq(G,Z)
normal = (a,b,-1)
nn = np.linalg.norm(normal)
normal = normal / nn
return normal
XYZ = np.array([
[0,0,1],
[0,1,2],
[0,2,3],
[1,0,1],
[1,1,2],
[1,2,3],
[2,0,1],
[2,1,2],
[2,2,3]
])
print fitPLaneLTSQ(XYZ)
[ -8.10792259e-17 7.07106781e-01 -7.07106781e-01]
I'm trying to adapt this code: Basic ordinary least squares calculation to replace np.linalg.lstsq
Here is what I have so far without using Numpy using the same coords as above:
xvals = [0,0,0,1,1,1,2,2,2]
yvals = [0,1,2,0,1,2,0,1,2]
zvals = [1,2,3,1,2,3,1,2,3]
""" Basic ordinary least squares calculation. """
sumx, sumy = map(sum, [xvals, yvals])
sumxy = sum(map(lambda x, y: x*y, xvals, yvals))
sumxsq = sum(map(lambda x: x**2, xvals))
Nsamp = len(xvals)
# y = a*x + b
# a (slope)
slope = (Nsamp*sumxy - sumx*sumy) / ((Nsamp*sumxsq - sumx**2))
# b (intercept)
intercept = (sumy - slope*sumx) / (Nsamp)
a = slope
b = intercept
normal = (a,b,-1)
mag = lambda x : math.sqrt(sum(i**2 for i in x))
nn = mag(normal)
normal = [i/nn for i in normal]
print normal
[0.0, 0.7071067811865475, -0.7071067811865475]
As you can see, the answers come out the same, but that is only because of this particular example. In other examples, they don't match. If you look closely you'll see that in the Numpy example the 'z' values are fed into 'np.linalg.lstsq', but in the non-Numpy version the 'z' values are ignored. How do I work in the 'z' values to the least-squares code?
Thanks
I do not think you can get away without implementing some basic matrix operations. As this is a multivariate linear regression problem, you will definitely need dot product, transpose and norm. These are easy. The difficult part is that you also need matrix inverse or QR decomposition or something similar. People usually use BLAS for these for good reasons, implementing them is not easy - but not impossible either.
With QR decomposition
I would start by creating a Matrix class that has the following methods
dot(m1, m2) (or __matmul__(m1, m2) if you have python 3.5): it is just the sum of products, should be straightforward
transpose(self): swapping matrix elements, should be easy
norm(self): square root of sum of squares (should be only used on vectors)
qr_decomp(self): this one is tricky. For an almost pure python implementation see this rosetta code solution (disclaimer: I have not thoroughly checked this code). It uses some numpy functions, but these are basic functions you can implement for your matrix class (shape, eye, dot, copysign, norm).
leastsqr_ut(R, A): solve the equation Rx = A if R is an upper triangular matrix. Not trivial, but should be easy enough as you can solve it equation by equation from the bottom.
With these, the solution is easy:
Generate the matrix G as detailed in your numpy example
Find the QR decomposition of G
Solve Rb = Q'z for b using that R is an upper triangular matrix
Then the normal vector you are looking for is (b[0], b[1], -1) (or the norm of it if you want a unit length normal vector).
With matrix inverse
The inverse of a 3x3 matrix is relatively easy to calculate, but this method is much less numerically stable than doing QR decomposition. If it is not an important concern, then you can do the following: implement
dot(m1, m2) (or __matmul__(m1, m2) if you have python 3.5): it is just the sum of products, should be straightforward
transpose(self): swapping matrix elements, should be easy
norm(self): square root of sum of squares (should be only used on vectors)
det(self): determinant, but it is enough if it works on 2x2 and 3x3 matrices, and for those simple formulas are available
inv(self): matrix inverse. It is enough if it works on 3x3 matrices, there is a simple formula for example here
Then the formula for b is b = inv(G'G) * (G'z) and your normal vector is again (b[0], b[1], -1).
As you can see, none of these are simple, and most of it is replicating some numpy functionality while making it a lot slower lot slower. So make sure you have absolutely no other choice.
I generated a code with a similar purpose (see "tangentplane_3D" function in the linked code).
In my case I had a scatter cloud of points that define a 3D ellipsoid. For each point I wanted to determine the tangent plane to the ellipsoid containing such point --> Goal: Determination of a 3D plane.
The problem can be seen in the following way: A plane is defined by its normal and the normal can be seen as the eigenvector associated to the minimum of the eigenvalues of a n set of points.
What I did, and you can check it on the code I posted, is to select k points close to the point of interest at which I wanted to calculate the tangent plane. Then, I performed a 3D Single Value Decomposition to these k points. Finally, from these SVD I selected the minimum eigenvalue and its associated eigenvector which is, in fact, the normal of the plane best fitting my set of points, and thus in my case, tangent to the ellipsoid plane. With the normal vector and the point you can subsequently calculate the complete plane equation.
I hope it helps!!
Best wishes.
I have used numpy's polyfit and obtained a very good fit (using a 7th order polynomial) for two arrays, x and y. My relationship is thus;
y(x) = p[0]* x^7 + p[1]*x^6 + p[2]*x^5 + p[3]*x^4 + p[4]*x^3 + p[5]*x^2 + p[6]*x^1 + p[7]
where p is the polynomial array output by polyfit.
Is there a way to reverse this method easily, so I have a solution in the form of,
x(y) = p[0]*y^n + p[1]*y^n-1 + .... + p[n]*y^0
No there is no easy way in general. Closed form-solutions for arbitrary polynomials are not available for polynomials of the seventh order.
Doing the fit in the reverse direction is possible, but only on monotonically varying regions of the original polynomial. If the original polynomial has minima or maxima on the domain you are interested in, then even though y is a function of x, x cannot be a function of y because there is no 1-to-1 relation between them.
If you are (i) OK with redoing the fitting procedure, and (ii) OK with working piecewise on single monotonic regions of your fit at a time, then you could do something like this:
-
import numpy as np
# generate a random coefficient vector a
degree = 1
a = 2 * np.random.random(degree+1) - 1
# an assumed true polynomial y(x)
def y_of_x(x, coeff_vector):
"""
Evaluate a polynomial with coeff_vector and degree len(coeff_vector)-1 using Horner's method.
Coefficients are ordered by increasing degree, from the constant term at coeff_vector[0],
to the linear term at coeff_vector[1], to the n-th degree term at coeff_vector[n]
"""
coeff_rev = coeff_vector[::-1]
b = 0
for a in coeff_rev:
b = b * x + a
return b
# generate some data
my_x = np.arange(-1, 1, 0.01)
my_y = y_of_x(my_x, a)
# verify that polyfit in the "traditional" direction gives the correct result
# [::-1] b/c polyfit returns coeffs in backwards order rel. to y_of_x()
p_test = np.polyfit(my_x, my_y, deg=degree)[::-1]
print p_test, a
# fit the data using polyfit but with y as the independent var, x as the dependent var
p = np.polyfit(my_y, my_x, deg=degree)[::-1]
# define x as a function of y
def x_of_y(yy, a):
return y_of_x(yy, a)
# compare results
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(my_x, my_y, '-b', x_of_y(my_y, p), my_y, '-r')
Note: this code does not check for monotonicity but simply assumes it.
By playing around with the value of degree, you should see that see the code only works well for all random values of a when degree=1. It occasionally does OK for other degrees, but not when there are lots of minima / maxima. It never does perfectly for degree > 1 because approximating parabolas with square-root functions doesn't always work, etc.