integrating 2D samples on a rectangular grid using SciPy - python

SciPy has three methods for doing 1D integrals over samples (trapz, simps, and romb) and one way to do a 2D integral over a function (dblquad), but it doesn't seem to have methods for doing a 2D integral over samples -- even ones on a rectangular grid.
The closest thing I see is scipy.interpolate.RectBivariateSpline.integral -- you can create a RectBivariateSpline from data on a rectangular grid and then integrate it. However, that isn't terribly fast.
I want something more accurate than the rectangle method (i.e. just summing everything up). I could, say, use a 2D Simpson's rule by making an array with the correct weights, multiplying that by the array I want to integrate, and then summing up the result.
However, I don't want to reinvent the wheel if there's already something better out there. Is there?

Use the 1D rule twice.
>>> from scipy.integrate import simps
>>> import numpy as np
>>> x = np.linspace(0, 1, 20)
>>> y = np.linspace(0, 1, 30)
>>> z = np.cos(x[:,None])**4 + np.sin(y)**2
>>> simps(simps(z, y), x)
0.85134099743259539
>>> import sympy
>>> xx, yy = sympy.symbols('x y')
>>> sympy.integrate(sympy.cos(xx)**4 + sympy.sin(yy)**2, (xx, 0, 1), (yy, 0, 1)).evalf()
0.851349922021627

If you are dealing with a true two dimensional integral over a rectangle you would have something like this
>>> import numpy as np
>>> from scipy.integrate import simps
>>> x_min,x_max,n_points_x = (0,1,50)
>>> y_min,y_max,n_points_y = (0,5,50)
>>> x = np.linspace(x_min,x_max,n_points_x)
>>> y = np.linspace(y_min,y_max,n_points_y)
>>> def F(x,y):
>>> return x**4 * y
# We reshape to use broadcasting
>>> zz = F(x.reshape(-1,1),y.reshape(1,-1))
>>> zz.shape
(50,50)
# We first integrate over x and then over y
>>> simps([simps(zz_x,x) for zz_x in zz],y)
2.50005233
You can compare with the true result which is

trapz can be done in 2D in the following way. Draw a grid of points schematically,
The integral over the whole grid is equal to the sum of the integrals over small areas dS. Trapezoid rule approximates the integral over a small rectangle dS as the area dS multiplied by the average of the function values in the corners of dS which are the grid points:
∫ f(x,y) dS = (f1 + f2 + f3 + f4)/4
where f1, f2, f3, f4 are the array values in the corners of the rectangle dS.
Observe that each internal grid point enters the formula for the whole integral four times as it is common for four rectangles. Each point on the side that is not in the corner, enters twice as it is common for two rectangles, and each corner point enters only once. Therefore, the integral is calculated in numpy via the following function:
def double_Integral(xmin, xmax, ymin, ymax, nx, ny, A):
dS = ((xmax-xmin)/(nx-1)) * ((ymax-ymin)/(ny-1))
A_Internal = A[1:-1, 1:-1]
# sides: up, down, left, right
(A_u, A_d, A_l, A_r) = (A[0, 1:-1], A[-1, 1:-1], A[1:-1, 0], A[1:-1, -1])
# corners
(A_ul, A_ur, A_dl, A_dr) = (A[0, 0], A[0, -1], A[-1, 0], A[-1, -1])
return dS * (np.sum(A_Internal)\
+ 0.5 * (np.sum(A_u) + np.sum(A_d) + np.sum(A_l) + np.sum(A_r))\
+ 0.25 * (A_ul + A_ur + A_dl + A_dr))
Testing it on the function given by David GG:
x_min,x_max,n_points_x = (0,1,50)
y_min,y_max,n_points_y = (0,5,50)
x = np.linspace(x_min,x_max,n_points_x)
y = np.linspace(y_min,y_max,n_points_y)
def F(x,y):
return x**4 * y
zz = F(x.reshape(-1,1),y.reshape(1,-1))
print(double_Integral(x_min, x_max, y_min, y_max, n_points_x, n_points_y, zz))
2.5017353157550444
Other methods (Simpson, Romberg, etc) can be derived similarly.

Related

Evaluate function in points inside half sphere and plot slides in Python

I am trying to evaluate a function that depends on the radius from the center of a sphere to any point inside half a sphere.
I start by defining three arrays corresponding to the points along the radius, the elevation and azimuthal angles. In a for loop I compute the x, y and z coordinates to evaluate the function.
I am not sure if I am doing the mapping properly. I need to store the values of the evaluated function in a 3D matrix corresponding to the x, y, and z coordinates to plot slices in a postprocessing step, but I am stuck identifying how I can define the size of my function matrix.
In cartesian coordinates is really easy since one can link every coordinate with the dimension of the matrix. That's why I need some guidance in how I can slide the matrix since I don't have a 3D matrix with the cartesian coordinates. How I can construct this matrix from the spherical coordintaes?
Any help will be more than appreciated!
Here is my (unfruitful) attempt:
import numpy as np
beta = 1
rho = np.linspace(0, 1, 20)
phi = np.linspace(0, 2*np.pi, 20)
theta = np.linspace(0, np.pi/2, 10)
f = np.empty([len(theta), len(theta), len(phi)], dtype=complex)
for i in range(len(rho)):
for j in range(len(phi)):
for k in range(len(theta)):
x = rho[i] * np.sin(theta[k]) * np.cos(phi[j])
y = rho[i] * np.sin(theta[k]) * np.sin(phi[j])
z = rho[i] * np.cos(theta[k])
R = np.sqrt(x**2 + y**2 + z**2)
f[k, i, j] = -1j*((z/R)/(z/R + beta)) * (np.exp(1j*k*R)/R)
You just have a typo, the second dimension is again len(theta) isntead of len(rho). It should be
f = np.empty([len(theta), len(rho), len(phi)], dtype=complex)
Note also that, if I am not mistaken, you don't need R at all, it's just rho[i].

Python function to find the numeric volume integral?

Goal
I would like to compute the 3D volume integral of a numeric scalar field.
Code
For this post, I will use an example of which the integral can be exactly computed. I have therefore chosen the following function:
In Python, I define the function, and a set of points in 3D, and then generate the discrete values at these points:
import numpy as np
# Make data.
def function(x, y, z):
return x**y**z
N = 5
grid = np.meshgrid(
np.linspace(0, 1, N),
np.linspace(0, 1, N),
np.linspace(0, 1, N)
)
points = np.vstack(list(map(np.ravel, grid))).T
x = points[:, 0]
y = points[:, 1]
z = points[:, 2]
values = [function(points[i, 0], points[i, 1], points[i, 2])
for i in range(len(points))]
Question
How can I find the integral, if I don't know the underlying function, i.e. if I only have the coordinates (x, y, z) and the values?
A nice way to go about this would be using scipy's tplquad integration. However, to use that, we need a function and not a cloud point.
An easy way around that is to use an interpolator, to get a function approximating our cloud point - we can for example use scipy's RegularGridInterpolator if the data is on a regular grid:
import numpy as np
from scipy import integrate
from scipy.interpolate import RegularGridInterpolator
# Make data.
def function(x,y,z):
return x*y*z
N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)
values = function(*np.meshgrid(x,y,z, indexing='ij'))
# Interpolate:
function_interpolated = RegularGridInterpolator((x, y, z), values)
# tplquad integrates func(z,y,x)
f = lambda z,y,x : my_interpolating_function([z,y,x])
result, error = integrate.tplquad(f, xmin, xmax, lambda _: ymin, lambda _:ymax,lambda *_: zmin, lambda *_: zmax)
In the example above, we get result = 0.12499999999999999 - close enough!
The easiest way to achieve what you are looking for is probably scipy's integration function. Here your example:
from scipy import integrate
# Make data.
def func(x,y,z):
return x**y**z
ranges = [[0,1], [0,1], [0,1]]
result, error = integrate.nquad(func, ranges)
Are you aware that the function that you created is different from the one that you show in the image. The one you created is an exponential (x^y^z) while the one that you are showing is just multiplications. If you want to represent the function in the image, use
def func(x,y,z):
return x*y*z
Hope this answers your question, otherwise just write a comment!
Edit:
Misread your post. If you only have the results, and they are not regularly spaced, you would have to figure out some form of interpolation (i.e. linear) and a lookup-table. If you do not know how to create that, let me know. The rest of the stated answer could still be used if you define func to return interpolated values from your original data
The first answer explains nicely the principal approach to handle this. Just wanted to illustrate an alternative way by showing the power of sklearn package and machine learning regression.
Doing the meshgrid in 3D gives a very large numpy array,
import numpy as np
N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)
grid = np.array(np.meshgrid(x,y,z, indexing='ij'))
grid.shape = (3, 5, 5, 5) # 2*5*5*5 = 250 numbers
Which is visually not very intuitive with 250 numbers. With different possible indexing ('ij' or 'xy'). Using regression we can get the same result with few input points (15-20).
# building random combinations from (x,y,z)
X = np.random.choice(x, 20)[:,None]
Y = np.random.choice(y, 20)[:,None]
Z = np.random.choice(z, 20)[:,None]
xyz = np.concatenate((X,Y,Z), axis = 1)
data = np.multiply.reduce(xyz, axis = 1)
So the input (grid) is just a 2D numpy array,
xyz.shape
(20, 3)
With the corresponding data,
data.shape = (20,)
Now the regression function and integration,
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from scipy import integrate
pipe=Pipeline([('polynomial',PolynomialFeatures(degree=3)),('modal',LinearRegression())])
pipe.fit(xyz, data)
def func(x,y,z):
return pipe.predict([[x, y, z]])
ranges = [[0,1], [0,1], [0,1]]
result, error = integrate.nquad(func, ranges)
print(result)
0.1257
This approach is useful with limited number of points.
Based on your requirements, it sounds like the most appropriate technique would be Monte Carlo integration:
# Step 0 start with some empirical data
observed_points = np.random.uniform(0,1,size=(10000,3))
unknown_fn = lambda x: np.prod(x) # just used to generate fake values
observed_values = np.apply_along_axis(unknown_fn, 1, observed_points)
K = 1000000
# Step 1 - assume that f(x,y,z) can be approximated by an interpolation
# of the data we have (you could get really fancy with the
# selection of interpolation method - we'll stick with straight lines here)
from scipy.interpolate import LinearNDInterpolator
f_interpolate = LinearNDInterpolator(observed_points, observed_values)
# Step 2 randomly sample from within convex hull of observed data
# Step 2a - Uniformly sample from bounding 3D-box of data
lower_bounds = observed_points.min(axis=0)
upper_bounds = observed_points.max(axis=0)
sampled_points = np.random.uniform(lower_bounds, upper_bounds,size=(K, 3))
# Step 2b - Reject points outside of convex hull...
# Luckily, we get a np.nan from LinearNDInterpolator in this case
sampled_values = f_interpolate(sampled_points)
rejected_idxs = np.argwhere(np.isnan(sampled_values))
# Step 2c - Remember accepted values of estimated f(x_i, y_i, z_i)
final_sampled_values = np.delete(sampled_values, rejected_idxs, axis=0)
# Step 3 - Calculate estimate of volume of observed data domain
# Since we sampled uniformly from the convex hull of data domain,
# each point was selected with P(x,y,z)= 1 / Volume of convex hull
volume = scipy.spatial.ConvexHull(observed_points).volume
# Step 4 - Multiply estimated volume of domain by average sampled value
I_hat = volume * final_sampled_values.mean()
print(I_hat)
For a derivation of why this works see this: https://cs.dartmouth.edu/wjarosz/publications/dissertation/appendixA.pdf

How to convert a cartesian problem in a cylindrical problem?

I display a gyroid structure (TPMS) in a cartesian system using Pyvista. I try now to display the structure in cylindrical coordinates. Pyvista displays something cylindrical indeed but it seems that the unit cell length is not uniform (while there is no reason to change this my parameter "a" being steady. This change seems to appear especially along z but I don't understand why (see image).
I have this:
Here is a part of my code.
Thank you for your help.
import pyvista as pv
import numpy as np
from numpy import cos, sin, pi
from random import uniform
lattice_par = 1.0 # Unit cell length
a = (2*pi)/lattice_par
res = 200j
r, theta, z = np.mgrid[0:2:res, 0:2*pi:res, 0:4:res]
# consider using non-equidistant r for uniformity
def GyroidCyl(r, theta, z, b=0.8):
return (sin(a*(r*cos(theta) - 1))*cos(a*(r*sin(theta) - 1))
+ sin(a*(r*sin(theta) - 1))*cos(a*(z - 1))
+ sin(a*(z - 1))*cos(a*(r*cos(theta) - 1))
- b)
vol3 = GyroidCyl(r, theta, z)
# compute Cartesian coordinates for grid points
x = r * cos(theta)
y = r * sin(theta)
grid = pv.StructuredGrid(x, y, z)
grid["vol3"] = vol3.flatten()
contours3 = grid.contour([0]) # Isosurface = 0
pv.set_plot_theme('document')
p = pv.Plotter()
p.add_mesh(contours3, scalars=contours3.points[:, 2], show_scalar_bar=False, interpolate_before_map=True,
show_edges=False, smooth_shading=False, render=True)
p.show_axes_all()
p.add_floor()
p.show_grid()
p.add_title('Gyroid in cylindrical coordinates')
p.add_text('Volume Fraction Parameter = ' + str(b))
p.show(window_size=[2040, 1500])
So you've noted in comments that you're trying to replicate something like the strategy explained in this paper. What they do is take a regular gyroid unit cell, and then transform it to build a cylindrical shell. If igloos were cylindrical, then a gyroid cell would be a single piece of snow brick. Put them next to one another and stack them in a column, and you get a cylinder.
Since I can't use figures from the paper we'll have to recreate some ourselves. So you have to start from a regular gyroid defined by the implicit function
cos(x) sin(y) + cos(y) sin(z) + cos(z) sin(x) = 0
(or some variation thereof). Here's how a single unit cell looks:
import pyvista as pv
import numpy as np
res = 100j
a = 2*np.pi
x, y, z = np.mgrid[0:a:res, 0:a:res, 0:a:res]
def Gyroid(x, y, z):
return np.cos(x)*np.sin(y) + np.cos(y)*np.sin(z) + np.cos(z)*np.sin(x)
# compute implicit function
fun_values = Gyroid(x, y, z)
# create grid for contouring
grid = pv.StructuredGrid(x, y, z)
grid["vol3"] = fun_values.ravel('F')
contours3 = grid.contour([0]) # isosurface for 0
# plot the contour, i.e. the gyroid
pv.set_plot_theme('document')
plotter = pv.Plotter()
plotter.add_mesh(contours3, scalars=contours3.points[:, -1],
show_scalar_bar=False)
plotter.add_bounding_box()
plotter.enable_terrain_style()
plotter.show_axes()
plotter.show()
Using the "unit cell" term implies there's an underlying infinite lattice, which can be built by stacking these (rectangular) unit cells neatly next to one another. With some imagination we can convince ourselves that this is true. Or we can look at the formula and note that due to the trigonometric functions the function is periodic in x, y and z, with period 2*pi. This also tells us that we can transform the unit cell to have arbitrary rectangular dimensions by introducing lattice parameters a, b and c:
cos(kx x) sin(ky y) + cos(ky y) sin(kz z) + cos(kz z) sin(kx x) = 0, where
kx = 2 pi/a
ky = 2 pi/b
kz = 2 pi/c
(These kx, ky and kz quantities are called wave vectors in solid state physics.)
The corresponding change only affects the header:
res = 100j
a, b, c = lattice_params = 1, 2, 3
kx, ky, kz = [2*np.pi/lattice_param for lattice_param in lattice_params]
x, y, z = np.mgrid[0:a:res, 0:b:res, 0:c:res]
def Gyroid(x, y, z):
return ( np.cos(kx*x)*np.sin(ky*y)
+ np.cos(ky*y)*np.sin(kz*z)
+ np.cos(kz*z)*np.sin(kx*x))
This is where we start. What we have to do is take this unit cell, bend it so that it corresponds to a 30-degree circular arc on a cylinder, and stack the cylinder using this unit. According to the paper, they used 12 unit cells to create a circle in a plane (hence the 30-degree magic number), and stacked three such circular bands on top of each other to build the cylinder.
The actual mapping is also fairly clearly explained in the paper. Whereas your original x, y and z parameters of the function essentially interpolated between [0, a], [0, b] and [0, c], respectively, in the new setup x interpolates in the radius range [r1, r2], y interpolates in the angular range [0, pi/6] and z is just z. (In the paper x and y seem to be reversed with respect to this convention, but this shouldn't matter. If it matters, that's left as an exercise to the reader.)
So what we need to do is more or less keep the current grid points, but transform the corresponding x, y and z grid points so that they lie on a cylinder instead. Here's one take:
import pyvista as pv
import numpy as np
res = 100j
a, b, c = lattice_params = 1, 1, 1
kx, ky, kz = [2*np.pi/lattice_param for lattice_param in lattice_params]
r_aux, phi, z = np.mgrid[0:a:res, 0:b:res, 0:3*c:res]
# convert r_aux range to actual radii
r1, r2 = 1.5, 2
r = r2/a*r_aux + r1/a*(1 - r_aux)
def Gyroid(x, y, z):
return ( np.cos(kx*x)*np.sin(ky*y)
+ np.cos(ky*y)*np.sin(kz*z)
+ np.cos(kz*z)*np.sin(kx*x))
# compute data for cylindrical gyroid
# r_aux is x, phi / 12 is y and z is z
fun_values = Gyroid(r_aux, phi * 12, z)
# compute Cartesian coordinates for grid points
x = r * np.cos(phi*ky)
y = r * np.sin(phi*ky)
grid = pv.StructuredGrid(x, y, z)
grid["vol3"] = fun_values.ravel('F')
contours3 = grid.contour([0])
# plot cylindrical gyroid
pv.set_plot_theme('document')
plotter = pv.Plotter()
plotter.add_mesh(contours3, scalars=contours3.points[:, -1],
show_scalar_bar=False)
plotter.add_bounding_box()
plotter.show_axes()
plotter.enable_terrain_style()
plotter.show()
If you want to look at a single transformed unit cell in the cylindrical setting, use a single domain of phi and z for the function and only convert to 1/12 a full circle for the grid points:
fun_values = Gyroid(r_aux, phi, z/3)
# compute Cartesian coordinates for grid points
x = r * np.cos(phi*ky/12)
y = r * np.sin(phi*ky/12)
grid = pv.StructuredGrid(x, y, z/3)
But it's not easy to see the curvature in the (no longer a) unit cell:

Python: Scipy's curve_fit for NxM arrays?

Usually I use Scipy.optimize.curve_fit to fit custom functions to data.
Data in this case was always a 1 dimensional array.
Is there a similiar function for a two dimensional array?
So, for example, I have a 10x10 numpy array. Then I have a function that does some stuff and creates a 10x10 numpy array, and I want to fit the function, so that the resulting 10x10 array has the best fit to the input array.
Maybe an example is better :)
data = pyfits.getdata('data.fits') #fits is an image format, this gives me a NxM numpy array
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
mod1_1D = numpy.ravel(mod1)
mod2_1D = numpy.ravel(mod2)
mod3_1D = numpy.ravel(mod3)
def dostuff(a,b): #originaly this is a function for 2D arrays
newdata = (mod1_1D*12)+(mod2_1D)**a - mod3_1D/b
return newdata
Now a and b should be fitted, so that newdata is as close as possible to data.
What I got so far:
data1D = numpy.ravel(data)
data_X = numpy.arange(data1D.size)
fit = curve_fit(dostuff,data_X,data1D)
But print fit only gives me
(array([ 1.]), inf)
I do have some nans in the arrays, maybe thats a problem?
The goal is to express the 2D function as a 1D function: g(x, y, ...) --> f(xy, ...)
Converting the coordinate pair (x, y) into a single number xy may seem tricky at first. But it's actually quite simple. Just enumerate all data points and you have a single number that uniquely defines each coordinate pair. The fitted function simply has to reconstruct the original coordinates, do it's calculations and return the result.
Example that fits a 2D linear gradient in a 20x10 image:
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
n, m = 10, 20
# noisy example data
x = np.arange(m).reshape(1, m)
y = np.arange(n).reshape(n, 1)
z = x + y * 2 + np.random.randn(n, m) * 3
def f(xy, a, b):
i = xy // m # reconstruct y coordinates
j = xy % m # reconstruct x coordinates
out = i * a + j * b
return out
xy = np.arange(z.size) # 0 is the top left pixel and 199 is the top right pixel
res = sp.optimize.curve_fit(f, xy, np.ravel(z))
z_est = f(xy, *res[0])
z_est2d = z_est.reshape(n, m)
plt.subplot(2, 1, 1)
plt.plot(np.ravel(z), label='original')
plt.plot(z_est, label='fitted')
plt.legend()
plt.subplot(2, 2, 3)
plt.imshow(z)
plt.xlabel('original')
plt.subplot(2, 2, 4)
plt.imshow(z_est2d)
plt.xlabel('fitted')
I would recommend using symfit for this, I wrote that to take care of all of the magic for you automatically.
In symfit you would just write the equation pretty much as you would on paper, and then you can run the fit.
I would do something like this:
from symfit import parameters, variables, Fit
# Assuming all this data is in the form of NxM arrays
data = pyfits.getdata('data.fits')
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
a, b = parameters('a, b')
x, y, z, u = variables('x, y, z, u')
model = {u: (x * 12) + y**a - z / b}
fit = Fit(model, x=mod1, y=mod2, z=mod3, u=data)
fit_result = fit.execute()
print(fit_result)
Unfortunatelly I have not yet included examples of the kind you need in the docs yet, but if you just look at the docs I think you can figure it out in case this doesn't work out of the box.

Plane fitting to 4 (or more) XYZ points

I have 4 points, which are very near to be at the one plane - it is the 1,4-Dihydropyridine cycle.
I need to calculate distance from C3 and N1 to the plane, which is made of C1-C2-C4-C5.
Calculating distance is OK, but fitting plane is quite difficult to me.
1,4-DHP cycle:
1,4-DHP cycle, another view:
from array import *
from numpy import *
from scipy import *
# coordinates (XYZ) of C1, C2, C4 and C5
x = [0.274791784, -1.001679346, -1.851320839, 0.365840754]
y = [-1.155674199, -1.215133985, 0.053119249, 1.162878076]
z = [1.216239624, 0.764265677, 0.956099579, 1.198231236]
# plane equation Ax + By + Cz = D
# non-fitted plane
abcd = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
# creating distance variable
distance = zeros(4, float)
# calculating distance from point to plane
for i in range(4):
distance[i] = (x[i]*abcd[0]+y[i]*abcd[1]+z[i]*abcd[2]+abcd[3])/sqrt(abcd[0]**2 + abcd[1]**2 + abcd[2]**2)
print distance
# calculating squares
squares = distance**2
print squares
How to make sum(squares) minimized? I have tried least squares, but it is too hard for me.
That sounds about right, but you should replace the nonlinear optimization with an SVD. The following creates the moment of inertia tensor, M, and then SVD's it to get the normal to the plane. This should be a close approximation to the least-squares fit and be much faster and more predictable. It returns the point-cloud center and the normal.
def planeFit(points):
"""
p, n = planeFit(points)
Given an array, points, of shape (d,...)
representing points in d-dimensional space,
fit an d-dimensional plane to the points.
Return a point, p, on the plane (the point-cloud centroid),
and the normal, n.
"""
import numpy as np
from numpy.linalg import svd
points = np.reshape(points, (np.shape(points)[0], -1)) # Collapse trialing dimensions
assert points.shape[0] <= points.shape[1], "There are only {} points in {} dimensions.".format(points.shape[1], points.shape[0])
ctr = points.mean(axis=1)
x = points - ctr[:,np.newaxis]
M = np.dot(x, x.T) # Could also use np.cov(x) here.
return ctr, svd(M)[0][:,-1]
For example: Construct a 2D cloud at (10, 100) that is thin in the x direction and 100 times bigger in the y direction:
>>> pts = np.diag((.1, 10)).dot(randn(2,1000)) + np.reshape((10, 100),(2,-1))
The fit plane is very nearly at (10, 100) with a normal very nearly along the x axis.
>>> planeFit(pts)
(array([ 10.00382471, 99.48404676]),
array([ 9.99999881e-01, 4.88824145e-04]))
Least squares should fit a plane easily. The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:
x_0 y_0 1
A = x_1 y_1 1
...
x_n y_n 1
And
a
x = b
c
And
z_0
B = z_1
...
z_n
In other words: Ax = B. Now solve for x which are your coefficients. But since you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:
a
b = (A^T A)^-1 A^T B
c
And here is some simple Python code with an example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET = 5
EXTENTS = 5
NOISE = 5
# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
zs.append(xs[i]*TARGET_X_SLOPE + \
ys[i]*TARGET_y_SLOPE + \
TARGET_OFFSET + np.random.normal(scale=NOISE))
# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')
# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
tmp_A.append([xs[i], ys[i], 1])
tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)
print("solution: %f x + %f y + %f = z" % (fit[0], fit[1], fit[2]))
print("errors:")
print(errors)
print("residual: {}".format(residual))
# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
for c in range(X.shape[1]):
Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
The solution for your points:
0.143509 x + 0.057196 y + 1.129595 = z
The fact that you are fitting to a plane is only slightly relevant here. What you are trying to do is minimize a particular function starting from a guess. For that use scipy.optimize. Note that there is no guarantee that this is the globally optimal solution, only locally optimal. A different initial condition may converge to a different result, this works well if you start close to the local minima you are seeking.
I've taken the liberty to clean up your code by taking advantage of numpy's broadcasting:
import numpy as np
# coordinates (XYZ) of C1, C2, C4 and C5
XYZ = np.array([
[0.274791784, -1.001679346, -1.851320839, 0.365840754],
[-1.155674199, -1.215133985, 0.053119249, 1.162878076],
[1.216239624, 0.764265677, 0.956099579, 1.198231236]])
# Inital guess of the plane
p0 = [0.506645455682, -0.185724560275, -1.43998120646, 1.37626378129]
def f_min(X,p):
plane_xyz = p[0:3]
distance = (plane_xyz*X.T).sum(axis=1) + p[3]
return distance / np.linalg.norm(plane_xyz)
def residuals(params, signal, X):
return f_min(X, params)
from scipy.optimize import leastsq
sol = leastsq(residuals, p0, args=(None, XYZ))[0]
print("Solution: ", sol)
print("Old Error: ", (f_min(XYZ, p0)**2).sum())
print("New Error: ", (f_min(XYZ, sol)**2).sum())
This gives:
Solution: [ 14.74286241 5.84070802 -101.4155017 114.6745077 ]
Old Error: 0.441513295404
New Error: 0.0453564286112
This returns the 3D plane coefficients along with the RMSE of the fit.
The plane is provided in a homogeneous coordinate representation, meaning its dot product with the homogeneous coordinates of a point produces the distance between the two.
def fit_plane(points):
assert points.shape[1] == 3
centroid = points.mean(axis=0)
x = points - centroid[None, :]
U, S, Vt = np.linalg.svd(x.T # x)
normal = U[:, -1]
origin_distance = normal # centroid
rmse = np.sqrt(S[-1] / len(points))
return np.hstack([normal, -origin_distance]), rmse
Minor note: the SVD can also be directly applied to the points instead of the outer product matrix, but I found it to be slower with NumPy's SVD implementation.
U, S, Vt = np.linalg.svd(x.T, full_matrices=False)
rmse = S[-1] / np.sqrt(len(points))
Another way aside from svd to quickly reach a solution while dealing with outliers ( when you have a large data set ) is ransac :
def fit_plane(voxels, iterations=50, inlier_thresh=10): # voxels : x,y,z
inliers, planes = [], []
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
for _ in range(iterations):
random_pts = voxels[np.random.choice(voxels.shape[0], voxels.shape[1] * 10, replace=False), :]
plane_transformation, residual = fit_pts_to_plane(random_pts)
inliers.append(((z - np.matmul(xy1, plane_transformation)) <= inlier_thresh).sum())
planes.append(plane_transformation)
return planes[np.array(inliers).argmax()]
def fit_pts_to_plane(voxels): # x y z (m x 3)
# https://math.stackexchange.com/questions/99299/best-fitting-plane-given-a-set-of-points
xy1 = np.concatenate([voxels[:, :-1], np.ones((voxels.shape[0], 1))], axis=1)
z = voxels[:, -1].reshape(-1, 1)
fit = np.matmul(np.matmul(np.linalg.inv(np.matmul(xy1.T, xy1)), xy1.T), z)
errors = z - np.matmul(xy1, fit)
residual = np.linalg.norm(errors)
return fit, residual
Here's one way. If your points are P[1]..P[n] then compute the mean M of these and subtract it from each, getting points p[1]..p[n]. Then compute C = Sum{ p[i]*p[i]'} (the "covariance" matrix of the points). Next diagonalise C, that is find orthogonal U and diagonal E so that C = U*E*U'. If your points are indeed on a plane then one of the eigenvalues (ie the diagonal entries of E) will be very small (with perfect arithmetic it would be 0). In any case if the j'th one of these is the smallest, then let the j'th column of U be (A,B,C) and compute D = -M'*N. These parameters define the "best" plane, the one such that the sum of the squares of the distances from the P[] to the plane is least.

Categories