My input is 2d (x,y) time series of a dot moving on a screen for a tracker software. It has some noise I want to remove using Kalman filter. Does someone can point me for a python code for Kalman 2d filter?
In scipy cookbook I found only a 1d example:
http://www.scipy.org/Cookbook/KalmanFiltering
I saw there is implementation for Kalman filter in OpenCV, but couldn't find code examples.
Thanks!
Here is my implementation of the Kalman filter based on the equations given on wikipedia. Please be aware that my understanding of Kalman filters is very rudimentary so there are most likely ways to improve this code. (For example, it suffers from the numerical instability problem discussed here. As I understand it, this only affects the numerical stability when Q, the motion noise, is very small. In real life, the noise is usually not small, so fortunately (at least for my implementation) in practice the numerical instability does not show up.)
In the example below, kalman_xy assumes the state vector is a 4-tuple: 2 numbers for the location, and 2 numbers for the velocity.
The F and H matrices have been defined specifically for this state vector: If x is a 4-tuple state, then
new_x = F * x
position = H * x
It then calls kalman, which is the generalized Kalman filter. It is general in the sense it is still useful if you wish to define a different state vector -- perhaps a 6-tuple representing location, velocity and acceleration. You just have to define the equations of motion by supplying the appropriate F and H.
import numpy as np
import matplotlib.pyplot as plt
def kalman_xy(x, P, measurement, R,
motion = np.matrix('0. 0. 0. 0.').T,
Q = np.matrix(np.eye(4))):
"""
Parameters:
x: initial state 4-tuple of location and velocity: (x0, x1, x0_dot, x1_dot)
P: initial uncertainty convariance matrix
measurement: observed position
R: measurement noise
motion: external motion added to state vector x
Q: motion noise (same shape as P)
"""
return kalman(x, P, measurement, R, motion, Q,
F = np.matrix('''
1. 0. 1. 0.;
0. 1. 0. 1.;
0. 0. 1. 0.;
0. 0. 0. 1.
'''),
H = np.matrix('''
1. 0. 0. 0.;
0. 1. 0. 0.'''))
def kalman(x, P, measurement, R, motion, Q, F, H):
'''
Parameters:
x: initial state
P: initial uncertainty convariance matrix
measurement: observed position (same shape as H*x)
R: measurement noise (same shape as H)
motion: external motion added to state vector x
Q: motion noise (same shape as P)
F: next state function: x_prime = F*x
H: measurement function: position = H*x
Return: the updated and predicted new values for (x, P)
See also http://en.wikipedia.org/wiki/Kalman_filter
This version of kalman can be applied to many different situations by
appropriately defining F and H
'''
# UPDATE x, P based on measurement m
# distance between measured and current position-belief
y = np.matrix(measurement).T - H * x
S = H * P * H.T + R # residual convariance
K = P * H.T * S.I # Kalman gain
x = x + K*y
I = np.matrix(np.eye(F.shape[0])) # identity matrix
P = (I - K*H)*P
# PREDICT x, P based on motion
x = F*x + motion
P = F*P*F.T + Q
return x, P
def demo_kalman_xy():
x = np.matrix('0. 0. 0. 0.').T
P = np.matrix(np.eye(4))*1000 # initial uncertainty
N = 20
true_x = np.linspace(0.0, 10.0, N)
true_y = true_x**2
observed_x = true_x + 0.05*np.random.random(N)*true_x
observed_y = true_y + 0.05*np.random.random(N)*true_y
plt.plot(observed_x, observed_y, 'ro')
result = []
R = 0.01**2
for meas in zip(observed_x, observed_y):
x, P = kalman_xy(x, P, meas, R)
result.append((x[:2]).tolist())
kalman_x, kalman_y = zip(*result)
plt.plot(kalman_x, kalman_y, 'g-')
plt.show()
demo_kalman_xy()
The red dots show the noisy position measurements, the green line shows the Kalman predicted positions.
For a project of mine, I needed to create intervals for time-series modeling, and to make the procedure more efficient I created tsmoothie: A python library for time-series smoothing and outlier detection in a vectorized way.
It provides different smoothing algorithms together with the possibility to computes intervals.
In the case of KalmanSmoother, you can operate a smoothing of a curve putting together different components: level, trend, seasonality, long seasonality
import numpy as np
import matplotlib.pyplot as plt
from tsmoothie.smoother import *
from tsmoothie.utils_func import sim_randomwalk
# generate 3 randomwalks timeseries of lenght 100
np.random.seed(123)
data = sim_randomwalk(n_series=3, timesteps=100,
process_noise=10, measure_noise=30)
# operate smoothing
smoother = KalmanSmoother(component='level_trend',
component_noise={'level':0.1, 'trend':0.1})
smoother.smooth(data)
# generate intervals
low, up = smoother.get_intervals('kalman_interval', confidence=0.05)
# plot the first smoothed timeseries with intervals
plt.figure(figsize=(11,6))
plt.plot(smoother.smooth_data[0], linewidth=3, color='blue')
plt.plot(smoother.data[0], '.k')
plt.fill_between(range(len(smoother.data[0])), low[0], up[0], alpha=0.3)
I point out also that tsmoothie can carry out the smoothing of multiple timeseries in a vectorized way
Related
I am interested in integrating in Fourier space after using scipy to take an fft of some data. I have been following along with this stack exchange post numerical integration in Fourier space with numpy.fft but it does not properly integrate a few test cases I have been working with. I have added a few lines to address this issue but still am not recovering the correct integrals. Below is the code I have been using to integrate my test cases. At the top of the code are the 3 test cases I have been using.
import numpy as np
import scipy.special as sp
from scipy.fft import fft, ifft, fftfreq
import matplotlib.pyplot as plt
#set number of points in array
Ns = 2**16
#create array in space
x = np.linspace(-np.pi, np.pi, Ns)
#test case 1 from stack exchange post
# y = np.exp(-x**2) # function f(x)
# ys = np.exp(-x**2) * (-2 *x) # derivative f'(x)
#test case 2
# y = np.exp(-x**2) * x - 1/2 *np.sqrt(np.pi)*sp.erf(x)
# ys = np.exp(-x**2) * -2*x**2
#test case 3
y = np.sin(x**2) + (1/4)*np.exp(x)
ys = 1/4*(np.exp(x) + 8*x*np.cos(x**2))
#find spacing in space array
ss = x[1]-x[0]
#definte fft integration function
def fft_int(N,s,dydt):
#create frequency array
f = fftfreq(N,s)
# integration step ignoring divide by 0 errors
Fys = fft(dydt)
with np.errstate(divide="ignore", invalid="ignore"):
modFys = Fys / (2*np.pi*1j*f)
#set DC term to 0, was a nan since we divided by 0
modFys[0] = 0
#take inverse fft and subtract by integration constant
fourier = ifft(modFys)
fourier = fourier-fourier[0]
#tilt correction if function doesn't approach 0 at its ends
tilt = np.sum(dydt)*s*(np.arange(0,N)/(N-1) - 1/2)
fourier = fourier + tilt
return fourier
Test case 1 was from the stack exchange post from above. If you copy paste the code from the top answer and plot you'll get something like this:
with the solid blue line being the fft integration method and the dashed orange as the analytic solution. I account for this offset with the following line of code:
fourier = fourier-fourier[0]
since I don't believe the code was setting the constant of integration.
Next for test case 2 I get a plot like this:
again with the solid blue line being the fft integration method and the dashed orange as the analytic solution. I account for this tilt in the solution using the following lines of code
tilt = np.sum(dydt)*s*(np.arange(0,N)/(N-1) - 1/2)
fourier = fourier + tilt
Finally we arrive at test case 3. Which results in the following plot:
again with the solid blue line being the fft integration method and the dashed orange as the analytic solution. This is where I'm stuck, this offset has appeared again and I'm not sure why.
TLDR: How do I correctly integrate a function in fourier space using scipy.fft?
The tilt component makes no sense. It fixes one function, but it's not a generic solution of the problem.
The problem is that the FFT induces periodicity in the signal, meaning you compute the integral of a different function. Multiplying the FFT of the signal by 1/(2*np.pi*1j*f) is equivalent to a circular convolution of the signal with ifft(1/(2*np.pi*1j*f)). "Circular" is the key here. This is just a boundary problem.
Padding the function with zeros is one way to attempt to fix this:
import numpy as np
import scipy.special as sp
from scipy.fft import fft, ifft, fftfreq
import matplotlib.pyplot as plt
def fft_int(s, dydt, N=0):
dydt_padded = np.pad(dydt, (0, N))
f = fftfreq(dydt_padded.shape[0], s)
F = fft(dydt_padded)
with np.errstate(divide="ignore", invalid="ignore"):
F = F / (2*np.pi*1j*f)
F[0] = 0
y_padded = np.real(ifft(F))
y = y_padded[0:dydt.shape[0]]
return y - np.mean(y)
N = 2**16
x = np.linspace(-np.pi, np.pi, N)
s = x[1] - x[0]
# Test case 3
y = np.sin(x**2) + (1/4)*np.exp(x)
dy = 1/4*(np.exp(x) + 8*x*np.cos(x**2))
plt.plot(y - np.mean(y))
plt.plot(fft_int(s, dy))
plt.plot(fft_int(s, dy, N))
plt.plot(fft_int(s, dy, 10*N))
plt.show()
(Blue is expected output, computed solution without padding is orange, and with increasing amount of padding, green and red.)
Here I've solved the "offset" problem by plotting all functions with their mean removed. Setting the DC component to 0 is equal to subtracting the mean. But after cropping off the padding the mean changes, so fft_int subtracts the mean again after cropping.
Anyway, note how we get an increasingly better approximation as the padding increases. To get the exact result, one would need an infinite amount of padding, which of course is unrealistic.
Test case #1 doesn't need padding, the function reaches zero at the edges of the sampled domain. We can impose such a behavior on the other cases too. In Discrete Fourier analysis this is called windowing. This would look something like this:
def fft_int(s, dydt):
dydt_windowed = dydt * np.hanning(dydt.shape[0])
f = fftfreq(dydt.shape[0], s)
F = fft(dydt_windowed)
with np.errstate(divide="ignore", invalid="ignore"):
F = F / (2*np.pi*1j*f)
F[0] = 0
y = np.real(ifft(F))
return y
However, here we get correct integration results only in the middle of the domain, with increasingly suppressed values towards to ends. So this is not a practical solution either.
My conclusion is that no, this is not possible to do. It is much easier to compute the integral with np.cumsum:
yp = np.cumsum(dy) * s
plt.plot(y - np.mean(y))
plt.plot(yp - np.mean(yp))
plt.show()
(not showing output: the two plots overlap perfectly.)
Since learning about point charges in my physics II class this semester, I want to be able to investigate not only the static force and field distributions but the actual trajectories of movement of electrically charged particles. The first stage in doing this is to build a naive engine for simulating the dynamics of n individual point particles. I've implemented the solution using matrices in python and was hoping someone could comment on whether I've done so correctly. As I don't know what kind of dynamics to expect, I can't tell directly from the videos that my implementation of my equations is correct.
My Particular Problem
In particular, I cannot tell if in my calculation of Force magnitude I am computing the 1/r^(3/2) factor correctly. Why? because when I simulate a dipole and use $2/2$ as an exponent the particles start going in an elliptical orbit. which is what I would expect. However, when I use the correct exponent, I get this: Where is my code going wrong? What am I supposed to expect
I'll first write down the equations I'm using:
Given n charges q_1, q_2, ..., q_n, with masses m_1, m_2, ..., m_n located at initial positions r_1, r_2, ..., r_n, with velocities (d/dt)r_1, (d/dt)r_2, ..., (d/dt)r_n the force induced on q_i by q_j is given by
F_(j -> i) = k(q_iq_j)/norm(r_i-r_j)^{3/2} * (r_i - r_j)
Now, the net marginal force on particle $q_i$ is given as the sum of the pairwise forces
F_(N, i) = sum_(j != i)(F_(j -> i))
And then the net acceleration of particle $q_i$ just normalizes the force by the mass of the particle:
(d^2/dt^2)r_i = F_(N, i)/m_i
In total, for n particles, we have an n-th order system of differential equations. We will also need to specify n initial particle velocities and n initial positions.
To implement this in python, I need to be able to compute pairwise point distances and pairwise charge multiples. To do this I tile the q vector of charges and the r vector of positions and take, respectively, their product and difference with their transpose.
def integrator_func(y, t, q, m, n, d, k):
y = np.copy(y.reshape((n*2,d)))
# rj across, ri down
rs_from = np.tile(y[:n], (n,1,1))
# ri across, rj down
rs_to = np.transpose(rs_from, axes=(1,0,2))
# directional distance between each r_i and r_j
# dr_ij is the force from j onto i, i.e. r_i - r_j
dr = rs_to - rs_from
# Used as a mask to ignore divides by zero between r_i and r_i
nd_identity = np.eye(n).reshape((n,n,1))
# WHAT I AM UNSURE ABOUT
drmag = ma.array(
np.power(
np.sum(np.power(dr, 2), 2)
,3./2)
,mask=nd_identity)
# Pairwise q_i*q_j for force equation
qsa = np.tile(q, (n,1))
qsb = np.tile(q, (n,1)).T
qs = qsa*qsb
# Directional forces
Fs = (k*qs/drmag).reshape((n,n,1))
# Dividing by m to obtain acceleration vectors
a = np.sum(Fs*dr, 1)
# Setting velocities
y[:n] = np.copy(y[n:])
# Entering the acceleration into the velocity slot
y[n:] = np.copy(a)
# Flattening it out for scipy.odeint to work properly
return np.array(y).reshape(n*2*d)
def sim_particles(t, r, v, q, m, k=1.):
"""
With n particles in d dimensions:
t: timepoints to integrate over
r: n*d matrix. The d-dimensional initial positions of n particles
v: n*d matrix of initial particle velocities
q: n*1 matrix of particle charges
m: n*1 matrix of particle masses
k: electric constant.
"""
d = r.shape[-1]
n = r.shape[0]
y0 = np.zeros((n*2,d))
y0[:n] = r
y0[n:] = v
y0 = y0.reshape(n*2*d)
yf = odeint(
integrator_func,
y0,
t,
args=(q,m,n,d,k)).reshape(t.shape[0],n*2,d)
return yf
I have been working to implement a Kalman filter to search for anomalies in a two dimensional data set. Very similar to the excellent post that I found here. As a next step, I'd like to predict confidence intervals (for example 95% confidence for floor and ceiling values) for what I predict the next values will fall in. So in addition to the line below, I'd like to be able to generate two additional lines which represent a 95% confidence that the next value will be above the floor or below the ceiling.
I assume that I'll want to use the uncertainty covariance matrix (P) that is returned with each prediction generated by the Kalman filter but I'm not sure if it's right. Any guidance or reference to how to do this would be much appreciated!
kalman 2d filter in python
The code in the post above generates a set of measurements over time and uses a Kalman filter to smooth the results.
import numpy as np
import matplotlib.pyplot as plt
def kalman_xy(x, P, measurement, R,
motion = np.matrix('0. 0. 0. 0.').T,
Q = np.matrix(np.eye(4))):
"""
Parameters:
x: initial state 4-tuple of location and velocity: (x0, x1, x0_dot, x1_dot)
P: initial uncertainty convariance matrix
measurement: observed position
R: measurement noise
motion: external motion added to state vector x
Q: motion noise (same shape as P)
"""
return kalman(x, P, measurement, R, motion, Q,
F = np.matrix('''
1. 0. 1. 0.;
0. 1. 0. 1.;
0. 0. 1. 0.;
0. 0. 0. 1.
'''),
H = np.matrix('''
1. 0. 0. 0.;
0. 1. 0. 0.'''))
def kalman(x, P, measurement, R, motion, Q, F, H):
'''
Parameters:
x: initial state
P: initial uncertainty convariance matrix
measurement: observed position (same shape as H*x)
R: measurement noise (same shape as H)
motion: external motion added to state vector x
Q: motion noise (same shape as P)
F: next state function: x_prime = F*x
H: measurement function: position = H*x
Return: the updated and predicted new values for (x, P)
See also http://en.wikipedia.org/wiki/Kalman_filter
This version of kalman can be applied to many different situations by
appropriately defining F and H
'''
# UPDATE x, P based on measurement m
# distance between measured and current position-belief
y = np.matrix(measurement).T - H * x
S = H * P * H.T + R # residual convariance
K = P * H.T * S.I # Kalman gain
x = x + K*y
I = np.matrix(np.eye(F.shape[0])) # identity matrix
P = (I - K*H)*P
# PREDICT x, P based on motion
x = F*x + motion
P = F*P*F.T + Q
return x, P
def demo_kalman_xy():
x = np.matrix('0. 0. 0. 0.').T
P = np.matrix(np.eye(4))*1000 # initial uncertainty
N = 20
true_x = np.linspace(0.0, 10.0, N)
true_y = true_x**2
observed_x = true_x + 0.05*np.random.random(N)*true_x
observed_y = true_y + 0.05*np.random.random(N)*true_y
plt.plot(observed_x, observed_y, 'ro')
result = []
R = 0.01**2
for meas in zip(observed_x, observed_y):
x, P = kalman_xy(x, P, meas, R)
result.append((x[:2]).tolist())
kalman_x, kalman_y = zip(*result)
plt.plot(kalman_x, kalman_y, 'g-')
plt.show()
demo_kalman_xy()
The 2D generalization of the 1-sigma interval is the confidence ellipse which is characterized by the equation (x-mx).T P^{-1}.(x-mx)==1, with x being the parameter 2D-Vector, mx the 2D mean or ellipse center and P^{-1} the inverse covariance matrix. See this answer on how to draw one. Like the sigma-intervals the ellipses area corresponds to a fixed probability that the true value lies within. By scaling with the factor n (scaling the interval length or the ellipse radii) a higher confidence can be reached. Note that the Factors n have different probabilities in one and two dimensions:
|`n` | 1D-Intverval | 2D Ellipse |
==================================
1 | 68.27% | 39.35%
2 | 95.5% | 86.47%
3 | 99.73% | 98.89%
Calculating these values in 2D is a bit involved and unfortunately I don't have a public reference to it.
If you want a 95% interval to predict the next values will fall in, then you want a prediction interval and not a confidence interval (http://en.wikipedia.org/wiki/Prediction_interval).
For 2-D (3-D) data, the semi-axes of the ellipse (ellipsoid) can be found by calculating the eigenvalues of the covariance matrix of the data and adjusting the size of the semi-axes to account for the necessary prediction probability.
See Prediction ellipse and prediction ellipsoid for a Python code to calculate the 95% prediction ellipse or ellipsoid.
This might help you to calculate the prediction ellipse for your data.
Because your statistic is of course derived from a sample, the probability the population statistic is greater than the 2 sigma standard deviation is 0.5. Therefore, I would contemplate the significance of considering whether you have a good prediction of a value you expect the next measure to be below with probability 0.95 if you have not applied an upper confidence factor of the 2x standard deviation. The magnitude of that factor will depend on the sample size used to derive the 0.5 population probability. The smaller the sample size used to derive the covariance matrix the larger the factor to derive the 0.95 probability the population 0.95 statistic is less than the factored up sample statistic.
Update: I have modified the Optimize and Eigen and Solve methods to reflect changes. All now return the "same" vector allowing for machine precision. I am still stumped on the Eigen method. Specifically How/Why I select slice of the eigenvector does not make sense. It was just trial and error till the normal matched the other solutions. If anyone can correct/explain what I really should do, or why what I have done works I would appreciate it..
Thanks Alexander Kramer, for explaining why I take a slice, only alowed to select one correct answer
I have a depth image. I want to calculate a crude surface normal for a pixel in the depth image. I consider the surrounding pixels, in the simplest case a 3x3 matrix, and fit a plane to these point, and calculate the normal unit vector to this plane.
Sounds easy, but thought best to verify the plane fitting algorithms first. Searching SO and various other sites I see methods using least squares, singlualar value decomposition, eigenvectors/values etc.
Although I don't fully understand the maths I have been able to get the various fragments/example to work. The problem I am having, is that I am getting different answers for each method. I was expecting the various answers would be similar (not exact), but they seem significantly different. Perhaps some methods are not suited to my data, but not sure why I am getting different results. Any ideas why?
Here is the Updated output of the code:
LTSQ: [ -8.10792259e-17 7.07106781e-01 -7.07106781e-01]
SVD: [ 0. 0.70710678 -0.70710678]
Eigen: [ 0. 0.70710678 -0.70710678]
Solve: [ 0. 0.70710678 0.70710678]
Optim: [ -1.56069661e-09 7.07106781e-01 7.07106782e-01]
The following code implements five different methods to calculate the surface normal of a plane. The algorithms/code were sourced from various forums on the internet.
import numpy as np
import scipy.optimize
def fitPLaneLTSQ(XYZ):
# Fits a plane to a point cloud,
# Where Z = aX + bY + c ----Eqn #1
# Rearanging Eqn1: aX + bY -Z +c =0
# Gives normal (a,b,-1)
# Normal = (a,b,-1)
[rows,cols] = XYZ.shape
G = np.ones((rows,3))
G[:,0] = XYZ[:,0] #X
G[:,1] = XYZ[:,1] #Y
Z = XYZ[:,2]
(a,b,c),resid,rank,s = np.linalg.lstsq(G,Z)
normal = (a,b,-1)
nn = np.linalg.norm(normal)
normal = normal / nn
return normal
def fitPlaneSVD(XYZ):
[rows,cols] = XYZ.shape
# Set up constraint equations of the form AB = 0,
# where B is a column vector of the plane coefficients
# in the form b(1)*X + b(2)*Y +b(3)*Z + b(4) = 0.
p = (np.ones((rows,1)))
AB = np.hstack([XYZ,p])
[u, d, v] = np.linalg.svd(AB,0)
B = v[3,:]; # Solution is last column of v.
nn = np.linalg.norm(B[0:3])
B = B / nn
return B[0:3]
def fitPlaneEigen(XYZ):
# Works, in this case but don't understand!
average=sum(XYZ)/XYZ.shape[0]
covariant=np.cov(XYZ - average)
eigenvalues,eigenvectors = np.linalg.eig(covariant)
want_max = eigenvectors[:,eigenvalues.argmax()]
(c,a,b) = want_max[3:6] # Do not understand! Why 3:6? Why (c,a,b)?
normal = np.array([a,b,c])
nn = np.linalg.norm(normal)
return normal / nn
def fitPlaneSolve(XYZ):
X = XYZ[:,0]
Y = XYZ[:,1]
Z = XYZ[:,2]
npts = len(X)
A = np.array([ [sum(X*X), sum(X*Y), sum(X)],
[sum(X*Y), sum(Y*Y), sum(Y)],
[sum(X), sum(Y), npts] ])
B = np.array([ [sum(X*Z), sum(Y*Z), sum(Z)] ])
normal = np.linalg.solve(A,B.T)
nn = np.linalg.norm(normal)
normal = normal / nn
return normal.ravel()
def fitPlaneOptimize(XYZ):
def residiuals(parameter,f,x,y):
return [(f[i] - model(parameter,x[i],y[i])) for i in range(len(f))]
def model(parameter, x, y):
a, b, c = parameter
return a*x + b*y + c
X = XYZ[:,0]
Y = XYZ[:,1]
Z = XYZ[:,2]
p0 = [1., 1.,1.] # initial guess
result = scipy.optimize.leastsq(residiuals, p0, args=(Z,X,Y))[0]
normal = result[0:3]
nn = np.linalg.norm(normal)
normal = normal / nn
return normal
if __name__=="__main__":
XYZ = np.array([
[0,0,1],
[0,1,2],
[0,2,3],
[1,0,1],
[1,1,2],
[1,2,3],
[2,0,1],
[2,1,2],
[2,2,3]
])
print "Solve: ", fitPlaneSolve(XYZ)
print "Optim: ",fitPlaneOptimize(XYZ)
print "SVD: ",fitPlaneSVD(XYZ)
print "LTSQ: ",fitPLaneLTSQ(XYZ)
print "Eigen: ",fitPlaneEigen(XYZ)
Optimize
The normal vector of a plane a*x + b*y +c*z = 0, equals (a,b,c)
The optimize method finds a values for a and b such that a*x+b*y~z (~ denotes approximates) It omits to use the value of c in the calculation at all. I don't have numpy installed on this machine but I expect that changing the model to (a*x+b*y)/c should fix this method. It will not give the same result for all data-sets. This method will always assume a plane that goes through the origin.
SVD and LTSQ
produce the same results. (The difference is about the size of machine precision).
Eigen
The wrong eigenvector is chosen. The eigenvector corresponding to the greatest eigenvalue (lambda = 1.50) is x=[0, sqrt(2)/2, sqrt(2)/2] just as in the SVD and LTSQ.
Solve
I have no clue how this is supposed to work.
The normal vector of the plane in Eigen solution is the eigenvector for smallest eigenvalue. Some Eigen implementations sort the eigenvalues and eigenvectors some others don't. So in some implementations it's sufficient to take first (or last) eigenvector for normal. In other implementations you have to sort them first. On the other hand the majority of SVD implementations provide sorted values so it's simple first (or last) vector.
I wanted to compute the volume of the intersect of a sphere and infinite cylinder at some distance b, and i figured i would do it using a quick and dirty python script. My requirements are a <1s computation with >3 significant digits.
My thinking was as such:
We place the sphere, with radius R, such that its center is at the origin, and we place the cylinder, with radius R', such that its axis is spanned in z from (b,0,0). We integrate over the sphere, using a step function that returns 1 if we are inside the cylinder, and 0 if not, thus integrating 1 over the set constrained by being inside both sphere and cylinder, i.e. the intersect.
I tried this using scipy.intigrate.tplquad. It did not work out. I think its because of the discontinuity of the step function as i get warnings such the following. Of course, i might just be doing this wrong. Assuming i have not made some stupid mistake, I could attempt to formulate the ranges of the intersect, thus removing the need for the step function, but i figured i might try and get some feedback first. Can anyone spot any mistake, or point towards some simple solution.
Warning: The maximum number of
subdivisions (50) has been achieved.
If increasing the limit yields no
improvement it is advised to analyze
the integrand in order to determine
the difficulties. If the position of
a local difficulty can be
determined (singularity,
discontinuity) one will probably
gain from splitting up the interval
and calling the integrator on the
subranges. Perhaps a special-purpose
integrator should be used.
Code:
from scipy.integrate import tplquad
from math import sqrt
def integrand(z, y, x):
if Rprim >= (x - b)**2 + y**2:
return 1.
else:
return 0.
def integral():
return tplquad(integrand, -R, R,
lambda x: -sqrt(R**2 - x**2), # lower y
lambda x: sqrt(R**2 - x**2), # upper y
lambda x,y: -sqrt(R**2 - x**2 - y**2), # lower z
lambda x,y: sqrt(R**2 - x**2 - y**2), # upper z
epsabs=1.e-01, epsrel=1.e-01
)
R=1
Rprim=1
b=0.5
print integral()
Assuming you are able to translate and scale your data such a way that the origin of the sphere is in [0, 0, 0] and its radius is 1, then a simple stochastic approximation may give you a reasonable answer fast enough. So, something along the lines could be a good starting point:
import numpy as np
def in_sphere(p, r= 1.):
return np.sqrt((p** 2).sum(0))<= r
def in_cylinder(p, c, r= 1.):
m= np.mean(c, 1)[:, None]
pm= p- m
d= np.diff(c- m)
d= d/ np.sqrt(d** 2).sum()
pp= np.dot(np.dot(d, d.T), pm)
return np.sqrt(((pp- pm)** 2).sum(0))<= r
def in_sac(p, c, r_c):
return np.logical_and(in_sphere(p), in_cylinder(p, c, r_c))
if __name__ == '__main__':
n, c= 1e6, [[0, 1], [0, 1], [0, 1]]
p= 2* np.random.rand(3, n)- 2
print (in_sac(p, c, 1).sum()/ n)* 2** 3
Performing a triple adaptive numerical integrations on a discontinuous function that is constant over two domains is a terribly poor idea, especially if you wish to see either speed or accuracy.
I would suggest a far better idea is to reduce the problem analytically.
Align the cylinder with an axis, by transformation. This translates the sphere to some point that is not at the origin.
Now, find the limits of intersection of the sphere with the cylinder along that axis.
Integrate over that axis variable. The area of intersection at any fixed value along the axis is simply the area of intersection of two circles, which in turn is simply computable using trigonometry and a little effort.
In the end, you will have an exact result, with almost no computation time needed.
I solved it using a simple MC integration, as suggested by eat, but my implementation was to slow. My requirements had increased. I therefore reformulated the problem mathematically, as suggested by woodchips.
Basically i formulated the limits of x as a function of z and y, and y as a function of z. Then i, in essence, integrated f(z,y,z)=1 over the intersection, using the limits. I did this because of the speed increase, allowing me to plot volume vs b, and because it allows me to integrate more complex functions with relative minor modification.
I include my code in case anyone is interested.
from scipy.integrate import quad
from math import sqrt
from math import pi
def x_max(y,r):
return sqrt(r**2-y**2)
def x_min(y,r):
return max(-sqrt(r**2 - y**2), -sqrt(R**2 - y**2) + b)
def y_max(r):
if (R<b and b-R<r) or (R>b and b-R>r):
return sqrt( R**2 - (R**2-r**2+b**2)**2/(4.*b**2) )
elif r+R<b:
return 0.
else: #r+b<R
return r
def z_max():
if R>b:
return R
else:
return sqrt(2.*b*R - b**2)
def delta_x(y, r):
return x_max(y,r) - x_min(y,r)
def int_xy(z):
r = sqrt(R**2 - z**2)
return quad(delta_x, 0., y_max(r), args=(r))
def int_xyz():
return quad(lambda z: int_xy(z)[0], 0., z_max())
R=1.
Rprim=1.
b=0.5
print 4*int_xyz()[0]
First off: You can calculate the volume of the intersection by hand. If you don't want to (or can't) do that, here's an alternative:
I'd generate a tetrahedral mesh for the domain and then add up the cell volumes. An example with pygalmesh and meshplex (both authored by myself):
import pygalmesh
import meshplex
import numpy
ball = pygalmesh.Ball([0, 0, 0], 1.0)
cyl = pygalmesh.Cylinder(-1, 1, 0.7, 0.1)
u = pygalmesh.Intersection([ball, cyl])
mesh = pygalmesh.generate_mesh(u, cell_size=0.05, edge_size=0.1)
points = mesh.points
cells = mesh.cells["tetra"]
# kick out unused vertices
uvertices, uidx = numpy.unique(cells, return_inverse=True)
cells = uidx.reshape(cells.shape)
points = points[uvertices]
mp = meshplex.MeshTetra(points, cells)
print(sum(mp.cell_volumes))
This gives you
and prints 2.6567890958740463 as volume. Decrease cell or edge sizes for higher precision.