Gradient descent optimization for multivariate scalar functions - python

I attempted to test my gradient descent program on rosenbrock function. But no matter how I adjusted my learning rate (step argument), precision (precision argument) and number of iterations (iteration argument), I couldn't get a very close result.
import numpy as np
def minimize(f, f_grad, x, step=1e-3, iterations=1e3, precision=1e-3):
count = 0
while True:
last_x = x
x = x - step * f_grad(x)
count += 1
if count > iterations or np.linalg.norm(x - last_x) < precision:
break
return x
def rosenbrock(x):
"""The Rosenbrock function"""
return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)
def rosenbrock_grad(x):
"""Gradient of Rosenbrock function"""
xm = x[1:-1]
xm_m1 = x[:-2]
xm_p1 = x[2:]
der = np.zeros_like(x)
der[1:-1] = 200*(xm-xm_m1**2) - 400*(xm_p1 - xm**2)*xm - 2*(1-xm)
der[0] = -400*x[0]*(x[1]-x[0]**2) - 2*(1-x[0])
der[-1] = 200*(x[-1]-x[-2]**2)
return der
x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2])
minimize(rosenbrock, rosenbrock_grad, x0, step=1e-6, iterations=1e4, precision=1e-6)
For example, code like above gives me array([ 1.01723267, 1.03694999, 1.07870143, 1.16693184, 1.36404334]). But if I use any built-in optimization methods in scipy.optimize, I can get very close answer or exactly equal array([ 1., 1., 1., 1., 1.]) (this is the true answer).
However, if I use very small step, precision and very large iterations in my program, the calculation just takes forever on my computer.
I wonder if this is due to
any bugs in my program
or just because
gradient descent is inefficient here and demands very small
step, precision and very large iterations to yield a very close
solution
or because
I need to do some special feature scaling.
(Ps. I also tried to plot two-dimensional plot where value of function is on y axis and the number of iterations is on x axis to "debug" gradient descent, but even I get a nice-looking downsloping graph, the solution is still not very close.)

To quote the Rosenbrock Wikipedia page:
The global minimum is inside a long, narrow, parabolic shaped flat valley. To find the valley is trivial. To converge to the global minimum, however, is difficult.
Gradient descent is a simple algorithm, so it is probably no surprise that it cannot find the minimum. Let's see what happens in 2D for different starting points:
Just as Wikipedia says: it easily finds the valley but then fails to converge further. The gradient along the valley is very flat compared to the rest of the function.
I would conclude that your implementation works correctly but perhaps the Rosenbrock function is not the most appropriate function to test it.
Contrary to other answers, I further argue that the step size is too small rather than too large. The problem is not overshooting but that the algorithm gets stuck. If I set the the step size to 1e-3 without changing other settings the algorithm converges to the maximum within two digits. This happens despite overshooting the valley from some starting positions in the 2D case - but you need the speed not to get stuck later on, so to say.
Here is the modified code to reproduce above figure:
import numpy as np
import matplotlib.pyplot as plt
def minimize(f, f_grad, x, step=1e-3, iterations=1e3, precision=1e-3):
count = 0
while True:
last_x = x
x_hist.append(x)
x = x - step * f_grad(x)
count += 1
if count > iterations or np.linalg.norm(x - last_x) < precision:
x_hist.append(x)
break
return x
def rosenbrock(x):
"""The Rosenbrock function"""
return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)
def rosenbrock_grad(x):
"""Gradient of Rosenbrock function"""
xm = x[1:-1]
xm_m1 = x[:-2]
xm_p1 = x[2:]
der = np.zeros_like(x)
der[1:-1] = 200*(xm-xm_m1**2) - 400*(xm_p1 - xm**2)*xm - 2*(1-xm)
der[0] = -400*x[0]*(x[1]-x[0]**2) - 2*(1-x[0])
der[-1] = 200*(x[-1]-x[-2]**2)
return der
k = np.linspace(0, 2, 101)
f = np.empty((k.shape[0], k.shape[0]))
for i, y in enumerate(k):
for j, x in enumerate(k):
f[i, j] = rosenbrock(np.array([x, y]))
plt.imshow(np.log10(f), extent=[k[0], k[-1], k[-1], k[0]], cmap='autumn')
for start in [[0.5, 0.5], [1.0, 0.5], [1.5, 0.5],
[0.5, 1.0], [1.0, 1.0], [1.5, 1.0],
[0.5, 1.5], [1.0, 1.5], [1.5, 1.5]]:
x0 = np.array(start)
x_hist = []
minimize(rosenbrock, rosenbrock_grad, x0, step=1e-6, iterations=1e4, precision=1e-9)
x_hist = np.array(x_hist)
plt.plot(x_hist[:, 0], x_hist[:, 1], 'k')
plt.plot(x0[0], x0[1], 'ok')

Your method is vulnerable to overshoot. In a case with instantaneously high gradient, your solution will jump very far. It is often appropriate in optimization to refuse to take a step when it fails to reduce cost.
Linesearch
Once you have chosen a direction by computing the gradient, search along that direction until you reduce cost by some fraction of the norm of the gradient.
I.e. Start with x[n+1]= x - α * gradient
And vary α from 1.0 to 0.0, accepting a value for x if has reduced the cost by some fraction of the norm of gradient. This is a nice convergence rule termed the Armijo rule.
Other advice
Consider optimizing the 2D Rosenbrock function first, and plotting your path over that cost field.
Consider numerically verifying that your gradient implementation is correct. More often than not, this is the problem.

Imagine you're hiking along a
knife-edge
mountain path that's getting narrower and narrower.
A constant step-size will take you over the edge, aieeeee;
you want to take shorter, more careful steps as you climb.
Similarly, to follow a Rosenbrock valley, a program must
take shorter, more careful steps as the valley narrows.
Decreasing step-sizes as step0 / t^0.5 or 0.25
helps GD on Rosenbrock a bit,
but is still very sensitive to step0.
Real step-sizes aka learning rates must adapt to the problem terrain, e.g.
line search for smooth problems, Ada* for
SGD .
By the way, the Rosenbrock function is a sum of squares,
and there are powerful methods for that; see
scipy.optimize.least_squares .

Related

How to approximate function for geometrically growing sequence?

I have the function to create x0:
x0 = []
for i in range(0,N):
if i == 0:
a = 0.4
else:
a = round(0.4 + 0.3*2**(i-1), 1)
print(i, a)
x0.append(a)
which gives me data of growing sequence: [0.4, 0.7, 1.0, 1.6, 2.8, 5.2, 10.0, 19.6, 38.8, 77.2, ...] which I want to find a function to fit to these points. I don't want to use polynomials because the N can be different but need to find single parameter function. The projection needs to be very general.
My approach is using the code:
def fun(x, a, b):
return np.cos(x**(2/3) * a + b)
# Make fit #
y0 = [0]*len(x0)
p, c = curve_fit(f=fun, xdata=np.array(x0), ydata=y0)
x = np.linspace(0, max(x0), 10000)
plt.plot(x, fun(x, *p))
plt.scatter(x0, y0)
That function's progress seem to be too wide for starting points and quite fit the last ones. I also tried to lower the initial oscillations by multiplying this function by x, but the period still is too wide at the beginning. Is it possible to find good oscillation function to go thru (almost) all those points? I don't know how to set parameter under the x**(...) because placing there a variable cause the fit to estimate it as close to 1, which is not what I need. Can I set the power for sin(x**b) that way? If not what functions family should I try?
Below the plot for function multiplied by b*x. The oscillations at first points should be much denser.
Thanks to suggestions I've found the best fit and I think it can't be better.
The solution is:
def fun(x, a, b, c):
return np.cos(np.pi*(np.log2((x-a)/b) + c))
and the fit method looks like
p, c = curve_fit(f=fun, xdata=np.array(x0), ydata=y0, bounds=([0, -np.inf, -np.inf], [x0[0], np.inf, np.inf]))
It's important to set initial bounds for a to avoid convergention failure or "Residuals are not finite in the initial point" issues. At last each point has its own crossing despite mad behavior of the close to 0 at the domain. Parameters are pretty close to 0 or 1 - not tending to infinity.

Numeric Gradient Descent in python

I wrote this code to get the gradient descent of a vector function .
Where: f is the function, X0 is the starting point and eta is the step size.
It is essentially made up of two parts, first that obtains the gradient of the function and the second, which iterates over x, subtracting the gradient.
the problem is that you usually have trouble converging on some functions, for example:
if we take , the gradient descent does not converge to [20,25]
Something I need to change or add?
def descenso_grad(f,X0,eta):
def grad(f,X):
import numpy as np
def partial(g,k,X):
h=1e-9
Y=np.copy(X)
X[k-1]=X[k-1]+h
dp=(g(X)-g(Y))/h
return dp
grd=[]
for i in np.arange(0,len(X)):
ai=partial(f,i+1,X)
grd.append(ai)
return grd
#iterations
i=0
while True:
i=i+1
X0=X0-eta*np.array(grad(f,X0))
if np.linalg.norm(grad(f,X0))<10e-8 or i>400: break
return X0
Your gradient descent implementation is a good basic implementation, but your gradient sometimes oscillate and exploses. First we should precise that your gradient descent does not always diverge. For some combinations of eta and X0, it actually converges.
But first let me suggest a few edits to the code:
The import numpy as np statement should be at the top of your file, not within a function. In general, any import statement should be at the beginning of the code so that they are executed only once
It is better not to write nested functions but to separate them: you can write the partial function outside of the gradfunction, and the gradfunction outside of the descendo_grad function. it is better for debugging.
I strongly recommend to pass parameters such as the learning rate (eta), the number of steps (steps) and the tolerance (set to 10e-8 in your code, or 1e-7) as parameters to the descendo_grad function. This way you will be able to compare their influence on the result.
Anyway, here is the implementation of your code I will use in this answer:
import numpy as np
def partial(g, k, X):
h = 1e-9
Y = np.copy(X)
X[k - 1] = X[k - 1] + h
dp = (g(X) - g(Y)) / h
return dp
def grad(f, X):
grd = []
for i in np.arange(0, len(X)):
ai = partial(f, i + 1, X)
grd.append(ai)
return grd
def descenso_grad(f,X0,eta, steps, tolerance=1e-7):
#iterations
i=0
while True:
i=i+1
X0=X0-eta*np.array(grad(f,X0))
if np.linalg.norm(grad(f,X0))<tolerance or i>steps: break
return X0
def f(X):
return (X[0]-20)**4 + (X[1]-25)**4
Now, about the convergence. I said that your implementation didn't always diverge. Indeed:
X0 = [2, 30]
eta = 0.001
steps = 400
xmin = descenso_grad(f, X0, eta, steps)
print(xmin)
Will print [20.55359068 25.55258024]
But:
X0 = [2, 0]
eta = 0.001
steps = 400
xmin = descenso_grad(f, X0, eta, steps)
print(xmin)
Will actually diverge to [ 2.42462695e+01 -3.54879793e+10]
1) What happened
Your gradient is actually oscillating aroung the y axis. Let's compute the gradient of f at X0 = [2, 0]:
print(grad(f, X0))
We get grad(f, X0) = [-23328.00067961216, -62500.01024454831], which is quite high but in the right direction.
Now let's compute the next step of gradient descent:
eta = 0.001
X1=X0-eta*np.array(grad(f,X0))
print(X1)
We get X1 = [25.32800068 62.50001025]. We can see that on the x axis, we actually get closer to the minimal, but on the y axis, the gradient descent jumped to the other side of the minimal and went even further from it. Indeed, X0[1] was at a disctance of 25 from the minimal (X0[1] - Xmin[1] = 25) at its left while X0[1] is now at a distance of 65-25 = 40 but on its right*. Since the curve drawn by f has a simple U shape around the y axis, the value taken by f in X1 will be higher than before (to simplify, we ignore the influence of the x coordinate).
If we look at the next steps, we can clearly see the exploding oscillations around the minimal:
X0 = [2, 0]
eta = 0.001
steps = 10
#record values taken by X[1] during gradient descent
curve_y = [X0[1]]
i = 0
while True:
i = i + 1
X0 = X0 - eta * np.array(grad(f, X0))
curve_y.append(X0[1])
if np.linalg.norm(grad(f, X0)) < 10e-8 or i > steps: break
print(curve_y)
We get [0, 62.50001024554831, -148.43710232226067, 20719.6258707022, -35487979280.37413]. We can see that X1 gets further and further from the minimal while oscillating around it.
In order to illustrate this, let's assume that the value along the x axis is fixed, and look only at what happens on the y axis. The picture shows in black the oscillations of the function's values taken at each step of the gradient descent (synthetic data for the purpose of illustrating only). The gradient descent takes us further from the minimal at each step because the update value is too large:
Note that the gradient descent we gave as an example makes only 5 steps while we programmed 10 steps. This is because when the values taken by the function are too high, python does not succeed to make the difference between f(X[1]) and f(X[1]+h), so it computes a gradient equal to zero:
x = 24 # for the example
y = -35487979280.37413
z = f([x, y+h]) - f([x, y])
print(z)
We get 0.0. This issue is about the computer's computation precision, but we will get back to this later.
So, these oscillations are due to the combination of:
the very high value of the partial gradient with regards to the y axis
a too big value of eta that does not compensate the exploding gradient in the update.
If this is true, we might converge if we use a smaller learning rate. Let's check:
X0 = [2, 0]
# divide eta by 100
eta = 0.0001
steps = 400
xmin = descenso_grad(f, X0, eta, steps)
print(xmin)
We will get [18.25061287 23.24796497]. We might need more steps but we are converging this time!!
2) How to avoid that?
A) In your specific case
Since the function shape is simple and it has no local minimas or no saddle points, we can avoid this issue by simply clipping the gradient value. This means that we define a maximum value for the norm of the gradients:
def grad_clipped(f, X, clip):
grd = []
for i in np.arange(0, len(X)):
ai = partial(f, i + 1, X)
if ai<0:
ai = max(ai, -1*clip)
else:
ai = min(ai, clip)
grd.append(ai)
return grd
def descenso_grad_clipped(f,X0,eta, steps, clip=100, tolerance=10e-8):
#iterations
i=0
while True:
i=i+1
X0=X0-eta*np.array(grad_clipped(f,X0, clip))
if np.linalg.norm(grad_clipped(f,X0, clip))<tolerance or i>steps: break
return X0
Let's test it using the diverging example:
X0 = [2, 0]
eta = 0.001
steps = 400
clip=100
xmin = descenso_grad_clipped(f, X0, eta, clip, steps)
print(xmin)
This time we are converging: [19.31583901 24.20307188]. Note that this can slow the process since the gradient descend will take smaller steps. Here we can get closer to the real minimum by increasing the number of steps.
Note that this technique also solves the numerical calculous issue we faced when the function's value was too high.
B) In general
In general, there are a lot of caveats the gradient descent allgorithms try to avoid (exploding or vanishing gradients, saddle points, local minimas...). Backpropagation algorithms like Adam, RMSprop, Adagrad, etc try to avoid these caveats.
I am not going to dwelve into the details because this would deserve a whole article, however here are two resources you can use (I suggest to read them in the order given) to deepen your understanding of the topic:
A good article on towardsdatascience.com explaining the basics of gradients descents and its most common flaws
An overview of gradient descent algorithms

Error propagation in a linear fit using python

Lets say I take multiple measurements of some dependent variable y relative to some independent variable x. I also record the uncertainty in each measurement dy. As an example this may look like
import numpy as np
x = np.array([1, 2, 3, 4])
y = np.array([4.1, 5.8, 8.1, 9.7])
dy = np.array([0.2, 0.3, 0.2, 0.4])
Now assume I expect the measured values to obey a linear relationship y = mx + b and I want to determine the y value y_umn for some unmeasured x value x_unm. I can perform a linear fit in Python pretty easily if I don't consider the error:
fit_params, residuals, rank, s_values, rcond = np.polyfit(x, y, 1, full=True)
poly_func = np.poly1d(fit_params)
x_unm # The unmeasured x value
y_unm = poly_func(x_unm) # The unmeasured x value
I have two problems with this approach. First is that np.polyfit does not incorporate the error on each point. Second is that I have no idea what the uncertainty on y_unm is.
Does anyone know how to fit data with uncertainties in a way that would allow me to determine the uncertainty in y_unm?
This is a problem that can be done analytically, but that is perhaps better suited as a math/statistics discussion. For example see (among many sources):
https://www.che.udel.edu/pdf/FittingData.pdf
The error in the fit can be calculated analytically. It is important to note though that the fit itself is different when accounting for errors in the measurements.
In python I am not sure of a built in function that handles errors but here is an example of doing a chi-squared minimization using scipy.optimize.fmin
#Calculate Chi^2 function to minimize
def chi_2(params,x,y,sigy):
m,c=params
return sum(((y-m*x-c)/sigy)**2)
data_in=(x,y,dy)
params0=[1,0]
q=fmin(chi_2,params0,args=data_in)
For comparison I used this, your polyfit solution, and the analytic solution and plotted for the data you gave.
The results for the parameters from the given techniques:
Weighted Chi-squared with fmin:
m=1.94609996
b=2.1312239
Analytic:
m=1.94609929078014
b=2.1312056737588647
Polyfit:
m=1.91
b=2.15
Linear fits to given data
Here is the full code:
import numpy as np
from scipy.optimize import fmin
import matplotlib.pyplot as plt
x = np.array([1, 2, 3, 4])
y = np.array([4.1, 5.8, 8.1, 9.7])
dy = np.array([0.2, 0.3, 0.2, 0.4])
#Calculate Chi^2 function to minimize
def chi_2(params,x,y,sigy):
m,c=params
return sum(((y-m*x-c)/sigy)**2)
data_in=(x,y,dy)
params0=[1,0]
q=fmin(chi_2,params0,args=data_in)
#Unweighted fit to compare
a=np.polyfit(x,y,deg=1)
#Analytic solution
sx=sum(x/dy**2)
sx2=sum(x**2/dy**2)
s1=sum(1./dy**2)
sy=sum(y/dy**2)
sxy=sum(x*y/dy**2)
ma=(s1*sxy-sx*sy)/(s1*sx2-sx**2)
ba=(sx2*sy-sx*sxy)/(sx2*s1-sx**2)
xplt=np.linspace(0,5,100)
yplt1=xplt*q[0]+q[1]
yplt2=xplt*a[0]+a[1]
yplt3=xplt*ma+ba
plt.figure()
plt.plot(xplt,yplt1,label='Error Weighted',color='black')
plt.plot(xplt,yplt2,label='Non-Error Weighted',color='blue')
plt.plot(xplt,yplt3,label='Error Weighted Analytic',linestyle='--',color='red')
plt.errorbar(x,y,yerr=dy,fmt='ko')
plt.legend()
plt.show()

using undetermined number of parameters in scipy function curve_fit

First question:
I'm trying to fit experimental datas with function of the following form:
f(x) = m_o*(1-exp(-t_o*x)) + ... + m_j*(1-exp(-t_j*x))
Currently, I don't find a way to have an undetermined number of parameters m_j, t_j, I'm forced to do something like this:
def fitting_function(x, m_1, t_1, m_2, t_2):
return m_1*(1.-numpy.exp(-t_1*x)) + m_2*(1.-numpy.exp(-t_2*x))
parameters, covariance = curve_fit(fitting_function, xExp, yExp, maxfev = 100000)
(xExp and yExp are my experimental points)
Is there a way to write my fitting function like this:
def fitting_function(x, li):
res = 0.
for el in range(len(li) / 2):
res += li[2*idx]*(1-numpy.exp(-li[2*idx+1]*x))
return res
where li is the list of fitting parameters and then do a curve_fitting? I don't know how to tell to curve_fitting what is the number of fitting parameters.
When I try this kind of form for fitting_function, I have errors like "ValueError: Unable to determine number of fit parameters."
Second question:
Is there any way to force my fitting parameters to be positive?
Any help appreciated :)
See my question and answer here. I've also made a minimal working example demonstrating how it could be done for your application. I make no claims that this is the best way - I am muddling through all this myself, so any critiques or simplifications are appreciated.
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as pl
def wrapper(x, *args): #take a list of arguments and break it down into two lists for the fit function to understand
N = len(args)/2
amplitudes = list(args[0:N])
timeconstants = list(args[N:2*N])
return fit_func(x, amplitudes, timeconstants)
def fit_func(x, amplitudes, timeconstants): #the actual fit function
fit = np.zeros(len(x))
for m,t in zip(amplitudes, timeconstants):
fit += m*(1.0-np.exp(-t*x))
return fit
def gen_data(x, amplitudes, timeconstants, noise=0.1): #generate some fake data
y = np.zeros(len(x))
for m,t in zip(amplitudes, timeconstants):
y += m*(1.0-np.exp(-t*x))
if noise:
y += np.random.normal(0, noise, size=len(x))
return y
def main():
x = np.arange(0,100)
amplitudes = [1, 2, 3]
timeconstants = [0.5, 0.2, 0.1]
y = gen_data(x, amplitudes, timeconstants, noise=0.01)
p0 = [1, 2, 3, 0.5, 0.2, 0.1]
popt, pcov = curve_fit(lambda x, *p0: wrapper(x, *p0), x, y, p0=p0) #call with lambda function
yfit = gen_data(x, popt[0:3], popt[3:6], noise=0)
pl.plot(x,y,x,yfit)
pl.show()
print popt
print pcov
if __name__=="__main__":
main()
A word of warning, though. A linear sum of exponentials is going to make the fit EXTREMELY sensitive to any noise, particularly for a large number of parameters. You can test that by adding even a small amount of noise to the data generated in the script - even small deviations cause it to get the wrong answer entirely while the fit still looks perfectly valid by eye (test with noise=0, 0.01, and 0.1). Be very careful interpreting your results even if the fit looks good. It's also a form that allows for variable swapping: the best fit solution is the same even if you swap any pairs of (m_i, t_i) with (m_j, t_j), meaning your chi-square has multiple identical local minima that might mean your variables get swapped around during fitting, depending on your initial conditions. This is unlikely to be a numeriaclly robust way to extract these parameters.
To your second question, yes, you can, by defining your exponentials like so:
m_0**2*(1.0-np.exp(-t_0**2*x)+...
Basically, square them all in your actual fit function, fit them, and then square the results (which could be negative or positive) to get your actual parameters. You can also define variables to be between a certain range by using different proxy forms.

3d integral, python, integration set constrained

I wanted to compute the volume of the intersect of a sphere and infinite cylinder at some distance b, and i figured i would do it using a quick and dirty python script. My requirements are a <1s computation with >3 significant digits.
My thinking was as such:
We place the sphere, with radius R, such that its center is at the origin, and we place the cylinder, with radius R', such that its axis is spanned in z from (b,0,0). We integrate over the sphere, using a step function that returns 1 if we are inside the cylinder, and 0 if not, thus integrating 1 over the set constrained by being inside both sphere and cylinder, i.e. the intersect.
I tried this using scipy.intigrate.tplquad. It did not work out. I think its because of the discontinuity of the step function as i get warnings such the following. Of course, i might just be doing this wrong. Assuming i have not made some stupid mistake, I could attempt to formulate the ranges of the intersect, thus removing the need for the step function, but i figured i might try and get some feedback first. Can anyone spot any mistake, or point towards some simple solution.
Warning: The maximum number of
subdivisions (50) has been achieved.
If increasing the limit yields no
improvement it is advised to analyze
the integrand in order to determine
the difficulties. If the position of
a local difficulty can be
determined (singularity,
discontinuity) one will probably
gain from splitting up the interval
and calling the integrator on the
subranges. Perhaps a special-purpose
integrator should be used.
Code:
from scipy.integrate import tplquad
from math import sqrt
def integrand(z, y, x):
if Rprim >= (x - b)**2 + y**2:
return 1.
else:
return 0.
def integral():
return tplquad(integrand, -R, R,
lambda x: -sqrt(R**2 - x**2), # lower y
lambda x: sqrt(R**2 - x**2), # upper y
lambda x,y: -sqrt(R**2 - x**2 - y**2), # lower z
lambda x,y: sqrt(R**2 - x**2 - y**2), # upper z
epsabs=1.e-01, epsrel=1.e-01
)
R=1
Rprim=1
b=0.5
print integral()
Assuming you are able to translate and scale your data such a way that the origin of the sphere is in [0, 0, 0] and its radius is 1, then a simple stochastic approximation may give you a reasonable answer fast enough. So, something along the lines could be a good starting point:
import numpy as np
def in_sphere(p, r= 1.):
return np.sqrt((p** 2).sum(0))<= r
def in_cylinder(p, c, r= 1.):
m= np.mean(c, 1)[:, None]
pm= p- m
d= np.diff(c- m)
d= d/ np.sqrt(d** 2).sum()
pp= np.dot(np.dot(d, d.T), pm)
return np.sqrt(((pp- pm)** 2).sum(0))<= r
def in_sac(p, c, r_c):
return np.logical_and(in_sphere(p), in_cylinder(p, c, r_c))
if __name__ == '__main__':
n, c= 1e6, [[0, 1], [0, 1], [0, 1]]
p= 2* np.random.rand(3, n)- 2
print (in_sac(p, c, 1).sum()/ n)* 2** 3
Performing a triple adaptive numerical integrations on a discontinuous function that is constant over two domains is a terribly poor idea, especially if you wish to see either speed or accuracy.
I would suggest a far better idea is to reduce the problem analytically.
Align the cylinder with an axis, by transformation. This translates the sphere to some point that is not at the origin.
Now, find the limits of intersection of the sphere with the cylinder along that axis.
Integrate over that axis variable. The area of intersection at any fixed value along the axis is simply the area of intersection of two circles, which in turn is simply computable using trigonometry and a little effort.
In the end, you will have an exact result, with almost no computation time needed.
I solved it using a simple MC integration, as suggested by eat, but my implementation was to slow. My requirements had increased. I therefore reformulated the problem mathematically, as suggested by woodchips.
Basically i formulated the limits of x as a function of z and y, and y as a function of z. Then i, in essence, integrated f(z,y,z)=1 over the intersection, using the limits. I did this because of the speed increase, allowing me to plot volume vs b, and because it allows me to integrate more complex functions with relative minor modification.
I include my code in case anyone is interested.
from scipy.integrate import quad
from math import sqrt
from math import pi
def x_max(y,r):
return sqrt(r**2-y**2)
def x_min(y,r):
return max(-sqrt(r**2 - y**2), -sqrt(R**2 - y**2) + b)
def y_max(r):
if (R<b and b-R<r) or (R>b and b-R>r):
return sqrt( R**2 - (R**2-r**2+b**2)**2/(4.*b**2) )
elif r+R<b:
return 0.
else: #r+b<R
return r
def z_max():
if R>b:
return R
else:
return sqrt(2.*b*R - b**2)
def delta_x(y, r):
return x_max(y,r) - x_min(y,r)
def int_xy(z):
r = sqrt(R**2 - z**2)
return quad(delta_x, 0., y_max(r), args=(r))
def int_xyz():
return quad(lambda z: int_xy(z)[0], 0., z_max())
R=1.
Rprim=1.
b=0.5
print 4*int_xyz()[0]
First off: You can calculate the volume of the intersection by hand. If you don't want to (or can't) do that, here's an alternative:
I'd generate a tetrahedral mesh for the domain and then add up the cell volumes. An example with pygalmesh and meshplex (both authored by myself):
import pygalmesh
import meshplex
import numpy
ball = pygalmesh.Ball([0, 0, 0], 1.0)
cyl = pygalmesh.Cylinder(-1, 1, 0.7, 0.1)
u = pygalmesh.Intersection([ball, cyl])
mesh = pygalmesh.generate_mesh(u, cell_size=0.05, edge_size=0.1)
points = mesh.points
cells = mesh.cells["tetra"]
# kick out unused vertices
uvertices, uidx = numpy.unique(cells, return_inverse=True)
cells = uidx.reshape(cells.shape)
points = points[uvertices]
mp = meshplex.MeshTetra(points, cells)
print(sum(mp.cell_volumes))
This gives you
and prints 2.6567890958740463 as volume. Decrease cell or edge sizes for higher precision.

Categories