Python: how to integrate functions with two unknown parameters numerically

Python: how to integrate functions with two unknown parameters numerically - python

Now I have two functions respectively are
rho(u) = np.exp( (-2.0 / 0.2) * (u**0.2-1.0) )
psi( w(x-u) ) = (1/(4.0 * math.sqrt(np.pi))) * np.exp(- ((w * (x-u))**2) / 4.0) * (2.0 - (w * (x-u))**2)
And then I want to integrate 'rho(u) * psi( w(x-u) )' with respect to 'u'. So that the integral result can be one function with respect to 'w' and 'x'.
Here's my Python code snippet as I try to solve this integral.
import numpy as np
import math
import matplotlib.pyplot as plt
from scipy import integrate
x = np.linspace(0,10,1000)
w = np.linspace(0,10,500)
u = np.linspace(0,10,1000)
rho = np.exp((-2.0/0.2)*(u**0.2-1.0))
value = np.zeros((500,1000),dtype="float32")
# Integrate the products of rho with
# (1/(4.0*math.sqrt(np.pi)))*np.exp(- ((w[i]*(x[j]-u))**2) / 4.0)*(2.0 - (w[i]*(x[j]-u))**2)
for i in range(len(w)):
for j in range(len(x)):
value[i,j] =value[i,j]+ integrate.simps(rho*(1/(4.0*math.sqrt(np.pi)))*np.exp(- ((w[i]*(x[j]-u))**2) / 4.0)*(2.0 - (w[i]*(x[j]-u))**2),u)
plt.imshow(value,origin='lower')
plt.colorbar()
As illustrated above, when I do the integration, I used nesting for loops. We all know that such a way is inefficient.
So I want to ask whether there are methods not using for loop.

Here is a possibility using scipy.integrate.quad_vec. It executes in 6 seconds on my machine, which I believe is acceptable. It is true, however, that I have used a step of 0.1 only for both x and w, but such a resolution seems to be a good compromise on a single core.
from functools import partial
import matplotlib.pyplot as plt
from numpy import empty, exp, linspace, pi, sqrt
from scipy.integrate import quad_vec
from time import perf_counter
def func(u, x):
rho = exp(-10 * (u ** 0.2 - 1))
var = w * (x - u)
psi = exp(-var ** 2 / 4) * (2 - var ** 2) / 4 / sqrt(pi)
return rho * psi
begin = perf_counter()
x = linspace(0, 10, 101)
w = linspace(0, 10, 101)
res = empty((x.size, w.size))
for i, xVal in enumerate(x):
res[i], err = quad_vec(partial(func, x=xVal), 0, 10)
print(f'{perf_counter() - begin} s')
plt.contourf(w, x, res)
plt.colorbar()
plt.xlabel('w')
plt.ylabel('x')
plt.show()
UPDATE
I had not realised, but one can also work with a multi-dimensional array in quad_vec. The updated approach below enables to increase the resolution by a factor of 2 for both x and w, and keep an execution time of around 7 seconds. Additionally, no more visible for loops.
import matplotlib.pyplot as plt
from numpy import exp, mgrid, pi, sqrt
from scipy.integrate import quad_vec
from time import perf_counter
def func(u):
rho = exp(-10 * (u ** 0.2 - 1))
var = w * (x - u)
psi = exp(-var ** 2 / 4) * (2 - var ** 2) / 4 / sqrt(pi)
return rho * psi
begin = perf_counter()
x, w = mgrid[0:10:201j, 0:10:201j]
res, err = quad_vec(func, 0, 10)
print(f'{perf_counter() - begin} s')
plt.contourf(w, x, res)
plt.colorbar()
plt.xlabel('w')
plt.ylabel('x')
plt.show()
Addressing the comment
Just add the following lines before plt.show() to have both axes scale logarithmically.
plt.gca().set_xlim(0.05, 10)
plt.gca().set_ylim(0.05, 10)
plt.gca().set_xscale('log')
plt.gca().set_yscale('log')

Related

How can I set my x higher? I need it to be at least 1000e8, but the I get a overflow error

How do I prevent the Overflow Error? It only happens once I set my x higher than 400e4 I need it to go at least to 1000e5
I need the derivative of the function f and after that I would like to plot it for different gamma. It works fine, when I set my x low enough. How do I get rid of the overflow though?
The results I am Expecting are in the range of e-8 - e-9, I only need numbers with about 20 digits
import sympy as smp
import matplotlib.pyplot as plt
import numpy as np
x, k_1, k_2, k_3, T_0, T_S, T, f_p, gamma = smp.symbols('x k_1 k_2 k_3 T_0 T_S T f_p gamma', real=True)
aT = 10**((8.86*(T_0 - T_S)/(101.6 + (T_0 - T_S))-(8.86*(T-T_S-f_p*x)/(101.6+(T-T_S)))))
f = smp.log((k_1 * aT) / ((1 + gamma * k_2 * aT) ** k_3))
dfdx = smp.diff(f, x)
dfdx_f = smp.lambdify((gamma, x, k_1, k_2, k_3, T_0, T_S, T, f_p), dfdx)
gamma = np.linspace(0, 10000, 1000)
y = (dfdx_f(gamma, x=400e5, k_1=1000, k_2=0.01, k_3=0.94, T_0=243.9243, T_S=178.73, T=250.0, f_p=0.001001412))
plt.plot(gamma, y)

Sympy integration running into error, but Mathematica works fine

I'm trying to integrate an equation using sympy, but the evaluation keeps erroring out with:
TypeError: cannot add <class 'sympy.matrices.immutable.ImmutableDenseMatrix'> and <class 'sympy.core.numbers.Zero'>
However, when I integrate the same equation in Mathematica, I get the result I'm looking for. I'm hoping someone can explain what is happening under the hood with sympy and how its integration differs from mathematica's.
Because there are a lot of equations involved, I am only going to post the Python representation of the equations which are used to make up the variables called in the integration. To know for certain that the integration is an issue, I have independently checked each equation involved and made sure the results match between Python and Mathematica. I can confirm that only the integration fails.
Integration variable setup in Python:
import numpy as np
import sympy as sym
from sympy import I, Matrix, integrate
from sympy.integrals import Integral
from sympy.functions import exp
from sympy.physics.vector import cross, dot
c_speed = 299792458
lambda_u = 0.01
k_u = (2 * np.pi)/lambda_u
K_0 = 0.2
gamma_0 = 2.0
beta_0 = sym.sqrt(1 - 1 / gamma_0**2)
t = sym.symbols('t')
t_start = (-2 * lambda_u) / c_speed
t_end = (3 * lambda_u) / c_speed
beta_x_of_t = sym.Piecewise( (0.0, sym.Or(t < t_start, t > t_end)),
((-1 * K_0)/gamma_0 * sym.sin(k_u * c_speed * t), sym.And(t_start <= t, t <= t_end)) )
beta_z_of_t = sym.Piecewise( (beta_0, sym.Or(t < t_start, t > t_end)),
(beta_0 * (1 - K_0**2/ (4 * beta_0 * gamma_0**2)) + K_0**2 / (4 * beta_0 * gamma_0**2) * sym.sin(2 * k_u * c_speed * t), sym.And(t_start <= t, t <= t_end)) )
beta_xp_of_t = sym.diff(beta_x_of_t, t)
beta_zp_of_t = sym.diff(beta_z_of_t, t)
Python Integration:
n = Matrix([(0, 0, 1)])
beta = Matrix([(beta_x_of_t, 0, beta_z_of_t)])
betap = Matrix([(beta_xp_of_t, 0, beta_zp_of_t)])
def rad(n, beta, betap):
return integrate(n.cross((n-beta).cross(betap)), (t, t_start, t_end))
rad(n, beta, betap)
# Output is the error above
Mathematica integration:
rad[n_, beta_, betap_] :=
NIntegrate[
Cross[n, Cross[n - beta, betap]]
, {t, tstart, tend}, AccuracyGoal -> 3]
rad[{0, 0, 1}, {betax[t], 0, betaz[t]}, {betaxp[t], 0, betazp[t]}]
# Output is {0.00150421, 0., 0.}
I did see similar question such as this one and this question, but I'm not entirely sure if they are relevant here (the second link seems to be a closer match, but not quite a fit).

As you're working with imprecise floats (and even use np.pi instead of sym.pi), while sympy tries to find exact symbolic solutions, sympy's expressions get rather wild with constants like 6.67e-11 mixed with much larger values.
Here is an attempt to use more symbolic expressions, and only integrate the x-coordinate of the function.
import sympy as sym
from sympy import I, Matrix, integrate, S
from sympy.integrals import Integral
from sympy.functions import exp
from sympy.physics.vector import cross, dot
c_speed = 299792458
lambda_u = S(1) / 100
k_u = (2 * sym.pi) / lambda_u
K_0 = S(2) / 100
gamma_0 = 2
beta_0 = sym.sqrt(1 - S(1) / gamma_0 ** 2)
t = sym.symbols('t')
t_start = (-2 * lambda_u) / c_speed
t_end = (3 * lambda_u) / c_speed
beta_x_of_t = sym.Piecewise((0, sym.Or(t < t_start, t > t_end)),
((-1 * K_0) / gamma_0 * sym.sin(k_u * c_speed * t), sym.And(t_start <= t, t <= t_end)))
beta_z_of_t = sym.Piecewise((beta_0, sym.Or(t < t_start, t > t_end)),
(beta_0 * (1 - K_0 ** 2 / (4 * beta_0 * gamma_0 ** 2)) + K_0 ** 2 / (
4 * beta_0 * gamma_0 ** 2) * sym.sin(2 * k_u * c_speed * t),
sym.And(t_start <= t, t <= t_end)))
beta_xp_of_t = sym.diff(beta_x_of_t, t)
beta_zp_of_t = sym.diff(beta_z_of_t, t)
n = Matrix([(0, 0, 1)])
beta = Matrix([(beta_x_of_t, 0, beta_z_of_t)])
betap = Matrix([(beta_xp_of_t, 0, beta_zp_of_t)])
func = n.cross((n - beta).cross(betap))[0]
print(integrate(func, (t, t_start, t_end)))
This doesn't give an error, and outputs just zero.
Lambdify can be used to convert the function to numpy, and plot it.
func_np = sym.lambdify(t, func)
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
t_start_np = float(t_start.evalf())
t_end_np = float(t_end.evalf())
ts = np.linspace(t_start_np, t_end_np, 100000)
plt.plot(ts, func_np(ts))
print("approximate integral (np.trapz):", np.trapz(ts, func_np(ts)))
print("approximate integral (scipy's quad):", quad(func_np, t_start_np, t_end_np))
This outputs:
approximate integral (np.trapz): 0.04209721470548062
approximate integral (scipy's quad): (-2.3852447794681098e-18, 5.516333374450447e-10)
Note the huge values on the y-axis, and the small values on the x-axis. These values, as well as matematica's, could well be rounding errors.

How to increase the speed of an integral?

I have a function of the form which I want to integrate:
def f(z, t, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * np.exp(-0.5 * ((z - 40) / 2) ** 2)
I have taken this function as an example to understand and show the difference between different integration methods.
Based on earlier knowledge available through various answers on this platform as well as in documentation, I have tried different methods to increase the integration speed of the function. I will explain and compare them in brief here:
1. Using just python and scipy.quad:(takes 202.76 s)
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def f(q, z, t):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * np.exp(-0.5 * ((z - 40) / 2) ** 2)
y = np.empty([len(q)])
for n in range(len(q)):
y[n] = integrate.dblquad(lambda t, z: f(q[n], z, t), 0, 50, lambda z: 10, lambda z: 60)[0]
end = time.time()
print(end - start)
#202.76 s
As expected this is not very fast.
2. Using a low level callable and compiling the functions using Numba:(takes 6.46 s)
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
import numba as nb
from numba import cfunc
from numba.types import intc, CPointer, float64
from scipy import LowLevelCallable
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def jit_integrand_function(integrand_function):
jitted_function = nb.njit(integrand_function, nopython=True)
#error_model="numpy" -> Don't check for division by zero
#cfunc(float64(intc, CPointer(float64)),error_model="numpy",fastmath=True)
def wrapped(n, xx):
ar = nb.carray(xx, n)
return jitted_function(ar[0], ar[1], ar[2])
return LowLevelCallable(wrapped.ctypes)
#jit_integrand_function
def f(t, z, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * (1 / (np.sqrt(2 * np.pi) * 2)) * np.exp(
-0.5 * ((z - 40) / 2) ** 2)
def lower_inner(z):
return 10.
def upper_inner(z):
return 60.
y = np.empty(len(q))
for n in range(len(q)):
y[n] = integrate.dblquad(f, 0, 50, lower_inner, upper_inner,args=(q[n],))[0]
end = time.time()
print(end - start)
#6.46 s
This is faster than earlier approach.
3. Using scipy.quad_vec :(takes 2.87 s)
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
import numba as nb
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def f(z, t, q):
return t * 0.5 * (erf((t - z) / 3) - 1) * j0(q * t) * np.exp(-0.5 * ((z - 40) / 2) ** 2)
ans2 = integrate.quad_vec(lambda t: integrate.quad_vec(lambda z: f(z,t,q),10, 60)[0],0,50)[0]
end = time.time()
print(end - start)
#2.87s
The method involving scipy.quad_vec is the fastest of all. I was wondering how to make it even faster? scipy.quad_vecunfortunately doesn't accept low level callable functions; is there anyway to vectorize the scipy.quad or scipy.dblquad functions as efficiently as scipy.quad_vec? Or if there is any other approach that I can use? Any kind of help will be very much appreciated.

Here is a 10x faster version
import numpy as np
from scipy import integrate
from scipy.special import erf
from scipy.special import j0
import time
q = np.linspace(0.03, 1.0, 1000)
start = time.time()
def fa(z, t, q):
return (erf((t - z) / 3) - 1) * np.exp(-0.5 * ((z - 40) / 2) ** 2)
ans3 = 0.5 * integrate.quad_vec(lambda t: t * j0(q * t) *
integrate.quad_vec(lambda z: fa(z,t,q),10, 60)[0],0,50)[0]
end = time.time()
print(end - start)
What makes the difference here is to take j0(q*t) out of the inner integral.

python quad: integrating over one variable while treating other variable as constant?

I am doing simple integration, only thing is that I want to keep 'n' as a variable. How can I do this while still integrating over t?
import numpy as np
import matplotlib as mpl
from matplotlib import pyplot as plt
import scipy.integrate as integrate
from scipy.integrate import quad
import math as m
y = lambda t: 3*t
T = 4 #period
n = 1
w = 2*np.pi*n/T
#for odd functions
def integrand(t):
#return y*(2/T)*np.sin(w*t)
return y(t)*np.sin(n*w*t)
Bn = (2/T)*quad(integrand,-T/2,T/2)[0]
print(Bn)

Using quad, you cannot. Perhaps you're looking for symbolic integration like you would do with pen and paper; sympy can help you here:
import sympy
x = sympy.Symbol("x")
t = sympy.Symbol("t")
T = sympy.Symbol("T")
n = sympy.Symbol("n", positive=True)
w = 2 * sympy.pi * n / T
y = 3 * t
out = 2 / T * sympy.integrate(y * sympy.sin(n * w * t), (t, -T/2, T/2))
print(out)
2*(-3*T**2*cos(pi*n**2)/(2*pi*n**2) + 3*T**2*sin(pi*n**2)/(2*pi**2*n**4))/T
If you want to evaluate the integral for many n, quadpy can help:
import numpy as np
from quadpy import quad
y = lambda t: 3 * t
T = 4
n = np.linspace(0.5, 4.5, 20)
w = 2 * np.pi * n / T
# for odd functions
def integrand(t):
return y(t) * np.sin(np.multiply.outer(n * w, t))
Bn = (2 / T) * quad(integrand, -T / 2, T / 2)[0]
print(Bn)
[ 2.95202424 4.88513496 4.77595051 1.32599514 -1.93954768 -0.23784853
1.11558278 -0.95397681 0.63709387 -0.4752673 0.45818227 -0.4740128
0.35943759 -0.01510463 -0.30348511 0.09861289 0.25428048 0.10030723
-0.06099483 -0.13128359]

Still having trouble with curve fitting

I already opened a question on this topic, but I wasn't sure, if I should post it there, so I opened a new question here.
I have trouble again when fitting two or more peaks. First problem occurs with a calculated example function.
xg = np.random.uniform(0,1000,500)
mu1 = 200
sigma1 = 20
I1 = -2
mu2 = 800
sigma2 = 20
I2 = -1
yg3 = 0.0001*xg
yg1 = (I1 / (sigma1 * np.sqrt(2 * np.pi))) * np.exp( - (xg - mu1)**2 / (2 * sigma1**2) )
yg2 = (I2 / (sigma2 * np.sqrt(2 * np.pi))) * np.exp( - (xg - mu2)**2 / (2 * sigma2**2) )
yg=yg1+yg2+yg3
plt.figure(0, figsize=(8,8))
plt.plot(xg, yg, 'r.')
I tried two different approaches, I found in the documentation, which are shown below (modified for my data), but both give me wrong fitting data and a messy chaos of graphs (I guess one line per fitting step).
1st attempt:
import numpy as np
from lmfit.models import PseudoVoigtModel, LinearModel, GaussianModel, LorentzianModel
import sys
import matplotlib.pyplot as plt
gauss1 = PseudoVoigtModel(prefix='g1_')
pars.update(gauss1.make_params())
pars['g1_center'].set(200)
pars['g1_sigma'].set(15, min=3)
pars['g1_amplitude'].set(-0.5)
pars['g1_fwhm'].set(20, vary=True)
#pars['g1_fraction'].set(0, vary=True)
gauss2 = PseudoVoigtModel(prefix='g2_')
pars.update(gauss2.make_params())
pars['g2_center'].set(800)
pars['g2_sigma'].set(15)
pars['g2_amplitude'].set(-0.4)
pars['g2_fwhm'].set(20, vary=True)
#pars['g2_fraction'].set(0, vary=True)
mod = gauss1 + gauss2 + LinearModel()
pars.add('intercept', value=0, vary=True)
pars.add('slope', value=0.0001, vary=True)
init = mod.eval(pars, x=xg)
out = mod.fit(yg, pars, x=xg)
print(out.fit_report(min_correl=0.5))
plt.figure(5, figsize=(8,8))
out.plot_fit()
When I include the 'fraction'-parameter, I often get
'NameError: name 'pv1_fraction' is not defined in expr='<_ast.Module object at 0x00000000165E03C8>'.
although it should be defined. I get this Error for real data with this approach, too.
2nd attempt:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import lmfit
def gauss(x, sigma, mu, A):
return A*np.exp(-(x-mu)**2/(2*sigma**2))
def linear(x, m, n):
return m*x + n
peak1 = lmfit.model.Model(gauss, prefix='p1_')
peak2 = lmfit.model.Model(gauss, prefix='p2_')
lin = lmfit.model.Model(linear, prefix='l_')
model = peak1 + lin + peak2
params = model.make_params()
params['p1_mu'] = lmfit.Parameter(value=200, min=100, max=250)
params['p2_mu'] = lmfit.Parameter(value=800, min=100, max=1000)
params['p1_sigma'] = lmfit.Parameter(value=15, min=0.01)
params['p2_sigma'] = lmfit.Parameter(value=20, min=0.01)
params['p1_A'] = lmfit.Parameter(value=-2, min=-3)
params['p2_A'] = lmfit.Parameter(value=-2, min=-3)
params['l_m'] = lmfit.Parameter(value=0)
params['l_n'] = lmfit.Parameter(value=0)
out = model.fit(yg, params, x=xg)
print out.fit_report()
plt.figure(8, figsize=(8,8))
out.plot_fit()
So the result looks like this, in both cases. It seems to plot all fitting attempts, but never solves it correctly. The best fitted parameters are in the range that I gave it.
Anyone knows this type of error? Or has any solutions for this? And does anyone know how to avoid the NameError when calling a model function from lmfit with those approaches?

I have a somewhat tolerable solution for you. Since I don't know how variable your data is, I cannot say that it will work in a general sense but should get you started. If your data is along 0-1000 and has two peaks or dips along a line as you showed, then it should work.
I used the scipy curve_fit and put all of the components of the function together into one function. One can pass starting locations into the curve_fit function. (you can probably do this with the lib you're using but I'm not familiar with it) There is a loop in loop where I vary the mu parameters to find the ones with the lowest squared error. If you are needing to fit your data many times or in some real-time scenario then this is not for you but if you just need to fit some data, launch this code and grab a coffee.
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
import pylab
from matplotlib import cm as cm
import time
def my_function_big(x, m, n, #lin vars
sigma1, mu1, I1, #gaussian 1
sigma2, mu2, I2): #gaussian 2
y = m * x + n + (I1 / (sigma1 * np.sqrt(2 * np.pi))) * np.exp( - (x - mu1)**2 / (2 * sigma1**2) ) + (I2 / (sigma2 * np.sqrt(2 * np.pi))) * np.exp( - (x - mu2)**2 / (2 * sigma2**2) )
return y
#make some data
xs = np.random.uniform(0,1000,500)
mu1 = 200
sigma1 = 20
I1 = -2
mu2 = 800
sigma2 = 20
I2 = -1
yg3 = 0.0001 * xs
yg1 = (I1 / (sigma1 * np.sqrt(2 * np.pi))) * np.exp( - (xs - mu1)**2 / (2 * sigma1**2) )
yg2 = (I2 / (sigma2 * np.sqrt(2 * np.pi))) * np.exp( - (xs - mu2)**2 / (2 * sigma2**2) )
ys = yg1 + yg2 + yg3
xs = np.array(xs)
ys = np.array(ys)
#done making data
#start a double loop...very expensive but this is quick and dirty
#it would seem that the regular optimizer has trouble finding the minima so i
#found that having the near proper mu values helped it zero in much better
start = time.time()
serr = []
_x = []
_y = []
for x in np.linspace(0, 1000, 61):
for y in np.linspace(0, 1000, 61):
cfiti = curve_fit(my_function_big, xs, ys, p0=[0, 0, 1, x, 1, 1, y, 1], maxfev=20000000)
serr.append(np.sum((my_function_big(xs, *cfiti[0]) - ys) ** 2))
_x.append(x)
_y.append(y)
serr = np.array(serr)
_x = np.array(_x)
_y = np.array(_y)
print 'done loop in loop fitting'
print 'time: %0.1f' % (time.time() - start)
gridsize=20
plt.subplot(111)
plt.hexbin(_x, _y, C=serr, gridsize=gridsize, cmap=cm.jet, bins=None)
plt.axis([_x.min(), _x.max(), _y.min(), _y.max()])
cb = plt.colorbar()
cb.set_label('SE')
plt.show()
ix = np.argmin(serr.ravel())
mustart1 = _x.ravel()[ix]
mustart2 = _y.ravel()[ix]
print mustart1
print mustart2
cfit = curve_fit(my_function_big, xs, ys, p0=[0, 0, 1, mustart1, 1, 1, mustart2, 1], maxfev=2000000000)
xp = np.linspace(0, 1000, 1001)
plt.figure()
plt.scatter(xs, ys) #plot synthetic dat
plt.plot(xp, my_function_big(xp, *cfit[0]), '-', label='fit function') #plot data evaluated along 0-1000
plt.legend(loc=3, numpoints=1, prop={'size':12})
plt.show()
pylab.close()
Good luck!

In your first attempt:
pars['g1_fraction'].set(0, vary=True)
The fraction must be a value between 0 and 1, but I believe that cannot be zero. Try to put something like 0.000001, and it will work.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: how to integrate functions with two unknown parameters numerically - python

Related

How can I set my x higher? I need it to be at least 1000e8, but the I get a overflow error

Sympy integration running into error, but Mathematica works fine

How to increase the speed of an integral?

python quad: integrating over one variable while treating other variable as constant?

Still having trouble with curve fitting

Categories

Resources