Curve fitting with SciPy's least_squares() - python

I'm doing least squares curve fitting with Python and getting decent results, but would like it to be a bit more robust.
I have data from a first order LTI system, more specifically the speed of a motor that is read by a tachymeter. I'm trying to fit the step response of the motors so I can deduce its transfer function.
The speed (v(t)) has the following form:
v(t) = K * (1 - exp(-t/T))
I'm having some outliers in the data I use though, and would like to mitigate them. This mostly happens when the speeds becomes constant. Say the speed is 10000 units, I sometimes get outliers that are 10000 +/- 400. I wonder how to set my f_scale parameter given I want my data points to stay within +/- 400 of the "actual" speed (mean). Should I set f_scale to 400 or 800? I'm not sure what exactly I should set there.
Thanks
EDIT: Some data.

I have constructed a minimal example which is for a curve similar to yours. If you had posted actual data instead of a picture, this would have gone a bit faster. The two key things to understand about robust fitting with least_squares is that you have to use a different value for the loss parameter than linear and that f_scale is used as a scaling parameter for the loss function.
Basically, from the docs, least_squares tries to
minimize F(x) = 0.5 * sum(rho(f_i(x)**2)
and setting the loss loss parameter changes rho in the above formula. For loss='linear' rho is just the identity function. When loss='soft_l1', rho(z) = 2 * ((1 + z)**0.5 - 1). f_scale is used to scale the loss function such that rho_(f**2) = C**2 * rho(f**2 / C**2). So it doesn't have the same kind of meaning as you are asking for above, it's more like a way of penalising larger errors less.
In this particular case it doesn't appear to make much difference though.
import numpy
import matplotlib.pyplot as plt
import scipy.optimize
tmax = 6000
N = 100
K = 6000
T = 200
smootht = numpy.linspace(0, tmax, 1000)
tm = numpy.linspace(0, tmax, N)
def f(t, K, T):
return K * (1 - numpy.exp(-t/T))
v = f(smootht, K, T)
vm = f(tm, K, T) + numpy.random.randn(N)*400
def error(pars):
K, T = pars
vp = f(tm, K, T)
return vm - vp
f_scales = [0.01, 1, 100]
plt.scatter(tm, vm)
for f_scale in f_scales:
r = scipy.optimize.least_squares(error, [10, 10], loss='soft_l1', f_scale=f_scale)
vp = f(smootht, *r.x)
plt.plot(smootht, vp, label=f_scale)
plt.legend()
The resulting plot looks like this:
My suggestion is to start by just experimenting with the different loss functions before playing with f_scale.

Related

Unable to reproduce simple figure from textbook (possible numerical instability)

I am trying to reproduce figure 5.6 (attached) from the textbook "Modeling Infectious Diseases in Humans and Animals (official code repo)" (Keeling 2008) to verify whether my implementation of a seasonally forced SEIR (epidemiological model) is correct. An official program from the textbook that implements seasonal forcing indicates that large values of Beta 1 can lead to numerical errors, but if the figure has Beta 1 values that did not lead to numerical errors, then in principle this should not be the cause of the problem. My implementation correctly produces the graphs in row 0, column 0 and row 1, column 0 of figure 5.6 but there is no output in my figure for the remaining cells due to the numerical solution for the fraction of infected (see code at bottom) producing 0 (and the ln(0) --> -inf).
I do receive the following warnings:
ODEintWarning: Excess work done on this call
C:\Users\jared\AppData\Local\Temp\ipykernel_24972\2802449019.py:68:
RuntimeWarning: divide by zero encountered in log infected =
np.log(odeint(
C:\Users\jared\AppData\Local\Temp\ipykernel_24972\2802449019.py:68:
RuntimeWarning: invalid value encountered in log infected =
np.log(odeint(
Here is the textbook figure:
My figure within the same time range (990 - 1000 years). Natural log taken of fraction infected:
My figure but with a shorter time range (0 - 100 years). Natural log taken of fraction infected. The numerical solution for the infected population seems to fail between the 5 and 20 year mark for most of the seasonal parameters (Beta 1 and R0):
My figure with a shorter time range as above, but with no natural log taken of fraction infected.
Code to reproduce my figure:
# Code to minimally reproduce figure
import itertools
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
def seir(y, t, mu, sigma, gamma, omega, beta_zero, beta_one):
"""System of diff eqs for epidemiological model.
SEIR stands for susceptibles, exposed, infectious, and
recovered populations.
References:
[SEIR Python Program from Textbook](http://homepages.warwick.ac.uk/~masfz/ModelingInfectiousDiseases/Chapter2/Program_2.6/Program_2_6.py)
[Seasonally Forced SIR Program from Textbook](http://homepages.warwick.ac.uk/~masfz/ModelingInfectiousDiseases/Chapter5/Program_5.1/Program_5_1.py)
"""
s, e, i = y
beta = beta_zero * (1 + beta_one * np.cos(omega * t))
sdot = mu - (beta*i + mu)*s
edot = beta*s*i - (mu + sigma)*e
idot = sigma*e - (mu + gamma)*i
return sdot, edot, idot
def solve_beta_zero(basic_reproductive_rate, gamma):
"""Defined in the last paragraph of pg. 159 of textbook Keeling 2008."""
return gamma * basic_reproductive_rate
# Model parameters (see Figure 5.6 description)
mu = 0.02 / 365
sigma = 1/8
gamma = 1/5
omega = 2 * np.pi / 365 # frequency of oscillations per year
# Seasonal forcing parameters
r0s = [17, 10, 3]
b1s = [0.02, 0.1, 0.225]
# Permutes params to get tuples matching row i column j params in figure
# e.g., [(0.02, 17), (0.02, 10) ... ]
seasonal_params = [p for p in itertools.product(*(b1s, r0s))]
# Initial Conditions: I assume these are proportions of some total population
s0 = 6e-2
e0 = i0 = 1e-3
initial_conditions = [s0, e0, i0]
# Timesteps
nyears = 1000
days_per_year = 365
ndays = nyears * days_per_year
timesteps = np.arange(1, ndays+1, 1)
# Range to slice data to reproduce my figures
# NOTE: CHange the min slice or max slice for different ranges
min_slice = 990 # or 0
max_slice = 1000 # or 100
sliced = slice(min_slice * days_per_year, max_slice * days_per_year)
x_ticks = timesteps[sliced]/days_per_year
# Define figure
nrows = 3
ncols = 3
fig, ax = plt.subplots(nrows, ncols, sharex=True, figsize=(15, 8))
# Iterate through parameters and recreate figure
for i in range(nrows):
for j in range(ncols):
# Get seasonal parameters for this subplot
beta_one = seasonal_params[i * nrows + j][0]
basic_reproductive_rate = seasonal_params[i * nrows + j][1]
# Compute beta zero given the reproductive rate
beta_zero = solve_beta_zero(
basic_reproductive_rate=basic_reproductive_rate,
gamma=gamma)
# Numerically solve the model, extract only the infected solutions,
# slice those solutions to the desired time range, and then natural
# log scale them
solutions = odeint(
seir,
initial_conditions,
timesteps,
args=(mu, sigma, gamma, omega, beta_zero, beta_one))
infected_solutions = solutions[:, 2]
log_infected = np.log(infected_solutions[sliced])
# NOTE: To inspect results without natural log, uncomment the
# below line
# log_infected = infected_solutions[sliced]
# DEBUG: For shape and parameter printing
# print(
# infected_solutions.shape, 'R0=', basic_reproductive_rate, 'B1=', beta_one)
# Plot results
ax[i,j].plot(x_ticks, log_infected)
# label subplot
ax[i,j].title.set_text(rf'$(R_0=${basic_reproductive_rate}, $\beta_{1}=${beta_one})')
fig.supylabel('NaturalLog(Fraction Infected)')
fig.supxlabel('Time (years)')
Disclaimer:
My short term solution is to simply change the list of seasonal parameters to values that will produce data for that range, and this adequately illustrates the effects of seasonal forcing. The point is to reproduce the figure, though, and if the author was able to do it, others should be able to as well.
Your first (and possibly main) problem is one of scale. This diagnosis is also conform with the observations in your later experiments. The system is such that if it is started with positive values, it should stay within positive values. That negative values are reached is only possible if the step errors of the numerical method are too large.
As you can see in the original graphs, the range of values goes from exp(-7) ~= 9e-4 to exp(-12) ~= 6e-6. The value of the absolute tolerance should force at least 3 digits to be exact, so atol = 1e-10 or smaller. The relative tolerance should be adapted similarly. Viewing all components together shows that the first component has values around exp(-2.5) ~= 5e-2, so per-component tolerances should provide better results. The corresponding call is
solutions = odeint(
seir,
initial_conditions,
timesteps,
args=(mu, sigma, gamma, omega, beta_zero, beta_one),
atol = [1e-9,1e-13,1e-13], rtol=1e-11)
With these parameters I get the plots below
The first row and first column are as in the cited graphic, the others look different.
As a test and a general method to integrate in the range of small positive solutions, reformulate for the integration of the logarithms of the components. This can be done with a simple wrapper
def seir_log(log_y,t,*args):
y = np.exp(log_y)
dy = np.array(seir(y,t,*args))
return dy/y # = d(log(y))
Now the expected values have scale 1 to 10, so that the tolerances are no longer so critical, default tolerances should be sufficient, but it is better to work with documented tolerances.
log_solution = odeint(
seir_log,
np.log(initial_conditions),
timesteps,
args=(mu, sigma, gamma, omega, beta_zero, beta_one),
atol = 1e-8, rtol=1e-9)
log_infected = log_solution[sliced,2]
The bottom-left plot is still sensible to atol, with 1e-7 one gets a more wavy picture. Bounding the step size with hmax=5 also stabilizes that. With the code as above the plots are
The center plot is still different than the reference. It might be that there are different stable cycles.

Integrate a 2D vectorfield-array (reversing np.gradient)

i have the following problem:
I want to integrate a 2D array, so basically reversing a gradient operator.
Assuming i have a very simple array as follows:
shape = (60, 60)
sampling = 1
k_mesh = np.meshgrid(np.fft.fftfreq(shape[0], sampling), np.fft.fftfreq(shape[1], sampling))
Then i construct my vectorfield as a complex-valued arreay (x-vector = real part, y-vector = imaginary part):
k = k_mesh[0] + 1j * k_mesh[1]
So the real part for example looks like this
Now i take the gradient:
k_grad = np.gradient(k, sampling)
I then use Fourier transforms to reverse it, using the following function:
def freq_array(shape, sampling):
f_freq_1d_y = np.fft.fftfreq(shape[0], sampling[0])
f_freq_1d_x = np.fft.fftfreq(shape[1], sampling[1])
f_freq_mesh = np.meshgrid(f_freq_1d_x, f_freq_1d_y)
f_freq = np.hypot(f_freq_mesh[0], f_freq_mesh[1])
return f_freq
def int_2d_fourier(arr, sampling):
freqs = freq_array(arr.shape, sampling)
k_sq = np.where(freqs != 0, freqs**2, 0.0001)
k = np.meshgrid(np.fft.fftfreq(arr.shape[0], sampling), np.fft.fftfreq(arr.shape[1], sampling))
v_int_x = np.real(np.fft.ifft2((np.fft.fft2(arr[1]) * k[0]) / (2*np.pi * 1j * k_sq)))
v_int_y = np.real(np.fft.ifft2((np.fft.fft2(arr[0]) * k[0]) / (2*np.pi * 1j * k_sq)))
v_int_fs = v_int_x + v_int_y
return v_int_fs
k_int = int_2d_fourier(k, sampling)
Unfortunately, the result is not very accurate at the position where k has an abrupt change, as can be seen in the plot below, which displayes a horizontal line profile of k and k_int.
Any ideas how to improve the accuracy? Is there a way to make it exactly the same?
I actually found a solution. The integration itself yields very accurate results.
However, the gradient function from numpy calculates second order accurate central differences, which means that the gradient itself already is an approximation.
When you replace the problem above with an analytical formula such as a 2D Gaussian, one can calculate the derivative analytically. When integrating this analytically derived function, the error is on the order of 10^-10 (depending on the width of the Gaussian, which can lead to aliasing effects).
So long story short: The integration function proposed above works as intended!

numpy fit coefficients to linear combination of polynomials

I have data that I want to fit with polynomials. I have 200,000 data points, so I want an efficient algorithm. I want to use the numpy.polynomial package so that I can try different families and degrees of polynomials. Is there some way I can formulate this as a system of equations like Ax=b? Is there a better way to solve this than with scipy.minimize?
import numpy as np
from scipy.optimize import minimize as mini
x1 = np.random.random(2000)
x2 = np.random.random(2000)
y = 20 * np.sin(x1) + x2 - np.sin (30 * x1 - x2 / 10)
def fitness(x, degree=5):
poly1 = np.polynomial.polynomial.polyval(x1, x[:degree])
poly2 = np.polynomial.polynomial.polyval(x2, x[degree:])
return np.sum((y - (poly1 + poly2)) ** 2 )
# It seems like I should be able to solve this as a system of equations
# x = np.linalg.solve(np.concatenate([x1, x2]), y)
# minimize the sum of the squared residuals to find the optimal polynomial coefficients
x = mini(fitness, np.ones(10))
print fitness(x.x)
Your intuition is right. You can solve this as a system of equations of the form Ax = b.
However:
The system is overdefined and you want to get the least-squares solution, so you need to use np.linalg.lstsq instead of np.linalg.solve.
You can't use polyval because you need to separate the coefficients and powers of the independent variable.
This is how to construct the system of equations and solve it:
A = np.stack([x1**0, x1**1, x1**2, x1**3, x1**4, x2**0, x2**1, x2**2, x2**3, x2**4]).T
xx = np.linalg.lstsq(A, y)[0]
print(fitness(xx)) # test the result with original fitness function
Of course you can generalize over the degree:
A = np.stack([x1**p for p in range(degree)] + [x2**p for p in range(degree)]).T
With the example data, the least squares solution runs much faster than the minimize solution (800µs vs 35ms on my laptop). However, A can become quite large, so if memory is an issue minimize might still be an option.
Update:
Without any knowledge about the internals of the polynomial function things become tricky, but it is possible to separate terms and coefficients. Here is a somewhat ugly way to construct the system matrix A from a function like polyval:
def construct_A(valfunc, degree):
columns1 = []
columns2 = []
for p in range(degree):
c = np.zeros(degree)
c[p] = 1
columns1.append(valfunc(x1, c))
columns2.append(valfunc(x2, c))
return np.stack(columns1 + columns2).T
A = construct_A(np.polynomial.polynomial.polyval, 5)
xx = np.linalg.lstsq(A, y)[0]
print(fitness(xx)) # test the result with original fitness function

python- convolution with step response

I want to compute this integral $\frac{1}{L}\int_{-\infty}^{t}H(t^{'})\exp(-\frac{R}{L}(t-t^{'}))dt^{'}$ using numpy.convolution, where $H(t)$ is heavside function. I am supposed to get this equals to $\exp(-\frac{R}{L}t)H(t)$
below is what I did,
I changed the limitation of the integral into -inf to +inf by change of variable multiplying a different H(t) then I used this as my function to convolve with H(t)(the one inside the integral), but the output plot is definitely not a exp function, neither I could find any mistakes in my code, please help, any hint or suggestions will be appreciated!
import numpy as np
import matplotlib.pyplot as plt
R = 1e3
L = 3.
delta = 1
Nf = 100
Nw = 200
k = np.arange(0,Nw,delta)
dt = 0.1e-3
tk = k*dt
Ng = Nf + Nw -2
n = np.arange(0,Nf+Nw-1,delta)
tn = n*dt
#define H
def H(n):
H = np.ones(n)
H[0] = 0.5
return H
#build ftns that get convoluted
f = H(Nf)
w = np.exp((-R/L)*tk)*H(Nw)
#return the value of I
It = np.convolve(w,f)/L
#return the value of Voutput, b(t)
b = H(Ng+1) - R*It
plt.plot(tn,b,'o')
plt.show()
The issue with your code is not so much programming as it is conceptual. Rewrite the convolution as Integral[HeavisideTheta[t-t']*Exp[-R/L * t'], -Inf, t] (that's Mathematica code) and upon inspection you find that H(t-t') is always 1 within the limits (except for at t'=t which is the integration limit... but that's not important). So in reality you're not actually performing a complete convolution... you're basically just taking half (or a third) of the convolution.
If you think of a convolution as inverting one sequence and then going one shift at the time and adding it all up (see http://en.wikipedia.org/wiki/Convolution#Derivations - Visual Explanation of Convolution) then what you want is the middle half... i.e. only when they're overlapping. You don't want the lead-in (4-th graph down: http://en.wikipedia.org/wiki/File:Convolution3.svg). You do want the lead-out.
Now the easiest way to fix your code is as such:
#build ftns that get convoluted
f = H(Nf)
w = np.exp((-R/L)*tk)*H(Nw)
#return the value of I
It = np.convolve(w,f)/L
max_ind = np.argmax(It)
print max_ind
It1 = It[max_ind:]
The lead-in is the only time when the convolution integral (technically sum in our case) increases... thus after the lead-in is finished the convolution integral follows Exp[-x]... so you tell python to only take values after the maximum is achieved.
#return the value of Voutput, b(t) works perfectly now!
Note: Since you need the lead-out you can't use np.convolve(a,b, mode = 'valid').
So It1 looks like:
b(t) using It1 looks like:
There is no way you can ever get exp(-x) as the general form because the equation for b(t) is given by 1 - R*exp(-x)... It can't mathematically follow an exp(-x) form. At this point there are 3 things:
The units don't really make sense... check them. The Heaviside function is 1 and R*It1 is about 10,000. I'm not sure this is an issue but just in case, the normalized curve looks as such:
You can get an exp(-x) form if you use b(t) = R*It1 - H(t)... the code for that is here (You might have to normalize depending on your needs):
b = R*It1 - H(len(It1))
# print len(tn)
plt.plot(tn[:len(b)], b,'o')
plt.show()
And the plot looks like:
Your question might still not be resolved in which case you need to explain what exactly you think was wrong. With the info you've given me... b(t) can never have an Exp[-x] form unless the equation for b(t) is messed with. As it stands in your original code It1 follows Exp[-x] in form but b(t) cannot.
I think there's a bit of confusion here about convolution. We use convolution in the time domain to calculate the response of a linear system to an arbitrary input. To do this, we need to know the impulse response of the system. Be careful switching between continuous and discrete systems - see e.g. http://en.wikipedia.org/wiki/Impulse_invariance.
The (continuous) impulse response of your system (which I assume to be for the resistor voltage of an L-R circuit) I have defined for convenience as a function of time t: IR = lambda t: (R/L)*np.exp(-(R/L)*t) * H.
I have also assumed that your input is the Heaviside step function, which I've defined on the time interval [0, 1], for a timestep of 0.001 s.
When we convolve (discretely), we effectively flip one function around and slide it along the other one, multiplying corresponding values and then taking the sum. To use the continuous impulse response with a step function which actually comprises of a sequence of Dirac delta functions, we need to multiply the continuous impulse response by the time step dt, as described in the Wikipedia link above on impulse invariance. NB - setting H[0] = 0.5 is also important.
We can visualise this operation below. Any given red marker represents the response at a given time t, and is the "sum-product" of the green input and a flipped impulse response shifted to the right by t. I've tried to show this with a few grey impulse responses.
The code to do the calculation is here.
import numpy as np
import matplotlib.pyplot as plt
R = 1e3 # Resistance
L = 3. #Inductance
dt = 0.001 # Millisecond timestep
# Define interval 1 second long, interval dt
t = np.arange(0, 1, dt)
# Define step function
H = np.ones_like(t)
H[0] = 0.5 # Correction for impulse invariance (cf http://en.wikipedia.org/wiki/Impulse_invariance)
# RL circuit - resistor voltage impulse response (cf http://en.wikipedia.org/wiki/RL_circuit)
IR = lambda t: (R/L)*np.exp(-(R/L)*t) * H # Don't really need to multiply by H as w is zero for t < 0
# Response of resistor voltage
response = np.convolve(H, IR(t)*dt, 'full')
The extra code to make the plot is here:
# Define new, longer, time array for plotting response - must be same length as response, with step dt
tp = np.arange(len(response))* dt
plt.plot(0-t, IR(t), '-', label='Impulse response (flipped)')
for q in np.arange(0.01, 0.1, 0.01):
plt.plot(q-t, IR(t), 'o-', markersize=3, color=str(10*q))
t = np.arange(-1, 1, dt)
H = np.ones_like(t)
H[t<0] = 0.
plt.plot(t, H, 's', label='Unit step function')
plt.plot(tp, response, '-o', label='Response')
plt.tight_layout()
plt.grid()
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.legend()
plt.show()
Finally, if you still have some confusion about convolution, I strongly recommend "Digital Signal Processing: A Practical Guide for Engineers and Scientists" by Steven W. Smith.

Slow scipy double quadrature integration

I'm trying to obtain the function expected_W or H that is the result of an integration:
where:
theta is a vector with two elements: theta_0 and theta_1
f(beta | theta) is a normal density for beta with mean theta_0 and variance theta_1
q(epsilon) is a normal density for epsilon with mean zero and variance sigma_epsilon (set to 1 by default).
w(p, theta, eps, beta) is a function I take as input, so I cannot predict exactly how it looks. It will likely be non-linear, but not particularly nasty.
This is the way I implement the problem. I'm sure the wrapper functions I make are a mess, so I'd be happy to receive any help on that too.
from __future__ import division
from scipy import integrate
from scipy.stats import norm
import math
import numpy as np
def exp_w(w_B, sigma_eps = 1, **kwargs):
'''
Integrates the w_B function
Input:
+ w_B : the function to be integrated.
+ sigma_eps : variance of the epsilon term. Set to 1 by default
'''
#The integrand function gives everything under the integral:
# w(B(p, \theta, \epsilon, \beta)) f(\beta | \theta ) q(\epsilon)
def integrand(eps, beta, p, theta_0, theta_1, sigma_eps=sigma_eps):
q_e = norm.pdf(eps, loc=0, scale=math.sqrt(sigma_eps))
f_beta = norm.pdf(beta, loc=theta_0, scale=math.sqrt(theta_1))
return w_B(p = p,
theta_0 = theta_0, theta_1 = theta_1,
eps = eps, beta=beta)* q_e *f_beta
#limits of integration. Using limited support for now.
eps_inf = lambda beta : -10 # otherwise: -np.inf
eps_sup = lambda beta : 10 # otherwise: np.inf
beta_inf = -10
beta_sup = 10
def integrated_f(p, theta_0, theta_1):
return integrate.dblquad(integrand, beta_inf, beta_sup,
eps_inf, eps_sup,
args = (p, theta_0, theta_1))
# this integrated_f is the H referenced at the top of the question
return integrated_f
I tested this function with a simple w function for which I know the analytic solution (this won't usually be the case).
def test_exp_w():
def w_B(p, theta_0, theta_1, eps, beta):
return 3*(p*eps + p*(theta_0 + theta_1) - beta)
# Function that I get
integrated = exp_w(w_B, sigma_eps = 1)
# Function that I should get
def exp_result(p, theta_0, theta_1):
return 3*p*(theta_0 + theta_1) - 3*theta_0
args = np.random.rand(3)
d_args = {'p' : args[0], 'theta_0' : args[1], 'theta_1' : args[2]}
if not (np.allclose(
integrated(**d_args)[0], exp_result(**d_args)) ):
raise Exception("Integration procedure isn't working!")
Hence, my implementation seems to be working, but it's very slow for my purpose. I need to repeat this process with tens or hundreds of thousands of times (this is a step in a Value function iteration. I can give more info if people think it's relevant).
With scipy version 0.14.0 and numpy version 1.8.1, this integral takes 15 seconds to compute.
Does anybody have any suggestion on how to go about this?
To start with, tt probably would help to get bounded domains of integration, but I haven't figure out how to do that or if the gaussian quadrature in SciPy takes care of it in a good way (does it use Gauss-Hermite?).
Thanks for your time.
---- Edit: adding profiling times -----
%lprun results gives that most of the time is spent in
_distn_infraestructure.py:1529(pdf) and
_continuous_distns.py:97(_norm_pdf)
each with a whopping 83244 number calls.
The time taken to integrate your function sounds very long if the function is not a nasty one.
First thing I suggest you do is to profile where the time is spent. Is it spent in dblquad or elsewhere? How many calls are made to w_B during the integration? If the time is spent in dblquad and the number of calls is very high, could you use looser tolerances in the integration?
It seems that the multiplication by the gaussians actually enables you to limit the integration limits a great deal, as most of the energy of the gaussian is within a very small area. You might want to try and calculate reasonable tighter bounds. You have already limited the area into -10..10; is there any significant performance change between -100..100, -10..10, and -1..1?
If you know your functions are relatively smooth, then there is a Mickey-Mouse version of the integration:
determine reasonable upper and lower limits in both axes (by the gaussians)
calculate a reasonable grid density (e.g. 100 points in each direction)
calculate the w_B for each of these points (and this will be much faster, if it is possible to require a vectorized version of w_B)
sum it all together
This is very low-tech but also very fast. Whether or not it gives you results which are good enough for the outer iteration is an interesting question. It just might.

Categories