ODE with non-analytical time-dependent parameters in PyMC3 - python

I'm working on solving the following ODE with PyMC3:
def production( y, t, p ):
return p[0]*getBeam( t ) - p[1]*y[0]
The getBeam( t ) is my time dependent coefficient. Those coefficients are given by an array of data which is accessed by the time index as follows:
def getBeam( t ):
nBeam = I[int(t/10)]*pow( 10, -6 )/q_e
return nBeam
I have successfully implemented it by using the scipy.integrate.odeint, but I have hard time to do it with pymc3.ode. In fact, by using the following:
ode_model = DifferentialEquation(func=production, times=x, n_states=1, n_theta=3, t0=0)
with pm.Model() as model:
a = pm.Uniform( "S-Factor", lower=0.01, upper=100 )
ode_solution = ode_model(y0=[0], theta=[a, Yield, lambd])
I obviously get the error TypeError: __trunc__ returned non-Integral (type TensorVariable), as the t is a TensorVariable, thus can not be used to access the array in which the coefficients are stored.
Is there a way to overcome this difficulty? I thought about using the theano.function but I can not get it working since, unfortunately, the coefficients can not be expressed by any analytical function: they are just stored inside the array I which index represents the time variable t.
Thanks

Since you already have a working implementation with scipy.integrate.odeint, you could use theano.compile.ops.as_op, though it comes with some inconveniences (see how to fit a method belonging to an instance with pymc3? and How to write a custom Deterministic or Stochastic in pymc3 with theano.op?)
Using your exact definitions for production and getBeam, the following code seems to work for me:
from scipy.integrate import odeint
from theano.compile.ops import as_op
import theano.tensor as tt
import pymc3 as pm
def ScipySolveODE(a):
return odeint(production, y0=[0], t=x, args=([a, Yield, lambd],)).flatten()
#as_op(itypes=[tt.dscalar], otypes=[tt.dvector])
def TheanoSolveODE(a):
return ScipySolveODE(a)
with pm.Model() as model:
a = pm.Uniform( "S-Factor", lower=0.01, upper=100 )
ode_solution = TheanoSolveODE(a)
Sorry I know this is more of a workaround than an actual solution...

Related

The covariance of the parameters cannot be estimated during curve fitting

I'm trying to solve two unknown parameters based on my function expression using the scipy.optimize.curve_fit function. The equation I used is as follows:
enter image description here
My code is as follows:
p_freqs =np.array(0.,8.19672131,16.39344262,24.59016393,32.78688525,
40.98360656,49.18032787,57.37704918,65.57377049,73.7704918,
81.96721311,90.16393443,98.36065574,106.55737705,114.75409836,
122.95081967,131.14754098,139.3442623, 147.54098361,155.73770492,
163.93442623,172.13114754,180.32786885,188.52459016,196.72131148,
204.91803279,213.1147541, 221.31147541,229.50819672,237.70491803,
245.90163934)
p_fft_amp1 = np.array(3.34278536e-08,5.73549829e-08,1.94897033e-08,1.59088184e-08,
9.23948302e-09,3.71198908e-09,1.85535722e-09,1.86064653e-09,
1.52149363e-09,1.33626573e-09,1.19468040e-09,1.08304535e-09,
9.96594475e-10,9.25671797e-10,8.66775330e-10,8.17287132e-10,
7.75342888e-10,7.39541296e-10,7.08843676e-10,6.82440637e-10,
6.59712650e-10,6.40169517e-10,6.23422124e-10,6.09159901e-10,
5.97134297e-10,5.87146816e-10,5.79040074e-10,5.72691200e-10,
5.68006964e-10,5.64920239e-10,5.63387557e-10)
def cal_omiga_tstar(omiga,tstar,f):
return omiga*np.exp(-np.pi*f*tstar)/(1+(f/18.15)**2)
omiga,tstar = optimize.curve_fit(cal_omiga_tstar,p_freqs,p_fft_amp1)[0]
When I run the code I get the following prompt:
OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated'
I couldn't exactly pinpoint the cause of your error message because your code had some errors prior to that. First, the construction of the two arrays has invalid syntax, then your definition of cal_omiga_tstar has the wrong argument order. While fixing these problems I did get your error message once, but I haven't been able to reproduce it, weirdly enough. However, I did manage to fit your function. You should supply initial guesses to the parameters, especially since your y has so many low values. There's no magic here, just plot your model and data until it's relatively close. Then, let the algorithm take the wheel.
Here's my code:
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
# Changed here, was "np.array(0.,..."
p_freqs =np.array([0.,8.19672131,16.39344262,24.59016393,32.78688525,
40.98360656,49.18032787,57.37704918,65.57377049,73.7704918,
81.96721311,90.16393443,98.36065574,106.55737705,114.75409836,
122.95081967,131.14754098,139.3442623, 147.54098361,155.73770492,
163.93442623,172.13114754,180.32786885,188.52459016,196.72131148,
204.91803279,213.1147541, 221.31147541,229.50819672,237.70491803,
245.90163934])
p_fft_amp1 = np.array([3.34278536e-08,5.73549829e-08,1.94897033e-08,1.59088184e-08,
9.23948302e-09,3.71198908e-09,1.85535722e-09,1.86064653e-09,
1.52149363e-09,1.33626573e-09,1.19468040e-09,1.08304535e-09,
9.96594475e-10,9.25671797e-10,8.66775330e-10,8.17287132e-10,
7.75342888e-10,7.39541296e-10,7.08843676e-10,6.82440637e-10,
6.59712650e-10,6.40169517e-10,6.23422124e-10,6.09159901e-10,
5.97134297e-10,5.87146816e-10,5.79040074e-10,5.72691200e-10,
5.68006964e-10,5.64920239e-10,5.63387557e-10])
# Changed sequence from "omiga, tstar, f" to "f, omiga, tstar".
def cal_omiga_tstar(f, omiga,tstar):
return omiga*np.exp(-np.pi*f*tstar)/(1+(f/18.15)**2)
# Changed this call to get popt, pcov, and supplied the initial guesses
popt, pcov = curve_fit(cal_omiga_tstar,p_freqs,p_fft_amp1, p0=(1E-5, 1E-2))
Here's popt: array([ 4.51365934e-08, -1.48124194e-06]) and pcov: array([[1.35757744e-17, 3.54656128e-12],[3.54656128e-12, 2.90508007e-06]]). As you can see, the covariance matrix could be estimated in this case.
Here's the model x data curve:

Summarise the posterior of a single parameter from an array with arviz

I am estimating a model using the pyMC3 library in python. In my "real" model, there are four parameter arrays, two of which have over 170,000 parameters in them. Summarising this array of parameters is too computationally intensive on my computer. I have been trying to figure out if the summary function in arviz will allow me to only summarise one (or a small number) of parameters in the array. Below is a reprex where the same problem is present, though the model is a lot simpler. In the linear regression model below, the parameter array b has three parameters in it b[0], b[1], b[2]. I would like to know how to get the summary for just b[0] and b[1] or alternatively for just a single parameter, e.g., b[0].
import pandas as pd
import pymc3 as pm
import arviz as az
d = pd.read_csv("https://quantoid.net/files/mtcars.csv")
mpg = d['mpg'].values
hp = d['hp'].values
weight = d['wt'].values
with pm.Model() as model:
b = pm.Normal("b", mu=0, sigma=10, shape=3)
sig = pm.HalfCauchy("sig", beta=2)
mu = pm.Deterministic('mu', b[0] + b[1]*hp + b[2]*weight)
like = pm.Normal('like', mu=mu, sigma=sig, observed=mpg)
fit = pm.fit(10000, method='advi')
samp = fit.sample(1500)
with model:
smry = az.summary(samp, var_names = ["b"])
It looked like the coords argument to the summary() function would do it, but after googling around and finding a few examples, like the one here with plot_posterior() instead of summary(), I was unable to get something to work. In particular, I tried the following in the hopes that it would return the summary for b[0] and b[1].
with model:
smry = az.summary(samp, var_names = ["b"], coords={"b_dim_0": range(1)})
or this to return the summary of b[0]:
with model:
smry = az.summary(samp, var_names = ["b"], coords={"b_dim_0": [0]})
I suspect I am missing something simple (I'm an R user who dabbles occasionally with Python). Any help is greatly appreciated.
(BTW, I am using Python 3.8.0, pyMC3 3.9.3, arviz 0.10.0)
To use coords for this, you need to update to the development (which will still show 0.11.2 but has the code from github or any >0.11.2 release) version of ArviZ. Until 0.11.2, the coords argument in summary was not used to subset the data (like it did in all plotting functions) but instead it was only taken into account if the input was not already InferenceData in which case it was passed to the converter.
With older versions, you need to use xarray to subset the data before passing it to summary. Therefore you need to explicitly convert the trace to inferencedata beforehand. In the example above it would look like:
with model:
...
samp = fit.sample(1500)
idata = az.from_pymc3(samp)
az.summary(idata.posterior[["b"]].sel({"b_dim_0": [0]}))
Moreover, you may also want to indicate summary to compute only a subset of the stats/diagnostics as shown in the docstring examples.

How to define General deterministic function in PyMC

In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.
Is it possible to do that?
Following is a pyMC3 code which shows what I am trying to do in a simplified case.
import numpy as np
import pymc as pm
#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
for w in range(size):
# A silly version of my real model evaluated on grid.
Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])
# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
return Grid[int(x)*size+int(w),2:] * z
#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror # ie. True x= 16, w= 12 and z= 3.6
with pm.Model() as model:
#Priors
x = pm.Uniform('x',lower=0,upper= size)
w = pm.Uniform('w',lower=0,upper =size)
z = pm.Uniform('z',lower=-5,upper =10)
#Expected value
y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))
#Data likelihood
ysigmas = np.ones(len(idata))*9.0
y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)
# Inference...
start = pm.find_MAP() # Find starting value by optimization
step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling
print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()
When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.
Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.
I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?
Update
Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.
#pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.
Metropolis seems to be not converging.
Which step method should I use in a scenario like this?
You found the correct solution with as_op.
Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.
Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.

SciPy: generating custom random variable from PMF

I'm trying to generate random variables according to a certain ugly distribution, in Python. I have an explicit expression for the PMF, but it involves some products which makes it unpleasant to obtain and invert the CDF (see below code for explicit form of PMF).
In essence, I'm trying to define a random variable in Python by its PMF and then have built-in code do the hard work of sampling from the distribution. I know how to do this if the support of the RV is finite, but here the support is countably infinite.
The code I am currently trying to run as per #askewchan's advice below is:
import scipy as sp
import numpy as np
class x_gen(sp.stats.rv_discrete):
def _pmf(self,k,param):
num = np.arange(1+param, k+param, 1)
denom = np.arange(3+2*param, k+3+2*param, 1)
p = (2+param)*(np.prod(num)/np.prod(denom))
return p
pa_limit = limitrv_gen()
print pa_limit.rvs(alpha,n=1)
However, this returns the error while running:
File "limiting_sim.py", line 42, in _pmf
num = np.arange(1+param, k+param, 1)
TypeError: only length-1 arrays can be converted to Python scalars
Basically, it seems that the np.arange() list isn't working somehow inside the def _pmf() function. I'm at a loss to see why. Can anyone enlighten me here and/or point out a fix?
EDIT 1: cleared up some questions by askewchan, edits reflected above.
EDIT 2: askewchan suggested an interesting approximation using the factorial function, but I'm looking more for an exact solution such as the one that I'm trying to get work with np.arange.
You should be able to subclass rv_discrete like so:
class mydist_gen(rv_discrete):
def _pmf(self, n, param):
return yourpmf(n, param)
Then you can create a distribution instance with:
mydist = mydist_gen()
And generate samples with:
mydist.rvs(param, size=1000)
Or you can then create a frozen distribution object with:
mydistp = mydist(param)
And finally generate samples with:
mydistp.rvs(1000)
With your example, this should work, since factorial automatically broadcasts. But, it might fail for large enough alpha:
import scipy as sp
import numpy as np
from scipy.misc import factorial
class limitrv_gen(sp.stats.rv_discrete):
def _pmf(self, k, alpha):
#num = np.prod(np.arange(1+alpha, k+alpha))
num = factorial(k+alpha-1) / factorial(alpha)
#denom = np.prod(np.arange(3+2*alpha, k+3+2*alpha))
denom = factorial(k + 2 + 2*alpha) / factorial(2 + 2*alpha)
return (2+alpha) * num / denom
pa_limit = limitrv_gen()
alpha = 100
pa_limit.rvs(alpha, size=10)

Creating new distributions in scipy

I'm trying to create a distribution based on some data I have, then draw randomly from that distribution. Here's what I have:
from scipy import stats
import numpy
def getDistribution(data):
kernel = stats.gaussian_kde(data)
class rv(stats.rv_continuous):
def _cdf(self, x):
return kernel.integrate_box_1d(-numpy.Inf, x)
return rv()
if __name__ == "__main__":
# pretend this is real data
data = numpy.concatenate((numpy.random.normal(2,5,100), numpy.random.normal(25,5,100)))
d = getDistribution(data)
print d.rvs(size=100) # this usually fails
I think this is doing what I want it to, but I frequently get an error (see below) when I try to do d.rvs(), and d.rvs(100) never works. Am I doing something wrong? Is there an easier or better way to do this? If it's a bug in scipy, is there some way to get around it?
Finally, is there more documentation on creating custom distributions somewhere? The best I've found is the scipy.stats.rv_continuous documentation, which is pretty spartan and contains no useful examples.
The traceback:
Traceback (most recent call last): File "testDistributions.py", line
19, in
print d.rvs(size=100) File "/usr/local/lib/python2.6/dist-packages/scipy-0.10.0-py2.6-linux-x86_64.egg/scipy/stats/distributions.py",
line 696, in rvs
vals = self._rvs(*args) File "/usr/local/lib/python2.6/dist-packages/scipy-0.10.0-py2.6-linux-x86_64.egg/scipy/stats/distributions.py",
line 1193, in _rvs
Y = self._ppf(U,*args) File "/usr/local/lib/python2.6/dist-packages/scipy-0.10.0-py2.6-linux-x86_64.egg/scipy/stats/distributions.py",
line 1212, in _ppf
return self.vecfunc(q,*args) File "/usr/local/lib/python2.6/dist-packages/numpy-1.6.1-py2.6-linux-x86_64.egg/numpy/lib/function_base.py",
line 1862, in call
theout = self.thefunc(*newargs) File "/usr/local/lib/python2.6/dist-packages/scipy-0.10.0-py2.6-linux-x86_64.egg/scipy/stats/distributions.py",
line 1158, in _ppf_single_call
return optimize.brentq(self._ppf_to_solve, self.xa, self.xb, args=(q,)+args, xtol=self.xtol) File
"/usr/local/lib/python2.6/dist-packages/scipy-0.10.0-py2.6-linux-x86_64.egg/scipy/optimize/zeros.py",
line 366, in brentq
r = _zeros._brentq(f,a,b,xtol,maxiter,args,full_output,disp) ValueError: f(a) and f(b) must have different signs
Edit
For those curious, following the advice in the answer below, here's code that works:
from scipy import stats
import numpy
def getDistribution(data):
kernel = stats.gaussian_kde(data)
class rv(stats.rv_continuous):
def _rvs(self, *x, **y):
# don't ask me why it's using self._size
# nor why I have to cast to int
return kernel.resample(int(self._size))
def _cdf(self, x):
return kernel.integrate_box_1d(-numpy.Inf, x)
def _pdf(self, x):
return kernel.evaluate(x)
return rv(name='kdedist', xa=-200, xb=200)
Specifically to your traceback:
rvs uses the inverse of the cdf, ppf, to create random numbers. Since you are not specifying ppf, it is calculated by a rootfinding algorithm, brentq. brentq uses lower and upper bounds on where it should search for the value at with the function is zero (find x such that cdf(x)=q, q is quantile).
The default for the limits, xa and xb, are too small in your example. The following works for me with scipy 0.9.0, xa, xb can be set when creating the function instance
def getDistribution(data):
kernel = stats.gaussian_kde(data)
class rv(stats.rv_continuous):
def _cdf(self, x):
return kernel.integrate_box_1d(-numpy.Inf, x)
return rv(name='kdedist', xa=-200, xb=200)
There is currently a pull request for scipy to improve this, so in the next release xa and xb will be expanded automatically to avoid the f(a) and f(b) must have different signs exception.
There is not much documentation on this, the easiest is to follow some examples (and ask on the mailing list).
edit: addition
pdf: Since you have the density function also given by gaussian_kde, I would add the _pdf method, which will make some calculations more efficient.
edit2: addition
rvs: If you are interested in generating random numbers, then gaussian_kde has a resample method. Random Samples can be generated by sampling from the data and adding gaussian noise. So, this will be faster than the generic rvs using the ppf method. I would write a ._rvs method that just calls gaussian_kde's resample method.
precomputing ppf: I don't know of any general way to precompute the ppf. However, the way I thought of doing it (but never tried so far) is to precompute the ppf at many points and then use linear interpolation to approximate the ppf function.
edit3: about _rvs to answer Srivatsan's question in the comment
_rvs is the distribution specific method that is called by the public method rvs. rvs is a generic method that does some argument checking, adds location and scale, and sets the attribute self._size which is the size of the requested array of random variables, and then calls the distribution specific method ._rvs or it's generic counterpart. The extra arguments in ._rvs are shape parameters, but since there are none in this case, *x and **y are redundant and unused.
I don't know how well the size or shape of the .rvs method works in the multivariate case. These distributions are designed for univariate distributions, and might not fully work for the multivariate case, or might need some reshapes.

Categories