In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.
Is it possible to do that?
Following is a pyMC3 code which shows what I am trying to do in a simplified case.
import numpy as np
import pymc as pm
#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
for w in range(size):
# A silly version of my real model evaluated on grid.
Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])
# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
return Grid[int(x)*size+int(w),2:] * z
#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror # ie. True x= 16, w= 12 and z= 3.6
with pm.Model() as model:
#Priors
x = pm.Uniform('x',lower=0,upper= size)
w = pm.Uniform('w',lower=0,upper =size)
z = pm.Uniform('z',lower=-5,upper =10)
#Expected value
y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))
#Data likelihood
ysigmas = np.ones(len(idata))*9.0
y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)
# Inference...
start = pm.find_MAP() # Find starting value by optimization
step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling
print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()
When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.
Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.
I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?
Update
Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.
#pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.
Metropolis seems to be not converging.
Which step method should I use in a scenario like this?
You found the correct solution with as_op.
Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.
Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.
Related
I am encountering a strange problem with scipy.integrate.ode. Here is a minimal working example:
import sys
import time
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from scipy.integrate import ode, complex_ode
def fun(t):
return np.exp( -t**2 / 2. )
def ode_fun(t, y):
a, b = y
f = fun(t)
c = np.conjugate(b)
dt_a = -2j*f*c + 2j*f*b
dt_b = 1j*f*a
return [dt_a, dt_b]
t_range = np.linspace(-10., 10., 10000)
init_cond = [-1, 0]
trajectory = np.empty((len(t_range), len(init_cond)), dtype=np.complex128)
### setup ###
r = ode(ode_fun).set_integrator('zvode', method='adams', with_jacobian=False)
r.set_initial_value(init_cond, t_range[0])
dt = t_range[1] - t_range[0]
### integration ###
for i, t_i in enumerate(t_range):
trajectory[i,:] = r.integrate(r.t+dt)
a_traj = trajectory[:,0]
b_traj = trajectory[:,1]
fun_traj = fun(t_range)
### plot ###
plt.figure(figsize=(10,5))
plt.subplot(121, title='ODE solution')
plt.plot(t_range, np.real(a_traj))
plt.subplot(122, title='Input')
plt.plot(t_range, fun_traj)
plt.show()
This code works correctly and the output figure is (the ODE is explicitly dependent on the input variable, right panel shows the input, left panel the ode solution for the first variable).
So in principle my code is working. What is strange is that if I simply replace the integration range
t_range = np.linspace(-10., 10., 10000)
by
t_range = np.linspace(-20., 20., 10000)
I get the output
So somehow the integrator just gave up on the integration and left my solution as a constant. Why does this happen? How can I fix it?
Some things I've tested: It is clearly not a resolution problem, the integration steps are really small already. Instead, it seems that the integrator does not even bother calling the ode function anymore after a few steps. I've tested that by including a print statement in the ode_fun().
My current suspicion is that the integrator decided that my function is constant after it did not change significantly during the first few integration steps. Do I maybe have to set some tolerance levels somewhere?
Any help appreciated!
"My current suspicion is that the integrator decided that my function is constant after it did not change significantly during the first few integration steps." Your suspicion is correct.
ODE solvers typically have an internal step size that is adaptively adjusted based on error estimates computed by the solver. These step sizes are independent of the times for which the output is requested; the output at the requested times is computed using interpolation of the solution at the points computed at the internal steps.
When you start your solver at t = -20, apparently the input changes so slowly that the solver's internal step size becomes large enough that by the time the solver gets near t = 0, the solver skips right over the input pulse.
You can limit the internal step size with the option max_step of the set_integrator method. If I set max_step to 2.0 (for example),
r = ode(ode_fun).set_integrator('zvode', method='adams', with_jacobian=False,
max_step=2.0)
I get the output that you expect.
I'm trying to use scipy.optimize.curve_fit on a large latitude/longitude/time xarray using dask.distributed as computing backend.
The idea is to run an individual data fitting for every (latitude, longitude) using the time series.
All of this runs fine outside xarray/dask. I tested it using the time series of a single location passed as a pandas dataframe. However, if I try to run the same process on the same (latitude, longitude) directly on the xarray, the curve_fit operation returns the initial parameters.
I am performing this operation using xr.apply_ufunc like so (here I'm providing only the code that is strictly relevant to the problem):
# function to perform the fit
def _fit_rti_curve(data, data_rti, fit, loc=False):
fit_func, linearize, find_init_params = _get_fit_functions(fit)
# remove nans
x, y = _filter_nodata(data_rti, data)
# remove outliers
x, y = _filter_for_outliers(x, y, linearize=linearize)
# find a first guess for maximum achieveable value
yscale = np.max(y) * 1.05
# find a first guess for the other parameters
# here loc can be manually passed if you have a good estimation
init_parms = find_init_params(x, y, yscale, loc=loc, linearize=linearize)
# fit the curve and return parameters
parms = curve_fit(fit_func, x, y, p0=init_parms, maxfev=10000)
parms = parms[0]
return parms
# shell around _fit_rti_curve
def find_rti_func_parms(data, rti, fit):
# sort and fit highest n values
top_data = np.sort(data)
top_data = top_data[-len(rti):]
# convert to float64 if needed
top_data = top_data.astype(np.float64)
rti = rti.astype(np.float64)
# run the fit
parms = _fit_rti_curve(top_data, rti, fit, loc=0) #TODO maybe add function to allow a free loc
return parms
# call for the apply_ufunc
# `fit` is a string that defines the distribution type
# `rti` is an array for the x values
parms_data = xr.apply_ufunc(
find_rti_func_parms,
xr_obj,
input_core_dims=[['time']],
output_core_dims=[[fit + ' parameters']],
output_sizes = {fit + ' parameters': len(signature(fit_func).parameters) - 1},
vectorize=True,
kwargs={'rti':return_time_interval, 'fit':fit},
dask='parallelized',
output_dtypes=['float64']
)
My guess would be that is a problem related to threading, or at least some shared memory space that is not properly passed between workers and scheduler.
However, I am just not knowledgeable enough to test this within dask.
Any idea on this problem?
You should have a look at this issue https://github.com/pydata/xarray/issues/4300
I had the same problem and I solved using apply_ufunc. It is not optimized, since it has to perform rechunking operations, but it works!
I've created a GitHub Gist for it https://gist.github.com/clausmichele/8350e1f7f15e6828f29579914276de71
This previous answer might be helpful? It's using numpy.polyfit but I think the general approach should be similar.
Applying numpy.polyfit to xarray Dataset
Also, I haven't tried it but xr.polyfit() just got merged recently! Could also be something to look into. http://xarray.pydata.org/en/stable/generated/xarray.DataArray.polyfit.html#xarray.DataArray.polyfit
I am trying to fit the parameters of a transit light curve.
I have observed transit light curve data and I am using a .py in python that through 4 parameters (period, a(semi-major axis), inclination, planet radius) returns a model transit light curve. I would like to minimize the residual between these two light curves. This is what I am trying to do: First - Estimate a max likelihood using method = "L-BFGS-B" and then apply the mcmc using emcee to estimate the uncertainties.
The code:
p = lmfit.Parameters()
p.add_many(('per', 2.), ('inc', 90.), ('a', 5.), ('rp', 0.1))
per_b = [1., 3.]
a_b = [4., 6.]
inc_b = [88., 90.]
rp_b = [0.1, 0.3]
bounds = [(per_b[0], per_b[1]), (inc_b[0], inc_b[1]), (a_b[0], a_b[1]), (rp_b[0], rp_b[1])]
def residual(p):
v = p.valuesdict()
eclipse.criarEclipse(v['per'], v['a'], v['inc'], v['rp'])
lc0 = numpy.array(eclipse.getCurvaLuz()) (observed flux data)
ts0 = numpy.array(eclipse.getTempoHoras()) (observed time data)
c = numpy.linspace(min(time_phased[bb]),max(time_phased[bb]),len(time_phased[bb]),endpoint=True)
nn = interpolate.interp1d(ts0,lc0)
return nn(c) - smoothed_LC[bb] (residual between the model and the data)
Inside def residual(p) I make sure that both the observed data (time_phased[bb] and smoothed_LC[bb]) have the same size of the model transit light curve. I want it to give me the best fit values for the parameters (v['per'], v['a'], v['inc'], v['rp']).
I need your help and I appreciate your time and your attention. Kindest regards, Yuri.
Your example is incomplete, with many partial concepts and some invalid Python. This makes it slightly hard to understand your intention. If the answer below is not sufficient, update your question with a complete example.
It seems pretty clear that you want to model your data smoothed_LC[bb] (not sure what bb is) with a model for some effect of an eclipse. With that assumption, I would recommend using the lmfit.Model approach. Start by writing a function that models the data, just so you check and plot your model. I'm not entirely sure I understand everything you're doing, but this model function might look like this:
import numpy
from scipy import interpolate
from lmfit import Model
# import eclipse from somewhere....
def eclipse_lc(c, per, a, inc, p):
eclipse.criarEclipse(per, a, inc, rp)
lc0 = numpy.array(eclipse.getCurvaLuz()) # observed flux data
ts0 = numpy.array(eclipse.getTempoHoras()) # observed time data
return interpolate.interp1d(ts0,lc0)(c)
With this model function, you can build a Model:
lc_model = Model(eclipse_lc)
and then build parameters for your model. This will automatically name them after the argument names of your model function. Here, you can also give them initial values:
params = lc_model.make_params(per=2, inc=90, a=5, rp=0.1)
You wanted to place upper and lower bounds on these parameters. This is done by setting min and max parameters, not making an ordered array of bounds:
params['per'].min = 1.0
params['per'].max = 3.0
and so on. But also: setting such tight bounds is usually a bad idea. Set bounds to avoid unphysical parameter values or when it becomes evident that you need to place them.
Now, you can fit your data with this model. Well, first you need to get the data you want to model. This seems less clear from your example, but perhaps:
c_data = numpy.linspace(min(time_phased[bb]), max(time_phased[bb]),
len(time_phased[bb]), endpoint=True)
lc_data = smoothed_LC[bb]
Well: why do you need to make this c_data? Why not just use time_phased as the independent variable? Anyway, now you can fit your data to your model with your parameters:
result = lc_model(lc_data, params, c=c_data)
At this point, you can print out a report of the results and/or view or get the best-fit arrays:
print(result.fit_report())
for p in result.params.items(): print(p)
import matplotlib.pyplot as plt
plt.plot(c_data, lc_data, label='data')
plt.plot(c_data. result.best_fit, label='fit')
plt.legend()
plt.show()
Hope that helps...
I would like to use scipy's ODR to fit a curve to a set of variables with variances. In this case, I am fitting a linear function with a set Y axis crossing point (e.g. a*x+100). Due to my inability to find an estimator (I asked about that here), I am using scipy.optimize curve_fit to estimate the initial a value. Now, the function works perfectly without standard deviation, but when I add it, the output makes completely no sense (the curve is far above all of the points). What could be the case of such behaviour? Thanks!
The code is attached here:
import scipy.odr as SODR
from scipy.optimize import curve_fit
def fun(kier, arg):
'''
Function to fit
:param kier: list of parameters
:param arg: argument of the function fun
:return y: value of function in x
'''
y =kier[0]*arg +100 #+kier[1]
return y
zx = [30.120348566300354, 36.218214083626386, 52.86998374096616]
zy = [83.47033171149137, 129.10207165602722, 85.59465198231146]
dx = [2.537935346025827, 4.918719773247683, 2.5477221183398977]
dy = [3.3729431749276837, 5.33696690247701, 2.0937213187876]
sx = [6.605581618516947, 8.221194790372632, 22.980577676739113]
sy = [1.0936584882351936, 0.7749999999999986, 20.915359045447914]
dx_total = [9.143516964542775, 13.139914563620316, 25.52829979507901]
dy_total = [4.466601663162877, 6.1119669024770085, 23.009080364235516]
# curve fitting
popt, pcov = curve_fit(fun, zx, zy)
danesd = SODR.RealData(x=zx, y=zy, sx=dx_total, sy=dy_total)
model = SODR.Model(fun)
onbig = SODR.ODR(danesd, model, beta0=[popt[0]])
outputbig = onbig.run()
biga=outputbig.beta[0]
print(biga)
daned = SODR.RealData(x=zx,y=zy,sx=dx,sy=dy)
on = SODR.ODR(daned, model, beta0=[popt[0]])
outputd = on.run()
normala = outputd.beta[0]
print(normala)
The outputs are:
30.926925885047254 (this is the output with standard deviation)
-0.25132703671513873 (this is without standard deviation)
This makes no sense, as shown here:
Also, I'd be happy to get any feedback whether my code is clear and the formatting of this question. I am still very new here.
I want to shift my variables in PyMC3. I'm currently just using deterministic transforms, but when I perform the inference, it saves both the original samples and the shifted samples (which is expected behavior).
Example code:
import pymc3 as pm
x_lower = -3
with pm.Model():
x = pm.Gamma('x', alpha=2., beta=1.5)
x_shift = pm.Deterministic("x_shift", x + x_lower)
trace = pm.sample(1000, tune=1000)
trace.remove_values("x") # my current solution
tp = pm.traceplot(trace)
# other analysis...
Now trace stores all x samples and all x_shift samples, which is clearly a waste when the numbers of variables and samples increase. I can do trace.remove_values("x") before continuing with analysis, but I would prefer to simply not savex at all.
Another option is to not save x_shift at all, but I can't find how to add x_lower on to the samples after inference. So this isn't really a solution if I want to use the in-built analysis tools.
Can I save only the x_shift samples, and not the x samples, when I sample?
You can specify exactly what you want to save by setting the trace argument in the pm.sample() function, e.g.,
trace = pm.sample(1000, tune=1000, trace=[x_shift])
In case anyone finds this...
I went for a still-unsatsifying solution: I'm not using Deterministic transformations any more. I still transform the variables, but don't save them. I just transform the saved (original) samples after sampling.
The above code now looks like this:
import pymc3 as pm
x_lower = -3
def f_x(x):
return x + x_lower
with pm.Model():
x = pm.Gamma('x', alpha=2., beta=1.5)
x_shift = f_x(x)
#keep = {"x": f_x} # something like this for more variables
trace = pm.sample(1000, tune=1000)
trace = {"x": f_x(trace["x"])} # note this merges all chains
#trace = {varname: f(trace[varname]) for varname, f in keep.items()}
tp = pm.traceplot(trace)
# other analysis...
I think the pm.traceplot(trace) still works with trace in this form, but otherwise just import arviz and use it directly --- it does work with dicts like this.
Note: take care with more complicated transformations. e.g. you'll have to use pm.math.exp in the model, but np.exp in the post-sampling transformation.