Define different bounds for a multidimentional stochastic variable in pymc - python

I'm having an issue with defining bounds for a multidimentional stochastic variable.
Here is a dummy exemple to explain my problem.
If I want to have a 3 dimension discrete uniform between [0,100]
import pymc as mc
from numpy import empty
truth = mc.DiscreteUniform("bin1", lower=0, upper=100, value=[50,50,50], size=3)
#mc.deterministic(plot=False)
def unfold(truth=truth):
out = empty(3)
for r in xrange(3):
out[r] = truth[r]
return out
data = [5, 10, 30]
unfolded = mc.Poisson('unfolded', mu=unfold, value=data, observed=True, size=3)
model = mc.Model([unfolded, unfold, truth])
mcmc = mc.MCMC( model )
mcmc.use_step_method(mc.AdaptiveMetropolis, truth)
mcmc.sample(10000,1000,10)
this will sample a DiscreteUniform for 3 bins with the same range for each bin (between 0 and 100).
Now, I tried several things to define different range for each bin, but can not succeed. I tried arrays of DiscreteUniform and arrays of bounds (upper,lower), but they obviously does not work.
Does anyone ahs any idea how to define different range for the various bins of a stochastic variable?

To define different ranges and initial values you need to call the stochastic constructor N times to create a list of variables and then use the Container constructor to make the list pymc-readable:
bin1 = mc.DiscreteUniform("bin1", lower=0, upper=100, value=50, size=1)
bin2 = mc.DiscreteUniform("bin2", lower=0, upper=40, value=20, size=1)
bin3 = mc.DiscreteUniform("bin3", lower=10, upper=50, value=30, size=1)
truth = mc.Container([bin1,bin2,bin3])

Related

Bifurcation diagram of dynamical system

TL:DR
How can one implement a bifurcation diagram of a seasonally forced epidemiological model such as SEIR (susceptible, exposed, infected, recovered) in Python? I already know how to implement the model itself and display a sampled time series (see this stackoverflow question), but I am struggling with reproducing a bifurcation figure from a textbook.
Context and My Attempt
I am trying to reproduce figures from the book "Modeling Infectious Diseases in Humans and Animals" (Keeling 2007) to both validate my implementations of models and to learn/visualize how different model parameters affect the evolution of a dynamical system. Below is the textbook figure.
I have found implementations of bifurcation diagrams for examples using the logistic map (see this ipython cookbook this pythonalgos bifurcation, and this stackoverflow question). My main takeaway from these implementations was that a single point on the bifurcation diagram has an x-component equal to some particular value of the varied parameter (e.g., Beta 1 = 0.025) and its y-component is the solution (numerical or otherwise) at time t for a given model/function. I use this logic to implement the plot_bifurcation function in the code section at the end of this question.
Questions
Why do my panel outputs not match those in the figure? I assume I can't try to reproduce the bifurcation diagram from the textbook without my panels matching the output in the textbook.
I have tried to implement a function to produce a bifurcation diagram, but the output looks really strange. Am I misunderstanding something about the bifurcation diagram?
NOTE: I receive no warnings/errors during code execution.
Code to Reproduce my Figures
from typing import Callable, Dict, List, Optional, Any
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
def seasonal_seir(y: List, t: List, params: Dict[str, Any]):
"""Seasonally forced SEIR model.
Function parameters much match with those required
by `scipy.integrate.odeint`
Args:
y: Initial conditions.
t: Timesteps over which numerical solution will be computed.
params: Dict with the following key-value pairs:
beta_zero -- Average transmission rate.
beta_one -- Amplitude of seasonal forcing.
omega -- Period of forcing.
mu -- Natural mortality rate.
sigma -- Latent period for infection.
gamma -- Recovery from infection term.
Returns:
Tuple whose components are the derivatives of the
susceptible, exposed, and infected state variables
w.r.t to time.
References:
[SEIR Python Program from Textbook](http://homepages.warwick.ac.uk/~masfz/ModelingInfectiousDiseases/Chapter2/Program_2.6/Program_2_6.py)
[Seasonally Forced SIR Program from Textbook](http://homepages.warwick.ac.uk/~masfz/ModelingInfectiousDiseases/Chapter5/Program_5.1/Program_5_1.py)
"""
beta_zero = params['beta_zero']
beta_one = params['beta_one']
omega = params['omega']
mu = params['mu']
sigma = params['sigma']
gamma = params['gamma']
s, e, i = y
beta = beta_zero*(1 + beta_one*np.cos(omega*t))
sdot = mu - (beta * i + mu)*s
edot = beta*s*i - (mu + sigma)*e
idot = sigma*e - (mu + gamma)*i
return sdot, edot, idot
def plot_panels(
model: Callable,
model_params: Dict,
panel_param_space: List,
panel_param_name: str,
initial_conditions: List,
timesteps: List,
odeint_kwargs: Optional[Dict] = dict(),
x_ticks: Optional[List] = None,
time_slice: Optional[slice] = None,
state_var_ix: Optional[int] = None,
log_scale: bool = False):
"""Plot panels that are samples of the parameter space for bifurcation.
Args:
model: Function that models dynamical system. Returns dydt.
model_params: Dict whose key-value pairs are the names
of parameters in a given model and the values of those parameters.
bifurcation_parameter_space: List of varied bifurcation parameters.
bifuraction_parameter_name: The name o the bifurcation parameter.
initial_conditions: Initial conditions for numerical integration.
timesteps: Timesteps for numerical integration.
odeint_kwargs: Key word args for numerical integration.
state_var_ix: State variable in solutions to use for plot.
time_slice: Restrict the bifurcation plot to a subset
of the all solutions for numerical integration timestep space.
Returns:
Figure and axes tuple.
"""
# Set default ticks
if x_ticks is None:
x_ticks = timesteps
# Create figure
fig, axs = plt.subplots(ncols=len(panel_param_space))
# For each parameter that is varied for a given panel
# compute numerical solutions and plot
for ix, panel_param in enumerate(panel_param_space):
# update model parameters with the varied parameter
model_params[panel_param_name] = panel_param
# Compute solutions
solutions = odeint(
model,
initial_conditions,
timesteps,
args=(model_params,),
**odeint_kwargs)
# If there is a particular solution of interst, index it
# otherwise squeeze last dimension so that [T, 1] --> [T]
# where T is the max number of timesteps
if state_var_ix is not None:
solutions = solutions[:, state_var_ix]
elif state_var_ix is None and solutions.shape[-1] == 1:
solutions = np.squeeze(solutions)
else:
raise ValueError(
f'solutions to model are rank-2 tensor of shape {solutions.shape}'
' with the second dimension greater than 1. You must pass'
' a value to :param state_var_ix:')
# Slice the solutions based on the desired time range
if time_slice is not None:
solutions = solutions[time_slice]
# Natural log scale the results
if log_scale:
solutions = np.log(solutions)
# Plot the results
axs[ix].plot(x_ticks, solutions)
return fig, axs
def plot_bifurcation(
model: Callable,
model_params: Dict,
bifurcation_parameter_space: List,
bifurcation_param_name: str,
initial_conditions: List,
timesteps: List,
odeint_kwargs: Optional[Dict] = dict(),
state_var_ix: Optional[int] = None,
time_slice: Optional[slice] = None,
log_scale: bool = False):
"""Plot a bifurcation diagram of state variable from dynamical system.
Args:
model: Function that models system. Returns dydt.
model_params: Dict whose key-value pairs are the names
of parameters in a given model and the values of those parameters.
bifurcation_parameter_space: List of varied bifurcation parameters.
bifuraction_parameter_name: The name o the bifurcation parameter.
initial_conditions: Initial conditions for numerical integration.
timesteps: Timesteps for numerical integration.
odeint_kwargs: Key word args for numerical integration.
state_var_ix: State variable in solutions to use for plot.
time_slice: Restrict the bifurcation plot to a subset
of the all solutions for numerical integration timestep space.
log_scale: Flag to natural log scale solutions.
Returns:
Figure and axes tuple.
"""
# Track the solutions for each parameter
parameter_x_time_matrix = []
# Iterate through parameters
for param in bifurcation_parameter_space:
# Update the parameter dictionary for the model
model_params[bifurcation_param_name] = param
# Compute the solutions to the model using
# dictionary of parameters (including the bifurcation parameter)
solutions = odeint(
model,
initial_conditions,
timesteps,
args=(model_params, ),
**odeint_kwargs)
# If there is a particular solution of interst, index it
# otherwise squeeze last dimension so that [T, 1] --> [T]
# where T is the max number of timesteps
if state_var_ix is not None:
solutions = solutions[:, state_var_ix]
elif state_var_ix is None and solutions.shape[-1] == 1:
solutions = np.squeeze(solutions)
else:
raise ValueError(
f'solutions to model are rank-2 tensor of shape {solutions.shape}'
' with the second dimension greater than 1. You must pass'
' a value to :param state_var_ix:')
# Update the parent list of solutions for this particular
# bifurcation parameter
parameter_x_time_matrix.append(solutions)
# Cast to numpy array
parameter_x_time_matrix = np.array(parameter_x_time_matrix)
# Transpose: Bifurcation plots Function Output vs. Parameter
# This line ensures that each row in the matrix is the solution
# to a particular state variable in the system of ODEs
# a timestep t
# and each column is that solution for a particular value of
# the (varied) bifurcation parameter of interest
time_x_parameter_matrix = np.transpose(parameter_x_time_matrix)
# Slice the iterations to display to a smaller range
if time_slice is not None:
time_x_parameter_matrix = time_x_parameter_matrix[time_slice]
# Make bifurcation plot
fig, ax = plt.subplots()
# For the solutions vector at timestep plot the bifurcation
# NOTE: The elements of the solutions vector represent the
# numerical solutions at timestep t for all varied parameters
# in the parameter space
# e.g.,
# t beta1=0.025 beta1=0.030 .... beta1=0.30
# 0 solution00 solution01 .... solution0P
for sol_at_time_t_for_all_params in time_x_parameter_matrix:
if log_scale:
sol_at_time_t_for_all_params = np.log(sol_at_time_t_for_all_params)
ax.plot(
bifurcation_parameter_space,
sol_at_time_t_for_all_params,
',k',
alpha=0.25)
return fig, ax
# Define initial conditions based on figure
s0 = 6e-2
e0 = i0 = 1e-3
initial_conditions = [s0, e0, i0]
# Define model parameters based on figure
# NOTE: omega is not mentioned in the figure, but
# omega is defined elsewhere as 2pi/365
days_per_year = 365
mu = 0.02/days_per_year
beta_zero = 1250
sigma = 1/8
gamma = 1/5
omega = 2*np.pi / days_per_year
model_params = dict(
beta_zero=beta_zero,
omega=omega,
mu=mu,
sigma=sigma,
gamma=gamma)
# Define timesteps
nyears = 200
ndays = nyears * days_per_year
timesteps = np.arange(1, ndays + 1, 1)
# Define different levels of seasonality (from figure)
beta_ones = [0.025, 0.05, 0.25]
# Define the time range to actually show on the plot
min_year = 190
max_year = 200
# Create a slice of the iterations to display on the diagram
time_slice = slice(min_year*days_per_year, max_year*days_per_year)
# Get the xticks to display on the plot based on the time slice
x_ticks = timesteps[time_slice]/days_per_year
# Plot the panels using the infected state variable ix
infection_ix = 2
# Plot the panels
panel_fig, panel_ax = plot_panels(
model=seasonal_seir,
model_params=model_params,
panel_param_space=beta_ones,
panel_param_name='beta_one',
initial_conditions=initial_conditions,
timesteps=timesteps,
odeint_kwargs=dict(hmax=5),
x_ticks=x_ticks,
time_slice=time_slice,
state_var_ix=infection_ix,
log_scale=False)
# Label the panels
panel_fig.suptitle('Attempt to Reproduce Panels from Keeling 2007')
panel_fig.supxlabel('Time (years)')
panel_fig.supylabel('Fraction Infected')
panel_fig.set_size_inches(15, 8)
# Plot bifurcation
bi_fig, bi_ax = plot_bifurcation(
model=seasonal_seir,
model_params=model_params,
bifurcation_parameter_space=np.linspace(0.025, 0.3),
bifurcation_param_name='beta_one',
initial_conditions=initial_conditions,
timesteps=timesteps,
odeint_kwargs={'hmax':5},
state_var_ix=infection_ix,
time_slice=time_slice,
log_scale=False)
# Label the bifurcation
bi_fig.suptitle('Attempt to Reproduce Bifurcation Diagram from Keeling 2007')
bi_fig.supxlabel(r'$\beta_1$')
bi_fig.supylabel('Fraction Infected')
bi_fig.set_size_inches(15, 8)
The answer to this questions is here on the Computational Science stack exchange. All credit to Lutz Lehmann.

How to interpolate in n dimensions with different length axes

So I was looking at how to use scipy's interpn function, and the example they have on the documentation isn't quite working with what I need it to do.
My implementation is a bit different. I have a precomputed value array with shape [200,40,40,40] that I get from a different script.
So when I do something like:
t = np.linspace(0,1, 200)
x = np.linspace(0,1, 40)
y = np.linspace(0,1, 40)
z = np.linspace(0,1, 40)
points = (t,x,y,z)
interpn(points,values,point)
I get an error: "ValueError: There are 40 points and 200 values in dimension 0"
It seems as though the dimensions of my points tuple and value array are not lining up, but I thought since my "t" axis is first in the tuple, it should be match. Any advice?
So this works for me:
import numpy as np
from scipy.interpolate import interpn
def f(x,y,z,t):
'''Simple 3D + time dimensional function.'''
return (np.sin(x)+y+np.sqrt(z))*t
t = np.linspace(0,1,200)
x = np.linspace(0,1,40)
y = np.linspace(0,1,40)
z = np.linspace(0,1,40)
points = (x,y,z,t)
values = f(*np.meshgrid(*points))
# example point in domain
point = [0,0.5,0.75,1/3.]
print(interpn(points, values, point))
array([0.44846267])
You defined x,y,z as np.linspace(0,40,1), this means you have a single point on the interval [0,40]. The same for t. That's probably your error. Example taken from the official scipy documentation.

fisher's linear discriminant in Python

I have the fisher's linear discriminant that i need to use it to reduce my examples A and B that are high dimensional matrices to simply 2D, that is exactly like LDA, each example has classes A and B, therefore if i was to have a third example they also have classes A and B, fourth, fifth and n examples would always have classes A and B, therefore i would like to separate them in a simple use of fisher's linear discriminant. Im pretty much new to machine learning, so i dont know how to separate my classes, i've been following the formula by eye and coding on the go. From what i was reading, i need to apply a linear transformation to my data so i can find a good threshold for it, but first i'd need to find the maximization function. For such task, i managed to find Sw and Sb, but i don't know how to go from there...
Where i also need to find the maximization function.
That maximization function gives me an eigen value solution:
What i have for each classes are matrices 5x2 of 2 examples. For instance:
Example 1
Class_A = [
201, 103,
40, 43,
23, 50,
12, 123,
99, 78
]
Class_B = [
201, 129,
114, 195,
180, 90,
69, 62,
76, 90
]
Example 2
Class_A = [
68, 98,
201, 203,
78, 212,
49, 5,
204, 78
]
Class_B = [
52, 19,
220, 219,
159, 195,
99, 23,
46, 50
]
I tried finding Sw for the example above like this:
Example_1_Class_A = np.dot(Example_1_Class_A, np.transpose(Example_1_Class_A))
Example_1_Class_B = np.dot(Example_1_Class_B, np.transpose(Example_1_Class_B))
Example_2_Class_A = np.dot(Example_2_Class_A, np.transpose(Example_2_Class_A))
Example_2_Class_B = np.dot(Example_2_Class_B, np.transpose(Example_2_Class_B))
Sw = sum([Example_1_Class_A, Example_1_Class_B, Example_2_Class_A, Example_2_Class_B], axis=0)
As for Sb, i tried like this:
Example_1_Class_A_mean = Example_1_Class_A.mean(axis=0)
Example_1_Class_B_mean = Example_1_Class_B.mean(axis=0)
Example_2_Class_A_mean = Example_2_Class_A.mean(axis=0)
Example_2_Class_B_mean = Example_2_Class_B.mean(axis=0)
Example_1_Class_A_Sb = np.dot(Example_1_Class_A_mean, np.transpose(Example_1_Class_A_mean))
Example_1_Class_B_Sb = np.dot(Example_1_Class_B_mean, np.transpose(Example_1_Class_B_mean))
Example_2_Class_A_Sb = np.dot(Example_2_Class_A_mean, np.transpose(Example_2_Class_A_mean))
Example_2_Class_B_Sb = np.dot(Example_2_Class_B_mean, np.transpose(Example_2_Class_B_mean))
Sb = sum([Example_1_Class_A_Sb, Example_1_Class_B_Sb, Example_2_Class_A_Sb, Example_2_Class_B_Sb], axis=0)
The problem is, i have no idea what else to do with my Sw and Sb, i am completely lost. Basically, what i need to do is get from here to this:
How for given Example A and Example B, do i separate a cluster only for classes As and only for classes b
Before answering your question, I will first touch the basic difference between PCA and (F)LDA. In PCA you don't know anything about underlying classes, but you assume that the information about classes separability lies in the variance of data. So you rotate your original axes (sometimes it is called projecting all the data onto new ones) in such way that your first new axis is pointing to the direction of most variance, second one is perpendicular to the first one and pointing to the direction of most residiual variance, and so on. This way a PCA transformation results in a (sub)space of the same dimensionality as the original one. Than you can take only first 2 dimensions, rejecting the rest, hence getting a dimensionality reduction from k dimensions to only 2.
LDA works a bit differently. In this case you know in advance how many classes there are in your data, and you can find their mean and covariance matrices. What Fisher criterion does it finds a direction in which the mean between classes is maximized, while at the same time total variability is minimized (total variability is a mean of within-class covariance matrices). And for each two classes there is only one such line. This is why when your data has C classes, LDA can provide you at most C-1 dimensions, regardless of the original data dimensionality. In your case this means that as you have only 2 classes A and B, you will get a one-dimensional projection, i.e. a line. And this is exactly what you have in your picture: original 2d data is projected on to a line. The direction of the line is the solution of the eigenproblem.
Let's generate data that is similar to your picture:
a = np.random.multivariate_normal((1.5, 3), [[0.5, 0], [0, .05]], 30)
b = np.random.multivariate_normal((4, 1.5), [[0.5, 0], [0, .05]], 30)
plt.plot(a[:,0], a[:,1], 'b.', b[:,0], b[:,1], 'r.')
mu_a, mu_b = a.mean(axis=0).reshape(-1,1), b.mean(axis=0).reshape(-1,1)
Sw = np.cov(a.T) + np.cov(b.T)
inv_S = np.linalg.inv(Sw)
res = inv_S.dot(mu_a-mu_b) # the trick
####
# more general solution
#
# Sb = (mu_a-mu_b)*((mu_a-mu_b).T)
# eig_vals, eig_vecs = np.linalg.eig(inv_S.dot(Sb))
# res = sorted(zip(eig_vals, eig_vecs), reverse=True)[0][1] # take only eigenvec corresponding to largest (and the only one) eigenvalue
# res = res / np.linalg.norm(res)
plt.plot([-res[0], res[0]], [-res[1], res[1]]) # this is the solution
plt.plot(mu_a[0], mu_a[1], 'cx')
plt.plot(mu_b[0], mu_b[1], 'yx')
plt.gca().axis('square')
# let's project data point on it
r = res.reshape(2,)
n2 = np.linalg.norm(r)**2
for pt in a:
prj = r * r.dot(pt) / n2
plt.plot([prj[0], pt[0]], [prj[1], pt[1]], 'b.:', alpha=0.2)
for pt in b:
prj = r * r.dot(pt) / n2
plt.plot([prj[0], pt[0]], [prj[1], pt[1]], 'r.:', alpha=0.2)
The resulting projection is calculated using a neat trick for two class problem. You can read details on it here in section 1.6.
Regarding the "examples" you mention in your question. I believe you need to repeat the process for each example, as it is a different set of data point probably with different distributions. Also, put attention that estimated mean (mu_a, mu_b) and class covariance matrices would be slightly different from the ones that data was generated with, especially for small sample size.
Mathematics
See https://sebastianraschka.com/Articles/2014_python_lda.html#lda-in-5-steps for more information.
Implementation using Iris
Since you want to use LDA for dimensionality reduction but provide only 2d data I am showing how to perform this procedure on the iris dataset.
Let's import libraries
import pandas as pd
import numpy as np
import sklearn as sk
from collections import Counter
from sklearn import datasets
# load dataset and transform to pandas df
X, y = datasets.load_iris(return_X_y=True)
X = pd.DataFrame(X, columns=[f'feat_{i}' for i in range(4)])
y = pd.DataFrame(y, columns=['labels'])
tot = pd.concat([X,y], axis=1)
# calculate class means
class_means = tot.groupby('labels').mean()
total_mean = X.mean()
The class_means are given by:
class_means
feat_0 feat_1 feat_2 feat_3
labels
0 5.006 3.428 1.462 0.246
1 5.936 2.770 4.260 1.326
2 6.588 2.974 5.552 2.026
To do this, we first subtract the class means from each observation (basically we calculate x - m_i from the equation above).
Subtract the corresponding class mean from each observation. Since we want to calculate
x_mi = tot.transform(lambda x: x - class_means.loc[x['labels']], axis=1).drop('labels', 1)
def kronecker_and_sum(df, weights):
S = np.zeros((df.shape[1], df.shape[1]))
for idx, row in df.iterrows():
x_m = row.as_matrix().reshape(df.shape[1],1)
S += weights[idx]*np.dot(x_m, x_m.T)
return S
# Each x_mi is weighted with 1. Now we use the kronecker_and_sum function to calculate the within-class scatter matrix S_w
S_w = kronecker_and_sum(x_mi, 150*[1])
mi_m = class_means.transform(lambda x: x - total_mean, axis=1)
# Each mi_m is weighted with the number of observations per class which is 50 for each class in this example. We use kronecker_and_sum to calculate the between-class scatter matrix.
S_b=kronecker_and_sum(mi_m, 3*[50])
eig_vals, eig_vecs = np.linalg.eig(np.linalg.inv(S_w).dot(S_b))
We only need to consider the eigenvalues which are remarkably different from zero (in this case only the first two)
eig_vals
array([ 3.21919292e+01, 2.85391043e-01, 6.53468167e-15, -2.24877550e-15])
Transform X with the matrix of the two eigenvectors which correspond to the highest eigenvalues
W = eig_vecs[:, :2]
X_trafo = np.dot(X, W)
tot_trafo = pd.concat([pd.DataFrame(X_trafo, index=range(len(X_trafo))), y], 1)
# plot the result
tot_trafo.plot.scatter(x=0, y=1, c='labels', colormap='viridis')
We have reduced the dimensions from 4 to 2 and chosen the space in such a way, that the classes can be well seperated.
Scikit-learn usage
Scikit has LDA support aswell. What we did in dozens of lines can be done with the following lines of code:
from sklearn import discriminant_analysis
lda = discriminant_analysis.LinearDiscriminantAnalysis(n_components=2)
X_trafo_sk = lda.fit_transform(X,y)
pd.DataFrame(np.hstack((X_trafo_sk, y))).plot.scatter(x=0, y=1, c=2, colormap='viridis')
I'm not giving a plot here, cause it is the same as in our derived example (except for a 180 degree rotation).

How to pass coordinates to arviz / pymc3 function plot_posterior (similar to xarray.Dataset.sel)

I'm doing some Bayesian modelling in pymc3 and would like to plot the posterior distribution using plot_posterior (which comes from the arviz package). The resulting plot is awkwardly misaligned on the horizontal axis, and I would like to shift it over to be plotted precisely between -3 and +3. Unfortunately I can't work out what I should pass to the function to specify this.
The documentation for arviz.plot_posterior specifies the argument "coords" has the definition "Coordinates of var_names to be plotted. Passed to Dataset.sel" Presumably this is what I need to specify the range for the horizontal axis, but it doesn't tell me what sort of value it expects.
I have checked the documentation for Dataset.sel and it states that the first argument it expects is "A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels." My interpretation of this is that the keys are strings matching the name(s) of variable(s) and the values are some iterable structure of tickmarks.
My variable is called 'm' and was produced from the following code:
with pymc3.Model() as m1:
m = pymc3.Normal('m', mu = 0, sigma = 1)
obs = pymc3.Normal('obs', mu = m, sigma = 1, observed = numpy.random.randn(3))
trace = pymc3.sample(1000, tune = 500, cores = 1)
My guess at what plot_posterior expects is this:
plot_posterior(trace, coords = {'m': [-3.0, -2,0, -1,0, 0.0, 1.0, 2.0, 3.0]})
It gave me the error "ValueError: dimensions or multi-index levels ['m'] do not exist"
Presumably I'm on the right track but I just can't dig up any more precise definition of what arguments this function needs. Thanks for any help you can provide.
edit: I've worked out how to extend the axes themselves (the trick is ax = mpl.pyplot.axes(xlim = (-3.0, 3.0))) but I still don't know how to extend the plotting of the variable itself.
This is actually something you can go straight to matplotlib for: pm.plot_posterior will return an axis, which has getters and setters for most display attributes:
ax, = pymc3.plot_posterior(trace)
ax.set_xlim(-3, 3)
The coords argument is for multi-dimensional random variables. If your model looked like this:
with pymc3.Model() as m1:
m = pymc3.Normal('m', mu = 0, sigma = 1, shape = 4)
obs = pymc3.Normal('obs', mu = m, sigma = 1, observed = numpy.random.randn(3, 4))
trace = pymc3.sample(1000, tune = 500, cores = 1)
then your plot will have all 4 dimensions of m:
pymc3.plot_posterior(trace)
You can use coords to cut that down:
pymc3.plot_posterior(trace, coords={'m_dim_0': [0, 2]})

How to define General deterministic function in PyMC

In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.
Is it possible to do that?
Following is a pyMC3 code which shows what I am trying to do in a simplified case.
import numpy as np
import pymc as pm
#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
for w in range(size):
# A silly version of my real model evaluated on grid.
Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])
# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
return Grid[int(x)*size+int(w),2:] * z
#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror # ie. True x= 16, w= 12 and z= 3.6
with pm.Model() as model:
#Priors
x = pm.Uniform('x',lower=0,upper= size)
w = pm.Uniform('w',lower=0,upper =size)
z = pm.Uniform('z',lower=-5,upper =10)
#Expected value
y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))
#Data likelihood
ysigmas = np.ones(len(idata))*9.0
y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)
# Inference...
start = pm.find_MAP() # Find starting value by optimization
step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling
print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()
When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.
Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.
I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?
Update
Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.
#pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.
Metropolis seems to be not converging.
Which step method should I use in a scenario like this?
You found the correct solution with as_op.
Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.
Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.

Categories