Related
I am trying to optimize lambda that maximize Equation to maximize, where Z_A is {|0><0|,|1><1|}, Gamma is {Id_4, |01><01|,|10><10|}, and gamma is [1,0.5,0.5]. I change these quantum object to cp.bmat object using:
Gamma_cp = np.array([cp.bmat([[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]]),cp.bmat([[0,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,0]]),cp.bmat([[0.5,0,0,-0.5],[0,0.5,-0.5,0],[0,-0.5,0.5,0],[-0.5,0,0,0.5]])])
Z_A_cp = np.array([cp.bmat([[1,0,0,0],[0,1,0,0],[0,0,0,0],[0,0,0,0]]),cp.bmat([[0,0,0,0],[0,0,0,0],[0,0,1,0],[0,0,0,1]])])
Identity4 = cp.bmat([[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]])
The objective I want to optimize is coded as:
def Theta_total_cvx(Q):
gamma = np.array([1,Q,Q])
l = cp.Variable(shape=len(Gamma_cp))
r = cp.exp(Identity4-sum([l[i]*Gamma_cp[i] for i in range(3)]))
obj = cp.Maximize(-cp.lambda_max(sum(Z_A_cp[i]#r#Z_A_cp[i] for i in range(2))-np.dot(l,gamma)))
constraints = []
results = cp.Problem(obj, constraints)
results.solve(verbose=True)
return results.value/np.log(2) - h(Q)
The problem is python compile the obj using more than 10000 minutes, and still running.
Is there anything wrong with this code?
I have been working with the following link,
Fitting empirical distribution to theoretical ones with Scipy (Python)?
I have been using my data to the code from the link and found out that the common distribution for my data is the Non-Central Student’s T distribution. I couldn’t find the distribution in the pymc3 package, so, I decided to have a look with scipy to understand how the distribution is formed. I created a custom distribution and I have few questions:
I would like to know if my approach to creating the distribution is right?
How can I implement the custom distribution into models?
Regarding the prior distribution, do I use same steps in normal distribution priors (mu and sigma) combined with halfnormed for degree of freedom and noncentral value?
My custom distribution:
import numpy as np
import theano.tensor as tt
from scipy import stats
from scipy.special import hyp1f1, nctdtr
import warnings
from pymc3.theanof import floatX
from pymc3.distributions.dist_math import bound, gammaln
from pymc3.distributions.continuous import assert_negative_support, get_tau_sigma
from pymc3.distributions.distribution import Continuous, draw_values, generate_samples
class NonCentralStudentT(Continuous):
"""
Parameters
----------
nu: float
Degrees of freedom, also known as normality parameter (nu > 0).
mu: float
Location parameter.
sigma: float
Scale parameter (sigma > 0). Converges to the standard deviation as nu increases. (only required if lam is not specified)
lam: float
Scale parameter (lam > 0). Converges to the precision as nu increases. (only required if sigma is not specified)
"""
def __init__(self, nu, nc, mu=0, lam=None, sigma=None, sd=None, *args, **kwargs):
super().__init__(*args, **kwargs)
super(NonCentralStudentT, self).__init__(*args, **kwargs)
if sd is not None:
sigma = sd
warnings.warn("sd is deprecated, use sigma instead", DeprecationWarning)
self.nu = nu = tt.as_tensor_variable(floatX(nu))
self.nc = nc = tt.as_tensor_variable(floatX(nc))
lam, sigma = get_tau_sigma(tau=lam, sigma=sigma)
self.lam = lam = tt.as_tensor_variable(lam)
self.sigma = self.sd = sigma = tt.as_tensor_variable(sigma)
self.mean = self.median = self.mode = self.mu = mu = tt.as_tensor_variable(mu)
self.variance = tt.switch((nu > 2) * 1, (1 / self.lam) * (nu / (nu - 2)), np.inf)
assert_negative_support(lam, 'lam (sigma)', 'NonCentralStudentT')
assert_negative_support(nu, 'nu', 'NonCentralStudentT')
assert_negative_support(nc, 'nc', 'NonCentralStudentT')
def random(self, point=None, size=None):
"""
Draw random values from Non-Central Student's T distribution.
Parameters
----------
point: dict, optional
Dict of variable values on which random values are to be
conditioned (uses default point if not specified).
size: int, optional
Desired size of random sample (returns one sample if not
specified).
Returns
-------
array
"""
nu, nc, mu, lam = draw_values([self.nu, self.nc, self.mu, self.lam], point=point, size=size)
return generate_samples(stats.nct.rvs, nu, nc, loc=mu, scale=lam ** -0.5, dist_shape=self.shape, size=size)
def logp(self, value):
"""
Calculate log-probability of Non-Central Student's T distribution at specified value.
Parameters
----------
value: numeric
Value(s) for which log-probability is calculated. If the log probabilities for multiple
values are desired the values must be provided in a numpy array or theano tensor
Returns
-------
TensorVariable
"""
nu = self.nu
nc = self.nc
mu = self.mu
lam = self.lam
n = nu * 1.0
nc = nc * 1.0
x2 = value * value
ncx2 = nc * nc * x2
fac1 = n + x2
trm1 = n / 2. * tt.log(n) + gammaln(n + 1)
trm1 -= n * tt.log(2) + nc * nc / 2. + (n / 2.) * tt.log(fac1) + gammaln(n / 2.)
Px = tt.exp(trm1)
valF = ncx2 / (2 * fac1)
trm1 = tt.sqrt(2) * nc * value * hyp1f1(n / 2 + 1, 1.5, valF)
trm1 /= np.asarray(fac1 * tt.gamma((n + 1) / 2))
trm2 = hyp1f1((n + 1) / 2, 0.5, valF)
trm2 /= np.asarray(np.sqrt(fac1) * tt.gamma(n / 2 + 1))
Px *= trm1 + trm2
return bound(Px, lam > 0, nu > 0, nc > 0)
def logcdf(self, value):
"""
Compute the log of the cumulative distribution function for Non-Central Student's T distribution
at the specified value.
Parameters
----------
value: numeric
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.
Returns
-------
TensorVariable
"""
nu = self.nu
nc = self.nc
return nctdtr(nu, nc, value)
My Custom model:
with pm.Model() as model:
# Prior Distributions for unknown model parameters:
mu = pm.Normal('sigma', 0, 10)
sigma = pm.Normal('sigma', 0, 10)
nc= pm.HalfNormal('nc', sigma=10)
nu= pm.HalfNormal('nu', sigma=1)
# Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
=> (input custom distribution) observed_data = pm.Beta('observed_data', alpha=alpha, beta=beta, observed=data)
# draw 5000 posterior samples
trace = pm.sample(draws=5000, tune=2000, chains=3, cores=1)
# Obtaining Posterior Predictive Sampling:
post_pred = pm.sample_posterior_predictive(trace, samples=3000)
print(post_pred['observed_data'].shape)
print('\nSummary: ')
print(pm.stats.summary(data=trace))
print(pm.stats.summary(data=post_pred))
Edit 1:
I redesigned the custom model to include the custom distribution, however, I keep on getting error based on the equations used to get the likelihood distribution or sometimes tensor locks down and the code just freeze. Find my code below,
with pm.Model() as model:
# Prior Distributions for unknown model parameters:
mu = pm.Normal('mu', mu=0, sigma=1)
sd = pm.HalfNormal('sd', sigma=1)
nc = pm.HalfNormal('nc', sigma=10)
nu = pm.HalfNormal('nu', sigma=1)
# Custom distribution:
# observed_data = pm.DensityDist('observed_data', NonCentralStudentT, observed=data_list)
# Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
observed_data = NonCentralStudentT('observed_data', mu=mu, sd=sd, nc=nc, nu=nu, observed=data_list)
# draw 5000 posterior samples
trace_S = pm.sample(draws=5000, tune=2000, chains=3, cores=1)
# Obtaining Posterior Predictive Sampling:
post_pred_S = pm.sample_posterior_predictive(trace_S, samples=3000)
print(post_pred_S['observed_data'].shape)
print('\nSummary: ')
print(pm.stats.summary(data=trace_S))
print(pm.stats.summary(data=post_pred_S))
Edit 2:
I am looking online in order to convert the function to theano, the only thing that I found to define the function is from the following GitHub link hyp1f1 function GitHub
Will this be enough to use in order to convert the function into theano?
In addition, I have a question, it is okay to use NumPy arrays with theano?
Also, I thought of another way but I am not sure if this can be implemented, I looked into the nct function in scipy and they wrote the following,
If Y is a standard normal random variable and V is an independent
chi-square random variable ( chi2 ) with k degrees of freedom, then
X=(Y+c) / sqrt(V/k)
has a non-central Student’s t distribution on the real line. The
degrees of freedom parameter k (denoted df in the implementation)
satisfies k>0 and the noncentrality parameter c (denoted nc in the
implementation) is a real number.
The probability density above is defined in the “standardized” form.
To shift and/or scale the distribution use the loc and scale
parameters. Specifically, nct.pdf(x, df, nc, loc, scale) is
identically equivalent to nct.pdf(y, df, nc) / scale with y = (x -
loc) / scale .
So, I thought of only using the priors as normal and chi2 random variables code part in their distributions and use the degree of freedom variable as mentioned before in the code into the equation mentioned in SciPy, will it be enough to get the distribution?
Edit 3:
I managed to run the code in the link about fitting empirical distribution and found out the second best was the student t distribution, so, I will be using this. Thank you for your help. I just have a side question, I ran my model with student t distribution but I got these warnings:
There were 52 divergences after tuning. Increase target_accept or
reparameterize. The acceptance probability does not match the target.
It is 0.7037574708196309, but should be close to 0.8. Try to increase
the number of tuning steps. The number of effective samples is smaller
than 10% for some parameters.
I am just confused about these warnings, Do you have any idea what it means? I know that this won't affect my code, but, I can reduce the divergences? and regarding the effective samples, Do I need to increase the number of samples in the trace code?
Here I aim to estimate the parameters (gama and omega) of a damped harmonic oscillator given by
dX^2/dt^2+gamma*dX/dt+(2*pi*omega)^2*X=0.
(We can add white gaussian noise to the system.)
import pymc
import numpy as np
import scipy.io as sio
import matplotlib.pyplot as plt;
from scipy.integrate import odeint
#import data
xdata = sio.loadmat('T.mat')['T'][0] #time
ydata1 = sio.loadmat('V1.mat')['V1'][0] # V2=dV1/dt, (X=V1),
ydata2 = sio.loadmat('V2.mat')['V2'][0] # dV2/dt=-(2pi*omega)^2*V1-gama*V2
#time span for solving the equations
npts= 500
dt=0.01
Tspan=5.0
time = np.linspace(0,Tspan,npts+1)
#initial condition
V0 = [1.0, 1.0]
# Priors for unknown model parameters
sigma = pymc.Uniform('sigma', 0.0, 100.0)
gama= pymc.Uniform('gama', 0.0, 20.0)
omega=pymc.Uniform('omega',0.0, 20.0)
#Solve the equations
#pymc.deterministic
def DHOS(gama=gama, omega=omega):
V1= np.zeros(npts+1)
V2= np.zeros(npts+1)
V1[0] = V0[0]
V2[0] = V0[1]
for i in range(1,npts+1):
V1[i]= V1[i-1] + dt*V2[i-1];
V2[i] = V2[i-1] + dt*(-((2*np.pi*omega)**2)*V1[i-1]-gama*V2[i-1]);
return [V1, V2]
#or we can use odeint
##pymc.deterministic
#def DHS( gama=gama, omega=omega):
# def DOS_func(y, time):
# V1, V2 = y[0], y[1]
# dV1dt = V2
# dV2dt = -((2*np.pi*omega)**2)* V1 -gama*V2
# dydt = [dV1dt, dV2dt]
# return dydt
# soln = odeint(DOS_func,V0, time)
# V1, V2 = soln[:,0], soln[:,1]
# return V1, V2
# value of outcome (observations)
V1 = pymc.Lambda('V1', lambda DHOS=DHOS: DHOS[0])
V2 = pymc.Lambda('V2', lambda DHOS=DHOS: DHOS[1])
# liklihood of observations
Yobs1 = pymc.Normal('Yobs1', mu=V1, tau=1.0/sigma**2, value=ydata1, observed=True)
Yobs2 = pymc.Normal('Yobs2', mu=V2, tau=1.0/sigma**2, value=ydata2, observed=True)
By saving the above code as DampedOscil_model.py, then we are able to run PYMC as follows
import pymc
import DampedOscil_model
MDL = pymc.MCMC(DampedOscil_model, db='pickle')
MDL.sample(iter=1e4, burn=1e2, thin=2)
gama_trace=MDL.trace('gama')[- 1000:]
omega_trace=MDL.trace('omega')[-1000:]
gama=MDL.gama.value
omega=MDL.omega.value
And it works well (See below).
The true signal constructed by gama_true=2.0 and omega_est=1.5 versus the estimated signal. The estimated parameter values are gama_est=2.04 and omega_est=1.49
Now I would convert this code to PYMC3 to use NUTS and ADVI.
import matplotlib.pyplot as plt
import scipy.io as sio
import pandas as pd
import numpy as np
import pymc3 as pm
import theano.tensor as tt
import theano
from pymc3 import Model, Normal, HalfNormal, Uniform
from pymc3 import NUTS, find_MAP, sample, Slice, traceplot, summary
from pymc3 import Deterministic
from scipy.optimize import fmin_powell
#import data
xdata = sio.loadmat('T.mat')['T'][0] #time
ydata1 = sio.loadmat('V1.mat')['V1'][0] # V2=dV1/dt, (X=V1),
ydata2 = sio.loadmat('V2.mat')['V2'][0] # dV2/dt=-(2pi*omega)^2*V1-gama*V2
#time span for solving the equations
npts= 500
dt=0.01
Tspan=5.0
time = np.linspace(0,Tspan,npts+1)
niter=10000
burn=niter//2;
with pm.Model() as model:
#Priors for unknown model parameters
sigma = pm.HalfNormal('sigma', sd=1)
gama= pm.Uniform('gama', 0.0, 20.0)
omega=pm.Uniform('omega',0.0, 20.0)
#initial condition
V0 = [1.0, 1.0]
#Solve the equations
# do I need to use theano.tensor here?!
#theano.compile.ops.as_op(itypes=[tt.dscalar, tt.dscalar],otypes=[tt.dvector])
def DHOS(gama=gama, omega=omega):
V1= np.zeros(npts+1)
V2= np.zeros(npts+1)
V1[0] = V0[0]
V2[0] = V0[1]
for i in range(1,npts+1):
V1[i]= V1[i-1] + dt*V2[i-1];
V2[i] = V2[i-1] + dt*(-((2*np.pi*1)**2)*V1[i-1]-gama*V2[i-1]);
return V1,V2
V1 = pm.Deterministic('V1', DHOS[0])
V2 = pm.Deterministic('V2', DHOS[1])
start = pm.find_MAP(fmin=fmin_powell, disp=True)
step=pm.NUTS()
trace=pm.sample(niter, step, start=start, progressbar=False)
traceplot(trace);
Summary=pm.df_summary(trace[-1000:])
gama_trace = trace.get_values('gama', burn)
omega_trace = trace.get_values('omega', burn)
For this code I get the following error:
V1 = pm.Deterministic('V1', DHOS[0])
TypeError: 'FromFunctionOp' object does not support indexing
Briefly, I wonder to know how can I to convert the following part of PYMC code to PYMC3.
#pymc.deterministic
def DOS(gama=gama, omega=omega):
V1= np.zeros(npts+1)
V2= np.zeros(npts+1)
V1[0] = V0[0]
V2[0] = V0[1]
for i in range(1,npts+1):
V1[i]= V1[i-1] + dt*V2[i-1];
V2[i] = V2[i-1] + dt*(-((2*np.pi*omega)**2)*V1[i-1]-gama*V2[i-1]);
return [V1, V2]
V1 = pymc.Lambda('V1', lambda DOS=DOS: DOS[0])
V2 = pymc.Lambda('V2', lambda DOS=DOS: DOS[1])
The problem is, first, the argumentation of Deterministic function is different in PYMC3 from PYMC, secondly, there in no Lambda function in PYMC3.
I appreciate your help in solving ODEs in PYMC3 to solve parameter estimation task in biological systems (estimating the equation parameters from data).
Thanks a lot in advance for your help.
Kind Regards,
Meysam
I would suggest, and have successfully implemented, using a 'black box' method for interfacing with PYMC3. In this case what that means is calculating the log-liklihood yourself and then using PYMC3 to sample it. This requires writing your functions in a way that Theano and PYMC3 can interface with them.
This is outlined in a notebook on the PYMC3 page, which uses cython as an example.
Here is a bit shorter sample of what needs to be done.
First you can load your data and set-up any parameters you need such as your time steps etc.
import pymc3 as pm
import numpy as np
import theano
import theano.tensor as tt
#import data
xdata = sio.loadmat('T.mat')['T'][0] #time
ydata1 = sio.loadmat('V1.mat')['V1'][0] # V2=dV1/dt, (X=V1),
ydata2 = sio.loadmat('V2.mat')['V2'][0] # dV2/dt=-(2pi*omega)^2*V1-gama*V2
#time span for solving the equations
npts= 500
dt=0.01
Tspan=5.0
time = np.linspace(0,Tspan,npts+1)
#initial condition
V0 = [1.0, 1.0]
Then you define a data generating function just as before but you don't need to use any decorators from PYMC for this. The output of this function should be whatever you need to compare to your data to calculate the likelihood.
def DHOS(theta):
gama,omega=theta
V1= np.zeros(npts+1)
V2= np.zeros(npts+1)
V1[0] = V0[0]
V2[0] = V0[1]
for i in range(1,npts+1):
V1[i]= V1[i-1] + dt*V2[i-1];
V2[i] = V2[i-1] + dt*(-((2*np.pi*omega)**2)*V1[i-1]-gama*V2[i-1]);
return [V1, V2]
Next you write a function that calls the previous function and calculates the likelihood using whatever distribution you want, in this a normal distribution.
def my_loglike(theta,data,sigma):
"""
A Gaussian log-likelihood function for a model with parameters given in theta
"""
model = DHOS(theta) #V1 and V2 from the DHOS function
#Here data = [ydata1,ydata2] to compare with model
#sigma is either the same shape as model or a scalar
#which corresponds to the uncertainty on the data.
return -(0.5)*sum((data - model)**2/sigma**2)
From here you have now have to define a Theano class so that it can interface with PYMC3.
# define a theano Op for our likelihood function
class LogLike(tt.Op):
"""
Specify what type of object will be passed and returned to the Op when it is
called. In our case we will be passing it a vector of values (the parameters
that define our model) and returning a single "scalar" value (the
log-likelihood)
"""
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike, data, sigma):
"""
Initialise the Op with various things that our log-likelihood function
requires. Below are the things that are needed in this particular
example.
Parameters
----------
loglike:
The log-likelihood (or whatever) function we've defined
data:
The "observed" data that our log-likelihood function takes in
x:
The dependent variable (aka 'x') that our model requires
sigma:
The noise standard deviation that our function requires.
"""
# add inputs as class attributes
self.likelihood = loglike
self.data = data
self.sigma = sigma
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
theta, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(theta, self.data, self.sigma)
outputs[0][0] = array(logl) # output the log-likelihood
Finally you can use PYMC3 to build your model and sample accordingly.
ndraws = 10000 # number of draws from the distribution
nburn = 1000 # number of "burn-in points" (which we'll discard)
# create our Op
logl = LogLike(my_loglike, rdat_sim, 10)
# use PyMC3 to sampler from log-likelihood
with pm.Model():
gama= pm.Uniform('gama', 0.0, 20.0)
omega=pm.Uniform('omega',0.0, 20.0)
# convert m and c to a tensor vector
theta = tt.as_tensor_variable([gama,omega])
# use a DensityDist (use a lamdba function to "call" the Op)
pm.DensityDist('likelihood', lambda v: logl(v), observed={'v': theta})
trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True)
And you can use the internal plotting to see the results of the sampling
_ = pm.traceplot(trace)
This was just adapted from the example notebook in the link, and as mentioned there if you want to use NUTS you need gradient information, which you do not have given you custom function. In the link it talks about how to sample the gradient and construct it so you can pass it into the sampler, but I have not shown that here.
Additionally if you want to use solve_ivp (or odeint or another solver), all you have to do is change the DHOS function as you normally would to invoke the solver. The rest of the code should be portable to whatever problem you, or anyone else, need.
I'm trying to convert this example of Bayesian correlation for PyMC2 to PyMC3, but get completely different results. Most importantly, the mean of the multivariate Normal distribution quickly goes to zero, whereas it should be around 400 (as it is for PyMC2). Consequently, the estimated correlation quickly goes towards 1, which is wrong as well.
The full code is available in this notebook for PyMC2 and in this notebook for PyMC3.
The relevant code for PyMC2 is
def analyze(data):
# priors might be adapted here to be less flat
mu = pymc.Normal('mu', 0, 0.000001, size=2)
sigma = pymc.Uniform('sigma', 0, 1000, size=2)
rho = pymc.Uniform('r', -1, 1)
#pymc.deterministic
def precision(sigma=sigma,rho=rho):
ss1 = float(sigma[0] * sigma[0])
ss2 = float(sigma[1] * sigma[1])
rss = float(rho * sigma[0] * sigma[1])
return np.linalg.inv(np.mat([[ss1, rss], [rss, ss2]]))
mult_n = pymc.MvNormal('mult_n', mu=mu, tau=precision, value=data.T, observed=True)
model = pymc.MCMC(locals())
model.sample(50000,25000)
My port of the above code to PyMC3 is as follows:
def precision(sigma, rho):
C = T.alloc(rho, 2, 2)
C = T.fill_diagonal(C, 1.)
S = T.diag(sigma)
return T.nlinalg.matrix_inverse(T.nlinalg.matrix_dot(S, C, S))
def analyze(data):
with pm.Model() as model:
# priors might be adapted here to be less flat
mu = pm.Normal('mu', mu=0., sd=0.000001, shape=2, testval=np.mean(data, axis=1))
sigma = pm.Uniform('sigma', lower=1e-6, upper=1000., shape=2, testval=np.std(data, axis=1))
rho = pm.Uniform('r', lower=-1., upper=1., testval=0)
prec = pm.Deterministic('prec', precision(sigma, rho))
mult_n = pm.MvNormal('mult_n', mu=mu, tau=prec, observed=data.T)
return model
model = analyze(data)
with model:
trace = pm.sample(50000, tune=25000, step=pm.Metropolis())
The PyMC3 version runs, but clearly does not return the expected result. Any help would be highly appreciated.
The call signature of pymc.Normal is
In [125]: pymc.Normal?
Init signature: pymc.Normal(self, *args, **kwds)
Docstring:
N = Normal(name, mu, tau, value=None, observed=False, size=1, trace=True, rseed=True, doc=None, verbose=-1, debug=False)
Notice that the third positional argument of pymc.Normal is tau, not the standard deviation, sd.
Therefore, since the pymc code uses
mu = Normal('mu', 0, 0.000001, size=2)
The corresponding pymc3 code should use
mu = pm.Normal('mu', mu=0., tau=0.000001, shape=2, ...)
or
mu = pm.Normal('mu', mu=0., sd=math.sqrt(1/0.000001), shape=2, ...)
since tau = 1/sigma**2.
With this one change, your pymc3 code produces (something like)
I'm trying to implement Bayesian PCA using PyMC library for python. But, I'm stuck where I define lower dimensional coordinates...
Model is
x = Wz + e
where x is observation vector, W is the transformation matrix, and z is lower dimensional coordinate vector.
First I define a distribution for the transformation matrix W. Each column is drawn from a normal distribution (zero mean, and identity covariance for simplicity)
def W_logp(value):
logLikes = np.array([multivariate_normal.logpdf(value[:,i], mean=np.zeros(dimX), cov=1) for i in range(0, dimZ)])
return logLikes.sum()
def W_random():
W = np.zeros([dimX, dimZ])
for i in range(0, dimZ):
W[:,i] = multivariate_normal.rvs(mean=np.zeros(dimX), cov=1)
return W
w0 = np.random.randn(dimX, dimZ)
W = pymc.Stochastic(
logp = W_logp,
doc = 'Transformation',
name = 'W',
parents = {},
random = W_random,
trace = True,
value = w0,
dtype = float,
rseed = 116.,
observed = False,
cache_depth = 2,
plot = False,
verbose = 0)
Then, I want to define distribution for z that is again a multivariate normal (zero mean, and identity covariance). However, I need to draw a z for each observation separately while W is common for all of them. So, I tried
z = pymc.MvNormal('z', np.zeros(dimZ), np.eye(dimZ), size=N)
However, pymc.MvNormal does not have a size parameter. So it raises an error. Next step would be
m = Data.mean(axis=0) + np.dot(W, z)
obs = pymc.MvNormal('Obs', m, C, value=Data, observed=True)
I did not give the specification for C above since it is irrelevant for now. Any ideas how to implement?
Thanks
EDIT
After Chris Fonnesbeck's answer I changed my code as follows
numD, dimX = Data.shape
dimZ = 3
mm = Data.mean(axis=0)
tau = pymc.Gamma('tau', alpha=10, beta=2)
tauW = pymc.Gamma('tauW', alpha=20, beta=2, size=dimZ)
#pymc.deterministic(dtype=float)
def C(tau=tau):
return (tau)*np.eye(dimX)
#pymc.deterministic(dtype=float)
def CW(tau=tauW):
return np.diag(tau)
W = [pymc.MvNormal('W%i'%i, np.zeros(dimZ), CW) for i in range(dimX)]
z = [pymc.MvNormal('z%i'%i, np.zeros(dimZ), np.eye(dimZ)) for i in range(numD)]
mu = [pymc.Lambda('mu%i'%i, lambda W=W, z=z: mm + np.dot(np.array(W), np.array(z[i]))) for i in range(numD)]
obs = [pymc.MvNormal('Obs%i'%i, mu[i], C, value=Data[i,:], observed=True) for i in range(numD)]
model = pymc.Model([tau, tauW] + obs + W + z)
mcmc = pymc.MCMC(model)
But this time, it tries to allocate a huge amount of memory (more than 8GB) when running pymc.MCMC(model), with numD=45 and dimX=504. Even when I try it with only numD=1 (so creating only 1 z, mu, and obs), it does the same. Any idea why?
Unfortunately, PyMC does not easily let you define vectors of multivariate stochastics. Hopefully we can make this happen in PyMC 3. For now, you would have to specify this using a container. For example:
z = [pymc.MvNormal('z_%i' % i, np.zeros(dimZ), np.eye(dimZ)) for i in range(N)]
Regarding the memory issue, try using a different backend for the traces. The default ("ram") keeps everything in RAM. You can try something like "pickle" or "sqlite" instead.
Regarding the plate notation, it might be something we could pursue for PyMC 3. Feel free to create an issue suggesting this in our issue tracker.