I am trying to implement ERGMs with PyMC.
I've found this, this, this and this, but these resources are a bit dated.
I have an NxN matrix for each network statistic (density, triangles, istar2, istar3 & distance). Each cell in each matrix indicates how the presence of that potential edge would change that statistic, holding the rest of the network constant. am is the adjacency matrix of graph G (nx.to_numpy_array(G)).
My model looks like this.
with pm.Model() as model:
density = pm.ConstantData("density", density)
triangles = pm.ConstantData("triangles", triangles)
istar2 = pm.ConstantData("istar2", istar2)
istar3 = pm.ConstantData("istar3", istar3)
distance = pm.ConstantData("distance", distance)
β_density = pm.Normal('β_density', mu=0, sigma=100)
β_triangles = pm.Normal('β_triangles', mu=0, sigma=100)
β_istar2 = pm.Normal('β_istar2', mu=0, sigma=100)
β_istar3 = pm.Normal('β_istar3', mu=0, sigma=100)
β_distance = pm.Normal('β_distance', mu=0, sigma=100)
μ = β_density*density + β_triangles*triangles + β_istar2*istar2 + β_istar3*istar3 + β_distance*distance
θ = pm.Deterministic('θ', pm.math.sigmoid(μ))
y = pm.Bernoulli('y', p=θ, observed=am)
trace=pm.sample(
draws=500,
tune=1000,
cores=1,
)
Am I doing this correctly?
Related
Ok so this is maybe more a math question than a programming one, but something is really bogging me down.
Suppose I manually perform gradient descent for a simple univariate linear regression, as follows:
# add biases to data
X_ = np.concatenate(
[np.ones(X_scaled.shape[0]).reshape(-1, 1), X_scaled], axis=1)
X_copy = X_.copy()
history = []
thetas = initial_theta
costs = []
grads = []
for step in range(200):
hypothesis = np.dot(X_copy, thetas)
# cost
J = (1 / m) * np.sum(np.square(hypothesis-y))
# derivative
d = np.dot(hypothesis-y, X_copy) / m
# store
history.append(thetas)
costs.append(J)
grads.append(d)
# update
thetas = thetas - d * 0.1
The final thetas I get are approximately the same I get with scikit-lern, so so far all good.
Now I want to plot the tangent line to the cost function for a given value of one of the theta params.
I do this:
fig = plt.figure()
s = 4 # which gradient descent iteration should I pick
i = 2 # just a basic increment factor to plot the tangent line
# plot cost as function of first param
plt.plot([params[0] for params in history], costs, "-")
# pick a tangent point
tangent_point_x, tangent_point_y = history[s][0], costs[s]
plt.plot(tangent_point_x, tangent_point_y, "to")
# plot tangent
slope = grads[s][0]
new_point1_x = history[s-i][0]
new_point1_y = tangent_point_y + slope * (new_point1_x - tangent_point_x)
new_point2_x = history[s+i][0]
new_point2_y = tangent_point_y + slope * (new_point2_x - tangent_point_x)
plt.plot((new_point1_x, new_point2_x), (new_point1_y, new_point2_y), "-")
plt.plot(new_point1_x, new_point1_y, "bo")
plt.plot(new_point2_x, new_point2_y, "go")
Here is the resulting plot. What am I doing wrong?
I am trying to solve a dynamic optimization problem using gekko. The goal is to minimize a form of energy consumption represented by VSP over a set distance under speed constraints. I define a piece-wise linear function as a the speed constraint and to model the slope of the road at different distances:
min_velocity = 0
max_velocity = 10
max_decel = -1
max_accel = 1
distances = np.linspace(0,20,21)
goal_dist = 200
trip_time = 100
# set up PWL functions
distances = np.linspace(0,200,10)
speed_limits = np.ones(10)*5
speed_limits[5:]=7
slope = np.zeros(10)
slope[3:5]=1; slope[7:9]=-1
model = GEKKO(remote=False)
model.time = [i for i in range(trip_time)]
x = model.Var(value=0.0, lb=0)
v = model.Var(value=0.0, lb = min_velocity, ub = max_velocity)
v_max = model.Var()
slope_var = model.Var()
a = model.MV(value=0, lb=max_decel ,ub=max_accel)
a.STATUS = 1
#define vehicle movement
model.Equation(x.dt()==v)
model.Equation(v.dt()==a)
#aggregated velocity constraint
model.pwl(x, v_max, distances, speed_limits)
model.Equation(v<=v_max)
#slope is modeled as a piecewise linear function
model.pwl(x, slope_var, distances, slope)
#End state constraints
p = np.zeros_like(model.time); p[-1]=1
final = model.Param(p)
model.Minimize(1e4*final*(v**2))# vehicle must be fully stopped
model.Minimize(1e4*final*((x-goal_dist)**2))# vehicle must arrive at destination
#VSPI Objective function
obj = model.Intermediate(v * (1.1 * a + 9.81 * slope_var + 0.132) + 0.0003002*pow(v, 3))
#VSPI Objective function
model.Obj(obj)
# solve
model.options.IMODE = 6
model.options.REDUCE = 3
model.options.MAX_ITER=1000
model.solve(disp=False)
plt.plot(x.value, v_max.value, 'b-', label = r'$vmaxvals$')
plt.plot(x.value , v.value,'g-',label=r'$vopt$')
plt.plot(x.value, a.value, 'c-', label=r'$accel$')
plt.plot(x.value, slope_var.value, 'r-', label=r'$slope$')
plt.plot([i*20 for i in range(10)], slope, 'mx', label=r'$orig_slope$')
plt.plot([i*20 for i in range(10)], speed_limits, 'kx', label=r'$orig_spd_limit$')
plt.legend(loc='best')
plt.xlabel('Distance Covered')
plt.show()
print(model.options.APPSTATUS)
Unfortunately, however, the values of slope_var and v_max get adjusted in the process of solving the problem. I am sure this is intended in this case, so is there a way to fix these PWL functions in place similar to a Parameter?
If I use a cspline object to apprximate the speed limits and slope, the values dont't change since it is pre-built as far as I understand, however, the accuracy of a cubic spline is limited to a few data points and few changes in slope, which is why I would like to model it using a piecewise linear function.
The pwl function does give a linear interpolation but it relies on a Mathematical Program with Complementarity Constraints (MPCCs) that are challenging to solve with many local minima at saddle points. You mentioned that you don't want to use the cspline function, but it may be your best option. There are some slight errors at the transition points, but it can be fixed by adding additional points during transitions or by increasing the resolution.
import numpy as np
from gekko import GEKKO
import matplotlib.pyplot as plt
min_velocity = 0
max_velocity = 10
max_decel = -1
max_accel = 1
distances = np.linspace(0,20,21)
goal_dist = 200
trip_time = 100
# set up PWL functions
distances = np.linspace(0,200,10)
speed_limits = np.ones(10)*5
speed_limits[5:]=7
slope = np.zeros(10)
slope[3:5]=1; slope[7:9]=-1
model = GEKKO(remote=False)
model.time = [i for i in range(trip_time)]
x = model.Var(value=0.0, lb=0)
v = model.Var(value=0.0, lb = min_velocity, ub = max_velocity)
v_max = model.Var()
slope_var = model.Var()
a = model.MV(value=0, lb=max_decel ,ub=max_accel)
a.STATUS = 1
#define vehicle movement
model.Equation(x.dt()==v)
model.Equation(v.dt()==a)
#aggregated velocity constraint
model.cspline(x,v_max,distances,speed_limits,True)
#model.pwl(x, v_max, distances, speed_limits)
model.Equation(v<=v_max)
#slope is modeled as a piecewise linear function
#model.pwl(x, slope_var, distances, slope)
model.cspline(x,slope_var,distances,slope,True)
#End state constraints
p = np.zeros_like(model.time); p[-1]=1
final = model.Param(p)
model.Minimize(1e4*final*(v**2))# vehicle must be fully stopped
model.Minimize(1e4*final*((x-goal_dist)**2))# vehicle must arrive at destination
#VSPI Objective function
obj = model.Intermediate(v * (1.1 * a + 9.81 * slope_var + 0.132) + 0.0003002*pow(v, 3))
#VSPI Objective function
model.Obj(obj)
# solve
model.options.IMODE = 6
model.options.REDUCE = 3
model.options.MAX_ITER=1000
model.solve(disp=False)
plt.plot(x.value, v_max.value, 'b-', label = 'vmaxvals')
plt.plot(x.value , v.value,'g-',label='vopt')
plt.plot(x.value, a.value, 'c-', label='accel')
plt.plot(x.value, slope_var.value, 'r-', label='slope')
plt.plot(distances, slope, 'mx', label='orig_slope')
plt.plot(distances, speed_limits, 'kx', label='orig_spd_limit')
plt.legend(loc='best')
plt.xlabel('Distance Covered')
plt.show()
print(model.options.APPSTATUS)
There was an error in the plotting with:
plt.plot([i*20 for i in range(10)], slope, 'mx', label=r'$orig_slope$')
plt.plot([i*20 for i in range(10)], speed_limits, 'kx', label=r'$orig_spd_limit$')
p
Use this instead:
plt.plot(distances, slope, 'mx', label='orig_slope')
plt.plot(distances, speed_limits, 'kx', label='orig_spd_limit')
I am struggling to implement a linear regression in pymc3 with a custom likelihood.
I previously posted this question on CrossValidated & it was recommended to post here as the question is more code orientated (closed post here)
Suppose you have two independent variables x1, x2 and a target variable y, as well as an indicator variable called delta.
When delta is 0, the likelihood function is standard least squares
When delta is 1, the likelihood function is the least squares contribution only when the target variable is greater than the prediction
Example snippet of observed data:
x_1 x_2 𝛿 observed_target
10 1 0 100
20 2 0 50
5 -1 1 200
10 -2 1 100
Does anyone know how this can be implemented in pymc3? As a starting point...
model = pm.Model()
with model as ttf_model:
intercept = pm.Normal('param_intercept', mu=0, sd=5)
beta_0 = pm.Normal('param_x1', mu=0, sd=5)
beta_1 = pm.Normal('param_x2', mu=0, sd=5)
std = pm.HalfNormal('param_std', beta = 0.5)
x_1 = pm.Data('var_x1', df['x1'])
x_2 = pm.Data('var_x2', df['x2'])
mu = (intercept + beta_0*x_0 + beta_1*x_1)
In case this is helpful, from reading the docs it looks like something along these lines might work, but I have not been able to test it and it was too long to pop into a comment.
model = pm.Model()
with model as ttf_model:
intercept = pm.Normal('param_intercept', mu=0, sd=5)
beta_0 = pm.Normal('param_x1', mu=0, sd=5)
beta_1 = pm.Normal('param_x2', mu=0, sd=5)
std = pm.HalfNormal('param_std', beta = 0.5)
x_1 = pm.Data('var_x1', df['x1'])
x_2 = pm.Data('var_x2', df['x2'])
delta = pm.Data('delta', df['delta']) # Or whatever this column is
target = pm.Data('target', df['observed_target'])
ypred = (intercept + beta_0*x_0 + beta_1*x_1) # Intermediate result
target_ge_ypred = pm.math.ge(target, ypred) # Compare target to intermediate result
zero = pm.math.constant(0) # Use this if delta==1 and target<ypred
# EDIT: Check delta
alternate = pm.math.switch(target_ge_ypred, ypred, zero) # Alternative result
mu = pm.math.switch(pm.math.eq(delta, zero), ypred, alternate) # Actual result wanted?
I'm still a noob in PyMC3, so the question might me naive, but I don't know how to translate this pymc2 code in pymc3. In particular it's not clear to me how to translate the R function.
beta = pymc.Normal('beta', mu=0, tau=1.0e-4)
s = pymc.Uniform('s', lower=0, upper=1.0e+4)
tau = pymc.Lambda('tau', lambda s=s: s**(-2))
### Intrinsic CAR
#pymc.stochastic
def R(tau=tau, value=np.zeros(N)):
# Calculate mu based on average of neighbors
mu = np.array([sum(W[i]*value[A[i]])/Wplus[i] for i in xrange(N)])
# Scale precision to the number of neighbors
taux = tau*Wplus
return pymc.normal_like(value, mu, taux)
#pymc.deterministic
def M(beta=beta, R=R):
return [np.exp(beta + R[i]) for i in xrange(N)]
obsvd = pymc.Poisson("obsvd", mu=M, value=Y, observed=True)
model = pymc.Model([s, beta, obsvd])
Code from https://github.com/Youki/statistical-modeling-for-data-analysis-with-python/blob/945c13549a872d869e33bc48082c42efc022a07b/Chapter11/Chapter11.rst, and http://glau.ca/?p=340
Can you help me? Thanks
In PyMC3, you can implement the CAR model using the scan function of Theano. There is a sample code in their documentation. There are two implementations for CAR in the linked document. Here is the first one [Source]:
from theano import scan
floatX = "float32"
from pymc3.distributions import continuous
from pymc3.distributions import distribution
class CAR(distribution.Continuous):
"""
Conditional Autoregressive (CAR) distribution
Parameters
----------
a : list of adjacency information
w : list of weight information
tau : precision at each location
"""
def __init__(self, w, a, tau, *args, **kwargs):
super(CAR, self).__init__(*args, **kwargs)
self.a = a = tt.as_tensor_variable(a)
self.w = w = tt.as_tensor_variable(w)
self.tau = tau*tt.sum(w, axis=1)
self.mode = 0.
def get_mu(self, x):
def weigth_mu(w, a):
a1 = tt.cast(a, 'int32')
return tt.sum(w*x[a1])/tt.sum(w)
mu_w, _ = scan(fn=weigth_mu,
sequences=[self.w, self.a])
return mu_w
def logp(self, x):
mu_w = self.get_mu(x)
tau = self.tau
return tt.sum(continuous.Normal.dist(mu=mu_w, tau=tau).logp(x))
with pm.Model() as model1:
# Vague prior on intercept
beta0 = pm.Normal('beta0', mu=0.0, tau=1.0e-5)
# Vague prior on covariate effect
beta1 = pm.Normal('beta1', mu=0.0, tau=1.0e-5)
# Random effects (hierarchial) prior
tau_h = pm.Gamma('tau_h', alpha=3.2761, beta=1.81)
# Spatial clustering prior
tau_c = pm.Gamma('tau_c', alpha=1.0, beta=1.0)
# Regional random effects
theta = pm.Normal('theta', mu=0.0, tau=tau_h, shape=N)
mu_phi = CAR('mu_phi', w=wmat, a=amat, tau=tau_c, shape=N)
# Zero-centre phi
phi = pm.Deterministic('phi', mu_phi-tt.mean(mu_phi))
# Mean model
mu = pm.Deterministic('mu', tt.exp(logE + beta0 + beta1*aff + theta + phi))
# Likelihood
Yi = pm.Poisson('Yi', mu=mu, observed=O)
# Marginal SD of heterogeniety effects
sd_h = pm.Deterministic('sd_h', tt.std(theta))
# Marginal SD of clustering (spatial) effects
sd_c = pm.Deterministic('sd_c', tt.std(phi))
# Proportion sptial variance
alpha = pm.Deterministic('alpha', sd_c/(sd_h+sd_c))
trace1 = pm.sample(1000, tune=500, cores=4,
init='advi',
nuts_kwargs={"target_accept":0.9,
"max_treedepth": 15})
The M function is written here as:
mu = pm.Deterministic('mu', tt.exp(logE + beta0 + beta1*aff + theta + phi))
I am trying to fit a hierarchical Poisson regression to estimate time_delay per group and globally. I am confused as to whether pymc automatically applies a log link function to mu or do I have to do so explicitly:
with pm.Model() as model:
alpha = pm.Gamma('alpha', alpha=1, beta=1)
beta = pm.Gamma('beta', alpha=1, beta=1)
a = pm.Gamma('a', alpha=alpha, beta=beta, shape=n_participants)
mu = a[participants_idx]
y_est = pm.Poisson('y_est', mu=mu, observed=messages['time_delay'].values)
start = pm.find_MAP(fmin=scipy.optimize.fmin_powell)
step = pm.Metropolis(start=start)
trace = pm.sample(20000, step, start=start, progressbar=True)
The below traceplot shows estimates for a. You can see group estimates between 0 and 750.
My confusion begins when I plot the hyper parameter gamma distribution by using the mean for alpha and beta as parameters. The below distribution shows support between 0 and 5 approx. This doesn't fit my expectation whilst looking at the group estimates for a above. What does a represent? Is it log(a) or something else?
Thanks for any pointers.
Adding example using fake data as requested in comments: This example has just a single group, so it should be easier to see if the hyper parameter could plausibly produce the Poisson distribution of the group.
test_data = []
model = []
for i in np.arange(1):
# between 1 and 100 messages per conversation
num_messages = np.random.uniform(1, 100)
avg_delay = np.random.gamma(15, 1)
for j in np.arange(num_messages):
delay = np.random.poisson(avg_delay)
test_data.append([i, j, delay, i])
model.append([i, avg_delay])
model_df = pd.DataFrame(model, columns=['conversation_id', 'synthetic_mean_delay'])
test_df = pd.DataFrame(test_data, columns=['conversation_id', 'message_id', 'time_delay', 'participants_str'])
test_df.head()
# Estimate parameters of model using test data
# convert categorical variables to integer
le = preprocessing.LabelEncoder()
test_participants_map = le.fit(test_df['participants_str'])
test_participants_idx = le.fit_transform(test_df['participants_str'])
n_test_participants = len(test_df['participants_str'].unique())
with pm.Model() as model:
alpha = pm.Gamma('alpha', alpha=1, beta=1)
beta = pm.Gamma('beta', alpha=1, beta=1)
a = pm.Gamma('a', alpha=alpha, beta=beta, shape=n_test_participants)
mu = a[test_participants_idx]
y = test_df['time_delay'].values
y_est = pm.Poisson('y_est', mu=mu, observed=y)
start = pm.find_MAP(fmin=scipy.optimize.fmin_powell)
step = pm.Metropolis(start=start)
trace = pm.sample(20000, step, start=start, progressbar=True)
I don't see how the below hyper parameter could produce a poisson distribution with parameter between 13 and 17.
ANSWER: pymc uses different parameters than scipy to represent Gamma distributions. scipy uses alpha & scale, whereas pymc uses alpha and beta. The below model works as expected:
with pm.Model() as model:
alpha = pm.Gamma('alpha', alpha=1, beta=1)
scale = pm.Gamma('scale', alpha=1, beta=1)
a = pm.Gamma('a', alpha=alpha, beta=1.0/scale, shape=n_test_participants)
#mu = T.exp(a[test_participants_idx])
mu = a[test_participants_idx]
y = test_df['time_delay'].values
y_est = pm.Poisson('y_est', mu=mu, observed=y)
start = pm.find_MAP(fmin=scipy.optimize.fmin_powell)
step = pm.Metropolis(start=start)
trace = pm.sample(20000, step, start=start, progressbar=True)