I am trying to use sysid to regress an ARX model periodically, then evaluate the predictive ability of that model by simulating with the future inputs and comparing the output with the experimental data. When I try to solve using m.solve() I get the following error: Exception: Data arrays must have the same length, and match time discretization in dynamic problems
The following is an MRE:
X = [[ 0.9, 0.], [ 0.9, 0.],[ 0.9,0.],[ 0.9,0.],[ 0.9, 0.],[ 0.5,0.],[0.5,0.],[0.5,0.],[0.5,0.], [ 0.5, 0.]] # 2 values for inputs at each time step
Y = [20.3, 20.3, 20.2, 20.2, 20.1, 20.1, 20.1, 20., 19.9, 19.8,] # 1 output at each time step
t = np.linspace(0, 9*300, 10) # 10 points 5 minutes apart each
na = 1 # output coefficients
nb = 2 # input coefficients
res, p, K = m.sysid(t, X, Y, na, nb, pred='meas')
m.time = t - t[0]
y_, u_ = m.arx(p)
u_[0].value = X[0]
u_[1].value = X[1]
m.options.imode = 4
m.options.nodes = 2
# simulate
m.solve()
I don't want to control, rather apply the experimental values to future timesteps and see how the ARX model extrapolates.
Thanks for your help
The problem was with this section and how the data is loaded into value:
u_[0].value = X[:,0]
u_[1].value = X[:,1]
y_[0].value = Y
Try printing X[0] and X[1] to see that they are only the first and second elements of the original list. Converting to a numpy array helps with the slicing.
import numpy as np
from gekko import GEKKO
m = GEKKO()
X = np.array([[ 0.9, 0.], [ 0.9, 0.],\
[ 0.9,0.],[ 0.9,0.],\
[ 0.9, 0.],[ 0.5,0.],\
[0.5,0.],[0.5,0.],\
[0.5,0.], [ 0.5, 0.]]) # 2 values for inputs at each time step
Y = np.array([20.3, 20.3, 20.2, 20.2,\
20.1, 20.1, 20.1, 20., \
19.9, 19.8]) # 1 output at each time step
t = np.linspace(0, 9*300, 10) # 10 points 5 minutes apart each
na = 1 # output coefficients
nb = 2 # input coefficients
res, p, K = m.sysid(t, X, Y, na, nb, pred='meas')
m.time = t - t[0]
y_, u_ = m.arx(p)
u_[0].value = X[:,0]
u_[1].value = X[:,1]
y_[0].value = Y
print(X[0])
print(X[1])
m.options.imode = 4
m.options.nodes = 2
# simulate
m.solve()
Here is another example with Pandas DataFrames:
from gekko import GEKKO
import pandas as pd
import matplotlib.pyplot as plt
# load data and parse into columns
url = 'http://apmonitor.com/do/uploads/Main/tclab_dyn_data2.txt'
data = pd.read_csv(url)
t = data['Time']
u = data[['H1','H2']]
y = data['T1']
# generate time-series model
m = GEKKO(remote=False) # remote=True for MacOS
# system identification
na = 2 # output coefficients
nb = 2 # input coefficients
yp,p,K = m.sysid(t,u,y,na,nb,diaglevel=1)
plt.figure()
plt.subplot(2,1,1)
plt.plot(t,u)
plt.legend([r'$u_0$',r'$u_1$'])
plt.ylabel('MVs')
plt.subplot(2,1,2)
plt.plot(t,y)
plt.plot(t,yp)
plt.legend([r'$y_0$',r'$z_0$'])
plt.ylabel('CVs')
plt.xlabel('Time')
plt.savefig('sysid.png')
plt.show()
We're also working on a package with a Seeq add-on for system identification that runs in Python and Jupyter notebooks.
Related
İ have 3 data sets for y axis values as follows.
y = [0.2535 0.3552 0.456 0.489 0.5265 0.58384 1.87616 2.87328 2.55184 2.66992 2.8208 3.09632 3.51616]
[0.116112 0.425088 0.582528 0.70192 1.07584 2.41408 3.75232 4.61824 2.55184 2.66992 2.8208 3.09632 3.51616 ]
[0.389664 1.166368 1.60392 2.05984 2.788 4.02784 5.0184 5.60224 2.55184 2.66992 2.8208 3.09632 3.51616 ]
and one data set for x values
x = [ 0. 8.75 17.5 26.25 35. 43.75 52.5 61.25 70. 78.75
87.5 96.25 105. ]
ı am using the following command to curve fit
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
Which works fine for one data set of y and x. what kind of loop should ı use if ı want to have 3 different curve-fit equations for different y data sets for same x sets? ı am new to python this is why ı strugle in such a basic loop.
My expected output is an equation that represents a curve for given data sets(x and ). I managed to get an equation one by one. but. I have tons of different data sets for y and ı dont want to find the equivalent equations one by one since ı can do it in a loop for y values but dont know how?
This is the working example for one set of y values. in reality i have 3 data sets for y. ı can change the y and get 3 different results but i want to do it in a single loop for all y values
import numpy as np
import matplotlib.pyplot as plt
x = [0, 5.25, 10.5, 21, 31.5, 42, 52.5, 63, 73.5, 84, 94.5, 99.75, 105]
y = [0, 0.116112, 0.389664, 1.739712, 3.566016, 4.860304, 5.05776, 5.04792,
4.197744, 2.210064, 0.505776, 0.1312, 0]
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
new_x = []
new_y= []
for a in range(105):
new_x.append(a+1)
calc = poly(a+1)
new_y.append(calc)
plt.plot(new_x, new_y)
plt.scatter(x, y)
print(poly)
Following are some improvements of your code. I've made a function to automate the processing of your data. Do not forget that numpy provide vectorized operations so it was not usefull to iterate over new_x to get new_y. Vectorized operations are more readable and so much more performant.
I let you call the function inside a loop over your datasets.
import numpy as np
import matplotlib.pyplot as plt
def my_function(x,y):
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
new_x = np.arange(x[0],x[-1],1)
new_y= poly(new_x)
plt.plot(new_x, new_y)
plt.scatter(x, y)
print(poly)
x = [0, 5.25, 10.5, 21, 31.5, 42, 52.5, 63, 73.5, 84, 94.5, 99.75, 105]
y1 = [0, 0.116112, 0.389664, 1.739712, 3.566016, 4.860304, 5.05776, 5.04792,
4.197744, 2.210064, 0.505776, 0.1312, 0]
y2 = [0.116112 0.425088 0.582528 0.70192 1.07584 2.41408 3.75232 4.61824 2.55184 2.66992 2.8208 3.09632 3.51616 ]
y3 = [0.389664 1.166368 1.60392 2.05984 2.788 4.02784 5.0184 5.60224 2.55184 2.66992 2.8208 3.09632 3.51616 ]
ylist = [ y1, y2, y3]
for y in ylist:
my_function(x,y)
I am trying to create a Neural Network to predict the behavior of variable "miu".
Since I only have 6 data points, I tried to use spline to find more points that follow the behavior of the system to afterwards use all those points in the neural network.
I am trying to use 2 inputs, which are time and cell concentration. And the expected output would be the miu value, which is given as the derivative dy/dx where y is the cell concentration and x the time.
I implemented the following code:
from gekko import brain
import numpy as np
import matplotlib.pyplot as plt
from numpy import diff
from scipy.interpolate import CubicSpline
xm = np.array([ 0.0 , 23.0 , 47.0 , 49.0 ,\
71.5 , 95.0 , 119.0 , 143.0 ])
def spline(cell):
m = GEKKO()
m.options.IMODE=2
c = [m.FV(value=0) for i in range(4)]
x = m.Param(value=xm)
cell = np.array(cell)
y = m.CV(value=cell)
y.FSTATUS = 1
# polynomial model
m.Equation(y==c[0]+c[1]*x+c[2]*x**2+c[3]*x**3)
c[0].STATUS=1
m.solve(disp=False)
c[1].STATUS=1
m.solve(disp=False)
c[2].STATUS=1
c[3].STATUS=1
m.solve(disp=False)
pbr = [c[3].value[0],c[2].value[0],\
c[1].value[0],c[0].value[0]]
print(pbr)
xp = np.linspace(0,144,100)
plot1 = plt.figure(1)
if cell[0] == cell_br2[0]:
plt.plot(xm,cell_br2, 'ko', label ='BR2')
plt.plot(xp,np.polyval(pbr,xp),'g:',linewidth=2)
elif cell[0] == cell_br1[0] :
plt.plot(xm,cell_br1, 'mo', label ='BR1')
plt.plot(xp,np.polyval(pbr,xp),'r:',linewidth=2)
plt.xlabel('time(hr)')
plt.ylabel('cells')
plt.legend()
dx = diff(xp)
dy1 = diff(np.polyval(pbr,xp))
deriv1 = dy1/dx
time =np.linspace(0,144,99)
plot1 = plt.figure(2)
if cell[0] == cell_br2[0]:
plt.plot(time,deriv1,'b:',linewidth=2, label ='BR2')
elif cell[0] == cell_br1[0]:
plt.plot(time,deriv1,'m:',linewidth=2, label ='BR1')
plt.xlabel('time(hr)')
plt.ylabel('miu(1/h)')
plt.legend()
plt.show()
return(deriv1)
m = GEKKO()
cell_br1 = (0.63*10**6 , 1.10*10**6, 2.06*10**6, 2.08*10**6,\
3.73*10**6, 3.89*10**6, 3.47*10**6,2.312*10**6)
cell_br2= (0.58*10**6 , 0.96*10**6, 2.07*10**6, 1.79*10**6,\
3.57*10**6, 3.34*10**6, 2.62*10**6, 1.75*10**6)
b = brain.Brain()
b.input_layer(2)
b.layer(linear=5)
b.layer(tanh=5)
b.layer(linear=5)
b.output_layer(1)
x_s = np.linspace(0,144,99)
xg = np.array([ 0.0 , 23.0 , 47.0 , 49.0 , 71.5 ,\
95.0 , 119.0 , 144.0 ])
cells_spline = CubicSpline(xm, cell_br1)
y_cells = cells_spline(x_s)
miu_1 = spline(cell_br1)
miu_2 = spline(cell_br2)
x = (x_s, y_cells)#, y_glucose) #Inputs (3)
y = (miu_1) #Output (2)
b.learn(x,y) # train
xp = np.linspace(0,144,99)
yp = b.think(x) # validate
yyp = np.array(yp)
miu = np.reshape(yyp, (99,))
plot1 = plt.figure(3)
plt.plot(xp,miu,'r-', label = 'Predicted ')
plt.plot(x_s,miu_1,'bo', label = 'Experimental points')
plt.xlabel('Time [hr]')
plt.ylabel('miu [1/h]')
plt.legend()
plt.show()
Although solver finds a solution, it is constant, which indicated that the solver is not working. My output is the following :
Can someone please help? I can't find what is failing. Thanks
Here are a couple issues with your current approach:
The training uses two inputs while the validation uses only one input
The data is not scaled. It generally helps if you scale the data to -1 to 1. I included a simple scalar but there are better ways to do this that also zero-center the data.
from gekko import brain
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
from numpy import diff
from scipy.interpolate import CubicSpline
xm = np.array([ 0.0 , 23.0 , 47.0 , 49.0 ,\
71.5 , 95.0 , 119.0 , 143.0 ])
def spline(cell):
m = GEKKO()
m.options.IMODE=2
c = [m.FV(value=0) for i in range(4)]
x = m.Param(value=xm)
cell = np.array(cell)
y = m.CV(value=cell)
y.FSTATUS = 1
# polynomial model
m.Equation(y==c[0]+c[1]*x+c[2]*x**2+c[3]*x**3)
c[0].STATUS=1
m.solve(disp=False)
c[1].STATUS=1
m.solve(disp=False)
c[2].STATUS=1
c[3].STATUS=1
m.solve(disp=False)
pbr = [c[3].value[0],c[2].value[0],\
c[1].value[0],c[0].value[0]]
print(pbr)
xp = np.linspace(0,144,100)
plot1 = plt.figure(1)
if cell[0] == cell_br2[0]:
plt.plot(xm,cell_br2, 'ko', label ='BR2')
plt.plot(xp,np.polyval(pbr,xp),'g:',linewidth=2)
elif cell[0] == cell_br1[0] :
plt.plot(xm,cell_br1, 'mo', label ='BR1')
plt.plot(xp,np.polyval(pbr,xp),'r:',linewidth=2)
plt.xlabel('time(hr)')
plt.ylabel('cells')
plt.legend()
dx = diff(xp)
dy1 = diff(np.polyval(pbr,xp))
deriv1 = dy1/dx
time =np.linspace(0,144,99)
plot1 = plt.figure(2)
if cell[0] == cell_br2[0]:
plt.plot(time,deriv1,'b:',linewidth=2, label ='BR2')
elif cell[0] == cell_br1[0]:
plt.plot(time,deriv1,'m:',linewidth=2, label ='BR1')
plt.xlabel('time(hr)')
plt.ylabel('miu(1/h)')
plt.legend()
#plt.show()
return(deriv1)
cell_br1 = np.array([0.63*10**6 , 1.10*10**6, 2.06*10**6, 2.08*10**6,\
3.73*10**6, 3.89*10**6, 3.47*10**6,2.312*10**6])
cell_br2= np.array([0.58*10**6 , 0.96*10**6, 2.07*10**6, 1.79*10**6,\
3.57*10**6, 3.34*10**6, 2.62*10**6, 1.75*10**6])
b = brain.Brain(remote=True)
b.input_layer(1)
b.layer(linear=1)
b.layer(tanh=4)
b.layer(linear=1)
b.output_layer(1)
x_s = np.linspace(0,144,99)
xg = np.array([ 0.0 , 23.0 , 47.0 , 49.0 , 71.5 ,\
95.0 , 119.0 , 144.0 ])
cells_spline = CubicSpline(xm, cell_br1)
y_cells = cells_spline(x_s)
miu_1 = spline(cell_br1)
miu_2 = spline(cell_br2)
scale = [1.0e6,1.0e4]
x = (y_cells/scale[0]) #, y_glucose) #Inputs (3)
y = (miu_1/scale[1]) #Output (2)
b.learn(x,y) # train
yp = b.think(x) # validate
xp = np.linspace(0,144,99)
yyp = np.array(yp)
miu = np.reshape(yyp, (99,))
plot1 = plt.figure(3)
plt.plot(xp,miu*scale[1],'r-', label = 'Predicted ')
plt.plot(x_s,miu_1,'bo', label = 'Experimental points')
plt.xlabel('Time [hr]')
plt.ylabel('miu [1/h]')
plt.legend()
plt.show()
Recommendations:
Adjust the number of nodes and types of layers.
Use a package such as Keras or PyTorch for this type of problem. Here is a tutorial on Keras. Gekko is especially good at problems that need extra things such as constraints, non-standard activation functions, and hybrid machine learning where the model is a combination of physics-based and empirical elements.
Gekko uses gradient-based solvers that may get stuck at local minima.
sorry but the title is not clear enough because I didn't know how to describe it with few words.
As you can see in the image I have used interp1d to graphically "predict" the value of y when x=7.
What I'm trying to do is to predict another value of y when x+1 (8) and so on any time the size of X grows up till the last value of the dataset is reached(let's say 100) using a for loop?. like
[1 2 3 4 5 6]
[ 4470.76 25465.72 25465.72 25465.72 21480.59 20024.53]
[1 2 3 4 5 6 7]
[ 4470.76 25465.72 25465.72 25465.72 21480.59 20024.53 15487.45]
[1 2 3 4 5 6 7 8]
[ 4470.76 25465.72 25465.72 25465.72 21480.59 20024.53 15487.45 25654.14]
[1 2 3 4 5 6 7 8 9]
[ 4470.76 25465.72 25465.72 25465.72 21480.59 20024.53 15487.45 25654.14 54874.22]
...
Any ideas, please?
edit: csv_file
import pandas as pd
import numpy as np
import os
import scipy.stats as sp
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={'figure.figsize': (18, 5)})
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
# Load dataset
df = pd.read_csv('data.csv', sep=";", index_col = 'date')
df = df[['pow']]
# Reset index
df = df.reset_index()
df = df[['date', 'pow(+)']]
df.head(10)
X = np.array(pd.to_datetime(df['date'].index.values+1, format='%Y-%m-%d'), dtype=int)#.reshape((-1, 1))
X = X[:6]
y = np.array(df['pow(+)'], dtype=float)#.reshape(-1, 1)
y = y[:6]
print (X)
print (y)
f = interp1d(X, y, fill_value = "extrapolate")
#start, stop , nber of samples to generate, If True, stop is the last sample
X_new = np.linspace(0, 7, num=8, endpoint=True)
plt.plot(X, y, 'o', X_new, f(X_new), '-')
plt.legend(['data', 'linear'], loc='best')
plt.show()
#print('\n')
#print("X shape:", X.shape)
#print("y shape:", y.shape)
This is not simple task, you need to:
find single or mix of functions that will fit you data
in this example find trend of you data with line fitting
set proper bounds for each parameter for complex function
predict new value based of fitted parameters
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
import pandas as pd
df = pd.read_csv('df.csv')
x_data = np.array(pd.to_datetime(df['date'].index.values+1, format='%Y-%m-%d'), dtype=int)
y_data = np.array(df['pow'], dtype=float)
# normalise data
y_data = (y_data - np.min(y_data))/ np.max(y_data)
# find data trend
def line_function(x, a, b):
return a*x + b
# fit function
parameters_line, covariance_line = curve_fit(line_function, x_data, y_data, method='lm')
# define fitting function
def fit_function(x, A, t, fi, c, d):
return A*np.sin(x*t + fi)**2 + c*x + d
# set bounds for each parameter
param_bounds = ([0, 0, 0, -1, 0], [2, (2*np.pi/600), 10, parameters_line[0], 10])
# fit function
parameters_fit, covariance_fit = curve_fit(fit_function, x_data, y_data,bounds=param_bounds , method='trf')
A, t, fi, c, d = [value for value in parameters_fit]
# predict new value
x_predict = 900
y_predict = fit_function(x_predict, A, t, fi, c, d)
# plot data
x_fit_data = np.linspace(-100, 1000, 1000)
y_fit_data = fit_function(x_fit_data, A, t, fi, c, d)
plt.plot(x_data, y_data, '.')
plt.plot(x_fit_data, y_fit_data, '-')
y_line_fit_data = line_function(x_fit_data, parameters_line[0], parameters_line[1])
plt.plot(x_fit_data, y_line_fit_data, '--')
plt.plot(x_predict, y_predict, 'o')
plt.show()
Output:
There is an equation of exponential truncated power law in the article below:
Gonzalez, M. C., Hidalgo, C. A., & Barabasi, A. L. (2008). Understanding individual human mobility patterns. Nature, 453(7196), 779-782.
like this:
It is an exponential truncated power law. There are three parameters to be estimated: rg0, beta and K. Now we have got several users' radius of gyration(rg), and uploaded it onto Github: radius of gyrations.txt
The following codes can be used to read data and calculate P(rg):
import numpy as np
# read radius of gyration from file
rg = []
with open('/path-to-the-data/radius of gyrations.txt', 'r') as f:
for i in f:
rg.append(float(i.strip('\n')))
# calculate P(rg)
rg = sorted(rg, reverse=True)
rg = np.array(rg)
prg = np.arange(len(sorted_data)) / float(len(sorted_data)-1)
or you can directly get rg and prg data as the following:
rg = np.array([ 20.7863444 , 9.40547933, 8.70934714, 8.62690145,
7.16978087, 7.02575052, 6.45280959, 6.44755478,
5.16630287, 5.16092884, 5.15618737, 5.05610068,
4.87023561, 4.66753197, 4.41807645, 4.2635671 ,
3.54454372, 2.7087178 , 2.39016885, 1.9483156 ,
1.78393238, 1.75432688, 1.12789787, 1.02098332,
0.92653501, 0.32586582, 0.1514813 , 0.09722761,
0. , 0. ])
prg = np.array([ 0. , 0.03448276, 0.06896552, 0.10344828, 0.13793103,
0.17241379, 0.20689655, 0.24137931, 0.27586207, 0.31034483,
0.34482759, 0.37931034, 0.4137931 , 0.44827586, 0.48275862,
0.51724138, 0.55172414, 0.5862069 , 0.62068966, 0.65517241,
0.68965517, 0.72413793, 0.75862069, 0.79310345, 0.82758621,
0.86206897, 0.89655172, 0.93103448, 0.96551724, 1. ])
I can plot the P(r_g) and r_g using the following python script:
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(rg, prg, 'bs', alpha = 0.3)
# roughly estimated params:
# rg0=1.8, beta=0.15, K=5
plt.plot(rg, (rg+1.8)**-.15*np.exp(-rg/5))
plt.yscale('log')
plt.xscale('log')
plt.xlabel('$r_g$', fontsize = 20)
plt.ylabel('$P(r_g)$', fontsize = 20)
plt.show()
How can I use these data of rgs to estimate the three parameters above? I hope to solve it using python.
According to #Michael 's suggestion, we can solve the problem using scipy.optimize.curve_fit
def func(rg, rg0, beta, K):
return (rg + rg0) ** (-beta) * np.exp(-rg / K)
from scipy import optimize
popt, pcov = optimize.curve_fit(func, rg, prg, p0=[1.8, 0.15, 5])
print popt
print pcov
The results are given below:
[ 1.04303608e+03 3.02058550e-03 4.85784945e+00]
[[ 1.38243336e+18 -6.14278286e+11 -1.14784675e+11]
[ -6.14278286e+11 2.72951900e+05 5.10040746e+04]
[ -1.14784675e+11 5.10040746e+04 9.53072925e+03]]
Then we can inspect the results by plotting the fitted curve.
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(rg, prg, 'bs', alpha = 0.3)
plt.plot(rg, (rg+popt[0])**-(popt[1])*np.exp(-rg/popt[2]) )
plt.yscale('log')
plt.xscale('log')
plt.xlabel('$r_g$', fontsize = 20)
plt.ylabel('$P(r_g)$', fontsize = 20)
plt.show()
I'd like to implement a hierarchical bayesian model with PyMC3. Before designing a complex model, I'm trying to get accustomed with PyMC3 by implementing Bayes PCA and comparing the results with sklearn.decomposition.pca
In 1:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
# Generate data
nsamp_cl = 1000 #Number of samples per class and per site
cov = np.matrix([[1, 0.9, 0.05],
[0.9, 1, 0.05],
[0.05, 0.05, 1]])
nfeat = cov.shape[0] #Number of features
X0 = np.random.multivariate_normal(np.zeros(nfeat),cov,nsamp_cl)
X1 = np.random.multivariate_normal(np.zeros(nfeat),cov,nsamp_cl)
# Rotate class 1
theta = np.radians(90)
cos, sin = np.cos(theta), np.sin(theta)
R = np.matrix('{} {}; {} {}'.format(cos, -sin, sin, cos))
X1[:,0:2] = np.dot(X1[:,0:2],R.T)
X = np.concatenate([X0,X1])
Y = np.concatenate([np.zeros(X0.shape[0]),np.ones(X1.shape[0])])
n = X.shape[0]
d = X.shape[1]
In 2:
plt.figure()
cols = ['b','r']
colors = [cols[y.astype(int)] for y in Y ]
plt.scatter(X[:,0],X[:,1],20,colors)
plt.title('Features 0 and 1')
plt.figure()
cols = ['b','r']
colors = [cols[y.astype(int)] for y in Y ]
plt.scatter(X[:,1],X[:,2],20,colors)
plt.title('Features 1 and 2')
Out 2:
In 3:
from pymc3 import Model,Normal,Gamma,math,variational
common_latent_model = Model()
# Builiding a latent model to extract site-robust principal components
with common_latent_model:
n_latent = 3
#ARD prior
alphas = Gamma('alphas', alpha=1e-6, beta=1e-6, shape=n_latent)
# Weight vector
w = Normal('w',mu=0,tau=alphas,shape=(d,n_latent))
# Latent space
z = Normal('z',mu=0,tau=1,shape=(n,n_latent))
# Multiply latent variables by W to go from latent to observation space
t = math.dot(z,w.T)
# Add bias
mu = Normal('mu', mu=0, tau=0.01, shape=d)
u = t + mu
# Precision of the observation
sigma = Gamma('sigma',alpha=1e-6, beta=1e-6,shape=1)
# Likelihood (sampling distribution) of observations
X_obs = Normal('X_obs', mu=u, tau=sigma, observed=X)
with common_latent_model:
means, sds, elbos = variational.advi(n=10000,learning_rate=0.1, accurate_elbo=True)#100000)
plt.plot(elbos)
plt.ylabel('ELBO')
plt.xlabel('iteration')
In [4]:
for key in means:
print "key: %s , value: %s" % (key, means[key])
key: mu , value: [ 0.03288066 -0.05347487 0.00260641]
key: alphas_log_ , value: [ 6.94631195 6.85621834 6.84792233]
key: sigma_log_ , value: [-0.009662]
key: z , value: [[-0.01260083 -0.00460729 -0.01360558]
[-0.02817471 0.04281501 0.01643355]
[-0.05178572 -0.02470609 -0.05092171]
...,
[-0.05201711 0.00150599 -0.01167801]
[-0.01097088 -0.02666511 0.03660954]
[ 0.0609949 0.01156182 0.01814843]]
key: w , value: [[-0.06004834 0.00599346 -0.03071374]
[ 0.00668656 -0.01306511 0.00400904]
[-0.00141243 -0.00778869 0.03257137]]
In [5]:
PC_bayes = means['z']
plt.figure()
cols = ['b','r']
colors = [cols[y.astype(int)] for y in Y ]
plt.scatter(PC_bayes[:,0],PC_bayes[:,1],20,colors,alpha=.1)
Out [5]:
In [6]:
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(X)
PC = pca.transform(X)
In [7]:
plt.figure()
cols = ['b','r']
colors = [cols[y.astype(int)] for y in Y ]
plt.scatter(PC[:,0],PC[:,1],20,colors,alpha=.1)
Out [7]:
(You can find the iPython Notebook here:
https://github.com/peppeFarAway/pymc3/blob/master/BayesPCA.ipynb)
Why can't my Bayes PCA implementation recover the Principal Components, while sklearn.decomposition.pca can? Where am I making a mistake?
The main reference I used to implement the model are:
https://blogs.msdn.microsoft.com/infernet_team_blog/2011/09/30/bayesian-pca/