I'm trying to implement the DOSNES algorithm from this publication but in Python for a project. I found this Matlab Implementation which works well but I probably mistranslated one or more steps in my code (mainly with axis I guess) because I clearly don't reach the same result. This is the part I'm strugglering with in Matlab:
P(1:n + 1:end) = 0; % set diagonal to zero
P = 0.5 * (P + P'); '% symmetrize P-values
P = max(P ./ sum(P(:)), realmin); % make sure P-values sum to one
const = sum(P(:) .* log(P(:))); % constant in KL divergence
ydata = .0001 * randn(n, no_dims);
y_incs = zeros(size(ydata));
gains = ones(size(ydata));
% Run the iterations
for iter=1:max_iter
% Compute joint probability that point i and j are neighbors
sum_ydata = sum(ydata .^ 2, 2);
num = 1 ./ (1 + bsxfun(#plus, sum_ydata, bsxfun(#plus, sum_ydata', -2 * (ydata * ydata')))); % Student-t distribution
num(1:n+1:end) = 0; % set diagonal to zero
Q = max(num ./ sum(num(:)), realmin); % normalize to get probabilities
% Compute the gradients (faster implementation)
L = (P - Q) .* num;
y_grads = 4 * (diag(sum(L, 1)) - L) * ydata;
% Update the solution
gains = (gains + .2) .* (sign(y_grads) ~= sign(y_incs)) ... % note that the y_grads are actually -y_grads
+ (gains * .8) .* (sign(y_grads) == sign(y_incs));
gains(gains < min_gain) = min_gain;
y_incs = momentum * y_incs - epsilon * (gains .* y_grads);
ydata = ydata + y_incs;
% Spherical projection
ydata = bsxfun(#minus, ydata, mean(ydata, 1));
r_mean = mean(sqrt(sum(ydata.^2,2)),1);
ydata = bsxfun(#times, ydata, r_mean./ sqrt(sum(ydata.^2,2)) );
% Update the momentum if necessary
if iter == mom_switch_iter
momentum = final_momentum;
end
% Print out progress
if ~rem(iter, 10)
cost = const - sum(P(:) .* log(Q(:)));
disp(['Iteration ' num2str(iter) ': error is ' num2str(cost)]);
end
end
and this is my python version :
no_dims = 3
n = X.shape[0]
min_gain = 0.01
momentum = 0.5
final_momentum = 0.8
epsilon = 500
mom_switch_iter = 250
max_iter = 1000
P[np.diag_indices_from(P)] = 0.
P = ( P + P.T )/2
P = np.max(P / np.sum(P), axis=0)
const = np.sum( P * np.log(P) )
ydata = 1e-4 * np.random.random(size=(n, no_dims))
y_incs = np.zeros(shape=ydata.shape)
gains = np.ones(shape=ydata.shape)
for iter in range(max_iter):
sum_ydata = np.sum(ydata**2, axis = 1)
bsxfun_1 = sum_ydata.T + -2*np.dot(ydata, ydata.T)
bsxfun_2 = sum_ydata + bsxfun_1
num = 1. / ( 1 + bsxfun_2 )
num[np.diag_indices_from(num)] = 0.
Q = np.max(num / np.sum(num), axis=0)
L = (P - Q) * num
t = np.diag( L.sum(axis=0) ) - L
y_grads = 4 * np.dot( t , ydata )
gains = (gains + 0.2) * ( np.sign(y_grads) != np.sign(y_incs) ) \
+ (gains * 0.8) * ( np.sign(y_grads) == np.sign(y_incs) )
# gains[np.where(np.sign(y_grads) != np.sign(y_incs))] += 0.2
# gains[np.where(np.sign(y_grads) == np.sign(y_incs))] *= 0.8
gains = np.clip(gains, a_min = min_gain, a_max = None)
y_incs = momentum * y_incs - epsilon * gains * y_grads
ydata += y_incs
ydata -= ydata.mean(axis=0)
alpha = np.sqrt(np.sum(ydata ** 2, axis=1))
r_mean = np.mean(alpha)
ydata = ydata * (r_mean / alpha).reshape(-1, 1)
if iter == mom_switch_iter:
momentum = final_momentum
if iter % 10 == 0:
cost = const - np.sum( P * np.log(Q) )
print( "Iteration {} : error is {}".format(iter, cost) )
If you want to do trials, you can download the repository here which uses Iris Dataset and an attached library. test.py is my test implementation with Iris dataset and visu.py is the result the paper has on MNIST dataset but restricted to 1000k random points.
Many thanks for your support,
Nicolas
EDIT 1
This is the final code working as expected :
P[np.diag_indices_from(P)] = 0.
P = ( P + P.T )/2
P = P / np.sum(P)
const = np.sum(xlogy(P, P))
ydata = 1e-4 * np.random.random(size=(n, no_dims))
y_incs = np.zeros(shape=ydata.shape)
gains = np.ones(shape=ydata.shape)
for iter in range(max_iter):
sum_ydata = np.sum(ydata**2, axis = 1)
bsxfun_1 = sum_ydata.T + -2*np.dot(ydata, ydata.T)
bsxfun_2 = sum_ydata + bsxfun_1
num = 1. / ( 1 + bsxfun_2 )
num[np.diag_indices_from(num)] = 0.
Q = num / np.sum(num)
L = (P - Q) * num
t = np.diag( L.sum(axis=0) ) - L
y_grads = 4 * np.dot( t , ydata )
gains = (gains + 0.2) * ( np.sign(y_grads) != np.sign(y_incs) ) \
+ (gains * 0.8) * ( np.sign(y_grads) == np.sign(y_incs) )
gains = np.clip(gains, a_min = min_gain, a_max = None)
y_incs = momentum * y_incs - epsilon * gains * y_grads
ydata += y_incs
ydata -= ydata.mean(axis=0)
alpha = np.sqrt(np.sum(ydata ** 2, axis=1))
r_mean = np.mean(alpha)
ydata = ydata * (r_mean / alpha).reshape(-1, 1)
if iter == mom_switch_iter:
momentum = final_momentum
if iter % 10 == 0:
cost = const - np.sum( xlogy(P, Q) )
print( "Iteration {} : error is {}".format(iter, cost) )
Right at the beginning you seem to replace a nonreducing max in matlab (it has two arguments, so it will compare those one by one and return a full size P) with a reducing max in python (axis=0 will reduce along this axis, meaning that the result will have one dimension less).
My advice, however, is to leave out the max altogether because it looks pretty much like an amateurish attempt of sidestepping the problem of p log p being defined at 0 only via taking the limit p->0 which using L'Hopital's rule can be shown to be 0, whereas the computer will returm NaN when asked to compute 0 * log(0).
The proper way of going about this is using scipy.special.xlogy which treats 0 correctly.
Related
I have a modified version of the spring optimization Gekko model.
Is there a way to slightly relax constraints so that the solver can still give me solutions even if they start falling out of the range of my constraints?
I'm aware of RTOL, but is there a way of specifying tolerances for individual equations?
One way to do this is create a new variable eps that has a lower bound of zero and an upper bound that is the maximum allowable violation. This becomes the deviation of the inequality constraints that can be minimized with an importance factor (e.g. 10).
eps = m.Var(lb=0,ub=0.5)
m.Minimize(10*eps)
Here are modified equations with eps:
m.Equations([
d_coil / d_wire >= 4-eps,
d_coil / d_wire <= 16+eps
])
and the full script:
from gekko import GEKKO
# Initialize Gekko model
m = GEKKO()
#Maximize force of a spring at its preload height h_0 of 1 inches
#The stress at Hs (solid height) must be less than Sy to protect from damage
# Constants
from numpy import pi
# Model Parameters
delta_0 = 0.4 # inches (spring deflection)
h_0 = 1.0 # inches (preload height)
Q = 15e4 # psi
G = 12e6 # psi
S_e = 45e3 # psi
S_f = 1.5
w = 0.18
# Variables
# inches (wire diameter)
d_wire = m.Var(value=0.07247, lb = 0.01, ub = 0.2)
# inches (coil diameter)
d_coil = m.Var(value=0.6775, lb = 0.0)
# number of coils in the spring
n_coils = m.Var(value=7.58898, lb = 0.0)
# inches (free height spring exerting no force)
h_f = m.Var(value=1.368117, lb = 1.0)
F = m.Var() # Spring force
# Intermediates
S_y = m.Intermediate((0.44 * Q) / (d_wire**w))
h_s = m.Intermediate(n_coils * d_wire)
k = m.Intermediate((G * d_wire**4)/(8 * d_coil**3 * n_coils))
kW = m.Intermediate((4 * d_coil - d_wire)/(4 * (d_coil-d_wire)) \
+ 0.62 * d_wire/d_coil)
n = m.Intermediate((8 * kW * d_coil)/(pi * d_wire**3))
tau_max = m.Intermediate((h_f - h_0 + delta_0) * k * n)
tau_min = m.Intermediate((h_f - h_0) * k * n)
tau_mean = m.Intermediate((tau_max + tau_min) / 2)
tau_alt = m.Intermediate((tau_max - tau_min) / 2)
h_def = m.Intermediate(h_0 - delta_0)
tau_hs = m.Intermediate((h_f - h_s) * k * n)
# Equations
eps = m.Var(lb=0,ub=0.5)
m.Minimize(10*eps)
m.Equations([
F == k * (h_f - h_0),
d_coil / d_wire >= 4-eps,
d_coil / d_wire <= 16+eps,
d_coil + d_wire < 0.75,
h_def - h_s > 0.05,
tau_alt < S_e / S_f,
tau_alt + tau_mean < S_y / S_f,
tau_hs < S_y / S_f
])
# Objective function
m.Maximize(F)
# Send to solver
m.solve()
# Print solution
print('Maximum force: ' + str(F[0]))
print('Optimal values: ')
print('d_wire : ' + str(d_wire[0]))
print('d_coil : ' + str(d_coil[0]))
print('n_coils : ' + str(n_coils[0]))
print('h_f : ' + str(h_f[0]))
print('eps : ' + str(eps[0]))
The function HH_model(I,area_factor) has as return value the number of spikes which are triggered by n runs. Assuming 1000 runs, there are 157 times that max(v[]-v_rest) > 60, then the return value of HH_model(I,area_factor) is 157.
Now I know value pairs from another model - the x-values are related to the stimulus I, while the y-values are the number of spikes.
I have written these values as a comment under the code. I want to choose my input parameters I and area_factor in a way that the error to the data is as small as possible. I have no idea how I should do this optimization.
import matplotlib.pyplot as py
import numpy as np
import scipy.optimize as optimize
# HH parameters
v_Rest = -65 # in mV
gNa = 1200 # in mS/cm^2
gK = 360 # in mS/cm^2
gL = 0.3*10 # in mS/cm^2
vNa = 115 # in mV
vK = -12 # in mV
vL = 10.6 # in mV
#Number of runs
runs = 1000
c = 1 # in uF/cm^2
ROOT = False
def HH_model(I,area_factor):
count = 0
t_end = 10 # in ms
delay = 0.1 # in ms
duration = 0.1 # in ms
dt = 0.0025 # in ms
area_factor = area_factor
#geometry
d = 2 # diameter in um
r = d/2 # Radius in um
l = 10 # Length of the compartment in um
A = (1*10**(-8))*area_factor # surface [cm^2]
I = I
C = c*A # uF
for j in range(0,runs):
# Introduction of equations and channels
def alphaM(v): return 12 * ((2.5 - 0.1 * (v)) / (np.exp(2.5 - 0.1 * (v)) - 1))
def betaM(v): return 12 * (4 * np.exp(-(v) / 18))
def betaH(v): return 12 * (1 / (np.exp(3 - 0.1 * (v)) + 1))
def alphaH(v): return 12 * (0.07 * np.exp(-(v) / 20))
def alphaN(v): return 12 * ((1 - 0.1 * (v)) / (10 * (np.exp(1 - 0.1 * (v)) - 1)))
def betaN(v): return 12 * (0.125 * np.exp(-(v) / 80))
# compute the timesteps
t_steps= t_end/dt+1
# Compute the initial values
v0 = 0
m0 = alphaM(v0)/(alphaM(v0)+betaM(v0))
h0 = alphaH(v0)/(alphaH(v0)+betaH(v0))
n0 = alphaN(v0)/(alphaN(v0)+betaN(v0))
# Allocate memory for v, m, h, n
v = np.zeros((int(t_steps), 1))
m = np.zeros((int(t_steps), 1))
h = np.zeros((int(t_steps), 1))
n = np.zeros((int(t_steps), 1))
# Set Initial values
v[:, 0] = v0
m[:, 0] = m0
h[:, 0] = h0
n[:, 0] = n0
### Noise component
knoise= 0.0005 #uA/(mS)^1/2
### --------- Step3: SOLVE
for i in range(0, int(t_steps)-1, 1):
# Get current states
vT = v[i]
mT = m[i]
hT = h[i]
nT = n[i]
# Stimulus current
IStim = 0
if delay / dt <= i <= (delay + duration) / dt:
IStim = I # in uA
else:
IStim = 0
# Compute change of m, h and n
m[i + 1] = (mT + dt * alphaM(vT)) / (1 + dt * (alphaM(vT) + betaM(vT)))
h[i + 1] = (hT + dt * alphaH(vT)) / (1 + dt * (alphaH(vT) + betaH(vT)))
n[i + 1] = (nT + dt * alphaN(vT)) / (1 + dt * (alphaN(vT) + betaN(vT)))
# Ionic currents
iNa = gNa * m[i + 1] ** 3. * h[i + 1] * (vT - vNa)
iK = gK * n[i + 1] ** 4. * (vT - vK)
iL = gL * (vT-vL)
Inoise = (np.random.normal(0, 1) * knoise * np.sqrt(gNa * A))
IIon = ((iNa + iK + iL) * A) + Inoise #
# Compute change of voltage
v[i + 1] = (vT + ((-IIon + IStim) / C) * dt)[0] # in ((uA / cm ^ 2) / (uF / cm ^ 2)) * ms == mV
# adjust the voltage to the resting potential
v = v + v_Rest
# test if there was a spike
if max(v[:]-v_Rest) > 60:
count += 1
return count
# some datapoints from another model out of 1000 runs. ydata means therefore 'count' out of 1000 runs.
# xdata = np.array([0.92*I,0.925*I,0.9535*I,0.975*I,0.9789*I,I,1.02*I,1.043*I,1.06*I,1.078*I,1.09*I])
# ydata = np.array([150,170,269,360,377,500,583,690,761,827,840])
EDIT:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import minimize
# HH parameters
v_Rest = -65 # in mV
gNa = 120 # in mS/cm^2
gK = 36 # in mS/cm^2
gL = 0.3 # in mS/cm^2
vNa = 115 # in mV
vK = -12 # in mV
vL = 10.6 # in mV
#Number of runs
runs = 1000
c = 1 # in uF/cm^2
def HH_model(x,I,area_factor):
count = 0
t_end = 10 # in ms
delay = 0.1 # in ms
duration = 0.1 # in ms
dt = 0.0025 # in ms
area_factor = area_factor
#geometry
d = 2 # diameter in um
r = d/2 # Radius in um
l = 10 # Length of the compartment in um
A = (1*10**(-8))*area_factor # surface [cm^2]
I = I*x
C = c*A # uF
for j in range(0,runs):
# Introduction of equations and channels
def alphaM(v): return 12 * ((2.5 - 0.1 * (v)) / (np.exp(2.5 - 0.1 * (v)) - 1))
def betaM(v): return 12 * (4 * np.exp(-(v) / 18))
def betaH(v): return 12 * (1 / (np.exp(3 - 0.1 * (v)) + 1))
def alphaH(v): return 12 * (0.07 * np.exp(-(v) / 20))
def alphaN(v): return 12 * ((1 - 0.1 * (v)) / (10 * (np.exp(1 - 0.1 * (v)) - 1)))
def betaN(v): return 12 * (0.125 * np.exp(-(v) / 80))
# compute the timesteps
t_steps= t_end/dt+1
# Compute the initial values
v0 = 0
m0 = alphaM(v0)/(alphaM(v0)+betaM(v0))
h0 = alphaH(v0)/(alphaH(v0)+betaH(v0))
n0 = alphaN(v0)/(alphaN(v0)+betaN(v0))
# Allocate memory for v, m, h, n
v = np.zeros((int(t_steps), 1))
m = np.zeros((int(t_steps), 1))
h = np.zeros((int(t_steps), 1))
n = np.zeros((int(t_steps), 1))
# Set Initial values
v[:, 0] = v0
m[:, 0] = m0
h[:, 0] = h0
n[:, 0] = n0
### Noise component
knoise= 0.0005 #uA/(mS)^1/2
### --------- Step3: SOLVE
for i in range(0, int(t_steps)-1, 1):
# Get current states
vT = v[i]
mT = m[i]
hT = h[i]
nT = n[i]
# Stimulus current
IStim = 0
if delay / dt <= i <= (delay + duration) / dt:
IStim = I # in uA
else:
IStim = 0
# Compute change of m, h and n
m[i + 1] = (mT + dt * alphaM(vT)) / (1 + dt * (alphaM(vT) + betaM(vT)))
h[i + 1] = (hT + dt * alphaH(vT)) / (1 + dt * (alphaH(vT) + betaH(vT)))
n[i + 1] = (nT + dt * alphaN(vT)) / (1 + dt * (alphaN(vT) + betaN(vT)))
# Ionic currents
iNa = gNa * m[i + 1] ** 3. * h[i + 1] * (vT - vNa)
iK = gK * n[i + 1] ** 4. * (vT - vK)
iL = gL * (vT-vL)
Inoise = (np.random.normal(0, 1) * knoise * np.sqrt(gNa * A))
IIon = ((iNa + iK + iL) * A) + Inoise #
# Compute change of voltage
v[i + 1] = (vT + ((-IIon + IStim) / C) * dt)[0] # in ((uA / cm ^ 2) / (uF / cm ^ 2)) * ms == mV
# adjust the voltage to the resting potential
v = v + v_Rest
# test if there was a spike
if max(v[:]-v_Rest) > 60:
count += 1
return count
def loss(parameters, model, x_ref, y_ref):
# unpack multiple parameters
I, area_factor = parameters
# compute prediction
y_predicted = np.array([model(x, I, area_factor) for x in x_ref])
# compute error and use it as loss
mse = ((y_ref - y_predicted) ** 2).mean()
return mse
# some datapoints from another model out of 1000 runs. ydata means therefore 'count' out of 1000 runs.
xdata = np.array([0.92,0.925,0.9535, 0.975, 0.9789, 1])
ydata = np.array([150,170,269, 360, 377, 500])
y_data_scaled = ydata / runs
y_predicted = np.array([HH_model(x,I=10**(-3), area_factor=1) for x in xdata])
parameters = (10**(-3), 1)
mse0 = loss(parameters, HH_model, xdata, y_data_scaled)
# compute the parameters that minimize the loss (alias, the error between the data and the predictions of the model)
optimum = minimize(loss, x0=np.array([10**(-3), 1]), args=(HH_model, xdata, y_data_scaled))
# compute the predictions with the optimized parameters
I = optimum['x'][0]
area_factor = optimum['x'][1]
y_predicted_opt = np.array([HH_model(x, I, area_factor) for x in xdata])
# plot the raw data, the model with handcrafted guess and the model with optimized parameters
fig, ax = plt.subplots(1, 1)
ax.set_xlabel('input')
ax.set_ylabel('output predictions')
ax.plot(xdata, y_data_scaled, marker='o')
ax.plot(xdata, y_predicted, marker='*')
ax.plot(xdata, y_predicted_opt, marker='v')
ax.legend([
"raw data points",
"initial guess",
"predictions with optimized parameters"
])
I started using your function,
then I noticed it was very slow to execute.
Hence, I decided to show the process with a toy (linear) model.
The process remains the same.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
def loss(parameters, model, x_ref, y_ref):
# unpack multiple parameters
m, q = parameters
# compute prediction
y_predicted = np.array([model(x, m, q) for x in x_ref])
# compute error and use it as loss
mse = ((y_ref - y_predicted) ** 2).mean()
return mse
# load a dataset to fit a model
x_data = np.array([0.92, 0.925, 0.9535, 0.975, 0.9789, 1, 1.02, 1.043, 1.06, 1.078, 1.09])
y_data = np.array([150, 170, 269, 360, 377, 500, 583, 690, 761, 827, 840])
# normalise the data - input is already normalised
y_data_scaled = y_data / 1000
# create a model (linear, as an example) using handcrafted parameters, ex:(1,1)
linear_fun = lambda x, m, q: m * x + q
y_predicted = np.array([linear_fun(x, m=1, q=1) for x in x_data])
# create a function that given a model (linear_fun), a dataset(x,y) and the parameters, compute the error
parameters = (1, 1)
mse0 = loss(parameters, linear_fun, x_data, y_data_scaled)
# compute the parameters that minimize the loss (alias, the error between the data and the predictions of the model)
optimum = minimize(loss, x0=np.array([1, 1]), args=(linear_fun, x_data, y_data_scaled))
# compute the predictions with the optimized parameters
m = optimum['x'][0]
q = optimum['x'][1]
y_predicted_opt = np.array([linear_fun(x, m, q) for x in x_data])
# plot the raw data, the model with handcrafted guess and the model with optimized parameters
fig, ax = plt.subplots(1, 1)
ax.set_xlabel('input')
ax.set_ylabel('output predictions')
ax.plot(x_data, y_data_scaled, marker='o')
ax.plot(x_data, y_predicted, marker='*')
ax.plot(x_data, y_predicted_opt, marker='v')
ax.legend([
"raw data points",
"initial guess",
"predictions with optimized parameters"
])
# Note1: good practise is to validate your model with a different set of data,
# respect to the one that you have used to find the parameters
# here, however, it is shown just the optimization procedure
# Note2: in your case you should use the HH_model instead of the linear_fun
# and I and Area_factor instead of m and q.
Output:
-- EDIT: To use the HH_model:
I went deeper in your code,
I tried few values for area and stimulus
and I executed a single run of HH_Model without taking the threshold.
Then, I checked the predicted dynamic of voltage (v):
the sequence is always diverging ( all values become nan after few steps )
if you have an initial guess for stimulus and area that could make the code to work, great.
if you have no idea of the order of magnitude of these parameters
the unique solution I see is a grid search over them - just to find this initial guess.
however, it might take a very long time without guarantee of success.
given that the code is based on a physical model, I would suggest to:
1 - find pen and paper a reasonable values.
2 - check that this simulation works for these values.
3 - then, run the optimizer to find the minimum.
Or, worst case scenario, reverse engineer the code and find the value that makes the equation to converge
Here the refactored code:
import math
import numpy as np
# HH parameters
v_Rest = -65 # in mV
gNa = 1200 # in mS/cm^2
gK = 360 # in mS/cm^2
gL = 0.3 * 10 # in mS/cm^2
vNa = 115 # in mV
vK = -12 # in mV
vL = 10.6 # in mV
# Number of runs
c = 1 # in uF/cm^2
# Introduction of equations and channels
def alphaM(v):
return 12 * ((2.5 - 0.1 * (v)) / (np.exp(2.5 - 0.1 * (v)) - 1))
def betaM(v):
return 12 * (4 * np.exp(-(v) / 18))
def betaH(v):
return 12 * (1 / (np.exp(3 - 0.1 * (v)) + 1))
def alphaH(v):
return 12 * (0.07 * np.exp(-(v) / 20))
def alphaN(v):
return 12 * ((1 - 0.1 * (v)) / (10 * (np.exp(1 - 0.1 * (v)) - 1)))
def betaN(v):
return 12 * (0.125 * np.exp(-(v) / 80))
def predict_voltage(A, C, delay, dt, duration, stimulus, t_end):
# compute the timesteps
t_steps = t_end / dt + 1
# Compute the initial values
v0 = 0
m0 = alphaM(v0) / (alphaM(v0) + betaM(v0))
h0 = alphaH(v0) / (alphaH(v0) + betaH(v0))
n0 = alphaN(v0) / (alphaN(v0) + betaN(v0))
# Allocate memory for v, m, h, n
v = np.zeros((int(t_steps), 1))
m = np.zeros((int(t_steps), 1))
h = np.zeros((int(t_steps), 1))
n = np.zeros((int(t_steps), 1))
# Set Initial values
v[:, 0] = v0
m[:, 0] = m0
h[:, 0] = h0
n[:, 0] = n0
# Noise component
knoise = 0.0005 # uA/(mS)^1/2
for i in range(0, int(t_steps) - 1, 1):
# Get current states
vT = v[i]
mT = m[i]
hT = h[i]
nT = n[i]
# Stimulus current
if delay / dt <= i <= (delay + duration) / dt:
IStim = stimulus # in uA
else:
IStim = 0
# Compute change of m, h and n
m[i + 1] = (mT + dt * alphaM(vT)) / (1 + dt * (alphaM(vT) + betaM(vT)))
h[i + 1] = (hT + dt * alphaH(vT)) / (1 + dt * (alphaH(vT) + betaH(vT)))
n[i + 1] = (nT + dt * alphaN(vT)) / (1 + dt * (alphaN(vT) + betaN(vT)))
# Ionic currents
iNa = gNa * m[i + 1] ** 3. * h[i + 1] * (vT - vNa)
iK = gK * n[i + 1] ** 4. * (vT - vK)
iL = gL * (vT - vL)
Inoise = (np.random.normal(0, 1) * knoise * np.sqrt(gNa * A))
IIon = ((iNa + iK + iL) * A) + Inoise #
# Compute change of voltage
v[i + 1] = (vT + ((-IIon + IStim) / C) * dt)[0] # in ((uA / cm ^ 2) / (uF / cm ^ 2)) * ms == mV
# stop simulation if it diverges
if math.isnan(v[i + 1]):
return [None]
# adjust the voltage to the resting potential
v = v + v_Rest
return v
def HH_model(stimulus, area_factor, runs=1000):
count = 0
t_end = 10 # in ms
delay = 0.1 # in ms
duration = 0.1 # in ms
dt = 0.0025 # in ms
area_factor = area_factor
# geometry
d = 2 # diameter in um
r = d / 2 # Radius in um
l = 10 # Length of the compartment in um
A = (1 * 10 ** (-8)) * area_factor # surface [cm^2]
stimulus = stimulus
C = c * A # uF
for j in range(0, runs):
v = predict_voltage(A, C, delay, dt, duration, stimulus, t_end)
if max(v[:] - v_Rest) > 60:
count += 1
return count
And the attempt to run one simulation:
import time
from ex_21.equations import c, predict_voltage
area_factor = 0.1
stimulus = 70
# input signal
count = 0
t_end = 10 # in ms
delay = 0.1 # in ms
duration = 0.1 # in ms
dt = 0.0025 # in ms
# geometry
d = 2 # diameter in um
r = d / 2 # Radius in um
l = 10 # Length of the compartment in um
A = (1 * 10 ** (-8)) * area_factor # surface [cm^2]
C = c * A # uF
start = time.time()
voltage_dynamic = predict_voltage(A, C, delay, dt, duration, stimulus, t_end)
elapse = time.time() - start
print(voltage_dynamic)
Output:
[None]
import math
import numpy as np
S0 = 100.; K = 100.; T = 1.0; r = 0.05; sigma = 0.2
M = 100; dt = T / M; I = 500000
S = np.zeros((M + 1, I))
S[0] = S0
for t in range(1, M + 1):
z = np.random.standard_normal(I)
S[t] = S[t - 1] * np.exp((r - 0.5 * sigma ** 2) * dt + sigma *
math.sqrt(dt) * z)
C0 = math.exp(-r * T) * np.sum(np.maximum(S[-1] - K, 0)) / I
print ("European Option Value is ", C0)
It gives a value of around 10.45 as you increase the number of simulations, but using the B-S formula the value should be around 10.09. Anybody know why the code isn't giving a number closer to the formula?
I am using scipy.optimize.minimize, with the default method ('Neldear-Mead').
The function I try to minimize is not strictly convex. It stays at the same value on some significant areas.
The issue that I have is that the steps taken by the algorithm are too small. For example my starting point has a first coordinate x0 = 0.2 . I know that the function will result in a different value only for a significant step, for example moving by 0.05. Unfortunately, I can see that the algorithm makes very small step (moving by around 0.000001). As a result, my function returns the same value, and the algorithm does not converge. Can I change that behaviour?
For convenience, here's the scipy code:
def _minimize_neldermead(func, x0, args=(), callback=None,
xtol=1e-4, ftol=1e-4, maxiter=None, maxfev=None,
disp=False, return_all=False,
**unknown_options):
"""
Minimization of scalar function of one or more variables using the
Nelder-Mead algorithm.
Options for the Nelder-Mead algorithm are:
disp : bool
Set to True to print convergence messages.
xtol : float
Relative error in solution `xopt` acceptable for convergence.
ftol : float
Relative error in ``fun(xopt)`` acceptable for convergence.
maxiter : int
Maximum number of iterations to perform.
maxfev : int
Maximum number of function evaluations to make.
This function is called by the `minimize` function with
`method=Nelder-Mead`. It is not supposed to be called directly.
"""
_check_unknown_options(unknown_options)
maxfun = maxfev
retall = return_all
fcalls, func = wrap_function(func, args)
x0 = asfarray(x0).flatten()
N = len(x0)
rank = len(x0.shape)
if not -1 < rank < 2:
raise ValueError("Initial guess must be a scalar or rank-1 sequence.")
if maxiter is None:
maxiter = N * 200
if maxfun is None:
maxfun = N * 200
rho = 1
chi = 2
psi = 0.5
sigma = 0.5
one2np1 = list(range(1, N + 1))
if rank == 0:
sim = numpy.zeros((N + 1,), dtype=x0.dtype)
else:
sim = numpy.zeros((N + 1, N), dtype=x0.dtype)
fsim = numpy.zeros((N + 1,), float)
sim[0] = x0
if retall:
allvecs = [sim[0]]
fsim[0] = func(x0)
nonzdelt = 0.05
zdelt = 0.00025
for k in range(0, N):
y = numpy.array(x0, copy=True)
if y[k] != 0:
y[k] = (1 + nonzdelt)*y[k]
else:
y[k] = zdelt
sim[k + 1] = y
f = func(y)
fsim[k + 1] = f
ind = numpy.argsort(fsim)
fsim = numpy.take(fsim, ind, 0)
# sort so sim[0,:] has the lowest function value
sim = numpy.take(sim, ind, 0)
iterations = 1
while (fcalls[0] < maxfun and iterations < maxiter):
if (numpy.max(numpy.ravel(numpy.abs(sim[1:] - sim[0]))) <= xtol and
numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= ftol):
break
xbar = numpy.add.reduce(sim[:-1], 0) / N
xr = (1 + rho) * xbar - rho * sim[-1]
fxr = func(xr)
doshrink = 0
if fxr < fsim[0]:
xe = (1 + rho * chi) * xbar - rho * chi * sim[-1]
fxe = func(xe)
if fxe < fxr:
sim[-1] = xe
fsim[-1] = fxe
else:
sim[-1] = xr
fsim[-1] = fxr
else: # fsim[0] <= fxr
if fxr < fsim[-2]:
sim[-1] = xr
fsim[-1] = fxr
else: # fxr >= fsim[-2]
# Perform contraction
if fxr < fsim[-1]:
xc = (1 + psi * rho) * xbar - psi * rho * sim[-1]
fxc = func(xc)
if fxc <= fxr:
sim[-1] = xc
fsim[-1] = fxc
else:
doshrink = 1
else:
# Perform an inside contraction
xcc = (1 - psi) * xbar + psi * sim[-1]
fxcc = func(xcc)
if fxcc < fsim[-1]:
sim[-1] = xcc
fsim[-1] = fxcc
else:
doshrink = 1
if doshrink:
for j in one2np1:
sim[j] = sim[0] + sigma * (sim[j] - sim[0])
fsim[j] = func(sim[j])
ind = numpy.argsort(fsim)
sim = numpy.take(sim, ind, 0)
fsim = numpy.take(fsim, ind, 0)
if callback is not None:
callback(sim[0])
iterations += 1
if retall:
allvecs.append(sim[0])
x = sim[0]
fval = numpy.min(fsim)
warnflag = 0
if fcalls[0] >= maxfun:
warnflag = 1
msg = _status_message['maxfev']
if disp:
print('Warning: ' + msg)
elif iterations >= maxiter:
warnflag = 2
msg = _status_message['maxiter']
if disp:
print('Warning: ' + msg)
else:
msg = _status_message['success']
if disp:
print(msg)
print(" Current function value: %f" % fval)
print(" Iterations: %d" % iterations)
print(" Function evaluations: %d" % fcalls[0])
result = OptimizeResult(fun=fval, nit=iterations, nfev=fcalls[0],
status=warnflag, success=(warnflag == 0),
message=msg, x=x)
if retall:
result['allvecs'] = allvecs
return result
I have used Nelder-Mead long time ago,but as I remember that you will find different local minima if you start from different starting points.You didn't give us your function,so we could only guess what should be best strategy for you.You should also read this
http://www.webpages.uidaho.edu/~fuchang/res/ANMS.pdf
Then you can try pure Python implementation
https://github.com/fchollet/nelder-mead/blob/master/nelder_mead.py
After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?
Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here
#Rewritten code from /r2/r2/lib/db/_sorts.pyx
from math import sqrt
def confidence(ups, downs):
n = ups + downs
if n == 0:
return 0
z = 1.0 #1.44 = 85%, 1.96 = 95%
phat = float(ups) / n
return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
I think this one has a wrong wilson call, because if you have 1 up 0 down you get NaN because you can't do a sqrt on the negative value.
The correct one can be found when looking at the ruby example from the article How not to sort by average page:
return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))
To get the Wilson CI without continuity correction, you can use proportion_confint in statsmodels.stats.proportion. To get the Wilson CI with continuity correction, you can use the code below.
# cf.
# [1] R. G. Newcombe. Two-sided confidence intervals for the single proportion, 1998
# [2] R. G. Newcombe. Interval Estimation for the difference between independent proportions: comparison of eleven methods, 1998
import numpy as np
from statsmodels.stats.proportion import proportion_confint
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
def propci_wilson_cc(count, nobs, alpha=0.05):
# get confidence limits for proportion
# using wilson score method w/ cont correction
# i.e. Method 4 in Newcombe [1];
# verified via Table 1
from scipy import stats
n = nobs
p = count/n
q = 1.-p
z = stats.norm.isf(alpha / 2.)
z2 = z**2
denom = 2*(n+z2)
num = 2.*n*p+z2-1.-z*np.sqrt(z2-2-1./n+4*p*(n*q+1))
ci_l = num/denom
num = 2.*n*p+z2+1.+z*np.sqrt(z2+2-1./n+4*p*(n*q-1))
ci_u = num/denom
if p == 0:
ci_l = 0.
elif p == 1:
ci_u = 1.
return ci_l, ci_u
def dpropci_wilson_nocc(a,m,b,n,alpha=0.05):
# get confidence limits for difference in proportions
# a/m - b/n
# using wilson score method WITHOUT cont correction
# i.e. Method 10 in Newcombe [2]
# verified via Table II
theta = a/m - b/n
l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
l2, u2 = proportion_confint(count=b, nobs=n, alpha=0.05, method='wilson')
ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)
return ci_l, ci_u
def dpropci_wilson_cc(a,m,b,n,alpha=0.05):
# get confidence limits for difference in proportions
# a/m - b/n
# using wilson score method w/ cont correction
# i.e. Method 11 in Newcombe [2]
# verified via Table II
theta = a/m - b/n
l1, u1 = propci_wilson_cc(count=a, nobs=m, alpha=alpha)
l2, u2 = propci_wilson_cc(count=b, nobs=n, alpha=alpha)
ci_u = theta + np.sqrt((a/m-u1)**2+(b/n-l2)**2)
ci_l = theta - np.sqrt((a/m-l1)**2+(b/n-u2)**2)
return ci_l, ci_u
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# single proportion testing
# these come from Newcombe [1] (Table 1)
a_vec = np.array([81, 15, 0, 1])
m_vec = np.array([263, 148, 20, 29])
for (a,m) in zip(a_vec,m_vec):
l1, u1 = proportion_confint(count=a, nobs=m, alpha=0.05, method='wilson')
l2, u2 = propci_wilson_cc(count=a, nobs=m, alpha=0.05)
print(a,m,l1,u1,' ',l2,u2)
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# difference in proportions testing
# these come from Newcombe [2] (Table II)
a_vec = np.array([56,9,6,5,0,0,10,10],dtype=float)
m_vec = np.array([70,10,7,56,10,10,10,10],dtype=float)
b_vec = np.array([48,3,2,0,0,0,0,0],dtype=float)
n_vec = np.array([80,10,7,29,20,10,20,10],dtype=float)
print('\nWilson without CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
l, u = dpropci_wilson_nocc(a,m,b,n,alpha=0.05)
print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))
print('\nWilson with CC')
for (a,m,b,n) in zip(a_vec,m_vec,b_vec,n_vec):
l, u = dpropci_wilson_cc(a,m,b,n,alpha=0.05)
print('{:2.0f}/{:2.0f}-{:2.0f}/{:2.0f} ; {:6.4f} ; {:8.4f}, {:8.4f}'.format(a,m,b,n,a/m-b/n,l,u))
HTH
The accepted solution seems to use a hard-coded z-value (best for performance).
In the event that you wanted a direct python equivalent of the ruby formula from the blogpost with a dynamic z-value (based on the confidence interval):
import math
import scipy.stats as st
def ci_lower_bound(pos, n, confidence):
if n == 0:
return 0
z = st.norm.ppf(1 - (1 - confidence) / 2)
phat = 1.0 * pos / n
return (phat + z * z / (2 * n) - z * math.sqrt((phat * (1 - phat) + z * z / (4 * n)) / n)) / (1 + z * z / n)
If you'd like to actually calculate z directly from a confidence bound and want to avoid installing numpy/scipy, you can use the following snippet of code,
import math
def binconf(p, n, c=0.95):
'''
Calculate binomial confidence interval based on the number of positive and
negative events observed. Uses Wilson score and approximations to inverse
of normal cumulative density function.
Parameters
----------
p: int
number of positive events observed
n: int
number of negative events observed
c : optional, [0,1]
confidence percentage. e.g. 0.95 means 95% confident the probability of
success lies between the 2 returned values
Returns
-------
theta_low : float
lower bound on confidence interval
theta_high : float
upper bound on confidence interval
'''
p, n = float(p), float(n)
N = p + n
if N == 0.0: return (0.0, 1.0)
p = p / N
z = normcdfi(1 - 0.5 * (1-c))
a1 = 1.0 / (1.0 + z * z / N)
a2 = p + z * z / (2 * N)
a3 = z * math.sqrt(p * (1-p) / N + z * z / (4 * N * N))
return (a1 * (a2 - a3), a1 * (a2 + a3))
def erfi(x):
"""Approximation to inverse error function"""
a = 0.147 # MAGIC!!!
a1 = math.log(1 - x * x)
a2 = (
2.0 / (math.pi * a)
+ a1 / 2.0
)
return (
sign(x) *
math.sqrt( math.sqrt(a2 * a2 - a1 / a) - a2 )
)
def sign(x):
if x < 0: return -1
if x == 0: return 0
if x > 0: return 1
def normcdfi(p, mu=0.0, sigma2=1.0):
"""Inverse CDF of normal distribution"""
if mu == 0.0 and sigma2 == 1.0:
return math.sqrt(2) * erfi(2 * p - 1)
else:
return mu + math.sqrt(sigma2) * normcdfi(p)
Here is a simplified (no need for numpy) and slightly improved (0 and n values for k do not cause a math domain error) version of the Wilson score confidence interval with continuity correction, from the original sourcecode written by batesbatesbates in another answer, and also a pure python no-numpy non-continuity correction version, with 2 equivalent ways to calculate (can be switched with eqmode argument, but both ways give the exact same non-continuity correction results):
import math
def propci_wilson_nocc(k, n, z=1.96, eqmode=0):
# Calculates the Binomial Proportion Confidence Interval using the Wilson Score method without continuation correction
# Equations eqmode == 1 from: https://en.wikipedia.org/w/index.php?title=Binomial_proportion_confidence_interval&oldid=1101942017#Wilson_score_interval
# Equations eqmode == 0 from: https://www.evanmiller.org/how-not-to-sort-by-average-rating.html
# The results should be close to:
# from statsmodels.stats.proportion import proportion_confint
# proportion_confint(k, n, alpha=0.05, method='wilson')
#z=1.44 = 85%, 1.96 = 95%
if n == 0:
return 0
p_hat = float(k) / n
z2 = z**2
if eqmode == 0:
ci_l = (p_hat + z2/(2*n) - z*math.sqrt(max(0.0, (p_hat*(1 - p_hat) + z2/(4*n))/n))) / (1 + z2 / n)
else:
ci_l = (1.0 / (1.0 + z2/n)) * (p_hat + z2/(2*n)) - (z / (1 + z2/n)) * math.sqrt(max(0.0, (p_hat*(1 - p_hat)/n + z2/(4*(n**2)))))
if eqmode == 0:
ci_u = (p_hat + z2/(2*n) + z*math.sqrt(max(0.0, (p_hat*(1 - p_hat) + z2/(4*n))/n))) / (1 + z2 / n)
else:
ci_u = (1.0 / (1.0 + z2/n)) * (p_hat + z2/(2*n)) + (z / (1 + z2/n)) * math.sqrt(max(0.0, (p_hat*(1 - p_hat)/n + z2/(4*(n**2)))))
return [ci_l, ci_u]
def propci_wilson_cc(n, k, z=1.96):
# Calculates the Binomial Proportion Confidence Interval using the Wilson Score method with continuation correction
# i.e. Method 4 in Newcombe [1]: R. G. Newcombe. Two-sided confidence intervals for the single proportion, 1998;
# verified via Table 1
# originally written by batesbatesbates https://stackoverflow.com/questions/10029588/python-implementation-of-the-wilson-score-interval/74021634#74021634
p_hat = k/n
q = 1.0-p
z2 = z**2
denom = 2*(n+z2)
num = 2.0*n*p_hat + z2 - 1.0 - z*math.sqrt(max(0.0, z2 - 2 - 1.0/n + 4*p_hat*(n*q + 1)))
ci_l = num/denom
num2 = 2.0*n*p_hat + z2 + 1.0 + z*math.sqrt(max(0.0, z2 + 2 - 1.0/n + 4*p_hat*(n*q - 1)))
ci_u = num2/denom
if p_hat == 0:
ci_l = 0.0
elif p_hat == 1:
ci_u = 1.0
return [ci_l, ci_u]
Note that the returned value will always be bounded between [0.0, 1.0] (due to how p_hat is a ratio of k/n), this is why it's a score and not really a confidence interval, but it's easy to project back to a confidence interval by multiplying ci_l * n and ci_u * n, these values will be in the same domain as k and can be plotted alongside.
Here is a much more readable version for how to compute the Wilson Score interval without continuity correction, by Bartosz Mikulski:
from math import sqrt
def wilson(p, n, z = 1.96):
denominator = 1 + z**2/n
centre_adjusted_probability = p + z*z / (2*n)
adjusted_standard_deviation = sqrt((p*(1 - p) + z*z / (4*n)) / n)
lower_bound = (centre_adjusted_probability - z*adjusted_standard_deviation) / denominator
upper_bound = (centre_adjusted_probability + z*adjusted_standard_deviation) / denominator
return (lower_bound, upper_bound)