Problems Solving XOR with Genetic Algorithm

Problems Solving XOR with Genetic Algorithm - python

I'm trying to solve XOR problem using neural network. For training I'm using genetic algorithm.
population size : 200
max_generations: 10000
crossover rate : 0,8
mutation rate : 0.1
number of weights : 9
activation function : sigmoid
selection method : high percentance for the ones with best fits
Code:
def crossover(self,wfather,wmother):
r = np.random.random()
if r <= self.crossover_perc:
new_weight= self.crossover_perc*wfather+(1-self.crossover_perc)*wmother
new_weight2=self.crossover_perc*wmother+(1-self.crossover_perc)*wfather
return new_weight,new_weight2
else:
return wfather,wmother
def select(self,fits):
percentuais = np.array(fits) / float(sum(fits))
vet = [percentuais[0]]
for p in percentuais[1:]:
vet.append(vet[-1] + p)
r = np.random.random()
#print(len(vet), r)
for i in range(len(vet)):
if r <= vet[i]:
return i
def mutate(self, weight):
r = np.random.random()
if r <= self.mut_perc:
mutr=np.random.randint(self.number_weights)
weight[mutr] = weight[mutr] + np.random.normal()
return weight
def activation_fuction(self, net):
return 1 / (1 + math.exp(-net))
Problem:
~5/10 tests works fine
Expected Output:
0,0 0
0,1 1
1,0 1
1,1 0
Tests:
Its inconsistent, sometimes i got four 0's, three 1's, multiple results
Could you help me find the error?
**Edit
All Code:
def create_initial_population(self):
population = np.random.uniform(-40, 40, [self.population_size, self.number_weights])
return population
def feedforward(self, inp1, inp2, weights):
bias = 1
x = self.activation_fuction(bias * weights[0] + (inp1 * weights[1]) + (inp2 * weights[2]))
x2 = self.activation_fuction(bias * weights[3] + (inp1 * weights[4]) + (inp2 * weights[5]))
out = self.activation_fuction(bias * weights[6] + (x * weights[7]) + (x2 * weights[8]))
print(inp1, inp2, out)
return out
def fitness(self, weights):
y1 = abs(0.0 - self.feedforward(0.0, 0.0, weights))
y2 = abs(1.0 - self.feedforward(0.0, 1.0, weights))
y3 = abs(1.0 - self.feedforward(1.0, 0.0, weights))
y4 = abs(0.0 - self.feedforward(1.0, 1.0, weights))
error = (y1 + y2 + y3 + y4) ** 2
# print("Error: ", 1/error)
return 1 / error
def sortpopbest(self, pop):
pop_with_fit = [(weights,self.fitness(weights)) for weights in pop]
sorted_population=sorted(pop_with_fit, key=lambda weights_fit: weights_fit[1]) #Worst->Best One
fits = []
pop = []
for i in sorted_population:
pop.append(i[0])
fits.append(i[1])
return pop,fits
def execute(self):
pop = self.create_initial_population()
for g in range(self.max_generations): # maximo de geracoes
pop, fits = self.sortpopbest(pop)
nova_pop=[]
for c in range(int(self.population_size/2)):
weights = pop[self.select(fits)]
weights2 = pop[self.select(fits)]
new_weights,new_weights2=self.crossover(weights,weights2)
new_weights=self.mutate(new_weights)
new_weights2=self.mutate(new_weights2)
#print(fits)
nova_pop.append(new_weights) # adiciona na nova_pop
nova_pop.append(new_weights2)
pop = nova_pop
print(len(fits),fits)

Some input:
XOR is a simple problem. With a few hundreds of random initialization, you should have some lucky ones that solve it immediately (if "solved" means that they output is correct after doing a threshold). This is a good test to see if your initialization and feed-forward pass is correct, without debugging the whole GA all at once. Or you chould just hand-craft the correct weights and biases, and see if that works.
Your initial weights (uniform -40...+40) are way too large. I guess for XOR this maybe okay-ish. But initial weights should be such that most neurons don't saturate, but aren't fully in the linear zone of sigmoid either.
After your implementation works, have a look at this numpy implementation of the feed-foward pass of a neural network for how to do it with less code.

Related

L-BFGS-B code, Scipy (sciopt.fmin_l_bfgs_b(func, init_guess, maxiter=10, bounds=list(bounds), disp=1, iprint=101))

I'm using the L-BFGS-B optimizer to find the minima of a function. This will help me calculate sharpness for the function. However, I'm not sure if this following message is considered a normal message (i.e. Is there something wrong with my program or is this message typical?) See below:
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 28149514 M = 10
At X0 0 variables are exactly at the bounds
^[[C
At iterate 0 f= -3.59325D+00 |proj g|= 2.10249D-03
At iterate 1 f= -2.47853D+01 |proj g|= 4.20499D-03
Bad direction in the line search;
refresh the lbfgs memory and restart the iteration.
At iterate 2 f= -2.53202D+01 |proj g|= 4.17686D-03
At iterate 3 f= -2.53202D+01 |proj g|= 4.17686D-03
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
***** 3 43 ****** 0 ***** 4.177D-03 -2.532D+01
F = -25.320247650146484
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
Warning: more than 10 function and gradient
evaluations in the last line search. Termination
may possibly be caused by a bad search direction.
I got the following sharpness anyway which is relatively consistent with the paper I'm trying to reproduce: It's just that I'm a bit concerned with the above message.
tensor(473.0201)
Here is my code for computing sharpness:
def get_sharpness(data_loader, model, criterion, epsilon, manifolds=0):
# extract current x0
x0 = None
for p in model.parameters():
if x0 is None:
x0 = p.data.view(-1)
else:
x0 = torch.cat((x0, p.data.view(-1)))
x0 = x0.cpu().numpy()
# get current f_x
f_x0, _ = get_minus_cross_entropy(x0, data_loader, model, criterion)
f_x0 = -f_x0
logging.info('min loss f_x0 = {loss:.4f}'.format(loss=f_x0))
# find the minimum
if 0==manifolds:
x_min = np.reshape(x0 - epsilon * (np.abs(x0) + 1), (x0.shape[0], 1))
x_max = np.reshape(x0 + epsilon * (np.abs(x0) + 1), (x0.shape[0], 1))
bounds = np.concatenate([x_min, x_max], 1)
func = lambda x: get_minus_cross_entropy(x, data_loader, model, criterion, training=True)
init_guess = x0
else:
warnings.warn("Small manifolds may not be able to explore the space.")
assert(manifolds<=x0.shape[0])
#transformer = rp.GaussianRandomProjection(n_components=manifolds)
#transformer.fit(np.random.rand(manifolds, x0.shape[0]))
#A_plus = transformer.components_
#A = np.linalg.pinv(A_plus)
A_plus = np.random.rand(manifolds, x0.shape[0])*2.-1.
# normalize each column to unit length
A_plus_norm = np.linalg.norm(A_plus, axis=1)
A_plus = A_plus / np.reshape(A_plus_norm, (manifolds,1))
A = np.linalg.pinv(A_plus)
abs_bound = epsilon * (np.abs(np.dot(A_plus, x0))+1)
abs_bound = np.reshape(abs_bound, (abs_bound.shape[0], 1))
bounds = np.concatenate([-abs_bound, abs_bound], 1)
def func(y):
floss, fg = get_minus_cross_entropy(x0 + np.dot(A, y), data_loader, model, criterion, training=True)
return floss, np.dot(np.transpose(A), fg)
#func = lambda y: get_minus_cross_entropy(x0+np.dot(A, y), data_loader, model, criterion, training=True)
init_guess = np.zeros(manifolds)
#rand_selections = (np.random.rand(bounds.shape[0])+1e-6)*0.99
#init_guess = np.multiply(1.-rand_selections, bounds[:,0])+np.multiply(rand_selections, bounds[:,1])
minimum_x, f_x, d = sciopt.fmin_l_bfgs_b(func, init_guess, maxiter=10, bounds=list(bounds), disp=1, iprint=101)
#factr=10.,
#pgtol=1.e-12,
f_x = -f_x
logging.info('max loss f_x = {loss:.4f}'.format(loss=f_x))
sharpness = (f_x - f_x0)/(1+f_x0)*100
print(sharpness)
# recover the model
x0 = torch.from_numpy(x0).float()
x0 = x0.cuda()
x_start = 0
for p in model.parameters():
psize = p.data.size()
peltnum = 1
for s in psize:
peltnum *= s
x_part = x0[x_start:x_start + peltnum]
p.data = x_part.view(psize)
x_start += peltnum
return sharpness
Which was taken from this repository:
https://github.com/wenwei202/smoothout/blob/master/measure_sharpness.py
I'm concerned about exact accuracy.

First, l-bfgs-b will only give a global minimum for a convex function.
the message
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
is the normal convergence message.
The warning you are getting says that there are a lot of function/gradient evaluations in the line search - this can often happen when you use l-bfgs-b on non convex functions. So if the thing you're minimizing is non convex (and it seems like it might be just by glancing at the code), I would say this is normal.

Why don't my 2 functions give the same results?

Below I have the code of my attempt to make a neural network with 2 inputs and 3 outputs. While the training gives good results, when I try to input the numbers, the results are way off. After I made some small changes, I observed that, even though they return the output from the function which should be the same, again, the results were different. The only explanation I can think of is that there is a bug.
The functions that I'm talking about are "train" and "result".
Here is the code:
from numpy import dot, exp, max, sum, random, array
class Network:
def __init__(self):
self.w = random.random((2,3))
def sigmoid(self, x, derivate = False):
if(derivate == True):
return x * (1 - x)
return 1 /(1 + exp(-x))
def train(self):
trainingInput = array([[0,0],[0,1],[1,0],[1,1]])
trainingOutput = array([[0,0,0],[0,1,0],[0,0,1],[1,0,0]])
n = 0
while(n < 10000):
exOutput = self.sigmoid(dot(trainingInput, self.w) - 0.1)
error = trainingOutput - exOutput
self.w += dot(trainingInput.T, error *
self.sigmoid(exOutput,True))
n += 1
return exOutput
def result(self):
trainingInput = array([[0,0],[0,1],[1,0],[1,1]])
exOutput = self.sigmoid(dot(trainingInput, self.w) - 0.1)
return exOutput
network = Network()
c = 0
d = 1
o = network.result()
output = network.train()
print(o)
print(output)

you should first train and then check the results.
If you check it before training, obviously two results will be different.
you can just once again calculate the results after training, hopefully, this will solve your bug.

Trying to build neural net for digit recognition in Python. Unable to get theta2 and predictions correct

I am following Andrew's Coursera course on machine learning. I am trying to build a 3 layers neural net for digit recognition in Python (784 input, 25 hidden, 10 output). However, I am unable to get the predictions (of the training data) correct (accuracy < 5% at 100 iter, accuracy not increasing with iteration).
J (the cost function) seems to be going down (see photo 1) and I have done gradient checking (before minimizing) and it seems to match to around 1e-11 (see photo 2).
I have compared the theta1 and theta2 after 100 iterations to my working matlab code (see code snippet 1 for octave and code snippet 2 for python). It seems theta1 is reasonably similar but theta2 is very different -- see code snippet 2. (I know they should differ because of the different optimisation routines. However, firstly, I have place the same initial thetas into both codes. Secondly, my reasoning is that they should start to converge, or at least get close, after 100 iterations)
The only error I see is:
-c:32: RuntimeWarning: overflow encountered in exp
when running the sigmoid during the optimising. However, I was told that this is not essential and it is normal to encounter this error during optimising? Furthermore, because it is a sigmoid, anytime the input is large, it will tend towards 1 anyways.
I have also attached my code in snippet 3. I have cut out all the other non-essential bits (like gradient checking) to make it as short as possible.
I would appreciate any help into this as I cannot even find where it is going wrong, let alone fix it. Thank you.
Photos:
J (cost function) decreasing to 1.8 after 12 iterations
Gradient checking before optimizing, they look very similar
Code snippet:
Initializing Neural Network Parameters ...
initial1
-0.0100100
-0.0771400
-0.1113800
-0.0230100
0.0547800
-0.0505500
-0.0731200
-0.0988700
0.0128000
-0.0855400
-0.1002500
-0.1137200
-0.0669300
-0.0999900
0.0084500
-0.0363200
-0.0588600
-0.0431100
-0.1133700
-0.0326300
0.0282800
0.0052400
-0.1134600
-0.0617700
0.0267600
initial2
0.0273700
0.1026000
-0.0502100
-0.0699100
0.0190600
0.1004000
0.0784600
-0.0075900
-0.0362100
0.0286200
Doing fminunc
Training Neural Network...
Iteration 100 | Cost: 6.219605e-01
theta1
-0.0099719
-0.0768462
-0.1109559
-0.0229224
0.0545714
-0.0503575
-0.0728415
-0.0984935
0.0127513
-0.0852143
-0.0998682
-0.1132869
-0.0666751
-0.0996092
0.0084178
-0.0361817
-0.0586359
-0.0429458
-0.1129383
-0.0325057
0.0281723
0.0052200
-0.1130279
-0.0615348
0.0266581
theta2
1.124918
1.603780
-1.266390
-0.848874
0.037956
-1.360841
2.145562
-1.448657
-1.262285
-1.357635
theta1_initial
[-0.01001 -0.07714 -0.11138 -0.02301 0.05478 -0.05055 -0.07312 -0.09887
0.0128 -0.08554 -0.10025 -0.11372 -0.06693 -0.09999 0.00845 -0.03632
-0.05886 -0.04311 -0.11337 -0.03263 0.02828 0.00524 -0.11346 -0.06177
0.02676]
theta2_initial
[ 0.02737 0.1026 -0.05021 -0.06991 0.01906 0.1004 0.07846 -0.00759
-0.03621 0.02862]
Doing fminunc
-c:32: RuntimeWarning: overflow encountered in exp
theta1
[-0.00997202 -0.07680716 -0.11086841 -0.02292044 0.05455335 -0.05034252
-0.07280686 -0.09842603 0.01275117 -0.08516515 -0.0997987 -0.11319546
-0.06664666 -0.09954009 0.00841804 -0.03617494 -0.05861458 -0.04293555
-0.1128474 -0.0325006 0.02816879 0.00522031 -0.1129369 -0.06151103
0.02665508]
theta2
[ 0.27954826 -0.08007496 -0.36449273 -0.22988024 0.06849659 -0.47803973
1.09023041 -0.25570559 -0.24537494 -0.40341995]
#-----------------BEGIN HEADERS-----------------
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import csv
import scipy
#-----------------END HEADERS-----------------
#-----------------BEGIN FUNCTION 1-----------------
def randinitialize(L_in, L_out):
w = np.zeros((L_out, 1 + L_in))
epsilon_init = 0.12
w = np.random.rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init
return w
#-----------------END FUNCTION 1-----------------
#-----------------BEGIN FUNCTION 2-----------------
def sigmoid(lz):
g = 1.0/(1.0+np.exp(-lz))
return g
#-----------------END FUNCTION 2-----------------
#-----------------BEGIN FUNCTION 3-----------------
def sigmoidgradient(lz):
g = np.multiply(sigmoid(lz),(1-sigmoid(lz)))
return g
#-----------------END FUNCTION 3-----------------
#-----------------BEGIN FUNCTION 4-----------------
def nncostfunction(ltheta_ravel, linput_layer_size, lhidden_layer_size, lnum_labels, lx, ly, llambda_reg):
ltheta1 = np.array(np.reshape(ltheta_ravel[:lhidden_layer_size * (linput_layer_size + 1)], (lhidden_layer_size, (linput_layer_size + 1))))
ltheta2 = np.array(np.reshape(ltheta_ravel[lhidden_layer_size * (linput_layer_size + 1):], (lnum_labels, (lhidden_layer_size + 1))))
ltheta1_grad = np.zeros((np.shape(ltheta1)))
ltheta2_grad = np.zeros((np.shape(ltheta2)))
y_matrix = []
lm = np.shape(lx)[0]
eye_matrix = np.eye(lnum_labels)
for i in range(len(ly)):
y_matrix.append(eye_matrix[int(ly[i])-1,:]) #The minus one as python is zero based
y_matrix = np.array(y_matrix)
a1 = np.hstack((np.ones((lm,1)), lx)).astype(float)
z2 = sigmoid(ltheta1.dot(a1.T))
a2 = (np.concatenate((np.ones((np.shape(z2)[1], 1)), z2.T), axis=1)).astype(float)
a3 = sigmoid(ltheta2.dot(a2.T))
h = a3
J_unreg = 0
J = 0
J_unreg = (1/float(lm))*np.sum(\
-np.multiply(y_matrix,np.log(h.T))\
-np.multiply((1-y_matrix),np.log(1-h.T))\
,axis=None)
J = J_unreg + (llambda_reg/(2*float(lm)))*\
(np.sum(\
np.multiply(ltheta1[:,1:],ltheta1[:,1:])\
,axis=None)+np.sum(\
np.multiply(ltheta2[:,1:],ltheta2[:,1:])\
,axis=None))
delta3 = a3.T - y_matrix
delta2 = np.multiply((delta3.dot(ltheta2[:,1:])), (sigmoidgradient(ltheta1.dot(a1.T))).T)
cdelta2 = ((a2.T).dot(delta3)).T
cdelta1 = ((a1.T).dot(delta2)).T
ltheta1_grad = (1/float(lm))*cdelta1
ltheta2_grad = (1/float(lm))*cdelta2
theta1_hold = ltheta1
theta2_hold = ltheta2
theta1_hold[:,0] = 0;
theta2_hold[:,0] = 0;
ltheta1_grad = ltheta1_grad + (llambda_reg/float(lm))*theta1_hold;
ltheta2_grad = ltheta2_grad + (llambda_reg/float(lm))*theta2_hold;
thetagrad_ravel = np.concatenate((np.ravel(ltheta1_grad), np.ravel(ltheta2_grad)))
return (J, thetagrad_ravel)
#-----------------END FUNCTION 4-----------------
#-----------------BEGIN FUNCTION 5-----------------
def predict(ltheta1, ltheta2, x):
m, n = np.shape(x)
p = np.zeros(m)
h1 = sigmoid((np.hstack((np.ones((m,1)),x.astype(float)))).dot(ltheta1.T))
h2 = sigmoid((np.hstack((np.ones((m,1)),h1))).dot(ltheta2.T))
for i in range(0,np.shape(h2)[0]):
p[i] = np.argmax(h2[i,:])
return p
#-----------------END FUNCTION 5-----------------
## Setup the parameters you will use for this exercise
input_layer_size = 784; # 28x28 Input Images of Digits
hidden_layer_size = 25; # 25 hidden units
num_labels = 10; # 10 labels, from 0 to 9
data = []
#Reading in data, split into X and y, rewrite label 0 to 10 (for easy comparison to course)
with open('train.csv', 'rb') as csvfile:
has_header = csv.Sniffer().has_header(csvfile.read(1024))
csvfile.seek(0) # rewind
data_csv = csv.reader(csvfile, delimiter=',')
if has_header:
next(data_csv)
for row in data_csv:
data.append(row)
data = np.array(data)
x = data[:,1:]
y = data[:,0]
y = y.astype(int)
for i in range(len(y)):
if y[i] == 0:
y[i] = 10
#Set basic parameters
m, n = np.shape(x)
lambda_reg = 1.0
#Randomly initalize weights for Theta_initial
#theta1_initial = np.genfromtxt('tt1.csv', delimiter=',')
#theta2_initial = np.genfromtxt('tt2.csv', delimiter=',')
theta1_initial = randinitialize(input_layer_size, hidden_layer_size);
theta2_initial = randinitialize(hidden_layer_size, num_labels);
theta_initial_ravel = np.concatenate((np.ravel(theta1_initial), np.ravel(theta2_initial)))
#Doing optimize
fmin = scipy.optimize.minimize(fun=nncostfunction, x0=theta_initial_ravel, args=(input_layer_size, hidden_layer_size, num_labels, x, y, lambda_reg), method='L-BFGS-B', jac=True, options={'maxiter': 10, 'disp': True})
fmin
theta1 = np.array(np.reshape(fmin.x[:hidden_layer_size * (input_layer_size + 1)], (hidden_layer_size, (input_layer_size + 1))))
theta2 = np.array(np.reshape(fmin.x[hidden_layer_size * (input_layer_size + 1):], (num_labels, (hidden_layer_size + 1))))
p = predict(theta1, theta2, x);
for i in range(len(y)):
if y[i] == 10:
y[i] = 0
correct = [1 if a == b else 0 for (a, b) in zip(p,y)]
accuracy = (sum(map(int, correct)) / float(len(correct)))
print 'accuracy = {0}%'.format(accuracy * 100)

I think I have fixed the problem: it seems I messed up the index
should be:
y_matrix.append(eye_matrix[int(ly[i]),:])
instead of:
y_matrix.append(eye_matrix[int(ly[i])-1,:])

Finding the self-consistent solution to an equation

At the bottom of this question are a set of functions transcribed from a published neural-network model. When I call R, I get the following error:
RuntimeError: maximum recursion depth exceeded while calling a Python object
Note that within each call to R, a recursive call to R is made for every other neuron in the network. This is what causes the recursion depth to be exceeded. Each return value for R depends on all the others (with the network involving N = 512 total values.) Does anyone have any idea what method should be used to compute the self-consistent solution for R? Note that R itself is a smooth function. I've tried treating this as a vector root-solving problem -- but in this case the 512 dimensions are not independent. With so many degrees of freedom, the roots are never found (using the scipy.optimize functions). Does Python have any tools that can help with this? Maybe it would be more natural to solve R using something like Mathematica? I don't know how this is normally done.
"""Recurrent model with strong excitatory recurrence."""
import numpy as np
l = 3.14
def R(x_i):
"""Steady-state firing rate of neuron at location x_i.
Parameters
----------
x_i : number
Location of this neuron.
Returns
-------
rate : float
Firing rate.
"""
N = 512
T = 1
x = np.linspace(-2, 2, N)
sum_term = 0
for x_j in x:
sum_term += J(x_i - x_j) * R(x_j)
rate = I_S(x_i) + I_A(x_i) + 1.0 / N * sum_term - T
if rate < 0:
return 0
return rate
def I_S(x):
"""Sensory input.
Parameters
----------
x : number
Location of this neuron.
Returns
-------
float
Sensory input to neuron at x.
"""
S_0 = 0.46
S_1 = 0.66
x_S = 0
sigma_S = 1.31
return S_0 + S_1 * np.exp(-0.5 * (x - x_S) ** 2 / sigma_S ** 2)
def I_A(x):
"""Attentional additive bias.
Parameters
----------
x : number
Location of this neuron.
Returns
-------
number
Additive bias for neuron at x.
"""
x_A = 0
A_1 = 0.089
sigma_A = 0.35
A_0 = 0
sigma_A_prime = 0.87
if np.abs(x - x_A) < l:
return (A_1 * np.exp(-0.5 * (x - x_A) ** 2 / sigma_A ** 2) +
A_0 * np.exp(-0.5 * (x - x_A) ** 2 / sigma_A_prime ** 2))
return 0
def J(dx):
"""Connection strength.
Parameters
----------
dx : number
Neuron i's distance from neuron j.
Returns
-------
number
Connection strength.
"""
J_0 = -2.5
J_1 = 8.5
sigma_J = 1.31
if np.abs(dx) < l:
return J_0 + J_1 * np.exp(-0.5 * dx ** 2 / sigma_J ** 2)
return 0
if __name__ == '__main__':
pass

This recursion never ends since there is no termination condition before recursive call, adjusting maximum recursion depth does not help
def R(x_i):
...
for x_j in x:
sum_term += J(x_i - x_j) * R(x_j)
Perhaps you should be doing something like
# some suitable initial guess
state = guess
while True: # or a fixed number of iterations
next_state = compute_next_state(state)
if some_condition_check(state, next_state):
# return answer
return state
if some_other_check(state, next_state):
# something wrong, terminate
raise ...

Change the maximum recursion depth using sys.setrecursionlimit
import sys
sys.setrecursionlimit(10000)
def rec(i):
if i > 1000:
print 'i is over 1000!'
return
rec(i + 1)
rec(0)
More info: https://docs.python.org/3/library/sys.html#sys.setrecursionlimit`

Simulating a neuron spike train in python

The model I'm working on has a neuron (modeled by the Hodgkin-Huxley equations), and the neuron itself receives a bunch of synaptic inputs from other neurons because it is in a network. The standard way to model the inputs is with a spike train made up of a bunch of delta function pulses that arrive at a specified rate, as a Poisson process. Some of the pulses provide an excitatory reaction to the neuron, and some provide an inhibitory pulse. So the synaptic current should look like this:
Here, Ne is the number of excitatory neurons, Ni is inhibitory, the h's are either 0 or 1 (1 with probability p) representing whether or not a spike was successfully transmitted, and the $t_k^l$ in the delta function is the discharge time of the l^th spike of the kth neuron (same for the $t_m^n$). So the basic idea behind how we tried coding this was to suppose first I had 100 neurons providing pulses into my HH neuron (80 of which are excitatory, 20 of which are inhibitory). We then formed an array where one column enumerated the neurons (so that neurons #0-79 were excitatory ones and #80-99 were inhibitory). We then checked to see if there is a spike in some time interval, and if there was, choose a random number between 0-1 and if it's below my specified probability p, then assign it the number 1, otherwise make it 0. We then plot the voltage as a function of time to look to see when the neuron spikes.
I think the code works, BUT the problem is that as soon as I add some more neurons in the network (one paper claimed they used 5000 total neurons), it takes forever to run, which is just unfeasible for doing numerical simulations. My question is: is there a better way to simulate a spike train pulsing into a neuron so that the computation is substantially faster for a large number of neurons in the network? Here is the code we tried: (it's a little long because the HH equations are quite detailed):
import scipy as sp
import numpy as np
import pylab as plt
#Constants
C_m = 1.0 #membrane capacitance, in uF/cm^2"""
g_Na = 120.0 #Sodium (Na) maximum conductances, in mS/cm^2""
g_K = 36.0 #Postassium (K) maximum conductances, in mS/cm^2"""
g_L = 0.3 #Leak maximum conductances, in mS/cm^2"""
E_Na = 50.0 #Sodium (Na) Nernst reversal potentials, in mV"""
E_K = -77.0 #Postassium (K) Nernst reversal potentials, in mV"""
E_L = -54.387 #Leak Nernst reversal potentials, in mV"""
def poisson_spikes(t, N=100, rate=1.0 ):
spks = []
dt = t[1] - t[0]
for n in range(N):
spkt = t[np.random.rand(len(t)) < rate*dt/1000.] #Determine list of times of spikes
idx = [n]*len(spkt) #Create vector for neuron ID number the same length as time
spkn = np.concatenate([[idx], [spkt]], axis=0).T #Combine tw lists
if len(spkn)>0:
spks.append(spkn)
spks = np.concatenate(spks, axis=0)
return spks
N = 100
N_ex = 80 #(0..79)
N_in = 20 #(80..99)
G_ex = 1.0
K = 4
dt = 0.01
t = sp.arange(0.0, 300.0, dt) #The time to integrate over """
ic = [-65, 0.05, 0.6, 0.32]
spks = poisson_spikes(t, N, rate=10.)
def alpha_m(V):
return 0.1*(V+40.0)/(1.0 - sp.exp(-(V+40.0) / 10.0))
def beta_m(V):
return 4.0*sp.exp(-(V+65.0) / 18.0)
def alpha_h(V):
return 0.07*sp.exp(-(V+65.0) / 20.0)
def beta_h(V):
return 1.0/(1.0 + sp.exp(-(V+35.0) / 10.0))
def alpha_n(V):
return 0.01*(V+55.0)/(1.0 - sp.exp(-(V+55.0) / 10.0))
def beta_n(V):
return 0.125*sp.exp(-(V+65) / 80.0)
def I_Na(V, m, h):
return g_Na * m**3 * h * (V - E_Na)
def I_K(V, n):
return g_K * n**4 * (V - E_K)
def I_L(V):
return g_L * (V - E_L)
def I_app(t):
return 3
def I_syn(spks, t):
"""
Synaptic current
spks = [[synid, t],]
"""
exspk = spks[spks[:,0]<N_ex] # Check for all excitatory spikes
delta_k = exspk[:,1] == t # Delta function
if sum(delta_k) > 0:
h_k = np.random.rand(len(delta_k)) < 0.5 # p = 0.5
else:
h_k = 0
inspk = spks[spks[:,0] >= N_ex] #Check remaining neurons for inhibitory spikes
delta_m = inspk[:,1] == t #Delta function for inhibitory neurons
if sum(delta_m) > 0:
h_m = np.random.rand(len(delta_m)) < 0.5 #p =0.5
else:
h_m = 0
isyn = C_m*G_ex*(np.sum(h_k*delta_k) - K*np.sum(h_m*delta_m))
return isyn
def dALLdt(X, t):
V, m, h, n = X
dVdt = (I_app(t)+I_syn(spks,t)-I_Na(V, m, h) - I_K(V, n) - I_L(V)) / C_m
dmdt = alpha_m(V)*(1.0-m) - beta_m(V)*m
dhdt = alpha_h(V)*(1.0-h) - beta_h(V)*h
dndt = alpha_n(V)*(1.0-n) - beta_n(V)*n
return np.array([dVdt, dmdt, dhdt, dndt])
X = [ic]
for i in t[1:]:
dx = (dALLdt(X[-1],i))
x = X[-1]+dt*dx
X.append(x)
X = np.array(X)
V = X[:,0]
m = X[:,1]
h = X[:,2]
n = X[:,3]
ina = I_Na(V, m, h)
ik = I_K(V, n)
il = I_L(V)
plt.figure()
plt.subplot(3,1,1)
plt.title('Hodgkin-Huxley Neuron')
plt.plot(t, V, 'k')
plt.ylabel('V (mV)')
plt.subplot(3,1,2)
plt.plot(t, ina, 'c', label='$I_{Na}$')
plt.plot(t, ik, 'y', label='$I_{K}$')
plt.plot(t, il, 'm', label='$I_{L}$')
plt.ylabel('Current')
plt.legend()
plt.subplot(3,1,3)
plt.plot(t, m, 'r', label='m')
plt.plot(t, h, 'g', label='h')
plt.plot(t, n, 'b', label='n')
plt.ylabel('Gating Value')
plt.legend()
plt.show()
I'm not familiar with other packages designed specifically for neural networks, but I wanted to write my own, mainly because I plan to do stochastic analysis which requires quite a bit of mathematical detail, and I don't know if those packages provide such detail.

Profiling shows that most of your time is being spent in these two lines:
if sum(delta_k) > 0:
and
if sum(delta_m) > 0:
Changing each of these to:
if np.any(...)
speeds everything up by a factor of 10. Take a look at kernprof if you'd like to do more line by line profiling:
https://github.com/rkern/line_profiler

In complement to welch's answer, you can use scipy.integrate.odeint to accelerate integration: replacing
X = [ic]
for i in t[1:]:
dx = (dALLdt(X[-1],i))
x = X[-1]+dt*dx
X.append(x)
by
from scipy.integrate import odeint
X=odeint(dALLdt,ic,t)
speeds the calculation by more than 10 on my computer.

if you have an NVidia grpahics board you can use numba/numbapro to accelerate your python code and reach a real time 4K neurons with 128 presynaptic neurons each one.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problems Solving XOR with Genetic Algorithm - python

Related

L-BFGS-B code, Scipy (sciopt.fmin_l_bfgs_b(func, init_guess, maxiter=10, bounds=list(bounds), disp=1, iprint=101))

Why don't my 2 functions give the same results?

Trying to build neural net for digit recognition in Python. Unable to get theta2 and predictions correct

Finding the self-consistent solution to an equation

Simulating a neuron spike train in python

Categories

Resources