All Outputs Going to Zero MNIST NumPy Solution with Simple Neural Net

All Outputs Going to Zero MNIST NumPy Solution with Simple Neural Net - python

I'm trying to just use NumPy to get a simple, relatively accurate digit-reading neural net. My code runs and gets the right MNIST digit information, but ends up giving the same result of predicting each digit to be unlikely to fall in any of the 10 digit classes.
I think my error has to be somewhat basic. Is there a huge issue with not having thresholds? Are my datatypes messed up? Anything to point me in the right direction would be hugely appreciated; I've been staring at this and tweaking stuff for hours.
Here is a link to my code on GitHub: https://github.com/popuguy/ai-tests/blob/master/npmnistnn.py
And here's a paste:
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)
def display_mnist(img, label):
'''Visually display the 28x28 unformatted array
'''
basic_array = img
plt.imshow(basic_array.reshape((28,28)), cmap=cm.Greys)
plt.suptitle('Image is of a ' + label)
plt.show()
hidden_layer_1_num_nodes = 500
hidden_layer_2_num_nodes = 500
hidden_layer_3_num_nodes = 500
output_layer_num_nodes = 10
batch_size = 100
dimension = 28
full_iterations = 10
def convert_digit_to_onehot(digit):
return [0] * digit + [1] + [0] * (9 - digit)
images = mnist.train.images
# images = np.add(images, 0.1)
labels = mnist.train.labels
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def slope_from_sigmoid(x):
return x * (1 - x)
syn1 = 2 * np.random.random((dimension**2, hidden_layer_1_num_nodes)) - 1
syn2 = 2 * np.random.random((hidden_layer_1_num_nodes, hidden_layer_2_num_nodes)) - 1
syn3 = 2 * np.random.random((hidden_layer_2_num_nodes, hidden_layer_3_num_nodes)) - 1
syn4 = 2 * np.random.random((hidden_layer_3_num_nodes, output_layer_num_nodes)) - 1
testing = False
test_n = 3
for iter in range(full_iterations):
print('Epic epoch bro, we\'re at #' + str(iter+1))
for section in range(0, len(images), batch_size):
if testing:
print('Syn before',syn1)
training_images = images[section:section+batch_size]
training_labels = labels[section:section+batch_size]
l0 = training_images
l1 = sigmoid(np.dot(l0, syn1))
l2 = sigmoid(np.dot(l1, syn2))
l3 = sigmoid(np.dot(l2, syn3))
l4 = sigmoid(np.dot(l3, syn4))
l4_err = training_labels - l4
l4_delta = l4_err * slope_from_sigmoid(l4)
l3_err = np.dot(l4_delta, syn4.T)
l3_delta = l3_err * slope_from_sigmoid(l3)
l2_err = np.dot(l3_delta, syn3.T)
l2_delta = l2_err * slope_from_sigmoid(l2)
l1_err = np.dot(l2_delta, syn2.T)
l1_delta = l1_err * slope_from_sigmoid(l1)
syn4_update = np.dot(l3.T, l4_delta)
syn4 += syn4_update
syn3_update = np.dot(l2.T, l3_delta)
syn3 += syn3_update
syn2_update = np.dot(l1.T, l2_delta)
syn2 += syn2_update
syn1_update = np.dot(l0.T, l1_delta)
syn1 += syn1_update
if testing:
print('Syn after',syn1)
print('Due to syn1 update', syn1_update)
print('Number non-zero elems', len(syn1_update.nonzero()))
print('Which were', syn1_update.nonzero())
print('From the l1_delta', l1_delta)
print(l0[0:test_n])
print("----------")
print(l1[0:test_n])
print("----------")
print(l2[0:test_n])
print("----------")
print(l3[0:test_n])
print("----------")
print(l4[0:test_n])
print("----------")
print(training_labels[0:test_n])
a=input()
if len(a) > 0 and a[0]=='s':
testing=False
correct = 0
total = 0
l4list = l4.tolist()
training_labelslist = training_labels.tolist()
print('Num things', len(l4list))
for i in range(len(l4list)):
print(["{0:0.2f}".format(a) for a in l4list[i]])
# print(l4list[i])
# display_mnist(l0[i], str(l4list[i].index(max(l4list[i]))))
if l4list[i].index(max(l4list[i])) == training_labelslist[i].index(max(training_labelslist[i])):
correct += 1
total += 1
print('Final round', 100*(correct/total),'percent correct')

Hyperparameters in this instance were just improperly tuned. Bringing down the number of nodes per hidden layer to 15 and changing the learning rate down to 0.1 yields a significant performance increase.

Related

Floating RMS in Python

I'm trying to implement a floating window RMS in python. I'm simulating an incoming stream of measurement data by simpling iterating over time and calculating the sine wave. Since it's a perfect sine wave, its easy to compare the results using math. I also added a numpy calculation to confirm my arrays are populated correctly.
However my floating RMS is not returning the right values, unrelated to my sample size.
Code:
import matplotlib.pyplot as plot
import numpy as np
import math
if __name__ == '__main__':
# sine generation
time_array = []
value_array = []
start = 0
end = 6*math.pi
steps = 100000
amplitude = 10
#rms calc
acc_load_current = 0
sample_size = 1000
for time in np.linspace(0, end, steps):
time_array.append(time)
actual_value = amplitude * math.sin(time)
value_array.append(actual_value)
# rms calc
acc_load_current -= (acc_load_current/sample_size)
# square here
sq_value = actual_value * actual_value
acc_load_current += sq_value
# mean and then root here
floating_rms = np.sqrt(acc_load_current/sample_size)
fixed_rms = np.sqrt(np.mean(np.array(value_array)**2))
math_rms = 1/math.sqrt(2) * amplitude
print(floating_rms)
print(fixed_rms)
print(math_rms)
plot.plot(time_array, value_array)
plot.show()
Result:
2.492669969708522
7.071032456438027
7.071067811865475

I solved the issue by usin a recursive average with zero crossing detection:
import matplotlib.pyplot as plot
import numpy as np
import math
def getAvg(prev_avg, x, n):
return (prev_avg * n + x) / (n+1)
if __name__ == '__main__':
# sine generation
time_array = []
value_array = []
used_value_array = []
start = 0
end = 6*math.pi + 0.5
steps = 10000
amplitude = 325
#rms calc
rms_stream = 0
stream_counter = 0
#zero crossing
in_crossing = 0
crossing_counter = 0
crossing_limits = [-5,5]
left_crossing = 0
for time in np.linspace(0, end, steps):
time_array.append(time)
actual_value = amplitude * math.sin(time) + 4 * np.random.rand()
value_array.append(actual_value)
# detect zero crossing, by checking the first time we reach the limits
# and then not counting until we left it again
is_crossing = crossing_limits[0] < actual_value < crossing_limits[1]
# when we are at amp/2 we can be sure the noise is not causing zero crossing
left_crossing = abs(actual_value) > amplitude/2
if is_crossing and not in_crossing:
in_crossing = 1
crossing_counter += 1
elif not is_crossing and in_crossing and left_crossing:
in_crossing = 0
# rms calc
# square here
if 2 <= crossing_counter <= 3:
sq_value = actual_value * actual_value
rms_stream = getAvg(rms_stream, sq_value, stream_counter)
stream_counter += 1
# debugging by recording the used values
used_value_array.append(actual_value)
else:
used_value_array.append(0)
# mean and then root here
stream_rms_sqrt = np.sqrt(rms_stream)
fixed_rms_sqrt = np.sqrt(np.mean(np.array(value_array)**2))
math_rms_sqrt = 1/math.sqrt(2) * amplitude
print(stream_rms_sqrt)
print(fixed_rms_sqrt)
print(math_rms_sqrt)
plot.plot(time_array, value_array, time_array, used_value_array)
plot.show()

MNIST handwritten digit

I've tried to make a script in python able to recognize handwritten digits, using this data set: http://deeplearning.net/data/mnist/mnist.pkl.gz.
More information about this problem and about the algorithm that I'm trying to implement can be found at this link: http://neuralnetworksanddeeplearning.com/chap1.html
I've implemented a classification algorithm using a perceptron for each digit.
import cPickle, gzip
import numpy as np
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()
def activation(x):
if x > 0:
return 1
return 0
bias = 0.5
learningRate = 0.01
images = train_set[0]
targets = train_set[1]
weights = np.random.uniform(0,1,(10,784))
for nr in range(0,10):
for i in range(0,49999):
x = images[i]
t = targets[i]
z = np.dot(weights[nr],x) + bias
output = activation(z)
weights[nr] = weights[nr] + (t - output) * x * learningRate
bias = bias + (t - output) * learningRate
images = test_set[0]
targets = test_set[1]
OK = 0
for i in range range(0, 10000):
vec = []
for j in range(0,10):
vec.append(np.dot(weights[j],images[i]))
if np.argmax(vec) == targets[i]:
OK = OK + 1
print("The network recognized " + str(OK) +'/'+ "10000")
I usually recognized 10% of the digits, which means that my algorithm is doing nothing, is the same as a random algorithm.
Even dough I know that this problem is popular and I can easily find another solution on the web, I'm still asking you to help me to identify mistakes in my code.
Maybe I've initialized the values of learningRate, bias and weights wrongly.

thanks to #Kevinj22 and the other ones, I was able to solve this problem in the end.
import cPickle, gzip
import numpy as np
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()
def activation(x):
if x > 0:
return 1
return 0
learningRate = 0.01
images = train_set[0]
targets = train_set[1]
weights = np.random.uniform(0,1,(10,784))
for nr in range(0,10):
for i in range(0,50000):
x = images[i]
t = targets[i]
z = np.dot(weights[nr],x)
output = activation(z)
if nr == t:
target = 1
else:
target = 0
adjust = np.multiply((target - output) * learningRate, x)
weights[nr] = np.add(weights[nr], adjust)
images = test_set[0]
targets = test_set[1]
OK = 0
for i in range(0, 10000):
vec = []
for j in range(0,10):
vec.append(np.dot(weights[j],images[i]))
if np.argmax(vec) == targets[i]:
OK = OK + 1
print("The network recognized " + str(OK) +'/'+ "10000")
here is my updated code. I didn't introduce loss computation in my first attempt. I also get rid of bias because I didn't find it useful in my implementation.
I run this piece of code 10 times, with an average accuracy of 88%

Trying to build neural net for digit recognition in Python. Unable to get theta2 and predictions correct

I am following Andrew's Coursera course on machine learning. I am trying to build a 3 layers neural net for digit recognition in Python (784 input, 25 hidden, 10 output). However, I am unable to get the predictions (of the training data) correct (accuracy < 5% at 100 iter, accuracy not increasing with iteration).
J (the cost function) seems to be going down (see photo 1) and I have done gradient checking (before minimizing) and it seems to match to around 1e-11 (see photo 2).
I have compared the theta1 and theta2 after 100 iterations to my working matlab code (see code snippet 1 for octave and code snippet 2 for python). It seems theta1 is reasonably similar but theta2 is very different -- see code snippet 2. (I know they should differ because of the different optimisation routines. However, firstly, I have place the same initial thetas into both codes. Secondly, my reasoning is that they should start to converge, or at least get close, after 100 iterations)
The only error I see is:
-c:32: RuntimeWarning: overflow encountered in exp
when running the sigmoid during the optimising. However, I was told that this is not essential and it is normal to encounter this error during optimising? Furthermore, because it is a sigmoid, anytime the input is large, it will tend towards 1 anyways.
I have also attached my code in snippet 3. I have cut out all the other non-essential bits (like gradient checking) to make it as short as possible.
I would appreciate any help into this as I cannot even find where it is going wrong, let alone fix it. Thank you.
Photos:
J (cost function) decreasing to 1.8 after 12 iterations
Gradient checking before optimizing, they look very similar
Code snippet:
Initializing Neural Network Parameters ...
initial1
-0.0100100
-0.0771400
-0.1113800
-0.0230100
0.0547800
-0.0505500
-0.0731200
-0.0988700
0.0128000
-0.0855400
-0.1002500
-0.1137200
-0.0669300
-0.0999900
0.0084500
-0.0363200
-0.0588600
-0.0431100
-0.1133700
-0.0326300
0.0282800
0.0052400
-0.1134600
-0.0617700
0.0267600
initial2
0.0273700
0.1026000
-0.0502100
-0.0699100
0.0190600
0.1004000
0.0784600
-0.0075900
-0.0362100
0.0286200
Doing fminunc
Training Neural Network...
Iteration 100 | Cost: 6.219605e-01
theta1
-0.0099719
-0.0768462
-0.1109559
-0.0229224
0.0545714
-0.0503575
-0.0728415
-0.0984935
0.0127513
-0.0852143
-0.0998682
-0.1132869
-0.0666751
-0.0996092
0.0084178
-0.0361817
-0.0586359
-0.0429458
-0.1129383
-0.0325057
0.0281723
0.0052200
-0.1130279
-0.0615348
0.0266581
theta2
1.124918
1.603780
-1.266390
-0.848874
0.037956
-1.360841
2.145562
-1.448657
-1.262285
-1.357635
theta1_initial
[-0.01001 -0.07714 -0.11138 -0.02301 0.05478 -0.05055 -0.07312 -0.09887
0.0128 -0.08554 -0.10025 -0.11372 -0.06693 -0.09999 0.00845 -0.03632
-0.05886 -0.04311 -0.11337 -0.03263 0.02828 0.00524 -0.11346 -0.06177
0.02676]
theta2_initial
[ 0.02737 0.1026 -0.05021 -0.06991 0.01906 0.1004 0.07846 -0.00759
-0.03621 0.02862]
Doing fminunc
-c:32: RuntimeWarning: overflow encountered in exp
theta1
[-0.00997202 -0.07680716 -0.11086841 -0.02292044 0.05455335 -0.05034252
-0.07280686 -0.09842603 0.01275117 -0.08516515 -0.0997987 -0.11319546
-0.06664666 -0.09954009 0.00841804 -0.03617494 -0.05861458 -0.04293555
-0.1128474 -0.0325006 0.02816879 0.00522031 -0.1129369 -0.06151103
0.02665508]
theta2
[ 0.27954826 -0.08007496 -0.36449273 -0.22988024 0.06849659 -0.47803973
1.09023041 -0.25570559 -0.24537494 -0.40341995]
#-----------------BEGIN HEADERS-----------------
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import csv
import scipy
#-----------------END HEADERS-----------------
#-----------------BEGIN FUNCTION 1-----------------
def randinitialize(L_in, L_out):
w = np.zeros((L_out, 1 + L_in))
epsilon_init = 0.12
w = np.random.rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init
return w
#-----------------END FUNCTION 1-----------------
#-----------------BEGIN FUNCTION 2-----------------
def sigmoid(lz):
g = 1.0/(1.0+np.exp(-lz))
return g
#-----------------END FUNCTION 2-----------------
#-----------------BEGIN FUNCTION 3-----------------
def sigmoidgradient(lz):
g = np.multiply(sigmoid(lz),(1-sigmoid(lz)))
return g
#-----------------END FUNCTION 3-----------------
#-----------------BEGIN FUNCTION 4-----------------
def nncostfunction(ltheta_ravel, linput_layer_size, lhidden_layer_size, lnum_labels, lx, ly, llambda_reg):
ltheta1 = np.array(np.reshape(ltheta_ravel[:lhidden_layer_size * (linput_layer_size + 1)], (lhidden_layer_size, (linput_layer_size + 1))))
ltheta2 = np.array(np.reshape(ltheta_ravel[lhidden_layer_size * (linput_layer_size + 1):], (lnum_labels, (lhidden_layer_size + 1))))
ltheta1_grad = np.zeros((np.shape(ltheta1)))
ltheta2_grad = np.zeros((np.shape(ltheta2)))
y_matrix = []
lm = np.shape(lx)[0]
eye_matrix = np.eye(lnum_labels)
for i in range(len(ly)):
y_matrix.append(eye_matrix[int(ly[i])-1,:]) #The minus one as python is zero based
y_matrix = np.array(y_matrix)
a1 = np.hstack((np.ones((lm,1)), lx)).astype(float)
z2 = sigmoid(ltheta1.dot(a1.T))
a2 = (np.concatenate((np.ones((np.shape(z2)[1], 1)), z2.T), axis=1)).astype(float)
a3 = sigmoid(ltheta2.dot(a2.T))
h = a3
J_unreg = 0
J = 0
J_unreg = (1/float(lm))*np.sum(\
-np.multiply(y_matrix,np.log(h.T))\
-np.multiply((1-y_matrix),np.log(1-h.T))\
,axis=None)
J = J_unreg + (llambda_reg/(2*float(lm)))*\
(np.sum(\
np.multiply(ltheta1[:,1:],ltheta1[:,1:])\
,axis=None)+np.sum(\
np.multiply(ltheta2[:,1:],ltheta2[:,1:])\
,axis=None))
delta3 = a3.T - y_matrix
delta2 = np.multiply((delta3.dot(ltheta2[:,1:])), (sigmoidgradient(ltheta1.dot(a1.T))).T)
cdelta2 = ((a2.T).dot(delta3)).T
cdelta1 = ((a1.T).dot(delta2)).T
ltheta1_grad = (1/float(lm))*cdelta1
ltheta2_grad = (1/float(lm))*cdelta2
theta1_hold = ltheta1
theta2_hold = ltheta2
theta1_hold[:,0] = 0;
theta2_hold[:,0] = 0;
ltheta1_grad = ltheta1_grad + (llambda_reg/float(lm))*theta1_hold;
ltheta2_grad = ltheta2_grad + (llambda_reg/float(lm))*theta2_hold;
thetagrad_ravel = np.concatenate((np.ravel(ltheta1_grad), np.ravel(ltheta2_grad)))
return (J, thetagrad_ravel)
#-----------------END FUNCTION 4-----------------
#-----------------BEGIN FUNCTION 5-----------------
def predict(ltheta1, ltheta2, x):
m, n = np.shape(x)
p = np.zeros(m)
h1 = sigmoid((np.hstack((np.ones((m,1)),x.astype(float)))).dot(ltheta1.T))
h2 = sigmoid((np.hstack((np.ones((m,1)),h1))).dot(ltheta2.T))
for i in range(0,np.shape(h2)[0]):
p[i] = np.argmax(h2[i,:])
return p
#-----------------END FUNCTION 5-----------------
## Setup the parameters you will use for this exercise
input_layer_size = 784; # 28x28 Input Images of Digits
hidden_layer_size = 25; # 25 hidden units
num_labels = 10; # 10 labels, from 0 to 9
data = []
#Reading in data, split into X and y, rewrite label 0 to 10 (for easy comparison to course)
with open('train.csv', 'rb') as csvfile:
has_header = csv.Sniffer().has_header(csvfile.read(1024))
csvfile.seek(0) # rewind
data_csv = csv.reader(csvfile, delimiter=',')
if has_header:
next(data_csv)
for row in data_csv:
data.append(row)
data = np.array(data)
x = data[:,1:]
y = data[:,0]
y = y.astype(int)
for i in range(len(y)):
if y[i] == 0:
y[i] = 10
#Set basic parameters
m, n = np.shape(x)
lambda_reg = 1.0
#Randomly initalize weights for Theta_initial
#theta1_initial = np.genfromtxt('tt1.csv', delimiter=',')
#theta2_initial = np.genfromtxt('tt2.csv', delimiter=',')
theta1_initial = randinitialize(input_layer_size, hidden_layer_size);
theta2_initial = randinitialize(hidden_layer_size, num_labels);
theta_initial_ravel = np.concatenate((np.ravel(theta1_initial), np.ravel(theta2_initial)))
#Doing optimize
fmin = scipy.optimize.minimize(fun=nncostfunction, x0=theta_initial_ravel, args=(input_layer_size, hidden_layer_size, num_labels, x, y, lambda_reg), method='L-BFGS-B', jac=True, options={'maxiter': 10, 'disp': True})
fmin
theta1 = np.array(np.reshape(fmin.x[:hidden_layer_size * (input_layer_size + 1)], (hidden_layer_size, (input_layer_size + 1))))
theta2 = np.array(np.reshape(fmin.x[hidden_layer_size * (input_layer_size + 1):], (num_labels, (hidden_layer_size + 1))))
p = predict(theta1, theta2, x);
for i in range(len(y)):
if y[i] == 10:
y[i] = 0
correct = [1 if a == b else 0 for (a, b) in zip(p,y)]
accuracy = (sum(map(int, correct)) / float(len(correct)))
print 'accuracy = {0}%'.format(accuracy * 100)

I think I have fixed the problem: it seems I messed up the index
should be:
y_matrix.append(eye_matrix[int(ly[i]),:])
instead of:
y_matrix.append(eye_matrix[int(ly[i])-1,:])

Neural network XOR gate not learning

I'm trying to make a XOR gate by using 2 perceptron network but for some reason the network is not learning, when I plot the change of error in a graph the error comes to a static level and oscillates in that region.
I did not add any bias to the network at the moment.
import numpy as np
def S(x):
return 1/(1+np.exp(-x))
win = np.random.randn(2,2)
wout = np.random.randn(2,1)
eta = 0.15
# win = [[1,1], [2,2]]
# wout = [[1],[2]]
obj = [[0,0],[1,0],[0,1],[1,1]]
target = [0,1,1,0]
epoch = int(10000)
emajor = ""
for r in range(0,epoch):
for xy in range(len(target)):
tar = target[xy]
fdata = obj[xy]
fdata = S(np.dot(1,fdata))
hnw = np.dot(fdata,win)
hnw = S(np.dot(fdata,win))
out = np.dot(hnw,wout)
out = S(out)
diff = tar-out
E = 0.5 * np.power(diff,2)
emajor += str(E[0]) + ",\n"
delta_out = (out-tar)*(out*(1-out))
nindelta_out = delta_out * eta
wout_change = np.dot(nindelta_out[0], hnw)
for x in range(len(wout_change)):
change = wout_change[x]
wout[x] -= change
delta_in = np.dot(hnw,(1-hnw)) * np.dot(delta_out[0], wout)
nindelta_in = eta * delta_in
for x in range(len(nindelta_in)):
midway = np.dot(nindelta_in[x][0], fdata)
for y in range(len(win)):
win[y][x] -= midway[y]
f = open('xor.csv','w')
f.write(emajor) # python will convert \n to os.linesep
f.close() # you can omit in most cases as the destructor will call it
This is the error changing by the number of learning rounds. Is this correct? The red color line is the line I was expecting how the error should change.
Anything wrong I'm doing in the code? As I can't seem to figure out what's causing the error. Help much appreciated.
Thanks in advance

Here is a one hidden layer network with backpropagation which can be customized to run experiments with relu, sigmoid and other activations. After several experiments it was concluded that with relu the network performed better and reached convergence sooner, while with sigmoid the loss value fluctuated. This happens because, "the gradient of sigmoids becomes increasingly small as the absolute value of x increases".
import numpy as np
import matplotlib.pyplot as plt
from operator import xor
class neuralNetwork():
def __init__(self):
# Define hyperparameters
self.noOfInputLayers = 2
self.noOfOutputLayers = 1
self.noOfHiddenLayerNeurons = 2
# Define weights
self.W1 = np.random.rand(self.noOfInputLayers,self.noOfHiddenLayerNeurons)
self.W2 = np.random.rand(self.noOfHiddenLayerNeurons,self.noOfOutputLayers)
def relu(self,z):
return np.maximum(0,z)
def sigmoid(self,z):
return 1/(1+np.exp(-z))
def forward (self,X):
self.z2 = np.dot(X,self.W1)
self.a2 = self.relu(self.z2)
self.z3 = np.dot(self.a2,self.W2)
yHat = self.relu(self.z3)
return yHat
def costFunction(self, X, y):
#Compute cost for given X,y, use weights already stored in class.
self.yHat = self.forward(X)
J = 0.5*sum((y-self.yHat)**2)
return J
def costFunctionPrime(self,X,y):
# Compute derivative with respect to W1 and W2
delta3 = np.multiply(-(y-self.yHat),self.sigmoid(self.z3))
djw2 = np.dot(self.a2.T, delta3)
delta2 = np.dot(delta3,self.W2.T)*self.sigmoid(self.z2)
djw1 = np.dot(X.T,delta2)
return djw1,djw2
if __name__ == "__main__":
EPOCHS = 6000
SCALAR = 0.01
nn= neuralNetwork()
COST_LIST = []
inputs = [ np.array([[0,0]]), np.array([[0,1]]), np.array([[1,0]]), np.array([[1,1]])]
for epoch in xrange(1,EPOCHS):
cost = 0
for i in inputs:
X = i #inputs
y = xor(X[0][0],X[0][1])
cost += nn.costFunction(X,y)[0]
djw1,djw2 = nn.costFunctionPrime(X,y)
nn.W1 = nn.W1 - SCALAR*djw1
nn.W2 = nn.W2 - SCALAR*djw2
COST_LIST.append(cost)
plt.plot(np.arange(1,EPOCHS),COST_LIST)
plt.ylim(0,1)
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title(str('Epochs: '+str(EPOCHS)+', Scalar: '+str(SCALAR)))
plt.show()
inputs = [ np.array([[0,0]]), np.array([[0,1]]), np.array([[1,0]]), np.array([[1,1]])]
print "X\ty\ty_hat"
for inp in inputs:
print (inp[0][0],inp[0][1]),"\t",xor(inp[0][0],inp[0][1]),"\t",round(nn.forward(inp)[0][0],4)
End Result:
X y y_hat
(0, 0) 0 0.0
(0, 1) 1 0.9997
(1, 0) 1 0.9997
(1, 1) 0 0.0005
The weights obtained after training were:
nn.w1
[ [-0.81781753 0.71323677]
[ 0.48803631 -0.71286155] ]
nn.w2
[ [ 2.04849235]
[ 1.40170791] ]
I found the following youtube series extremely helpful for understanding neural nets: Neural networks demystified
There is only little which I know and also that can be explained in this answer. If you want an even better understanding of neural nets, then I would suggest you to go through the following link: cs231n: Modelling one neuron

The error calculated in each epoch should be a sum total of all sum squared errors (i.e. error for every target)
import numpy as np
def S(x):
return 1/(1+np.exp(-x))
win = np.random.randn(2,2)
wout = np.random.randn(2,1)
eta = 0.15
# win = [[1,1], [2,2]]
# wout = [[1],[2]]
obj = [[0,0],[1,0],[0,1],[1,1]]
target = [0,1,1,0]
epoch = int(10000)
emajor = ""
for r in range(0,epoch):
# ***** initialize final error *****
finalError = 0
for xy in range(len(target)):
tar = target[xy]
fdata = obj[xy]
fdata = S(np.dot(1,fdata))
hnw = np.dot(fdata,win)
hnw = S(np.dot(fdata,win))
out = np.dot(hnw,wout)
out = S(out)
diff = tar-out
E = 0.5 * np.power(diff,2)
# ***** sum all errors *****
finalError += E
delta_out = (out-tar)*(out*(1-out))
nindelta_out = delta_out * eta
wout_change = np.dot(nindelta_out[0], hnw)
for x in range(len(wout_change)):
change = wout_change[x]
wout[x] -= change
delta_in = np.dot(hnw,(1-hnw)) * np.dot(delta_out[0], wout)
nindelta_in = eta * delta_in
for x in range(len(nindelta_in)):
midway = np.dot(nindelta_in[x][0], fdata)
for y in range(len(win)):
win[y][x] -= midway[y]
# ***** Save final error *****
emajor += str(finalError[0]) + ",\n"
f = open('xor.csv','w')
f.write(emajor) # python will convert \n to os.linesep
f.close() # you can omit in most cases as the destructor will call it

Python: TypeError: 'float' object has no attribute 'getitem'

I am trying to implement particle filter algorithm in python. I am getting this error:
x_P_update[i] = 0.5*x_P[i] + 25*x_P[i]/(1 + x_P[i]**2) + 8*math.cos(1.2*(t-1)) + math.sqrt(x_N)*np.random.randn()
TypeError: 'float' object has no attribute '__getitem__'
My code:
import math
import numpy as np
import matplotlib.pyplot as plt
x = 0.1 #initial value
x_N = 1 #process noise covariance in state update
x_R = 1 #noise covariance in measurement
T = 75 #number of iterations
N = 10 #number of particles
V = 2
x_P = [None]*(N)
for i in xrange(0, N):
x_P[i] = x + math.sqrt(V)*np.random.randn()
z_out = np.array([x**2 / 20 + math.sqrt(x_R) * np.random.randn()]) #the actual output vector for measurement values.
x_out = np.array([x]) #the actual output vector for measurement values.
x_est = np.array([x]); # time by time output of the particle filters estimate
x_est_out = np.array([x_est]) # the vector of particle filter estimates.
x_P_update = [None]*N
z_update = [None]*N
P_w = [None]*N
for t in xrange(1, T+1):
x = 0.5*x + 25*x/(1 + x**2) + 8*math.cos(1.2*(t-1)) + math.sqrt(x_N)*np.random.randn()
z = x**2/20 + math.sqrt(x_R)*np.random.randn()
for i in xrange(0, N):
#each particle is updated with process eq
x_P_update[i] = 0.5*x_P[i] + 25*x_P[i]/(1 + x_P[i]**2) + 8*math.cos(1.2*(t-1)) + math.sqrt(x_N)*np.random.randn()
#observations are updated for each particle
z_update[i] = x_P_update[i]**2/20
#generate weights
P_w[i] = (1/math.sqrt(2*math.pi*x_R)) * math.exp(-(z - z_update[i])**2/(2*x_R))
P_w[:] = [ k / sum(P_w) for k in P_w]
# print(np.where(np.cumsum(P_w, axis=0) >= np.random.rand()))
# print(index_tuple[0][1])
# P_w_array = np.array(list(P_w))
# indices = [i for i in range(len(P_w)) if np.cumsum(P_w_array) >= np.random.rand()]
for i in xrange(0, N):
index_tuple = np.where(np.random.rand() <= np.cumsum(P_w, axis=0))
m = index_tuple[0][1]
x_P = x_P_update[m]
x_est = np.array([np.mean(x_P)])
x_out = np.array([x_out, x])
z_out = np.array([z_out, z])
x_est_out = np.array([x_est_out, x_est])
I am using matlab code from here to learn how to implement this algorithm in python using scipy. http://studentdavestutorials.weebly.com/particle-filter-with-matlab-code.html
I just started learning python and can't get out of this problem, kindly help.

I'm not going to go through the video tutorial and fix your algorithm, but I can show you why you're getting this error.
In this line:
x_P = x_P_update[m]
You are assigning an array with a float value, which you then attempt to access as an array in the outer loop. Updating it instead will get rid of your error:
x_P[m] = x_P_update[m]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

All Outputs Going to Zero MNIST NumPy Solution with Simple Neural Net - python

Hyperparameters in this instance were just improperly tuned. Bringing down the number of nodes per hidden layer to 15 and changing the learning rate down to 0.1 yields a significant performance increase.

Related

Floating RMS in Python

MNIST handwritten digit

Trying to build neural net for digit recognition in Python. Unable to get theta2 and predictions correct

Neural network XOR gate not learning

Python: TypeError: 'float' object has no attribute 'getitem'

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

All Outputs Going to Zero MNIST NumPy Solution with Simple Neural Net - python

Hyperparameters in this instance were just improperly tuned. Bringing down the number of nodes per hidden layer to 15 and changing the learning rate down to 0.1 yields a significant performance increase.

Related

Floating RMS in Python

MNIST handwritten digit

Trying to build neural net for digit recognition in Python. Unable to get theta2 and predictions correct

Neural network XOR gate not learning

Python: TypeError: 'float' object has no attribute '__getitem__'

Categories

Resources

Python: TypeError: 'float' object has no attribute 'getitem'