Tensorflow Linear Regression - Exponential Model Not Fitting Exponent - python

I'm trying to fit an exponentially decaying model (y=Ax^b + C)to some data but have yet to get a value other than 0 for for b. I have two "working" sets of code right now, one steps through each X,Y pair, and the other attempts to use the entire [X,Y] array, but I'm not sure that I have implemented that correctly. For now I'd like for it to correctly fit a curve. The linear model works fine so I'm not sure where this is going south.
Data is here - PASTEBIN
#!/usr/bin/python
import numpy as np
import tensorflow as tf
import sys
import matplotlib.pyplot as plt
k=0
xdata= []
ydata = []
# Open the data and read it in, ignore the header.
with open('curvedata_full_formatted.csv') as f:
for line in f:
k+=1
if k==1:continue
items = line.split(',')
xdata.append(float(items[0]))
ydata.append(float(items[1]))
# Model linear regression y = A*x^B+C
# x - data to be fed into the model - 1 feature
x = tf.placeholder(tf.float32, [None, 1])
# A - training variable - 1 feature, 1 output
A = tf.Variable(tf.zeros([1,1]))
# B - training variable - 1 output
B = tf.Variable(tf.zeros([1,1]))
# C - training variable - 1 output
C = tf.Variable(tf.zeros([1]))
# x^B
xb = tf.exp(B)
# A*x^b
product = tf.mul(A,xb)
# Prediction
y = tf.add(product,C)
# Actual value ybar
y_ = tf.placeholder(tf.float32)
# Cost function sum((y_-y)**2)
cost = tf.reduce_mean(tf.square(y_-y))
# Training using Gradient Descent to minimize cost
train_step = tf.train.GradientDescentOptimizer(1*10**-9).minimize(cost)
sess = tf.Session()
init = tf.initialize_all_variables()
sess.run(init)
steps = 150
for i in range(steps):
# Read in data from log file and use as x,y
for (X,Y) in zip(xdata,ydata):
#xs = np.array([[xdata]])
#ys = np.array([[ydata]])
# Train
# Feed dict x placeholder xs, y_ placeholder ys
X = np.array([[X]])
Y = np.array([[Y]])
feed = { x: X, y_: Y }
sess.run(train_step, feed_dict=feed)
sys.stdout.write("\rIteration %i " %i +"cost %.15f" % sess.run(cost, feed_dict=feed))
sys.stdout.flush()
print ''
print 'A: %f'%sess.run(A)
print 'B: %f'%sess.run(B)
print 'C: %f'%sess.run(C)

As a test, try starting the optimizer with initial values close to the expected final parameters. This test will tell you whether or not the problem is in the selection of initial parameter values.

Related

Linear Regression using Tensor flow gives different weights and cost every time

I am trying to implement linear regression using tensor-flow. Following is the code I am using.
import tensorflow as tf
import numpy as np
import pandas as pd
import os
rng = np.random
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# reading data from a csv file
file1 = pd.read_csv('out.csv')
x_data=file1['^GSPC']
# converting datafram into array
x_data=x_data.values
y_data=file1['FB']
#converting dataframe into array
y_data=y_data.values
n_steps = 1000 #Total number of steps
n_iterations = [] #Nth iteration value
n_loss = [] #Loss at nth iteration
learned_weight = [] #weight at nth iteration
learned_bias = [] #bias value at nth iteration
# Try to find values for W and b that compute y_data = W * x_data + b
W = tf.Variable(rng.randn())
b = tf.Variable(rng.rand())
y = W * x_data + b
# Minimize the mean squared errors.
loss=tf.reduce_sum(tf.pow(y-y_data, 2))/(2*28)
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
with tf.Session() as sess:
# Before starting, initialize the variables. We will 'run' this first.
sess.run(tf.global_variables_initializer())
for step in range(n_steps):
sess.run(train)
n_iterations.append(step)
n_loss.append(loss.eval())
learned_weight.append(W.eval())
learned_bias.append(b.eval())
print("Final Weight: "+str(learned_weight[-1])+", Final Bias: "+str(learned_bias[-1]) + ", Final cost:"+str(n_loss[-1]))
The problem is every time I run the code I get different result (weights, bias and cost(loss)). I have studied from a few resources that weights, bias and cost should be approximately same in every run.
Secondly, the line i.e ( y=weights*x_data+bias) does not quite fit the training data.
Thirdly, I have to convert dataframe x_data and y_data to array by implementing the following
x_data=x_data.values
y_data=y_data.values
if I don’t do as shown above my code run the following error:
Traceback (most recent call last): File "python", line 33, in File "tensorflow/python/framework/fast_tensor_util.pyx", line 120, in tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto TypeError: Expected binary or unicode string, got tf.Tensor 'sub:0' shape=(28,) dtype=float32
Please help me understanding what I am doing wrong!
P.S: My questions may sound stupid because I am new to tensor flow and machine learning.
The code is implemented wrongly:
Use tf.Placeholders for data that will be passed into the model.
Use the feed_dict attribute of sess.run to pass data to the placeholder when executing the graph.
Here's an updated example:
Build the Graph
import numpy as np
import tensorflow as tf
import numpy as np
# dataset
X_data = np.random.randn(100,3)
y_data = 2*np.sum(X_data, 1)+0.01
# reshape y to be a column vector
y_data = np.reshape(y_data, [-1, 1])
# parameters
n_steps = 1000 #Total number of steps
batch_size = 20
input_length = X_data.shape[0] # => 100
display_cost = 500
# data placeholders
X = tf.placeholder(shape=[None, 3],dtype = tf.float32)
y = tf.placeholder(shape=[None, 1],dtype = tf.float32)
# build the model
W = tf.Variable(initial_value = tf.random_normal([3,1]))
b = tf.Variable(np.random.rand())
y_fitted = tf.add(tf.matmul(X, W), b)
# Minimize the mean squared errors
loss=tf.losses.mean_squared_error(labels=y, predictions=y_fitted)
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
Execute in Session
# execute in Session
with tf.Session() as sess:
# initialize all variables
tf.global_variables_initializer().run()
# Train the model
for steps in range(n_steps):
mini_batch = zip(range(0, input_length, batch_size),
range(batch_size, input_length+1, batch_size))
# train data in mini-batches
for (start, end) in mini_batch:
sess.run(optimizer, feed_dict = {X: X_data[start:end],
y: y_data[start:end]})
# print training performance
if (steps+1) % display_cost == 0:
print('Step: {}'.format((steps+1)))
# evaluate loss function
cost = sess.run(loss, feed_dict = {X: X_data,
y: y_data})
print('Cost: {}'.format(cost))
# report rmse for training and test data
print('\nFinal Weight: {}'.format(W.eval()))
print('\nFinal Bias: {}'.format(b.eval()))
Output for 2 runs
# Run 1
Step: 500
Cost: 3.1569701713918263e-11
Step: 1000
Cost: 3.1569701713918263e-11
Final Weight: [[2.0000048]
[2.0000024]
[1.9999973]]
Final Bias: 0.010000854730606079
# Run 2
Step: 500
Cost: 7.017615221566187e-12
Step: 1000
Cost: 7.017615221566187e-12
Final Weight: [[1.9999975]
[1.9999989]
[1.9999999]]
Final Bias: 0.0099998963996768
Indeed, the weight and bias are approximately the same for multiple calls to build a classifier using the same dataset. Also when doing numerical computations, Numpy ndarrays are mostly the preferred data format hence the conversion using .values.

how do I predict answers in tensorflow using a model trained with linear regression?

I've modified some really simple tutorial code (below) to train a linear regression model (way simple because I'm trying to learn how it works). The training works fine as I get the predicted result. I am, however, finding it impossible to figure out the syntax to feed in a piece of test data (x) and get the model to spit out the predicted y value. The error I get is "cannot feed value of shape (1,) for Tensor which has shape (?,1).
Can somebody take a look at the last section (#rebuild the graph so we can make a prediction from test data....) and point me in the right direction please? Thank you in advance.
sample data for x is in column 0 of each row in the CSV
import numpy as np
import tensorflow as tf
import csv
//# Model linear regression y = Wx + b
x = tf.placeholder(tf.float32, [None, 1], name='varx')
W = tf.Variable(tf.zeros([1,1]), name='W')
b = tf.Variable(tf.zeros([1]),name='b')
product = tf.matmul(x,W)
y = tf.placeholder(tf.float32,[None,1],name='vary')
y = product + b
y_ = tf.placeholder(tf.float32, [None, 1], name='varouty')
//# Cost function sum((y_-y)**2)
cost = tf.reduce_mean(tf.square(y_-y))
//# Training using Gradient Descent to minimize cost
train_step = tf.train.GradientDescentOptimizer(0.0000001).minimize(cost)
sess = tf.Session()
init = tf.initialize_all_variables()
sess.run(init)
epochs = 500
list1 = []
with open(r'c:\temp\book1.csv','r') as f:
reader = csv.reader(f)
for row in reader:
i2 = int(row[0])
list1.append(i2)
//#now loop through the list and add the items
for i in range(epochs):
//# Create fake data for y = W.x + b where W = 2, b = 0
i0 = list1[i]
xs = np.array([[i0]])
ys = np.array([[2*i0]])
//# Train
feed = { x: xs, y_: ys }
sess.run(train_step, feed_dict=feed)
print("xs=" + str(xs))
print("ys=" + str(ys))
print("After %d iterations:" % i)
print("W: %f" % sess.run(W))
print("b: %f" % sess.run(b))
//# NOTE: W should be close to 2, and b should be close to 0
//#rebuild the graph so we can make a prediction from test data....
xg = tf.get_default_graph()
x_input = xg.get_tensor_by_name('varx:0')
y_output = xg.get_tensor_by_name('vary:0')
W1 = xg.get_tensor_by_name('W:0')
b1 = xg.get_tensor_by_name('b:0')
with tf.Session(graph=xg) as sess1:
x_example = [3]
y_prediction = sess1.run(y_output,feed_dict={x_input:x_example})
print(y_prediction)

How to train a simple linear regression model with SGD in pytorch successfully?

I was trying to train a simple polynomial linear regression model in pytorch with SGD. I wrote some self contained (what I thought would be extremely simple code), however, for some reason my model does not train as I thought it should.
I have 5 points sampled from a sine curve and try to fit it with a polynomial of degree 4. This is a convex problem so GD or SGD should find a solution with zero train error eventually as long as we have enough iterations and small enough step size. For some reason however my model does not train well (even though it seems that it is changing the parameters of the model. Anyone have an idea why? Here is the code (I tried making it self contained and minimal):
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
import torch
from torch.autograd import Variable
from maps import NamedDict
from plotting_utils import *
def index_batch(X,batch_indices,dtype):
'''
returns the batch indexed/sliced batch
'''
if len(X.shape) == 1: # i.e. dimension (M,) just a vector
batch_xs = torch.FloatTensor(X[batch_indices]).type(dtype)
else:
batch_xs = torch.FloatTensor(X[batch_indices,:]).type(dtype)
return batch_xs
def get_batch2(X,Y,M,dtype):
'''
get batch for pytorch model
'''
# TODO fix and make it nicer, there is pytorch forum question
X,Y = X.data.numpy(), Y.data.numpy()
N = len(Y)
valid_indices = np.array( range(N) )
batch_indices = np.random.choice(valid_indices,size=M,replace=False)
batch_xs = index_batch(X,batch_indices,dtype)
batch_ys = index_batch(Y,batch_indices,dtype)
return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)
def get_sequential_lifted_mdl(nb_monomials,D_out, bias=False):
return torch.nn.Sequential(torch.nn.Linear(nb_monomials,D_out,bias=bias))
def train_SGD(mdl, M,eta,nb_iter,logging_freq ,dtype, X_train,Y_train):
##
N_train,_ = tuple( X_train.size() )
#print(N_train)
for i in range(nb_iter):
# Forward pass: compute predicted Y using operations on Variables
batch_xs, batch_ys = get_batch2(X_train,Y_train,M,dtype) # [M, D], [M, 1]
## FORWARD PASS
y_pred = mdl.forward(batch_xs)
## LOSS + Regularization
batch_loss = (1/M)*(y_pred - batch_ys).pow(2).sum()
## BACKARD PASS
batch_loss.backward() # Use autograd to compute the backward pass. Now w will have gradients
## SGD update
for W in mdl.parameters():
delta = eta*W.grad.data
W.data.copy_(W.data - delta)
## train stats
if i % (nb_iter/10) == 0 or i == 0:
current_train_loss = (1/N_train)*(mdl.forward(X_train) - Y_train).pow(2).sum().data.numpy()
print('i = {}, current_loss = {}'.format(i, current_train_loss ) )
## Manually zero the gradients after updating weights
mdl.zero_grad()
##
logging_freq = 100
dtype = torch.FloatTensor
## SGD params
M = 3
eta = 0.0002
nb_iter = 20*1000
##
lb,ub = 0,1
f_target = lambda x: np.sin(2*np.pi*x)
N_train = 5
X_train = np.linspace(lb,ub,N_train)
Y_train = f_target(X_train)
## degree of mdl
Degree_mdl = 4
## pseudo-inverse solution
c_pinv = np.polyfit( X_train, Y_train , Degree_mdl )[::-1]
## linear mdl to train with SGD
nb_terms = c_pinv.shape[0]
mdl_sgd = get_sequential_lifted_mdl(nb_monomials=nb_terms,D_out=1, bias=False)
## Make polynomial Kernel
poly_feat = PolynomialFeatures(degree=Degree_mdl)
Kern_train = poly_feat.fit_transform(X_train.reshape(N_train,1))
Kern_train_pt, Y_train_pt = Variable(torch.FloatTensor(Kern_train).type(dtype), requires_grad=False), Variable(torch.FloatTensor(Y_train).type(dtype), requires_grad=False)
train_SGD(mdl_sgd, M,eta,nb_iter,logging_freq ,dtype, Kern_train_pt,Y_train_pt)
the error seems to hover on 2ish:
i = 0, current_loss = [ 2.08996224]
i = 2000, current_loss = [ 2.03536892]
i = 4000, current_loss = [ 2.02014995]
i = 6000, current_loss = [ 2.01307297]
i = 8000, current_loss = [ 2.01300406]
i = 10000, current_loss = [ 2.01125693]
i = 12000, current_loss = [ 2.01162267]
i = 14000, current_loss = [ 2.01296973]
i = 16000, current_loss = [ 2.00951076]
i = 18000, current_loss = [ 2.00967121]
which is weird cuz it should be able to reach zero.
I also plotted the learned function:
the code for the plotting:
##
x_horizontal = np.linspace(lb,ub,1000).reshape(1000,1)
X_plot = poly_feat.fit_transform(x_horizontal)
X_plot_pytorch = Variable( torch.FloatTensor(X_plot), requires_grad=False)
##
fig1 = plt.figure()
#plots objs
p_sgd, = plt.plot(x_horizontal, [ float(f_val) for f_val in mdl_sgd.forward(X_plot_pytorch).data.numpy() ])
p_pinv, = plt.plot(x_horizontal, np.dot(X_plot,c_pinv))
p_data, = plt.plot(X_train,Y_train,'ro')
## legend
nb_terms = c_pinv.shape[0]
legend_mdl = f'SGD solution standard parametrization, number of monomials={nb_terms}, batch-size={M}, iterations={nb_iter}, step size={eta}'
plt.legend(
[p_sgd,p_pinv,p_data],
[legend_mdl,f'linear algebra soln, number of monomials={nb_terms}',f'data points = {N_train}']
)
##
plt.xlabel('x'), plt.ylabel('f(x)')
plt.show()
I actually went ahead and implemented a TensorFlow version. That one does seem to train the model. I tried having both of them match by giving them the same initialization:
mdl_sgd[0].weight.data.fill_(0)
but that still didn't work. Tensorflow code:
graph = tf.Graph()
with graph.as_default():
X = tf.placeholder(tf.float32, [None, nb_terms])
Y = tf.placeholder(tf.float32, [None,1])
w = tf.Variable( tf.zeros([nb_terms,1]) )
#w = tf.Variable( tf.truncated_normal([Degree_mdl,1],mean=0.0,stddev=1.0) )
#w = tf.Variable( 1000*tf.ones([Degree_mdl,1]) )
##
f = tf.matmul(X,w) # [N,1] = [N,D] x [D,1]
#loss = tf.reduce_sum(tf.square(Y - f))
loss = tf.reduce_sum( tf.reduce_mean(tf.square(Y-f), 0))
l2loss_tf = (1/N_train)*2*tf.nn.l2_loss(Y-f)
##
learning_rate = eta
#global_step = tf.Variable(0, trainable=False)
#learning_rate = tf.train.exponential_decay(learning_rate=eta, global_step=global_step,decay_steps=nb_iter/2, decay_rate=1, staircase=True)
train_step = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)
with tf.Session(graph=graph) as sess:
Y_train = Y_train.reshape(N_train,1)
tf.global_variables_initializer().run()
# Train
for i in range(nb_iter):
#if i % (nb_iter/10) == 0:
if i % (nb_iter/10) == 0 or i == 0:
current_loss = sess.run(fetches=loss, feed_dict={X: Kern_train, Y: Y_train})
print(f'i = {i}, current_loss = {current_loss}')
## train
batch_xs, batch_ys = get_batch(Kern_train,Y_train,M)
sess.run(train_step, feed_dict={X: batch_xs, Y: batch_ys})
I also tried changing the initialization but it didn't change anything, which makes sense cuz it shouldn't make a big difference:
mdl_sgd[0].weight.data.normal_(mean=0,std=0.001)
Original post:
https://discuss.pytorch.org/t/how-to-train-a-simple-linear-regression-model-with-sgd-in-pytorch-successfully/9620
This is how it should look like:
SOLUTION:
it seems that there is an issue with the result being returned as a vector instead of a number causing the issue. i.e. the following code fixed things:
y_pred = model(batch_xs).view(-1) # change this to "y_pred = model(batch_xs)" to get the incorrect results
loss = (y_pred - batch_ys).pow(2).mean()
which seems completely mysterious to me. Does someone know why this fixed the issue? it just seems like magic.
The bug is really subtle but essentially it's because pytorch is using numpy broadcasting rules. So when a column vector (3,1) and an array (i.e. dim is (3,) ) then what happens is that broadcasting produces a (3,3) matrix (note this wouldn't happen when you subtract a row vector (1,3) vector with a (3,) array, I guess arrays are treated as row vectors). This is really bad because it means that we compute the matrix of all pairwise differences between every label and every prediction. Of course this is nonsensical and produces a bug because we don't want the prediction of the first label point to match the prediction of every other label in the data set. Of course that won't produce anything sensible.
So it seems the answer is just to avoid wrong numpy broadcasting by either reshaping things during training or before the data is fed. Either one should work.
To avoid the error one can attach use this code:
def check_vectors_have_same_dimensions(Y,Y_):
'''
Checks that vector Y and Y_ have the same dimensions. If they don't
then there might be an error that could be caused due to wrong broadcasting.
'''
DY = tuple( Y.size() )
DY_ = tuple( Y_.size() )
if len(DY) != len(DY_):
return True
for i in range(len(DY)):
if DY[i] != DY_[i]:
return True
return False

simple linear regression failed to converge in tensorflow

I am new to machine learning and Tensorflow. Currently I am trying to follow the tutorial's logic to create a simple linear regression model of form y = a*x (there is no bias term here) . However, for some reason, the model fail to converge to the correct value "a". The data set is created by me in excel. As shown below:
here is my code that tries to run tensorflow on this dummy data set I generated.
import tensorflow as tf
import pandas as pd
w = tf.Variable([[5]],dtype=tf.float32)
b = tf.Variable([-5],dtype=tf.float32)
x = tf.placeholder(shape=(None,1),dtype=tf.float32)
y = tf.add(tf.matmul(x,w),b)
label = tf.placeholder(dtype=tf.float32)
loss = tf.reduce_mean(tf.squared_difference(y,label))
data = pd.read_csv("D:\\dat2.csv")
xs = data.iloc[:,:1].as_matrix()
ys = data.iloc[:,1].as_matrix()
optimizer = tf.train.GradientDescentOptimizer(0.000001).minimize(loss)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(10000):
sess.run(optimizer,{x:xs,label:ys})
if i%100 == 0: print(i,sess.run(w))
print(sess.run(w))
below is the print out in ipython console, as you can see after 10000th iteration, the value for w is around 4.53 instead of the correct value 6.
I would really appreciate if anyone could shed some light on what is going on wrong here. I have played around with different learning rate from 0.01 to 0.0000001, none of the setting is able to have the w converge to 6. I have read some suggesting to normalize the feature to standard normal distribution, I would like to know if this normalization is a must? without normalization, gradientdescent is not able to find the solution? Thank you very much!
It is a shaping problem: y and label don't have the same shape ([batch_size, 1] vs [batch_size]). In loss = tf.reduce_mean(tf.squared_difference(y, label)), it causes tensorflow to interpret things differently from what you want, probably by using some broadcasting... Anyway, the result is that your loss is not at all the one you want.
To correct that, simply replace
y = tf.add(tf.matmul(x, w), b)
by
y = tf.add(tf.matmul(x, w), b)
y = tf.reshape(y, shape=[-1])
My full working code below:
import tensorflow as tf
import pandas as pd
w = tf.Variable([[4]], dtype=tf.float64)
b = tf.Variable([10.0], dtype=tf.float64, trainable=True)
x = tf.placeholder(shape=(None, 1), dtype=tf.float64)
y = tf.add(tf.matmul(x, w), b)
y = tf.reshape(y, shape=[-1])
label = tf.placeholder(shape=(None), dtype=tf.float64)
loss = tf.reduce_mean(tf.squared_difference(y, label))
my_path = "/media/sf_ShareVM/data2.csv"
data = pd.read_csv(my_path, sep=";")
max_n_samples_to_use = 50
xs = data.iloc[:max_n_samples_to_use, :1].as_matrix()
ys = data.iloc[:max_n_samples_to_use, 1].as_matrix()
lr = 0.000001
optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(loss)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(100000):
_, loss_value, w_value, b_value, y_val, lab_val = sess.run([optimizer, loss, w, b, y, label], {x: xs, label: ys})
if i % 100 == 0: print(i, loss_value, w_value, b_value)
if (i%2000 == 0 and 0< i < 10000): # We use a smaller LR at first to avoid exploding gradient. It would be MUCH cleaner to use gradient clipping (by global norm)
lr*=2
optimizer = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(loss)
print(sess.run(w))

How to test my Neural Network developed Tensorflow?

I`m just finished to write neural net with tensorflow
attached code :
import tensorflow as tensorFlow
import csv
# read data from csv
file = open('stub.csv')
reader = csv.reader(file)
temp = list(reader)
del temp[0]
# change data from string to float (Tensorflow)
# create data & goal lists
data = []
goal = []
for i in range(len(temp)):
data.append(map(float, temp[i]))
goal.append([data[i][6], 0.0])
del data[i][6]
# change lists to tuple
data = tuple(tuple(x) for x in data)
goal = tuple(goal)
# create training data and test data by 70-30
a = int(len(data) * 0.6) # training set 60%
b = int(len(data) * 0.8) # validation & test: each one is 20%
trainData = data[0:a] # 60%
validationData = data[b: len(data)]
testData = data[a: b] # 20%
trainGoal = goal[0:a]
validationGoal = goal[b:len(data)]
testGoal = goal[a: b]
numberOfLayers = 500
nodesLayer = []
# define the numbers of nodes in hidden layers
for i in range(numberOfLayers):
nodesLayer.append(500)
# define our goal class
classes = 2
batchSize = 2000
# x for input, y for output
sizeOfRow = len(data[0])
x = tensorFlow.placeholder(dtype= tensorFlow.float32, shape=[None, sizeOfRow])
y = tensorFlow.placeholder(dtype= tensorFlow.float32, shape=[None, classes])
hiddenLayers = []
layers = []
def neuralNetworkModel(x):
# first step: (input * weights) + bias, linear operation like y = ax + b
# each layer connection to other layer will represent by nodes(i) * nodes(i+1)
for i in range(0,numberOfLayers):
if i == 0:
hiddenLayers.append({"weights": tensorFlow.Variable(tensorFlow.random_normal([sizeOfRow, nodesLayer[i]])),
"biases": tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i]]))})
elif i > 0 and i < numberOfLayers-1:
hiddenLayers.append({"weights" : tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i], nodesLayer[i+1]])),
"biases" : tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i+1]]))})
else:
outputLayer = {"weights": tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i], classes])),
"biases": tensorFlow.Variable(tensorFlow.random_normal([classes]))}
# create the layers
for i in range(numberOfLayers):
if i == 0:
layers.append(tensorFlow.add(tensorFlow.matmul(x, hiddenLayers[i]["weights"]), hiddenLayers[i]["biases"]))
layers.append(tensorFlow.nn.relu(layers[i])) # pass values to activation function (i.e sigmoid, softmax) and add it to the layer
elif i >0 and i < numberOfLayers-1:
layers.append(tensorFlow.add(tensorFlow.matmul(layers[i-1], hiddenLayers[i]["weights"]), hiddenLayers[i]["biases"]))
layers.append(tensorFlow.nn.relu(layers[i]))
output = tensorFlow.matmul(layers[numberOfLayers-1], outputLayer["weights"]) + outputLayer["biases"]
return output
def neuralNetworkTrain(data, x, y):
prediction = neuralNetworkModel(x)
# using softmax function, normalize values to range(0,1)
cost = tensorFlow.reduce_mean(tensorFlow.nn.softmax_cross_entropy_with_logits(prediction, y))
# minimize the cost function
# using AdamOptimizer algorithm
optimizer = tensorFlow.train.AdadeltaOptimizer().minimize(cost)
epochs = 2 # feed machine forward + backpropagation = epoch
# build sessions and train the model
with tensorFlow.Session() as sess:
sess.run(tensorFlow.initialize_all_variables())
for epoch in range(epochs):
epochLoss = 0
i = 0
for _ in range(int(len(data) / batchSize)):
ex, ey = nextBatch(i) # takes 500 examples
i += 1
feedDict = {x :ex, y:ey }
_, cos = sess.run([optimizer,cost], feed_dict= feedDict) # start session to optimize the cost function
epochLoss += cos
print("Epoch", epoch + 1, "completed out of", epochs, "loss:", epochLoss)
correct = tensorFlow.equal(tensorFlow.argmax(prediction,1), tensorFlow.argmax(y, 1))
accuracy = tensorFlow.reduce_mean(tensorFlow.cast(correct, "float"))
print("Accuracy:", accuracy.eval({ x: trainData, y:trainGoal}))
# takes 500 examples each iteration
def nextBatch(num):
# Return the next `batch_size` examples from this data set.
# case: using our data & batch size
num *= batchSize
if num < (len(data) - batchSize):
return data[num: num+batchSize], goal[num: num +batchSize]
neuralNetworkTrain(trainData, x, y)
each epoch (iteration) I`ve got the value of loss function and all good.
now I want to try it on my validation/test set.
Someone now what should I do exactly?
Thanks
If you want to get predictions on the trained data you can simply put something like:
tf_p = tf.nn.softmax(prediction)
...In your graph, having loaded your test data into x_test. Then evaluate predictions with:
[p] = session.run([tf_p], feed_dict = {
x : x_test,
y : y_test
}
)
at the end of your neuralNetworkTrain method, and you should end up having them in p.
...Or using tf.train.Saver:
Alternatively you could use tf.train.Saver object to save and restore (and optionally persist) your model. In order to do that you create a saver after you initialise all variables:
...
tf.initialize_all_variables().run()
saver = tf.train.Saver()
...
And then save it once you're done training, at the end of your neuralNetworkTrain method:
...
model_path = saver.save(sess)
You then build a new graph for evaluation, and restore the model before running it on your test data:
# Load test dataset into X_test
...
tf_x = tf.constant(X_test)
tf_p = tf.nn.softmax(neuralNetworkModel(tf_x))
with tf.Session() as session:
tf.initialize_all_variables().run()
saver.restore(session, model_path)
p = tf_p.eval()
And, once again, p should contain softmax activations for your test dataset.
(I haven't actually run this code I'm afraid, but it should give you an idea of how to implement it.)

Categories