where to save the checkpoint while training the neural network

where to save the checkpoint while training the neural network - python

In the example below I train the data x amount of times based on the value of iterations. At first I saved a checkpoint every iteration, but when I place it just below the iterations the training becomes much faster obviously. I'm not really sure if it makes a difference. Is one save in the session enough? And will the iterations use the values(weights) set in the previous iteration?
def train():
"""Trains the neural network
:returns: 0 for now. Why? 42
"""
unique_labels, labels = get_labels(training_data_file, savelabels=True) # get the labels from the training data
randomize_output(unique_labels) # create the correct amount of output neurons
prediction = neural_network_model(p_inputdata)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=p_known_labels)) # calculates the difference between the known labels and the results
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables()) # initialize all variables **in** the session
with open(training_data_file, "r") as file:
inputdata = [re.split(csvseperatorRE, line.rstrip("\n"))[:-1] for line in file]
if path.isfile(temp_dir + "\\" + label + ".ckpt.index"):
saver.restore(sess, temp_dir + "\\" + label + ".ckpt")
for i in range(iterations):
_, cost_of_iteration = sess.run([optimizer,cost], feed_dict={
p_inputdata: np.array(inputdata),
p_known_labels: np.array(labels)
})
saver.save(sess, temp_dir + "\\" + label + ".ckpt")
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(tf.argmax(p_known_labels, 1)))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
with open(verify_data_file, "r") as file:
verifydata = [re.split(csvseperatorRE, line.rstrip("\n"))[:-1] for line in file]
npverifydata = np.array(verifydata )
nplabels = np.array(get_labels(verify_data_file)[1])
print("Accuracy: ", accuracy.eval({p_inputdata: npverifydata , p_known_labels: nplabels}))
return 0

Related

Tensorflow Deep neural network code implementation

I recently tried on building a deep NN for identifying image between 1 and 0 from scratch. My codes seems to be having a very low accuracy and I am wondering why. Since I am still a beginner, I do not know how to check and also to evaluate my model. Is there a way to improve while at the same time check each dataset to print and see whether the model predicted correctly?
My dataset is from here: https://www.kaggle.com/kanncaa1/deep-learning-tutorial-for-beginners
My codes:
from zipfile import ZipFile
# from builtins import FileExistsError
from subprocess import check_output
from sklearn.model_selection import train_test_split
import os
import numpy as np
#import matplotlib
#matplotlib.use("Agg")
#import matplotlib.pyplot as plt
import tensorflow as tf
dirName = "input_file"
try:
# Create a directory
os.mkdir(dirName)
print("directory " + dirName + " created")
except:
# If directory exists
print("directory " + dirName + " exists \n")
# Extract the zipfile to the directory
with ZipFile("sign-language-digits-dataset.zip", "r") as zip_data:
zip_data.extractall("input_file")
# Print the files
print("List of files:")
print(check_output(["ls","input_file"]))
# Load data
x_data = np.load("input_file/X.npy")
y_data = np.load("input_file/Y.npy")
X = np.concatenate((x_data[204:409], x_data[822:1027] ), axis=0)
# Is there a way to improve this part? from here
first_zeros = np.zeros(205)
first_ones = np.ones(205)
zeros = list(zip(first_zeros, first_ones))
second_zeros = np.zeros(205)
second_ones = np.ones(205)
ones = list(zip(second_ones, second_zeros))
ones = np.asarray(ones)
zeros = np.asarray(zeros)
# to here???
Y = np.concatenate((zeros, ones), axis=0).reshape(X.shape[0],2)
# Create 25% for test data and random state is the seed with randomly picked
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25,
random_state=42)
try:
if len(X_train) == len(Y_train):
print("Both have {} lengths".format(len(X_train)))
except:
print("Different length")
X_train_flatten = X_train.reshape(X_train.shape[0],
X_train.shape[1]*X_train.shape[2])
X_test_flatten = X_test.reshape(X_test.shape[0],
X_test.shape[1]*X_test.shape[2])
# Network parameters
h_layer = 255
h_layer2 = 255
num_input = X_train_flatten.shape[1] # .shape is used for getting the shape
num_classes = Y_train.shape[1] # .shape is used for getting the shape
# parameters
learning_rate = 0.1
num_steps = 500
batch_size = 128
display_step = 100
# tf graph input
x = tf.placeholder("float", [None, num_input]) #4096 because these are the
#image size which required to be trained
y = tf.placeholder("float", [None, num_classes]) #1 because these are the
# output values that are either 0 or 1 in 2D but only 1 column
# Weight and bias
weights = {
"h1":tf.Variable(tf.random_normal([num_input, h_layer])),
"h2":tf.Variable(tf.random_normal([h_layer, h_layer2])),
"output":tf.Variable(tf.random_normal([h_layer2, num_classes]))
}
biases = {
"h1":tf.Variable(tf.random_normal([h_layer])),
"h2":tf.Variable(tf.random_normal([h_layer2])),
"output":tf.Variable(tf.random_normal([num_classes]))
}
# Create model
def neural_network(x):
layer_1 = tf.add(tf.matmul(x, weights["h1"]), biases["h1"])
layer_2 = tf.add(tf.matmul(layer_1, weights["h2"]), biases["h2"])
output = tf.add(tf.matmul(layer_2, weights["output"]), biases["output"])
return output
# Construct model
logits = neural_network(x)
# Define loss and optimizer
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(1, num_steps+1):
if step % display_step == 0 or step == 1:
# feed_dict takes values from the placeholder which is the tf
# graph input in the form of 2D - (410, 4096) shape
loss_val, accuracy_val = sess.run([loss, accuracy],
feed_dict={x:X_train_flatten,
y:Y_train})
print("Step " + str(step) + ", Minibatch Loss= " + \
"{:.4f}".format(loss_val) + ", Training Accuracy= " + \
"{:.3f}".format(accuracy_val))
print("Optimization Finished!")
# Calculate accuracy for MNIST test images
print("Testing Accuracy:", \
sess.run(accuracy, feed_dict={x: X_test_flatten,
y: Y_test}))
My side question:
1) My class should be [zero, one] so that means I have 2 classes. The output should be identifying the image input which is in pixels 4096 and output either 0 and 1. For example, if "0" then it will output [1,0] and for "1" it will output [0,1].
2) Is my code even considered as DNN or just a simple NN? If it is not DNN then how should I change it so that it could become DNN?
3) From my understanding, DNN do not require manual feature extraction because it learns from the training data and then perform classification. Unlike machine learning. Am I understanding it correctly?

Tensorflow GPU Out Of Memory during runtime using dynamic_rnn

I'm having trouble training a seq2seq model using Tensorflow on an Nvidia P100 GPU.
Here are the versions I'm using: TensorFlow 1.10.0, Keras 2.2.2, Python 3.6.3, CUDA 9.2.148.1, cuDNN 7.2.1
I currently get an OOM error well in the middle of training (18 minutes).
I've been doing a little digging and tried setting allow_growth = True (flag not set in the code below) but did not manage to see any memory grow, it all gets allocated at the start.
I also tried setting the graph to read only with tf.finalize() but the program still runs, which suggests no nodes are being added or that if haven't placed the function in the correct place in the code.
Since the graph trains and can be saved and all, it doesn't seem to be too large at the start.
Here are some of the hyper parameters I'm using:
batch_size = 10
rnn_size = 1024
src/tgt_vocab_size = 8000
epochs = 20
display_steps = 10
The dataset used are docstrings and the associated function code, so not incredibly long.
One of my original thoughts was that since the size of the sentences is dynamic, one really long one could be too big. But I shuffled the dataset to see if the crash happened at a different time and it's still at 18 minutes with the same parameters.
Here is the code including the graphs and the training/testing loop.
def train(...):
...
...
def source_generator():
for el in train_source_sents:
yield el
def target_generator():
for el in train_target_sents:
yield el
train_graph = tf.Graph()
with train_graph.as_default():
# Dataset and batch preparation
with tf.name_scope("Dataset-prep"):
source_dataset = tf.data.Dataset.from_generator(source_generator,output_types= tf.int32,output_shapes=(tf.TensorShape([None]))) # Converting sentence array to dataset
target_dataset = tf.data.Dataset.from_generator(target_generator,output_types= tf.int32,output_shapes=(tf.TensorShape([None]))) # Converting sentence array to dataset
target_dataset = target_dataset.map(lambda x: tf.concat([x, [target_tok2ID['<EOS>']]], 0)) # Adding <EOS> character to the endo of all the target sentences
target_input_dataset = target_dataset.map(lambda x: tf.concat([[target_tok2ID['<GO>']], x], 0)) # Creating training inputs for the decoder, This requires adding <GO> to the start of the sequence
target_sequence_length = target_dataset.map(lambda x: tf.shape(x)[0]) # Adding the sizes of all the sequences to be paired up with the reset of the dataset
dataset = tf.data.Dataset.zip((source_dataset, target_dataset, target_input_dataset, target_sequence_length)) # create the collection of all the individual datasets
dataset = dataset.shuffle(buffer_size=10000)
with tf.name_scope("Dataset-batching"):
dataset = dataset.repeat(epochs)
pad_id = target_tok2ID['<PAD>']
batched_dataset = dataset.padded_batch(batch_size, padded_shapes=([None], [None],[None], []), padding_values=(pad_id,pad_id,pad_id,pad_id))
batched_dataset = batched_dataset.prefetch(buffer_size=batch_size) # could be removed, perhaps yeilds improvements
iterator = batched_dataset.make_one_shot_iterator()
source_batch, target_batch, target_input_batch, batch_target_sequence_length = iterator.get_next()
with tf.name_scope("Encoding-layer"):
source_vocab_size = len(source_tok2ID)
embed = tf.contrib.layers.embed_sequence(source_batch, vocab_size=source_vocab_size, embed_dim=encoding_embedding_size)
stacked_cells = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob) for _ in range(num_layers)])
outputs, encoder_state = tf.nn.dynamic_rnn(stacked_cells, embed, dtype=tf.float32)
with tf.name_scope("Decoding-layer"):
target_vocab_size = len(target_tok2ID)
dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, target_input_batch)
cells = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.LSTMCell(rnn_size) for _ in range(num_layers)])
output_layer = tf.layers.Dense(target_vocab_size)
dec_cell = tf.contrib.rnn.DropoutWrapper(cells, output_keep_prob=keep_prob)
helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, batch_target_sequence_length)
decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, helper, encoder_state, output_layer)
max_target_sequence_length = tf.reduce_max(batch_target_sequence_length, axis=0)
dec_outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder, impute_finished=True, maximum_iterations=max_target_sequence_length)
dec_outputs = tf.identity(dec_outputs.rnn_output, name='logits') # This step might seem a little misterious, but it is to take the output from the dynamic decoder and pass it to a tensor from (Dodumentation is scarce here)
masks = tf.sequence_mask(batch_target_sequence_length, max_target_sequence_length, dtype=tf.float32, name='masks')
cost = tf.contrib.seq2seq.sequence_loss( dec_outputs, target_batch, masks)
optimizer = tf.train.AdamOptimizer(learning_rate)
gradients = optimizer.compute_gradients(cost)
capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
train_op = optimizer.apply_gradients(capped_gradients)
tf.summary.scalar('cost', cost)
merged = tf.summary.merge_all()
train_saver = tf.train.Saver()
init_op = tf.global_variables_initializer()
train_sess = tf.Session(graph=train_graph)
if(model_path):
print(model_path)
train_saver.restore(train_sess, model_path)
else:
train_sess.run(init_op)
store_path = "./visualisations/"
writer = tf.summary.FileWriter(store_path, train_sess.graph)
step = 0
start_time = datetime.now()
while True:
try:
step += 1
model_cost, _, summary = train_sess.run((cost, train_op, merged))
writer.add_summary(summary,step)
if(step % display_step == 0):
save_path = train_saver.save(train_sess, "./checkpoints/NMT")
print("Model saved in path: %s" % save_path)
test_sess = tf.Session(graph=test_graph)
test_saver.restore(test_sess, save_path)
# print(test_sess.run(foo))
scores = []
while True:
try:
predictions, refrence = test_sess.run([dec_predictions, test_target_batch])
for i in range(len(refrence)):
BLEU_score = nltk.translate.bleu_score.sentence_bleu([np.trim_zeros(refrence[i])], np.trim_zeros(predictions[i]), weights = (1,0))
scores.append(BLEU_score)
print("########################################")
print("ref:", list(map(lambda x: target_ID2tok[x], np.trim_zeros(refrence[i]))))
print("")
print("pred:", list(map(lambda x: target_ID2tok[x], np.trim_zeros(predictions[i]))))
except tf.errors.OutOfRangeError:
print("Exhausted test data")
break
delta_time = datetime.now() - start_time
total_exp_time = delta_time * (total_steps / step)
remaining_time = total_exp_time - delta_time
print("")
print("Test set BLEU Score:", np.mean(scores))
print("Model cost:" ,model_cost)
print("Step {} from {}".format(step, total_steps))
print("Current time:", datetime.now())
print("Total Experiment time (Hours:Minutes:Seconds):", str(total_exp_time))
print("Time Elapled (Hours:Minutes:Seconds):", str(delta_time))
print("Time remaining (Hours:Minutes:Seconds):", str(remaining_time))
print("")
except tf.errors.OutOfRangeError:
print("Model finished training")
break
return save_path
Here is an output of a training run:
command line output
Is there something wrong with the way I'm executing the graph, or am I repeating some step leading to the memory fill up?
Thanks for all your help!

TensorFlow - cannot recreate neural network

I am trying to recreate/rerun a previous neural network. The idea is that I save the weights and biases with which the network is initialised in run (1), and reuse those exact weights and biases for initialisation in run (2), so that the outcome of run (2) will be exactly that of run (1). The relevant code is below.
However, I can't manage to do this. If I rerun the network, setting use_stored_weights to True (which prompts neural_network_model to restore the previously saved weights and biases, and, I hope, initialise the network with them), I do indeed always get the same results - but these are not the results of the run I am trying to emulate (in fact, they are much worse). It is also strange that I ALWAYS get the same results - they thus seem to be independent of the stored weights and biases.
The first thing I checked is whether I am correctly restoring my weights and biases (in neural_network_model), and this is the case.
I'm not really sure what is going on here. I have two suspicions, both of which I don't know how to check:
Although the correct weights and biases are restored in neural_network_model, the network is in fact not initialised with them. Is it the command sess.run(init) in train_neural_network that somehow does this?
I am reinitialising the network with the stored weights and biases not once, but every time - this would explain the poor results.
... but it might as well be something completely different. Any help much appreciated!
UPDATE: I figured out what is happening: when use_stored_weights is set to True, the weights are always initialised as all zeros. As far as I can see, this is due to the sess.run(init) in train_neural_network. However, if I leave out that line, I get an error:
FailedPreconditionError: Attempting to use uninitialized value b2
So I guess my question is now: how can I make the restored weights and biases accessible in train_neural_network?
Imports:
import tensorflow as tf
import numpy as np
from numpy import genfromtxt
This function builds the neural network:
def neural_network_model(data, layer_sizes, use_stored_weights):
num_layers = len(layer_sizes) - 1 # hidden and output layers
weights = {}
biases = {}
# initialise the weights
# (a) create new weights and biases
if not use_stored_weights:
for i in range(num_layers):
w_name = 'W' + str(i+1)
b_name = 'b' + str(i+1)
weights[w_name] = tf.get_variable(w_name, [layer_sizes[i], layer_sizes[i+1]],
initializer = tf.contrib.layers.xavier_initializer(), dtype=tf.float32)
biases[b_name] = tf.get_variable(b_name, [layer_sizes[i+1]],
initializer = tf.zeros_initializer(), dtype=tf.float32)
# save weights and biases
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
save_path = saver.save(sess, fold_path + 'weights/' + 'weights.ckpt')
# (b) restore weights and biases
else:
for i in range(num_layers):
# prepare variable
w_name = 'W' + str(i+1)
b_name = 'b' + str(i+1)
weights[w_name] = tf.get_variable(w_name, [layer_sizes[i], layer_sizes[i+1]],
initializer = tf.zeros_initializer(), dtype=tf.float32)
biases[b_name] = tf.get_variable(b_name, [layer_sizes[i+1]],
initializer = tf.zeros_initializer(), dtype=tf.float32)
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, fold_path + 'weights/' + 'weights.ckpt')
# calculate linear and relu outputs for hidden layers
a_prev = data
for i in range(len(weights)-1):
z = tf.add(tf.matmul(a_prev, weights['W' + str(i+1)]), biases['b' + str(i+1)])
a = tf.nn.relu(z)
a_r = tf.nn.dropout(a, keep_prob)
a_prev = a_r
# calculate linear output for output layer
z_o = tf.add(tf.matmul(a_prev, weights['W' + str(len(weights))]), biases['b' + str(len(weights))])
return z_o
This function trains and evaluates the network:
def train_neural_network(x, layer_sizes, use_stored_weights):
prediction = neural_network_model(x, layer_sizes, use_stored_weights)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=lrn_rate).minimize(cost)
softm = tf.nn.softmax(prediction)
pred_class = tf.argmax(softm)
costs = []
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
epoch_loss = 0
for _ in range(int(len(x_train)/batch_size)):
epoch_x, epoch_y = x_train, y_train
_, c = sess.run([optimizer, cost], feed_dict = {x: epoch_x, y: epoch_y, keep_prob: kp})
epoch_loss += c
softmaxes_tst = sess.run([softm, pred_class], feed_dict={x: x_test, keep_prob: 1.0})[0]
incorr = 0
for i in range(len(softmaxes_tst)):
curr_sm = softmaxes_tst[i]
curr_lbl = y_test[i]
if np.argmax(curr_sm) != np.argmax(curr_lbl):
incorr += 1
print('incorr: ', incorr)
num_ex = len(x_test)
print('acc = ' + str(num_ex-incorr) + '/' + str(num_ex) + ' = ' + str((num_ex-incorr)/num_ex))
print('accuracy train:', accuracy.eval({x:x_train, y:y_train, keep_prob: kp}))
print('accuracy test :', accuracy.eval({x:x_test, y:y_test, keep_prob: 1.0}))
And this, finally, is the code that calls the above functions:
lrn_rate = 0.01
kp = 0.85
epochs = 400
path = '/my/path/to/data/'
fold_path = ''
num_folds = 19
for i in range(num_folds):
tf.reset_default_graph()
fold_path = path + 'fold_' + str(i+1) + '/'
x_train = genfromtxt(fold_path + 'fv_train.csv', delimiter=',')
y_train = genfromtxt(fold_path + 'lbl_train.csv', delimiter=',')
x_test = genfromtxt(fold_path + 'fv_test.csv', delimiter=',')
y_test = genfromtxt(fold_path + 'lbl_test.csv', delimiter=',')
num_classes = len(y_train[0])
num_features = len(x_train[0])
batch_size = len(x_train)
num_nodes_hl1 = num_features
num_nodes_hl2 = num_features
layer_sizes = [num_features, num_nodes_hl1, num_nodes_hl2, num_classes]
x = tf.placeholder('float', [None, num_features])
y = tf.placeholder('float')
keep_prob = tf.placeholder('float')
use_stored_weights = True
train_neural_network(x, layer_sizes, use_stored_weights)

TensorFlow: model saved successful but restore failed, where am I wrong?

I am learning TensorFlow recently, obviously I am a newbie. But I have tried many ways in this question, I wrote this code to train my model and want to directly restore it instead train it again if the model.ckpt file already exists. But after train, my test accuracy is about 90%, but if I restore it directly the accuracy just about 10%, I think it because I am failed restore my model. I just have two variables named weights and biases, this is my main-part code:
def train(bottleneck_tensor, jpeg_data_tensor):
image_lists = create_image_lists(TEST_PERCENTAGE, VALIDATION_PERCENTAGE)
n_classes = len(image_lists.keys())
# input
bottleneck_input = tf.placeholder(tf.float32, [None, BOTTLENECK_TENSOR_SIZE],
name='BottleneckInputPlaceholder')
ground_truth_input = tf.placeholder(tf.float32, [None, n_classes], name='GroundTruthInput')
# this is the new_layer code
# with tf.name_scope('final_training_ops'):
# weights = tf.Variable(tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, n_classes], stddev=0.001))
# biases = tf.Variable(tf.zeros([n_classes]))
# logits = tf.matmul(bottleneck_input, weights) + biases
logits=transfer_new_layer.new_layer(bottleneck_input,n_classes)
final_tensor = tf.nn.softmax(logits)
# losses
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=ground_truth_input)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
train_step = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy_mean)
# calculate the accurancy
with tf.name_scope('evaluation'):
correct_prediction = tf.equal(tf.argmax(final_tensor, 1), tf.argmax(ground_truth_input, 1))
evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
image_order_step = tf.arg_max(final_tensor, 1)
saver = tf.train.Saver(tf.global_variables(), write_version=tf.train.SaverDef.V1)
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
if os.path.exists('F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt'):
saver.restore(sess,"F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt")
reader = tf.train.NewCheckpointReader('F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
all_variables = reader.get_variable_to_shape_map()
for each in all_variables:
print(each, all_variables[each])
print(reader.get_tensor(each))
else:
print("retrain model")
for i in range(STEPS):
train_bottlenecks, train_ground_truth = get_random_cached_bottlenecks(
sess, n_classes, image_lists, BATCH, 'training', jpeg_data_tensor, bottleneck_tensor)
sess.run(train_step,
feed_dict={bottleneck_input: train_bottlenecks, ground_truth_input: train_ground_truth})
# 在验证数据上测试正确率
if i % 100 == 0 or i + 1 == STEPS:
validation_bottlenecks, validation_ground_truth = get_random_cached_bottlenecks(
sess, n_classes, image_lists, BATCH, 'validation', jpeg_data_tensor, bottleneck_tensor)
validation_accuracy = sess.run(evaluation_step, feed_dict={
bottleneck_input: validation_bottlenecks, ground_truth_input: validation_ground_truth})
print('Step %d: Validation accuracy on random sampled %d examples = %.1f%%' % (
i, BATCH, validation_accuracy * 100))
saver.save(sess, 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
print(tf.get_session_tensor("final_training_ops/Variable",dtype=float))
print(tf.get_session_tensor("final_training_ops/Variable_1",dtype=float))
print('Beginning Test')
# test
test_bottlenecks, test_ground_truth = get_tst_bottlenecks(sess, image_lists, n_classes,
jpeg_data_tensor,
bottleneck_tensor)
# saver.restore(sess, 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
test_accuracy = sess.run(evaluation_step, feed_dict={
bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth})
print('Final test accuracy = %.1f%%' % (test_accuracy * 100))
label_name_list = list(image_lists.keys())
for label_index, label_name in enumerate(label_name_list):
category = 'testing'
for index, unused_base_name in enumerate(image_lists[label_name][category]):
bottlenecks = []
ground_truths = []
print("real lable%s:" % label_name)
# print(unused_base_name)
bottleneck = get_or_create_bottleneck(sess, image_lists, label_name, index, category,
jpeg_data_tensor, bottleneck_tensor)
# saver.restore(sess, 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
ground_truth = np.zeros(n_classes, dtype=np.float32)
ground_truth[label_index] = 1.0
bottlenecks.append(bottleneck)
ground_truths.append(ground_truth)
image_kind = sess.run(image_order_step, feed_dict={
bottleneck_input: bottlenecks, ground_truth_input: ground_truths})
image_kind_order = int(image_kind[0])
print("pre_lable%s:" % label_name_list[image_kind_order])

Try this method to save and restore:
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(initVar)
# restore saved model
new_saver = tf.train.import_meta_graph('my-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
# save model weights, after training process
saver.save(sess, 'my-model')
define a tf.train.Saver outside the session. After finish training process save the weights by saver.save(sess, 'my-model'). And restore the weights like above.

I know where I am wrong..., the truth is that I have restore the model successfully, but because I create the result list every time by rand, when I use image_order_step = tf.arg_max(final_tensor, 1) to calculate the kind of the test image, because when I run the code next time, the lables order change, but the weight and biaese still the same to the last time, for example,for the first time,the lable list is [A1,A2,A3,A4,A5,A6],and after calculate the image_order_step = tf.arg_max(final_tensor, 1) result is 3,so the result will be A4, next time the lable list change to [A5,A3,A1,A6,A2,A4], but the image_order_step = tf.arg_max(final_tensor, 1) result still 3, so the predict result will be A6, so the accuracy will change every time and totally by rand...
This question tell me, be careful for the every detail, or a little ERROR will make you confusing for a long time. OVER!

How to test my Neural Network developed Tensorflow?

I`m just finished to write neural net with tensorflow
attached code :
import tensorflow as tensorFlow
import csv
# read data from csv
file = open('stub.csv')
reader = csv.reader(file)
temp = list(reader)
del temp[0]
# change data from string to float (Tensorflow)
# create data & goal lists
data = []
goal = []
for i in range(len(temp)):
data.append(map(float, temp[i]))
goal.append([data[i][6], 0.0])
del data[i][6]
# change lists to tuple
data = tuple(tuple(x) for x in data)
goal = tuple(goal)
# create training data and test data by 70-30
a = int(len(data) * 0.6) # training set 60%
b = int(len(data) * 0.8) # validation & test: each one is 20%
trainData = data[0:a] # 60%
validationData = data[b: len(data)]
testData = data[a: b] # 20%
trainGoal = goal[0:a]
validationGoal = goal[b:len(data)]
testGoal = goal[a: b]
numberOfLayers = 500
nodesLayer = []
# define the numbers of nodes in hidden layers
for i in range(numberOfLayers):
nodesLayer.append(500)
# define our goal class
classes = 2
batchSize = 2000
# x for input, y for output
sizeOfRow = len(data[0])
x = tensorFlow.placeholder(dtype= tensorFlow.float32, shape=[None, sizeOfRow])
y = tensorFlow.placeholder(dtype= tensorFlow.float32, shape=[None, classes])
hiddenLayers = []
layers = []
def neuralNetworkModel(x):
# first step: (input * weights) + bias, linear operation like y = ax + b
# each layer connection to other layer will represent by nodes(i) * nodes(i+1)
for i in range(0,numberOfLayers):
if i == 0:
hiddenLayers.append({"weights": tensorFlow.Variable(tensorFlow.random_normal([sizeOfRow, nodesLayer[i]])),
"biases": tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i]]))})
elif i > 0 and i < numberOfLayers-1:
hiddenLayers.append({"weights" : tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i], nodesLayer[i+1]])),
"biases" : tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i+1]]))})
else:
outputLayer = {"weights": tensorFlow.Variable(tensorFlow.random_normal([nodesLayer[i], classes])),
"biases": tensorFlow.Variable(tensorFlow.random_normal([classes]))}
# create the layers
for i in range(numberOfLayers):
if i == 0:
layers.append(tensorFlow.add(tensorFlow.matmul(x, hiddenLayers[i]["weights"]), hiddenLayers[i]["biases"]))
layers.append(tensorFlow.nn.relu(layers[i])) # pass values to activation function (i.e sigmoid, softmax) and add it to the layer
elif i >0 and i < numberOfLayers-1:
layers.append(tensorFlow.add(tensorFlow.matmul(layers[i-1], hiddenLayers[i]["weights"]), hiddenLayers[i]["biases"]))
layers.append(tensorFlow.nn.relu(layers[i]))
output = tensorFlow.matmul(layers[numberOfLayers-1], outputLayer["weights"]) + outputLayer["biases"]
return output
def neuralNetworkTrain(data, x, y):
prediction = neuralNetworkModel(x)
# using softmax function, normalize values to range(0,1)
cost = tensorFlow.reduce_mean(tensorFlow.nn.softmax_cross_entropy_with_logits(prediction, y))
# minimize the cost function
# using AdamOptimizer algorithm
optimizer = tensorFlow.train.AdadeltaOptimizer().minimize(cost)
epochs = 2 # feed machine forward + backpropagation = epoch
# build sessions and train the model
with tensorFlow.Session() as sess:
sess.run(tensorFlow.initialize_all_variables())
for epoch in range(epochs):
epochLoss = 0
i = 0
for _ in range(int(len(data) / batchSize)):
ex, ey = nextBatch(i) # takes 500 examples
i += 1
feedDict = {x :ex, y:ey }
_, cos = sess.run([optimizer,cost], feed_dict= feedDict) # start session to optimize the cost function
epochLoss += cos
print("Epoch", epoch + 1, "completed out of", epochs, "loss:", epochLoss)
correct = tensorFlow.equal(tensorFlow.argmax(prediction,1), tensorFlow.argmax(y, 1))
accuracy = tensorFlow.reduce_mean(tensorFlow.cast(correct, "float"))
print("Accuracy:", accuracy.eval({ x: trainData, y:trainGoal}))
# takes 500 examples each iteration
def nextBatch(num):
# Return the next `batch_size` examples from this data set.
# case: using our data & batch size
num *= batchSize
if num < (len(data) - batchSize):
return data[num: num+batchSize], goal[num: num +batchSize]
neuralNetworkTrain(trainData, x, y)
each epoch (iteration) I`ve got the value of loss function and all good.
now I want to try it on my validation/test set.
Someone now what should I do exactly?
Thanks

If you want to get predictions on the trained data you can simply put something like:
tf_p = tf.nn.softmax(prediction)
...In your graph, having loaded your test data into x_test. Then evaluate predictions with:
[p] = session.run([tf_p], feed_dict = {
x : x_test,
y : y_test
}
)
at the end of your neuralNetworkTrain method, and you should end up having them in p.
...Or using tf.train.Saver:
Alternatively you could use tf.train.Saver object to save and restore (and optionally persist) your model. In order to do that you create a saver after you initialise all variables:
...
tf.initialize_all_variables().run()
saver = tf.train.Saver()
...
And then save it once you're done training, at the end of your neuralNetworkTrain method:
...
model_path = saver.save(sess)
You then build a new graph for evaluation, and restore the model before running it on your test data:
# Load test dataset into X_test
...
tf_x = tf.constant(X_test)
tf_p = tf.nn.softmax(neuralNetworkModel(tf_x))
with tf.Session() as session:
tf.initialize_all_variables().run()
saver.restore(session, model_path)
p = tf_p.eval()
And, once again, p should contain softmax activations for your test dataset.
(I haven't actually run this code I'm afraid, but it should give you an idea of how to implement it.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

where to save the checkpoint while training the neural network - python

Related

Tensorflow Deep neural network code implementation

Tensorflow GPU Out Of Memory during runtime using dynamic_rnn

TensorFlow - cannot recreate neural network

TensorFlow: model saved successful but restore failed, where am I wrong?

How to test my Neural Network developed Tensorflow?

Categories

Resources