I have trained a neural network with TensorFlow. After training i saved it and loaded it again in a new '. py' file to avoid retraining on accident. As i was testing it with some extra data i found out that it predicts different things for the same data. Should it not theoretically compute the same thing for the same data?
Some information
feed forward net
4 hidden layers with 900 neurons each
5000 training epochs
reached accuracy of ~80%
data was normalized using normalize from sklearn. preprocessing
cost function: tensorflow.nn.softmax_cross_entropy_with_logits
optimizer: tf.train.AdamOptimizer
I am giving my network the data as a matrix, same way i used for training. (each row containing a data sample, having as many columns as there are input neurons)
Out of ten prediction cycles with the same data my network produces different results in at least 2 cycles (max observed 4 so far)
How can this be. By theory all that is happening are data processing calculations of the form W_i*x_i + b_i. As my x_i, W_i and b_i do not change anymore how come that the prediction varies? May there be a mistake in model reloading routine?
with tf.Session() as sess:
saver = tf.train.import_meta_graph('path to .meta')
saver.restore(sess, tf.train.latest_checkpoint('path to checkpoints'))
result = (sess.run(tf.argmax(prediction.eval(feed_dict=x:input_data}),1)))
print(result)
So this is a really stupid mistake by me. Now it works fine with loading the model from a save. The problem was caused by the global variables initializer. If you leave it out, it will work fine. The previously found information may prove useful for someone so i will leave it here. Solution is now:
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, 'path to your saved file C:x/y/z/model/model.ckpt')
After this you can go on as usually. I do not really know why variables initializer prevents this from working. As i see it, it should be something like: initialize all variables to exist and with random values and then got to that saved file and use values from there, but apparently something else happens...
So i have been doing some testing and found out the following about this problem.
As i have been trying to reuse my created model i had to use the tf.global_variables_initializer(). By doing so it has overwritten my imported graph and all the values were random, which explains different network outputs. This still left me with a problem to solve: how do i load my network? The workaround i am currently using is not optimal by far but it at least allows me to use my saved model. Tensor flow allows one to give unique names to the functions and tensors used. By doing so i could access them through the graph:
with tf.Session() as sess:
saver = tf.train.import_meta_graph('path to .meta')
saver.restore(sess, tf.train.latest_checkpoint('path to checkpoints'))
graph = tf.get_default_graph()
graph.get_tensor_by_name('name:0')
Using this method i could access all my saved values, but they were separated! It means that i had 1x weight and 1x bias per operation used, which led to a bunch of new variables. If you do not know the names, use following:
print(graph.get_all_collection_keys())
This prints the collection names (our variables are stored in collections)
print(graph.get_collection('name'))
This allows us to access the collection as see what are the names/keys for our variables.
This led to another problem. I could no longer use my model as global variables initializer had everything overwritten. By thus i had to redefine the whole model manually with weight and biases that i got previously.
Unfortunately, this is the only thing i could come up with. If anyone has a better idea, please let me know.
The whole thing with mistake looked like this:
imports...
placeholders for data...
def my_network(data):
## network definition with tf functions ##
return output
def train_my_net():
prediction = my_network(data)
cost function
optimizer
with tf.Session() as sess:
for i in how many epochs i want:
training routine
save
def use_my_net():
prediction = my_network(data)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver = tf.train.import_meta_graph('path to .meta')
saver.restore(sess, tf.train.latest_checkpoint('path to checkpoints'))
print(sess.run(prediction.eval(feed_dict={placeholder:data})))
graph = tf.get_default_graph()
Related
I am trying to save my model at different steps while training. Let's say I would like to save after 5 epochs.
At this moment I am using:
tf.saved_model.simple_save(
sess, model_folder, inputs, outputs
)
which works as a charm. Nevertheless, I realize it is saving the whole graph and weights on each iteration, which has a high computational cost.
I would like to update the weights of my model keeping the graph from the previous save (since it is not changing during training)
I have read about tf.train.Saver which seems to fit with my intentions. But this forces me to specify all the variables I want to save, this is not as practical as simple_save method. So I am wondering if there is any way of using simple_save in a checkpoint fashion.
I think you have wrong understanding of the tf.train.Saver. You can do something as simple as:
saver = tf.train.Saver()
with tf.Session() as sess:
for e in range(epochs):
...
if e % 5 == 0:
saver.save(sess, "/path/where/to/save/model")
So no need to specify every single variable you want to save.
I am currently going through this tutorial on how to use TensorFlow to train a CNN and use it to categorize images based on the CIFAR-10 data set. When running the evaluation script, cifar10_eval.py, the output is a precision rating of how accurate the model is against the test set. I instead wanted to see the output of the model's classifications for each category on the test data. The way the logits are calculated and stored is through:
# Build a graph that computes the logits predictions from the
# inference model.
logits = cifar10.inference(images)
After running this line, I edited the script to display the type of the "logits" variable, its shape, and the type of its elements through the following:
print(type(logits))
print(logits.dtype)
print(logits.shape)
Which returns the following output:
class 'tensorflow.python.framework.ops.Tensor'
dtype: 'float32'
(128, 10)
I am assuming the shape is (128,10) for there being 128 test images with each image being given an evaluation on how likely it is to be each of the 10 categories. In order to display this I am trying the following code:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(logits.eval())
This .eval() statement never terminates, I was wondering where I've gone wrong and how to fix this so that I can access the logits?
This is probably because you are opening a new session (and reinitializing the variables with that!). Try evaluating logits in the same session it is created. But it's weird that it doesn't terminate, it should raise an error. Also, it's tf.Session() not tf.Session
I'd like to train my model with many epoches using Tensorflow v1.0. And my idea is to save every model in every epoch. But soon i found the current model would replace the last one.(i mean the last one would vanish.) So i want to know how to get all of the models and restore them one by one. I think it's hard and haven't got a nice solution. Thanks for every suggestion!
tf.Train.Saver().save() has an argument global_step.
From the documentation:
Savers can automatically number checkpoint filenames with a provided counter. This lets you keep multiple checkpoints at different steps while training a model.
So you should try something like:
saver = tf.Train.Saver(...)
sess = tf.Session(...)
for epoch in num_epochs:
... train model...
saver.save(sess, "MODEL_NAME", global_step=epoch)
Note that by default, Tensorflow keeps only the last 5 checkpoints. If you want to keep them all you should initialize your Saver with something in the lines of:
saver = tf.Train.Saver(max_to_keep=num_epochs)
I trained a RNN with a fixed batch size, but now I'd like to modify the graph I saved with tf.train.Saver to have batch size 1 for inference. How can I go about this?
session = tf.InteractiveSession()
saver = tf.train.import_meta_graph('model.ckpt.meta')
saver.restore(session, 'model.ckpt')
A way to achieve this is to reconstruct a different (albeit compatible) network at test time and limit the recovery to weights only.
During training,
net = make_my_net(batch_size)
...
saver.save(session, model_name)
During testing,
net = make_my_net(1)
...
saver.restore(session, model_name)
The later will replace the values of variables (including network weights) with the ones that were saved earlier. You don't have to initialize the variables that you are about to overwrite according to the documentation, although I believe it has not always been so.
Note that reconstructing a different network gives you the opportunity to build a cleaner test network, e.g. by removing layers such as dropout.
I trained a FCN model in Tensorflow following implementation in link and saved the complete model as checkpoint, Now I want to use the saved model(pre-trained) for different problem.
I tried to restore the model from checkpoint by specifying the weights in Saver as:
saver = tf.train.Saver({"weights" : [w1_1,w1_2,w2_1,w2_2,w3_1,w3_2,w3_3,w3_4, w4_1, w4_2, w4_3, w4_4,w5_1,w5_2,w5_3,w6,w7]})
I am getting weights as:
w1_1=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,scope='inference/conv1_1_w')
and so on....
I am not able to restore it successfully (up to specific layer).
Tensorflow version:0.12r
Either you can call init = tf.initialize_variables([list_of_vars]) followed by sess.run(init) and that would reinitialize those variables for you, or you can recreate the graph with same structure from the point where you want to freeze the weights but keep different names for variables. Further in case you only want to train certain variables only, you can pass those variables only to optimizer. tf.train.AdamOptimizer(learning_rate).minimize(loss,var_list = [wi, wj, ....])