How to restore checkpoint directory in TensorFlow?

How to restore checkpoint directory in TensorFlow? - python

I train model with method in TensorFlow tutorials(code is here). At last I save the model in a checkpoint directory. Now I want to restore from the checkpoint directory:
import tensorflow as tf
def main(_):
saver = tf.train.Saver()
with tf.Session() as sess:
ckpt = tf.train.latest_checkpoint("/data/lstm_models")
saver.restore(sess, ckpt)
if __name__ == "__main__":
tf.app.run()
Howerver, I got error:
ValueError: No variables to save

It look like you haven't defined your graph to restore from the checkpoint, so when building the saver it complains that your graph is empty.
Could you try to build your graph again (e.g. redefining your variables) before trying to restore it?
From the restore method string doc:
It requires a session in which the graph was launched.

Related

Restoring a TensorFlow Model Without Explicitly Saving Session Variables

I've looked at many questions regarding saving a trained neural network, including Tensorflow: how to save/restore a model? and https://cv-tricks.com/tensorflow-tutorial/save-restore-tensorflow-models-quick-complete-tutorial/ but none of them save a model without explicitly saving specific variables along with it, as in my case. Here is my case:
# In session "sesh"
saver = tf.train.Saver()
saver.save(sesh,os.getcwd(),latest_filename= 'RNN_plasma.ckpt')
Now, I quit the session and want to restore the model I just saved. How can I do this? When trying:
import tensorflow as tf
with tf.Session() as session1:
#First let's load meta graph and restore weights
saver = tf.train.import_meta_graph('RNN_plasma.ckpt')#error-line
saver.restore(session1,tf.train.latest_checkpoint('./'))
, the tf.train.import_meta_graph() call returns:
raise IOError("Cannot parse file %s: %s." % (filename, str(e)))
IOError: Cannot parse file RNN_plasma.ckpt: 1:1 : Message type "tensorflow.MetaGraphDef" has no field named "model_checkpoint_path"..
Can anyone give any insight as to what is going on here, and how to solve it?
(My version of TensorFlow doesn't come with tf.python.saved_model.simple_save(). (I have git_version 1.5.0))

Save:
saver = tf.train.Saver()
saver.save(sess,"/tmp/network")
Restore:
sess = tf.Session()
saver = tf.train.import_meta_graph('/tmp/network.meta')
saver.restore(sess,tf.train.latest_checkpoint('/tmp'))
graph = tf.get_default_graph()

You save a simple checkpoint but then you are trying load it as a meta graph. This cannot work.
There is a writeup on the TensorFlow website explaining the differences
https://www.tensorflow.org/mobile/prepare_models#what_is_up_with_all_the_different_saved_file_formats
There must be a file ending with .meta.

Tensorflow: How to have saver.save() and .restore() in one module?

I have a module called neural.py
I initialize the variables in the body.
import tensorflow as tf
tf_x = tf.placeholder(tf.float32, [None, length])
tf_y = tf.placeholder(tf.float32, [None, num_classes])
...
I save the checkpoint in a function train() after training:
def train():
...
pred = tf.layers.dense(dropout, num_classes, tf.identity)
...
cross_entropy = tf.losses.softmax_cross_entropy(tf_y, pred)
...
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
saver = tf.train.Saver(tf.trainable_variables())
for ep in range(epochs):
... (training steps)...
saver.save(sess, "checkpoints/cnn")
I want to also restore and run the network after training in the run() function of this module:
def run():
# I have tried adding tf.reset_default_graph() here
# I have also tried with tf.Graph().as_default() as g: and adding (graph=g) in tf.Session()
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, "checkpoints/cnn")
... (run network etc)
It just doesn't work. It gives me either NotFoundError (see above for traceback): Key beta2_power not found in checkpoint or ValueError: No variables to save if I add tf.reset_default_graph() under run(), as commented above.
However, if I put the exact same code for run() in a new module without train() and with tf.reset_default_graph() at the top, it works perfectly. How do I make it work in the same module?
Final snippet:
if __name__ == '__main__':
print("Start training")
train()
print("Finished training. Generate prediction")
run()

This might be a typo, but saver.save(sess, "checkpoints/cnn") should definitely be within with tf.Session() as sess block, otherwise you're saving a closed session.
NotFoundError (see above for traceback): Key beta2_power not found in checkpoint
I think the problem is that part of your graph is defined in train. The beta1_power and beta2_power are the internal variables of AdapOptimizer, which, along with pred and softmax_cross_entropy, is not in the graph, if train() is not invoked (e.g. commented?). So one solution would be to make the whole graph accessible in both train and run.
Another solution is to separate them and use the restored graph in run, instead of default one. Like this:
tf.reset_default_graph()
saver = tf.train.import_meta_graph('checkpoints/cnn.meta')
with tf.Session() as sess:
saver.restore(sess, "checkpoints/cnn")
print("Model restored.")
tf_x = sess.graph.get_tensor_by_name('tf_x:0')
...
But you'll need to give the names to all of your variables (good idea anyway) and then find those tensors in the graph. Can't use previously defined variables here. This method assures that run method works with the saved model version, can be easily extracted in a separate script, etc.

Attempting to use uninitialized value when restoring the saved models

I am trying to restore the saved model and do the testing.
However, I met the problem of Attempting to use uninitialized value. I've read some posts before. It seems I cannot do the global initialization. But the error seems interesting.
My code is:
new_saver = tf.train.import_meta_graph("trained_model_epoch-1.meta")
sess=tf.Session()
new_saver.restore(sess, './trained_model_epoch-1')
print('Test')
run_test_model(sess,y_out,...... split='Test', N=Ntest)

Have you tried using tf.train.Saver() ?
building_graph_method()
saver = tf.train.Saver()
sess = tf.Session()
saver.restore(sess, save_path)
Of course you would need to save your model, using saver
saver.save(sess, save_path)

I believe you are accessing your tensors/operations directly (if they are defined in the same script), rather than pulling them from the restored graph:
sess = tf.Session()
new_saver.restore(sess, './trained_model_epoch-1')
graph = sess.graph
w1 = graph.get_tensor_by_name("w1:0") # this tensor is initialized
w2 = graph.get_tensor_by_name("w2:0") # this tensor is initialized too

diffence of saving and loading a file in tensorflow and python

I want to ask that the syntax for saving and loading a file in python and tensor flow is different or same?
How can i reload such results
np.save("Result/"+FLAGS.result_file,W)

If you are loading numpy files you can use np.load() to get the results back into a numpy array.
x = np.load("Result/"+FLAGS.result_file)
If you want to save a tensorflow graph, you need to create a saver object after you create your tensors.
x = tf.Variable(..., name="x_saved")
init_op = tf.global_variables_initializer()
...
saver = tf.train.Saver()
Then use the saver object to save the graph to file.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
..
# Save the variables to disk.
save_path = saver.save(sess, "Result/"+FLAGS.result_file)
When you want to load the model, you need to create same sized tensors, and create a saver object. If you load all your tensors from file, you don't need to call initializer.
saver = tf.train.Saver()
and restore the session using that saver.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "Result/"+FLAGS.result_file)
This will load the tensors with values you've saved earlier. If you want to save and load specific tensors only, you can initialize saver object with the names of those tensors.
x_loaded = tf.Variable(..., name="x")
saver = tf.train.Saver({"x_loaded": x})
Bear in mind, If you load some tensors and not the whole graph, you need to initialize any other tensors.

Saving tensorflow model after training is finished

I have finished running a big model in tensorflow python. But I have not saved it inside the session. Now that the training is over, I want to save the variables. I am doing the following:
saver=tf.train.Saver()
with tf.Session(graph=graph) as sess:
save_path = saver.save(sess, "86_model.ckpt")
print("Model saved in file: %s" % save_path)
This returns : ValueError: No variables to save. According to their website what is missing is initialize_all_variables(). The documentation says little about what exactly that does. The word "initialize" scares me, I do not want to reset all my trained values. Any way to save my model without re-running it?

It seems like from the tensorflow documentation, the "session" is the thing that holds the information from the trained model. So presumably somewhere you called sess.run() to train your model - what you want to do is call sess.save() using THAT session, not a new one you create with this saver object.

I believe its because you are not initializing all of your variables in the saver. This should work
with tf.Session() as sess:
tf.initialize_all_variables().run()
saver = tf.train.Saver(tf.all_variables())
-------everything your session does -------------
checkpoint_path = os.path.join(save_dir, 'model.ckpt')
saver.save(sess, checkpoint_path, global_step = your_global_step)

How about using skflow ? With skflow(now skflow is integrated in tensorflow) you can specify the parameter model_dir on your constructor and that automatically will save your model while training(it will save checkpoints so if something goes wrong during training, you can restart from last checkpooint).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to restore checkpoint directory in TensorFlow? - python

Related

Restoring a TensorFlow Model Without Explicitly Saving Session Variables

Tensorflow: How to have saver.save() and .restore() in one module?

Attempting to use uninitialized value when restoring the saved models

diffence of saving and loading a file in tensorflow and python

Saving tensorflow model after training is finished

Categories

Resources