How to troubleshoot TensorFlow error “Restoring from checkpoint failed.”

How to troubleshoot TensorFlow error “Restoring from checkpoint failed.” - python

I am new to Tensorflow, I have been using a trained model from a Git repository. The pre-trained model is saved in '../model/snapshot-38' directory. I have snapshot-38.index, snapshot-38.meta, snapshot-38.data-00000-of-00001 and checkpoint files here. I have my python script files and data in '../src' and I don't use any other location other than these in my code to save model.
def save(self):
"save model to file"
self.snapID += 1
self.saver.save(self.sess, '../model/snapshot', global_step=self.snapID)
I am using Python 3.6, Tensorflow 1.12.2
I have backed these files and tried re-training using a different set of data and generating a new model output but aborted half way through.
I have then restored my pre-trained model files from the back up as before but from since then I am getting error "Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:" delete saved model
When I try either retrain or restore the model. Is there some temporary files that I need to remove ?? doubt if Tensorflow is trying to do something I am not aware, I don't really get an answer from any of the solutions in similar threads. Below is the detailed stack trace,
as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Validation character error rate of saved model: 10.624916%
Python: 3.6.10 |Anaconda, Inc.| (default, May 7 2020, 19:46:08) [MSC v.1916 64 bit (AMD64)]
Tensorflow: 1.12.0
2020-06-26 00:53:20.161185: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
model DIR ---- ../model/
model latestSnapshot ---- ../model/snapshot-38
Init with stored values from ../model/snapshot-38
Traceback (most recent call last):
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
return fn(*args)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
[[{{node save/Assign_15}} = Assign[T=DT_FLOAT, _class=["loc:#Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5, save/RestoreV2:15)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1546, in restore
{self.saver_def.filename_tensor_name: save_path})
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
run_metadata_ptr)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
run_metadata)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
[[node save/Assign_15 (defined at P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py:141) = Assign[T=DT_FLOAT, _class=["loc:#Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5, save/RestoreV2:15)]]
Caused by op 'save/Assign_15', defined at:
File "main.py", line 145, in <module>
main()
File "main.py", line 140, in main
model = Model(open(FilePaths.fnCharList).read(), decoderType, mustRestore=True, dump=args.dump)
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 53, in __init__
(self.sess, self.saver) = self.setupTF()
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 141, in setupTF
saver = tf.train.Saver(max_to_keep=1) # saver saves model to file
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1102, in __init__
self.build()
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 795, in _build_internal
restore_sequentially, reshape)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 428, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 119, in restore
self.op.get_shape().is_fully_defined())
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\ops\state_ops.py", line 221, in assign
validate_shape=validate_shape)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 60, in assign
use_locking=use_locking, name=name)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
op_def=op_def)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
[[node save/Assign_15 (defined at P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py:141) = Assign[T=DT_FLOAT, _class=["loc:#Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5, save/RestoreV2:15)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 145, in <module>
main()
File "main.py", line 140, in main
model = Model(open(FilePaths.fnCharList).read(), decoderType, mustRestore=True, dump=args.dump)
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 53, in __init__
(self.sess, self.saver) = self.setupTF()
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 153, in setupTF
saver.restore(sess, latestSnapshot)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1582, in restore
err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
[[node save/Assign_15 (defined at P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py:141) = Assign[T=DT_FLOAT, _class=["loc:#Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5, save/RestoreV2:15)]]
Caused by op 'save/Assign_15', defined at:
File "main.py", line 145, in <module>
main()
File "main.py", line 140, in main
model = Model(open(FilePaths.fnCharList).read(), decoderType, mustRestore=True, dump=args.dump)
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 53, in __init__
(self.sess, self.saver) = self.setupTF()
File "P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py", line 141, in setupTF
saver = tf.train.Saver(max_to_keep=1) # saver saves model to file
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1102, in __init__
self.build()
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 795, in _build_internal
restore_sequentially, reshape)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 428, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\training\saver.py", line 119, in restore
self.op.get_shape().is_fully_defined())
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\ops\state_ops.py", line 221, in assign
validate_shape=validate_shape)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 60, in assign
use_locking=use_locking, name=name)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
op_def=op_def)
File "C:\Users\rcs70\.conda\envs\tensorflow_opencv\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
[[node save/Assign_15 (defined at P:\Desktop\COSC428_ComputerVision\SimpleHTR-master\SimpleHTR-master\src\Model.py:141) = Assign[T=DT_FLOAT, _class=["loc:#Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5, save/RestoreV2:15)]]

The error says this: Assign requires shapes of both tensors to match. lhs shape= [1,1,512,71] rhs shape= [1,1,512,80]
This means that the dimensions of one of the tensors in the snapshot are different from the tensor in the model, in the snapshot it is [1,1,512,80] and in the model it is [1,1,512,71].
Therefore, something is different. You have to load the snapshot on a model that matches exactcly the one it was saved from.
If I would have to guess, I would say that this is a multi-class classification model and that the number of classes the model was trained in (i.e. the snapshot) was 80, while now the model has been built to classify 71 classes.

Related

How can I fix this error : 'Tensor' object has no attribute 'numpy' [duplicate]

How can I fix this error I downloaded this code from GitHub.
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].numpy()
throws the error
AttributeError: 'Tensor' object has no attribute 'numpy'
Please help me fix this!
I used:
sess = tf.Session()
with sess.as_default():
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
And i get this error. Someone help me i just want it to work why is this so hard?
D:\Python>python TextGenOut.py
File "TextGenOut.py", line 72
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
^
IndentationError: unexpected indent
D:\Python>python TextGenOut.py
2018-09-16 21:50:57.008663: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-16 21:50:57.272973: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at resource_variable_ops.cc:480 : Not found: Container localhost does not exist. (Could not find resource: localhost/model/embedding/embeddings)
Traceback (most recent call last):
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call
return fn(*args)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "TextGenOut.py", line 72, in <module>
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 680, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 4951, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 877, in run
run_metadata_ptr)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1272, in _do_run
run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]
Caused by op 'model/dense/MatMul/ReadVariableOp', defined at:
File "TextGenOut.py", line 66, in <module>
predictions, hidden = model(input_eval, hidden)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "TextGenOut.py", line 39, in call
x = self.fc(output)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\layers\core.py", line 943, in call
outputs = gen_math_ops.mat_mul(inputs, self.kernel)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_math_ops.py", line 4750, in mat_mul
name=name)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 510, in _apply_op_helper
preferred_dtype=default_dtype)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1094, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1045, in _dense_var_to_tensor
return var._dense_var_to_tensor(dtype=dtype, name=name, as_ref=as_ref) # pylint: disable=protected-access
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1000, in _dense_var_to_tensor
return self.value()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 662, in value
return self._read_variable_op()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 745, in _read_variable_op
self._dtype)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_resource_variable_ops.py", line 562, in read_variable_op
"ReadVariableOp", resource=resource, dtype=dtype, name=name)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3155, in create_op
op_def=op_def)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1717, in __init__
self._traceback = tf_stack.extract_stack()
FailedPreconditionError (see above for traceback): Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]

I suspect the place where you copied the code from had eager execution enabled, i.e. had invoked tf.enable_eager_execution() at the start of the program.
You could do the same.
UPDATE: Note that eager execution is enabled by default in TensorFlow 2.0. So the answer above applies only to TensorFlow 1.x

Since the accepted answer did not solve the problem for me so I thought it might be helpful for some people who face the problem and that already have tensorflow version >= 2.2.0 and eager execution enabled.
The issue seems to be that for certain functions during the fitting model.fit()
the #tf.function decorator prohibits the execution of functions like tensor.numpy() for performance reasons.
The solution for me was to pass the flag run_eagerly=True to the model.compile() like this:
model.compile(..., run_eagerly=True)

Tensorflow 2 has a config option to run functions "eagerly" which will enable getting Tensor values via .numpy() method. To enable eager execution, use following command:
tf.config.run_functions_eagerly(True)
Note that this is useful mainly for debugging.
See also: https://www.tensorflow.org/api_docs/python/tf/config/run_functions_eagerly

This can also happen in TF2.0 if your code is wrapped in a #tf.function or inside a Keras layer. Both of those run in graph mode. There's a lot of secretly broken code out of there because behavior differs between eager and graph modes and people are not aware that they're switching contexts, so be careful!

It happens in older version of TF. So try pip install tensorflow --upgrade
otherwise run
import tensorflow as tf
tf.enable_eager_execution()
If you are using Jupyter notebook, restart the Kernel.

tf.multinomial returns a Tensor object that contains a 2D list with drawn samples of shape [batch_size, num_samples]. Calling .eval() on that tensor object is expected to return a numpy ndarray.
Something like this:
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
You also need to ensure that you have a session active (doesn't make a lot of sense otherwise):
sess = tf.Session()
with sess.as_default():
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()

I saw similar error when I run code something like the following,
tensor = tf.multiply(ndarray, 42)
tensor.numpy() # throw AttributeError: 'Tensor' object has no attribute 'numpy'
I use anaconda 3 with tensorflow 1.14.0. I upgraded tensorflow with the command below
conda update tensorflow
now tensorflow is 2.0.0, issue fixed. Try this to see if it resolves your issue.

I had the same issue in a tf.function(): But what has worked for me is to transform the numpy array into a tensorflow tensor via tf.convert_to_tensor Doku and then go ahead with tensorflow. Maybe this trick could be useful for anyone...

You can also use tf.get_static_value() to obtain the value of a tensor. This has the benefit of not needing eager mode. See docs here.

Making one input in a tensorflow model constant

I have the following problem:
I have created a model using deepchem, which is a wrapped keras model, trained it and reloaded it. I can predict using this model without a problem.
Now I want to make a copy of this model, which has one less inputs, since one input is always constant in my use scenario and always passing it lead to errors in a function I can't edit.
data = np.array(data.data, dtype=float32)
with tf.Graph().as_default() as temp_graph:
tf.import_graph_def(self.model.session.graph.as_graph_def(),
input_map={self.model._input_placeholders[1].name:
tf.constant(np.array([0], dtype=float32)),})
#self.model.session.graph = temp_graph
#for deep explainer: replace all switched dropouts with dropouts
#get input tensor for this graph
tensors = tf.contrib.graph_editor.get_tensors(temp_graph)
for t in tensors:
if "input_1" in t.name:
input_tensor = t
break
#reshape output --> only singletask!
output = tf.reshape(tensors[-1], [-1, 1])
model = (input_tensor, output)
sess = tf.Session(graph=temp_graph)
feed_dict = dict(zip([input_tensor], [data]))
print(sess.run(output, feed_dict))
In this code fragments I was able to load the graph of my model and pass a constant into its input. Now obviously I can't run this new model in the same session, since that session contains the old model. The way of running the model with the feed dict can't be changed, since it is in another package in the real scenario. I get the following error message:
Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist.
The full trace is:
Traceback (most recent call last):
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_2/bias)
[[{{node import/model/dense_2/BiasAdd/ReadVariableOp}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 490, in <module>
main()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 478, in main
evaluate()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 445, in evaluate
reader.explain()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1534, in explain
self.explain()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1519, in explain
self._explain_Gradient_SHAP(self.df)
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 2047, in _explain_Gradient_SHAP
print(sess.run(output, feed_dict))
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_2/bias)
[[node import/model/dense_2/BiasAdd/ReadVariableOp (defined at /eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py:2033) ]]
Original stack trace for 'import/model/dense_2/BiasAdd/ReadVariableOp':
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 490, in <module>
main()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 478, in main
evaluate()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 445, in evaluate
reader.explain()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1534, in explain
self.explain()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1519, in explain
self._explain_Gradient_SHAP(self.df)
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 2033, in _explain_Gradient_SHAP
tf.constant(np.array([0], dtype=float32)),})
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 443, in import_graph_def
_ProcessNewOps(graph)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 236, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3751, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3751, in <listcomp>
for c_op in c_api_util.new_tf_operations(self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3641, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
I am using tensorflow 1.14 and Python 3.6 (This can't be changed aswell)
So my problem could be solved in 2 different ways: Either I get to run the second graph with the information that is in the old session, or I get to tell the old session to use one constant input.
Thanks for any help in advance!
best regards
Tobias
Edit:
I eventually fixed this by wrapping the class I was trying to use and overwriting some methods. I think another idea could have been, to replace one Keras input with a keras constant.

This error is a little tricky. Here's a couple of suggestions that spring to mind:
DeepChem HEAD is now running on TensorFlow 2.X. If your problem would be easier to handle in Eager mode, that might be one option. Of course, HEAD isn't stable and there might be other issues that crop up there.
DeepChem models are underneath the hood just made of Keras layers. If you can make a Keras model from the constituent layers of your model, then you can possibly avoid the DeepChem wrapper and solve the problem directly in Keras.
It might also help to add more information on the DeepChem model you're trying to use and the downstream function you're seeing an error in.

How to use TPUEstimator.export_saved_model with Tensorflow 1.12?

export_saved_model used on TPUEstimator raises TypeError: Failed to convert object of type to Tensor with Tensorflow 1.12.0. Am I using it incorrectly or if it is a bug is there some workaround?
I would like to train a model on TPU using TPUEstimator and then use the trained model locally on CPU. I cannot use the graph saved during training directly, but I need to use export_saved_model instead (Github issue).
export_saved_model on TPUEstimator works correctly with Tensorflow 1.13.0rc0, however it fails with current Tensorflow 1.12.0 (another Github issue). At the moment, however, TPUs with Tensorflow 1.13 are not available at Google Cloud and TPUs with Tensorflow 1.12 are not compatible, so upgrading Tensorflow to 1.13 is not an option.
The relevant code is:
def serving_input_receiver_fn():
feature = tf.placeholder(tf.float32, shape=[None, None, None, 2])
return tf.estimator.export.TensorServingInputReceiver(feature, feature)
estimator.export_saved_model(FLAGS.export_dir, serving_input_receiver_fn)
Expected result.
The model should be exported correctly. This happens with Tensorflow 1.13.0rc0 or with TPUEstimator replaced with Estimator. The former can be reproduced using this colab).
Actual result.
Exporting fails with TypeError: Failed to convert object of type and the traceback included below. This can be reproduced with this colab.
...
WARNING:tensorflow:From /Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py:1044: calling SavedModelBuilder.add_meta_graph_and_variables (from tensorflow.python.saved_model.builder_impl) with legacy_init_op is deprecated and will be removed in a future version.
Instructions for updating:
Pass your op to the equivalent parameter main_op instead.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
WARNING:tensorflow:rewrite_for_inference (from tensorflow.contrib.tpu.python.tpu.tpu) is experimental and may change or be removed at any time, and without warning.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Running infer on CPU
ERROR:tensorflow:Operation of type Placeholder (policy_labels) is not supported on the TPU. Execution will fail if this op is used in the graph.
ERROR:tensorflow:Operation of type Placeholder (sat_labels) is not supported on the TPU. Execution will fail if this op is used in the graph.
INFO:tensorflow:Done calling model_fn.
Traceback (most recent call last):
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 527, in make_tensor_proto
str_values = [compat.as_bytes(x) for x in proto_values]
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 527, in <listcomp>
str_values = [compat.as_bytes(x) for x in proto_values]
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 61, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got dict_values([<tf.Tensor 'sat_prob:0' shape=(?,) dtype=float32>, <tf.Tensor 'policy_prob:0' shape=(?, ?, 2) dtype=float32>])
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "neurosat_tpu.py", line 253, in <module>
tf.app.run()
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "neurosat_tpu.py", line 248, in main
estimator.export_saved_model(FLAGS.export_dir, serving_input_receiver_fn)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 734, in export_saved_model
strip_default_attrs=True)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 663, in export_savedmodel
mode=model_fn_lib.ModeKeys.PREDICT)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 789, in _export_saved_model_for_mode
strip_default_attrs=strip_default_attrs)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 907, in _export_all_saved_models
mode=model_fn_lib.ModeKeys.PREDICT)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2188, in _add_meta_graph_for_mode
check_variables=False))
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 984, in _add_meta_graph_for_mode
config=self.config)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2192, in _call_model_fn
return self._call_model_fn_for_inference(features, labels, mode, config)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2253, in _call_model_fn_for_inference
new_tensors.append(array_ops.identity(t))
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 81, in identity
return gen_array_ops.identity(input, name=name)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3454, in identity
"Identity", input=input, name=name)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 513, in _apply_op_helper
raise err
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 510, in _apply_op_helper
preferred_dtype=default_dtype)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/Users/michal/.virtualenvs/deepsat/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 531, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type <class 'dict_values'> to Tensor. Contents: dict_values([<tf.Tensor 'sat_prob:0' shape=(?,) dtype=float32>, <tf.Tensor 'policy_prob:0' shape=(?, ?, 2) dtype=float32>]). Consider casting elements to a supported type.

Adding argument export_to_tpu=False to TPUEstimator constructor prevents the error in Tensorflow 1.12:
estimator = tf.contrib.tpu.TPUEstimator(..., export_to_tpu=False)
export_to_tpu=False disables exporting the TPU version of the model, but CPU version is still exported and this is sufficient to run the model locally. With Tensorflow 1.13 the bug is fixed and the flag is not necessary.
The answer is based on the Github thread linked in the question.

Can't restore pre-trained network with Tensorflow

I'm stuck with restoring pre-trained network with Tensorflow....
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
sess=tf.Session()
saver = tf.train.import_meta_graph('./model/20170512-110547/model-20170512-110547.meta')
saver.restore(sess,'./model/20170512-110547/')
I'd like to use pre-trained network which was trained for face recognition, and then wanna add some layers for transfer learning.
(I downloaded the model from here. https://github.com/davidsandberg/facenet)
When I execute the code above, it shows the error,
WARNING:tensorflow:The saved meta_graph is possibly from an older release:
'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
Traceback (most recent call last):
File "/Users/user/Desktop/desktop/Python/HCR/Transfer_face/test.py", line 7, in <module>
saver.restore(sess,'./model/20170512-110547/')
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./model/20170512-110547/
[[Node: save/RestoreV2_491 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_491/tensor_names, save/RestoreV2_491/shape_and_slices)]]
Caused by op u'save/RestoreV2_491', defined at:
File "/Users/user/Desktop/desktop/Python/HCR/Transfer_face/test.py", line 6, in <module>
saver = tf.train.import_meta_graph('./model/20170512-110547/model-20170512-110547.meta')
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1698, in import_meta_graph
**kwargs)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py", line 656, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def
op_def=op_def)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./model/20170512-110547/
[[Node: save/RestoreV2_491 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_491/tensor_names, save/RestoreV2_491/shape_and_slices)]]
I can't understand why the system can't find pre-trained data...
And the directory structure is as below
USER-no-MacBook-Pro:Transfer_face user$ ls -R
model test.py
./model:
20170512-110547
./model/20170512-110547:
20170512-110547.pb
model-20170512-110547.ckpt-250000.index
model-20170512-110547.ckpt-250000.data-00000-of-00001
model-20170512-110547.meta

Import the .pb file.
import tensorflow as tf
from tensorflow.python.framework import tensor_util
with tf.gfile.GFile('20170512-110547.pb', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
#import into default graph
tf.import_graph_def(graph_def)
#print some data
wts = [n for n in graph_def.node if n.op == 'Const']
for n in wts:
print(tensor_util.MakeNdarray(n.attr['value'].tensor))
Linked questions:
Import a simple Tensorflow frozen_model.pb file and make prediction in C++
get the value weights from .pb file by Tensorflow
Related documentation: GraphDef

You need use the ckpt path "./model/20170512-110547/model-20170512-110547.ckpt-250000" instead of the folder path.

Training tensorflow im2txt fails with truncated record at

im2txt trains for a few thousand steps then halts with the following error.
I've checked the training files and they appear OK.
Running on Ubuntu 16.04, TF r.0.11, GPU mode GTX 970 4Gb.
Not sure if it is lack of RAM?
INFO:tensorflow:global step 56396: loss = 2.4654 (0.41 sec/step)
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors.DataLossError'>, truncated record at 369740238
[[Node: ReaderRead = ReaderRead[_class=["loc:#TFRecordReader", "loc:#filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]]
Caused by op u'ReaderRead', defined at:
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 114, in <module>
tf.app.run()
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 65, in main
model.build()
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 352, in build
self.build_inputs()
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 153, in build_inputs
num_reader_threads=self.config.num_input_reader_threads)
File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/ops/inputs.py", line 115, in prefetch_input_data
_, value = reader.read(filename_queue)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 277, in read
return gen_io_ops._reader_read(self._reader_ref, queue_ref, name=name)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 211, in _reader_read
queue_handle=queue_handle, name=name)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in apply_op
op_def=op_def)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2403, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1305, in __init__
self._traceback = _extract_stack()
DataLossError (see above for traceback): truncated record at 369740238
[[Node: ReaderRead = ReaderRead[_class=["loc:#TFRecordReader", "loc:#filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]]
INFO:tensorflow:global step 56397: loss = 2.5540 (0.40 sec/step)

I have the same problem, not sure why. I did not see any error when creating tfrecords. During training, the error comes out near the end of the records. BTW I am using tf 0.11rc

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to troubleshoot TensorFlow error “Restoring from checkpoint failed.” - python

Related

How can I fix this error : 'Tensor' object has no attribute 'numpy' [duplicate]

Making one input in a tensorflow model constant

How to use TPUEstimator.export_saved_model with Tensorflow 1.12?

Can't restore pre-trained network with Tensorflow

Training tensorflow im2txt fails with truncated record at

Categories

Resources