NotFoundError: Key not found in checkpoint - python

System information
TensorFlow version (use command below): 2.4.0
Python version: 3.6.2
Problem
I am trying to upgrade the LAS model from tensorflow 1.8.0 version to 2.4.0. There is no problem in training the model, but in the testing phase, loading the model will show that there is a parameter not found. I printed the saved model file, there is a parameter named " BeamSearchDecoderStep/multi_rnn_cell/cell_0_attention/attention_wrapper/lstm_cell_9/bias" in it. I would be very grateful if you can answer my question!
Error Message
2021-05-11 17:09:33.841199: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-05-11 17:09:33.841721: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-05-11 17:09:33.850617: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-I630CDV
2021-05-11 17:09:33.851304: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-I630CDV
2021-05-11 17:09:33.852078: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-11 17:09:33.853485: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
INFO:tensorflow:Building speller
WARNING:tensorflow:From C:\Users\yangrui\Desktop\PythonProject\Korean_Speech\las\model.py:346: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-05-11T17:09:47Z
INFO:tensorflow:Graph was finalized.
2021-05-11 17:09:48.052221: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
INFO:tensorflow:Restoring parameters from ./data_kss/Kspon_dataset/model_test\model.ckpt-0
2021-05-11 17:09:48.135902: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:196] None of the MLIR optimization passes are enabled (registered 0 passes)
2021-05-11 17:09:48.321843: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at save_restore_v2_ops.cc:205 : Not found: Key BeamSearchDecoderStep/multi_rnn_cell/cell_0_attention/attention_wrapper/lstm_cell_9/bias not found in checkpoint
Traceback (most recent call last):
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
return fn(*args)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn
target_list, run_metadata)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key BeamSearchDecoderStep/multi_rnn_cell/cell_0_attention/attention_wrapper/lstm_cell_9/bias not found in checkpoint
[[{{node save/RestoreV2}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 1298, in restore
{self.saver_def.filename_tensor_name: save_path})
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 968, in run
run_metadata_ptr)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run
run_metadata)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key BeamSearchDecoderStep/multi_rnn_cell/cell_0_attention/attention_wrapper/lstm_cell_9/bias not found in checkpoint
[[node save/RestoreV2 (defined at \Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py:1647) ]]
Original stack trace for 'save/RestoreV2':
File "/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 114, in <module>
main(args)
File "/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 85, in main
input_fn=lambda: input_fn(
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 467, in evaluate
name=name)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 510, in _actual_eval
return _evaluate()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 499, in _evaluate
output_dir=self.eval_dir(name))
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1647, in _evaluate_run
config=self._session_config)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\evaluation.py", line 269, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1038, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 749, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1231, in __init__
_WrappedSession.__init__(self, self._create_session())
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1236, in _create_session
return self._sess_creator.create_session()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 902, in create_session
self.tf_sess = self._session_creator.create_session()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 660, in create_session
self._scaffold.finalize()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 235, in finalize
self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 606, in _get_saver_or_default
saver = Saver(sharded=True, allow_empty=True)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 835, in __init__
self.build()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 847, in build
self._build(self._filename, build_save=True, build_restore=True)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 885, in _build
build_restore=build_restore)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 509, in _build_internal
restore_sequentially, reshape)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 388, in _AddShardedRestoreOps
name="restore_shard"))
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 335, in _AddRestoreOps
restore_sequentially)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 582, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1510, in restore_v2
name=name)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\ops.py", line 3536, in _create_op_internal
op_def=op_def)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\ops.py", line 1990, in __init__
self._traceback = tf_stack.extract_stack()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 70, in get_tensor
self, compat.as_bytes(tensor_str))
RuntimeError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 1308, in restore
names_to_keys = object_graph_key_mapping(save_path)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 1626, in object_graph_key_mapping
object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 74, in get_tensor
error_translator(e)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 35, in error_translator
raise errors_impl.NotFoundError(None, None, error_message)
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 114, in <module>
main(args)
File "C:/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 85, in main
input_fn=lambda: input_fn(
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 467, in evaluate
name=name)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 510, in _actual_eval
return _evaluate()
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 499, in _evaluate
output_dir=self.eval_dir(name))
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1647, in _evaluate_run
config=self._session_config)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\evaluation.py", line 269, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1038, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 749, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1231, in __init__
_WrappedSession.__init__(self, self._create_session())
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1236, in _create_session
return self._sess_creator.create_session()
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 902, in create_session
self.tf_sess = self._session_creator.create_session()
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 669, in create_session
init_fn=self._scaffold.init_fn)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\session_manager.py", line 295, in prepare_session
config=config)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\session_manager.py", line 209, in _restore_checkpoint
saver.restore(sess, checkpoint_filename_with_path)
File "C:\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 1314, in restore
err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Key BeamSearchDecoderStep/multi_rnn_cell/cell_0_attention/attention_wrapper/lstm_cell_9/bias not found in checkpoint
[[node save/RestoreV2 (defined at \Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py:1647) ]]
Original stack trace for 'save/RestoreV2':
File "/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 114, in <module>
main(args)
File "/Users/yangrui/Desktop/PythonProject/Korean_Speech/eval.py", line 85, in main
input_fn=lambda: input_fn(
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 467, in evaluate
name=name)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 510, in _actual_eval
return _evaluate()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 499, in _evaluate
output_dir=self.eval_dir(name))
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1647, in _evaluate_run
config=self._session_config)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\evaluation.py", line 269, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1038, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 749, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1231, in __init__
_WrappedSession.__init__(self, self._create_session())
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1236, in _create_session
return self._sess_creator.create_session()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 902, in create_session
self.tf_sess = self._session_creator.create_session()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 660, in create_session
self._scaffold.finalize()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\monitored_session.py", line 235, in finalize
self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 606, in _get_saver_or_default
saver = Saver(sharded=True, allow_empty=True)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 835, in __init__
self.build()
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 847, in build
self._build(self._filename, build_save=True, build_restore=True)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 885, in _build
build_restore=build_restore)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 509, in _build_internal
restore_sequentially, reshape)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 388, in _AddShardedRestoreOps
name="restore_shard"))
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 335, in _AddRestoreOps
restore_sequentially)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\training\saver.py", line 582, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1510, in restore_v2
name=name)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\ops.py", line 3536, in _create_op_internal
op_def=op_def)
File "\Anaconda3\envs\tf2\lib\site-packages\tensorflow\python\framework\ops.py", line 1990, in __init__
self._traceback = tf_stack.extract_stack()
Process finished with exit code 1
Part of my code:
import tensorflow as tf
import numpy as np
from tensorflow.python.util import nest
import tensorflow_addons as tfa
from las.ops import lstm_cell
from las.ops import pyramidal_bilstm
__all__ = [
'listener',
'speller',
]
class AttentionMultiCell(tf.keras.layers.StackedRNNCells):
# class AttentionMultiCell(tf.compat.v1.nn.rnn_cell.MultiRNNCell):
"""A MultiCell with attention style."""
def __init__(self, attention_cell, cells, use_new_attention=False):
"""Creates a AttentionMultiCell.
Args:
attention_cell: An instance of AttentionWrapper.
cells: A list of RNNCell wrapped with AttentionInputWrapper.
use_new_attention: Whether to use the attention generated from current
step bottom layer's output. Default is False.
"""
cells = [attention_cell] + cells
self.use_new_attention = use_new_attention
super(AttentionMultiCell, self).__init__(
cells)
def __call__(self, inputs, state, training=False, scope=None):
"""Run the cell with bottom layer's attention copied to all upper layers."""
if not nest.is_sequence(state):
raise ValueError(
"Expected state to be a tuple of length %d, but received: %s"
% (len(self.state_size), state))
with tf.compat.v1.variable_scope(scope or "multi_rnn_cell"):
new_states = []
with tf.compat.v1.variable_scope("cell_0_attention"):
attention_cell = self.cells[0]
attention_state = state[0]
cur_inp, new_attention_state = attention_cell(
inputs, attention_state)
new_states.append(new_attention_state)
for i in range(1, len(self.cells)):
with tf.compat.v1.variable_scope("cell_%d" % i):
cell = self.cells[i]
cur_state = state[i]
if self.use_new_attention:
cur_inp = tf.concat(
[cur_inp, new_attention_state.attention], -1)
else:
cur_inp = tf.concat(
[cur_inp, attention_state.attention], -1)
cur_inp, new_state = cell(cur_inp, cur_state)
new_states.append(new_state)
return cur_inp, new_states
class CustomAttention(tfa.seq2seq.LuongAttention):
def __init__(self,
num_units,
memory,
memory_sequence_length=None,
scale=False,
probability_fn=None,
score_mask_value=None,
dtype=None,
name="CustomAttention"):
super(CustomAttention, self).__init__(
num_units=num_units,
memory=memory,
memory_sequence_length=memory_sequence_length,
scale=scale,
probability_fn=probability_fn,
score_mask_value=score_mask_value,
dtype=dtype,
name=name)
self._query_layer = tf.compat.v1.layers.Dense(
num_units, name='query_layer', use_bias=False, dtype=dtype)
self._keys = tf.nn.relu(self.keys)
def __call__(self, query, state):
processed_query = tf.nn.relu(self.query_layer(query))
return super(CustomAttention, self).__call__(processed_query, state)
def listener(encoder_inputs,
source_sequence_length,
mode,
hparams):
if hparams['use_pyramidal']:
return pyramidal_bilstm(encoder_inputs, source_sequence_length, mode, hparams)
else:
forward_cell_list, backward_cell_list = [], []
for layer in range(hparams['num_layers']):
with tf.compat.v1.variable_scope('fw_cell_{}'.format(layer)):
cell = lstm_cell(hparams['num_units'], hparams['dropout'], mode)
forward_cell_list.append(cell)
with tf.compat.v1.variable_scope('bw_cell_{}'.format(layer)):
cell = lstm_cell(hparams['num_units'], hparams['dropout'], mode)
backward_cell_list.append(cell)
forward_cell = tf.keras.layers.StackedRNNCells(forward_cell_list)
backward_cell = tf.keras.layers.StackedRNNCells(backward_cell_list)
encoder_outputs, encoder_state = tf.keras.layers.Bidirectional(
forward_cell,
backward_cell,
encoder_inputs,
sequence_length=source_sequence_length,
dtype=tf.float32)
encoder_outputs = tf.concat(encoder_outputs, -1)
return (encoder_outputs, source_sequence_length), encoder_state
def attend(encoder_outputs,
source_sequence_length,
mode,
hparams):
memory = encoder_outputs
if hparams['attention_type'] == 'luong':
attention_fn = tfa.seq2seq.LuongAttention
elif hparams['attention_type'] == 'bahdanau':
attention_fn = tfa.seq2seq.BahdanauAttention
elif hparams['attention_type'] == 'custom':
attention_fn = CustomAttention
attention_mechanism = attention_fn(
hparams['num_units'], memory, source_sequence_length)
cell_list = []
for layer in range(hparams['num_layers']):
with tf.compat.v1.variable_scope('decoder_cell_'.format(layer)):
cell = lstm_cell(hparams['num_units'], hparams['dropout'], mode)
cell_list.append(cell)
alignment_history = (mode != tf.estimator.ModeKeys.TRAIN)
if hparams['bottom_only']: # False
# Only wrap the bottom layer with the attention mechanism.
attention_cell = cell_list.pop(0)
attention_cell = tfa.seq2seq.AttentionWrapper(
attention_cell, attention_mechanism,
attention_layer_size=hparams['attention_layer_size'],
alignment_history=alignment_history)
decoder_cell = AttentionMultiCell(attention_cell, cell_list)
else:
decoder_cell = tf.keras.layers.StackedRNNCells(cell_list)
decoder_cell = tfa.seq2seq.AttentionWrapper(
decoder_cell, attention_mechanism,
attention_layer_size=hparams['attention_layer_size'],
alignment_history=alignment_history)
return decoder_cell
def speller(encoder_outputs,
encoder_state,
decoder_inputs,
source_sequence_length,
target_sequence_length,
mode,
hparams):
batch_size = tf.shape(input=encoder_outputs)[0]
beam_width = hparams['beam_width']
if mode == tf.estimator.ModeKeys.PREDICT and beam_width > 0:
encoder_outputs = tfa.seq2seq.tile_batch(
encoder_outputs, multiplier=beam_width)
source_sequence_length = tfa.seq2seq.tile_batch(
source_sequence_length, multiplier=beam_width)
encoder_state = tfa.seq2seq.tile_batch(
encoder_state, multiplier=beam_width)
batch_size = batch_size * beam_width
if mode == tf.estimator.ModeKeys.EVAL and beam_width > 0:
encoder_outputs = tfa.seq2seq.tile_batch(
encoder_outputs, multiplier=beam_width)
source_sequence_length = tfa.seq2seq.tile_batch(
source_sequence_length, multiplier=beam_width)
encoder_state = tfa.seq2seq.tile_batch(
encoder_state, multiplier=beam_width)
batch_size = batch_size * beam_width
def embedding_fn(ids):
# pass callable object to avoid OOM when using one-hot encoding
if hparams['embedding_size'] != 0:
target_embedding = tf.compat.v1.get_variable(
'target_embedding', [
hparams['target_vocab_size'], hparams['embedding_size']],
dtype=tf.float32, initializer=tf.compat.v1.keras.initializers.VarianceScaling(scale=1.0, mode="fan_avg", distribution="uniform"))
return tf.nn.embedding_lookup(params=target_embedding, ids=ids)
else:
return tf.one_hot(ids, hparams['target_vocab_size'])
decoder_cell = attend(
encoder_outputs, source_sequence_length, mode, hparams)
projection_layer = tf.keras.layers.Dense(
hparams['target_vocab_size'], use_bias=True, name='projection_layer')
if hparams['pass_hidden_state'] and hparams['bottom_only']:
initial_state = tuple(
zs.clone(cell_state=es)
if isinstance(zs, tfa.seq2seq.AttentionWrapperState) else es
for zs, es in zip(
decoder_cell.get_initial_state(batch_size=batch_size, dtype=tf.float32), encoder_state))
else:
initial_state = decoder_cell.get_initial_state(batch_size=batch_size, dtype=tf.float32)
maximum_iterations = None
if mode != tf.estimator.ModeKeys.TRAIN:
max_source_length = tf.reduce_max(input_tensor=source_sequence_length)
maximum_iterations = tf.cast(tf.round(tf.cast(
max_source_length, dtype=tf.float32) * hparams['decoding_length_factor']), dtype=tf.int32)
if mode == tf.estimator.ModeKeys.TRAIN:
decoder_inputs = embedding_fn(decoder_inputs)
decay_steps = hparams['decay_steps']
iter_num = tf.compat.v1.train.get_global_step()
inverse_probability = tf.compat.v1.train.polynomial_decay(
1.0, iter_num, decay_steps, 0.6)
sampling_probability = 1.0 - inverse_probability
if hparams['sampling_probability']:
helper = tfa.seq2seq.ScheduledEmbeddingTrainingSampler(
sampling_probability=sampling_probability,
embedding_fn=embedding_fn
)
else:
helper = tfa.seq2seq.TrainingSampler()
decoder = tfa.seq2seq.BasicDecoder(
cell=decoder_cell,
sampler=helper,
output_layer=projection_layer,
maximum_iterations=maximum_iterations
)
decoder_outputs, final_context_state, final_sequence_length = tfa.seq2seq.dynamic_decode(
decoder, training=True, decoder_init_input=decoder_inputs, decoder_init_kwargs={
'initial_state': initial_state, 'sequence_length': target_sequence_length
})
elif mode == tf.estimator.ModeKeys.PREDICT and beam_width > 0:
start_tokens = tf.fill(
[tf.compat.v1.div(batch_size, beam_width)], hparams['sos_id'])
decoder = tfa.seq2seq.BeamSearchDecoder(
cell=decoder_cell,
embedding_fn=embedding_fn,
beam_width=beam_width,
output_layer=projection_layer,
maximum_iterations=maximum_iterations
)
decoder_outputs, final_context_state, final_sequence_length = tfa.seq2seq.dynamic_decode(
decoder, decoder_inputs=embedding_fn(decoder_inputs),
training=False, decoder_init_kwargs={
'start_tokens': start_tokens, 'end_token': hparams['eos_id'],
'initial_state': initial_state
})
else:
'''
start_tokens = tf.fill([batch_size], hparams.sos_id)
helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(
embedding_fn, start_tokens, hparams.eos_id)
decoder = tf.contrib.seq2seq.BasicDecoder(
decoder_cell, helper, initial_state, output_layer=projection_layer)
'''
start_tokens = tf.fill(
[tf.compat.v1.div(batch_size, beam_width)], hparams['sos_id'])
decoder = tfa.seq2seq.BeamSearchDecoder(
cell=decoder_cell,
embedding_fn=embedding_fn,
beam_width=beam_width,
output_layer=projection_layer,
maximum_iterations=maximum_iterations
)
decoder_outputs, final_context_state, final_sequence_length = tfa.seq2seq.dynamic_decode(
decoder, decoder_inputs=embedding_fn(decoder_inputs),
training=False, decoder_init_kwargs={
'start_tokens':start_tokens,
'end_token':hparams['eos_id'],
'initial_state': initial_state
})
return decoder_outputs, final_context_state, final_sequence_length

Related

Incompatible shapes: [84,6] vs. [128,6]. Error at end of first epoch

This is the model that I built. Please do help me understand if the problem with my model or any other problem I am facing this issue.
The error occurs after this:
Train on 63828 samples, validate on 95743 samples
Epoch 1/1
63744/63828 [============================>.] - ETA: 2s - loss: 0.3427 - acc: 0.9943
The error occurs at the end. So I removed the avlidation set during training.
from tensorflow.python.keras.layers import Embedding, Input
from tensorflow.python.keras.layers import LSTM, Bidirectional, GlobalMaxPool1D, Dropout
embedding_layer = Embedding(num_of_words, EMBEDDING_DIM, weights=[embedding_matrix], input_length=MAX_SEQUENCE_LENGTH, trainable=False)
#building the model
#INPUT LAYER
input_layer = Input((MAX_SEQUENCE_LENGTH,))
#EMBEDDING LAYER
embedding_layer = embedding_layer(input_layer)
#BI-LSTM LAYER
lstm_layer_output = Bidirectional(LSTM(128, return_sequences=True))(embedding_layer)
lstm, forward_h, forward_c, backward_h, backward_c = Bidirectional \
(LSTM
(128,
dropout=0.2,
return_sequences=True,
return_state=True,
recurrent_activation='relu',
recurrent_initializer='glorot_uniform'))(embedding_layer)
from tensorflow.python.keras import backend as K
#CNN LAYER WITH KERNELS 3,4,5
from tensorflow.python.keras.layers import Conv1D, MaxPooling1D
first_conv_layer = Conv1D(128, 3, activation='relu')(lstm_layer_output)
first_max_pooling_layer = MaxPooling1D(3)(first_conv_layer)
second_conv_layer = Conv1D(128, 4, activation='relu')(first_max_pooling_layer)
second_max_pooling_layer = MaxPooling1D(4)(second_conv_layer)
third_conv_layer = Conv1D(128, 5, activation='relu')(second_max_pooling_layer)
#third_max_pooling_layer = MaxPooling1D(5)(third_conv_layer)
global_max_pooling = GlobalMaxPool1D()(third_conv_layer)
#from tensorflow.python.keras.layers import Concatenate
#merged_pooling_layers = Concatenate(axis=1)([first_max_pooling_layer,second_max_pooling_layer,third_max_pooling_layer])
#global_max_pooling = GlobalMaxPool1D()(merged_pooling_layers)
#implementing attentionlayer manually
from tensorflow.python.keras.layers import Add
rnn_output = Add()([forward_h,backward_h])
hidden_size = int(lstm.shape[2])
from tensorflow.python.keras.layers import Lambda
hsf = Lambda(lambda x: x[:, -1], output_shape=(hidden_size,), name='last_hidden_state_forward')(rnn_output)
from tensorflow.python.keras.layers import Multiply
from tensorflow.python.keras.layers import Lambda
def norm(m):
return K.transpose(m)
u_t = Multiply()([Lambda(norm)(rnn_output),hsf])
context_vector = Multiply()([u_t,global_max_pooling])
def ex(m):
return K.exp(context_vector)
exp_u_t = Lambda(ex)(context_vector)
from tensorflow.python.keras.layers import Dense
attention_vector = Dense(128,activation='softmax')(exp_u_t)
x = Dense(64,activation="softmax")(weighted_input)
output_layer = Dense(6,activation="softmax")(x)
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.optimizers import Adam
model = Model(input_layer,output_layer)
from tensorflow.python.keras import optimizers
model.compile(
loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy']
)
print('Training model...')
r = model.fit(
data,
target_values,
batch_size=128,
epochs=1,
validation_split=0.0
)
The error I got is this:
InvalidArgumentError (see above for traceback): Incompatible shapes: [84,6] vs. [128,6]
[[Node: training/SGD/gradients/loss/dense_3_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:#training/SGD/gradients/loss/dense_3_loss/mul_grad/Reshape_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/SGD/gradients/loss/dense_3_loss/mul_grad/Shape, training/SGD/gradients/loss/dense_3_loss/mul_grad/Shape_1)]]
Please help me fix this problem.Thank you
Edit:
This is the traceback of the error
Epoch 1/1
31872/31914 [============================>.] - ETA: 1s - loss: 0.2419 Traceback (most recent call last):
File "<ipython-input-1-a7cc2e59a772>", line 165, in <module>
validation_split=0.8
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training.py", line 1216, in fit
validation_steps=validation_steps)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training_arrays.py", line 245, in fit_loop
outs = f(ins_batch)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\backend.py", line 2824, in __call__
fetches=fetches, feed_dict=feed_dict, **self.session_kwargs)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
run_metadata_ptr)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do_run
run_metadata)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: Incompatible shapes: [128,6] vs. [42,6]
[[Node: training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:#training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Shape, training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Shape_1)]]
Caused by op 'training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/BroadcastGradientArgs', defined at:
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 268, in <module>
main()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 264, in main
kernel.start()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 478, in start
self.io_loop.start()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tornado\ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 233, in dispatch_shell
handler(stream, idents, msg)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-a7cc2e59a772>", line 165, in <module>
validation_split=0.8
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training.py", line 1216, in fit
validation_steps=validation_steps)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training_arrays.py", line 90, in fit_loop
model._make_train_function()
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training.py", line 572, in _make_train_function
params=self._collected_trainable_weights, loss=self.total_loss)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\optimizers.py", line 208, in get_updates
grads = self.get_gradients(loss, params)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\optimizers.py", line 114, in get_gradients
grads = K.gradients(loss, params)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\backend.py", line 2866, in gradients
loss, variables, colocate_gradients_with_ops=True)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 494, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 636, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 385, in _MaybeCompile
return grad_fn() # Exit early
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 636, in <lambda>
lambda: grad_fn(op, *out_grads))
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\math_grad.py", line 874, in _MulGrad
rx, ry = gen_array_ops.broadcast_gradient_args(sx, sy)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 673, in broadcast_gradient_args
"BroadcastGradientArgs", s0=s0, s1=s1, name=name)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op
op_def=op_def)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
...which was originally created as op 'loss/dense_3_loss/logistic_loss/mul', defined at:
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 268, in <module>
main()
[elided 16 identical lines from previous traceback]
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-a7cc2e59a772>", line 153, in <module>
optimizer='sgd',
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training.py", line 428, in compile
output_loss = weighted_loss(y_true, y_pred, sample_weight, mask)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\engine\training_utils.py", line 438, in weighted
score_array = fn(y_true, y_pred)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\losses.py", line 116, in binary_crossentropy
return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\keras\_impl\keras\backend.py", line 3448, in binary_crossentropy
return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_impl.py", line 181, in sigmoid_cross_entropy_with_logits
relu_logits - logits * labels,
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 979, in binary_op_wrapper
return func(x, y, name=name)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1211, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 4758, in mul
"Mul", x=x, y=y, name=name)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op
op_def=op_def)
File "C:\Users\JCMat\New\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Incompatible shapes: [128,6] vs. [42,6]
[[Node: training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:#training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Shape, training/SGD/gradients/loss/dense_3_loss/logistic_loss/mul_grad/Shape_1)]]
The issue is that your last batch doesn't contain 128 rows, but only 84, since the length of your dataset isn't divisible without a remainder. Either try to adjust your code to allow for dynamic rows, or maybe try padding the last batch.

How to initialize a kernel with a tensor

I have created a custom layer in keras, which simply perform a dot product between the input and a kernel. But for the kernel I wanted to use the mean of the batch as a kernel initialization, meaning taking the mean of the batch and producing a kernel which initial value is that mean. To do so I have created a custom kernel initializer as follow:
class Tensor_Init(Initializer):
"""Initializer that generates tensors initialized to a given tensor.
# Arguments
Tensor: the generator tensors.
"""
def __init__(self, Tensor=None):
self.Tensor = Tensor
def __call__(self, shape, dtype=None):
return tf.Variable(self.Tensor)
def get_config(self):
return {'Tensor': self.Tensor}
This is the call method of the custom layer in keras. I simply compute the mean of the batch and use it with the above initializer class to produce a kernel. I use it as follow in the custom layer
def call(self, inputs):
data_format = conv_utils.convert_data_format(self.data_format, self.rank + 2)
inputs = tf.extract_image_patches(
inputs,
ksizes=(1,) + self.kernel_size + (1,),
strides=(1,) + self.strides + (1,),
rates=(1,) + self.dilation_rate + (1,),
padding=self.padding.upper(),
)
inputs = K.reshape(inputs,[-1,inputs.get_shape().as_list()[1],inputs.get_shape().as_list()
[2],self.kernel_size[0]*self.kernel_size[1] ,self.output_dim])
self.kernel = self.add_weight(name='kernel',shape=(),initializer=Tensor_Init(Tensor=tf.reduce_mean(inputs, 0)),trainable=True)
outputs = (tf.einsum('NHWKC,HWKC->NHWC',inputs,self.kernel)+self.c)**self.p
if self.data_format == 'channels_first':
outputs = K.permute_dimensions(outputs, (0, 3, 1, 2))
return outputs
Th e model is created and compiled normaly but I start training I am getting this error
InvalidArgumentError: You must feed a value for placeholder tensor 'conv2d_1_input' with dtype float and shape [?,48,48,3]
[[node conv2d_1_input (defined at C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py:736) ]]
Original stack trace for 'conv2d_1_input':
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelapp.py", line 563, in start
self.io_loop.start()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\platform\asyncio.py", line 148, in start
self.asyncio_loop.run_forever()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\base_events.py", line 438, in run_forever
self._run_once()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\base_events.py", line 1451, in _run_once
handle._run()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\asyncio\events.py", line 145, in _run
self._callback(*self._args)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 690, in <lambda>
lambda f: self._run_callback(functools.partial(callback, future))
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 743, in _run_callback
ret = callback()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 787, in inner
self.run()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 748, in run
yielded = self.gen.send(value)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 378, in dispatch_queue
yield self.process_one()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 225, in wrapper
runner = Runner(result, future, yielded)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 714, in __init__
self.run()
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 748, in run
yielded = self.gen.send(value)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 272, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 542, in execute_request
user_expressions, allow_stdin,
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\ipkernel.py", line 294, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2855, in run_cell
raw_cell, store_history, silent, shell_futures)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in _run_cell
return runner(coro)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3058, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3249, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-35eda01d200a>", line 75, in <module>
model = create_vgg16()
File "<ipython-input-2-35eda01d200a>", line 12, in create_vgg16
model.add(Conv2D(64, (5, 5), input_shape=(48,48,3), padding='same'))
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\sequential.py", line 162, in add
name=layer.name + '_input')
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\input_layer.py", line 178, in Input
input_tensor=tensor)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\engine\input_layer.py", line 87, in __init__
name=self.name)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py", line 736, in placeholder
shape=shape, ndim=ndim, dtype=dtype, sparse=sparse, name=name)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\backend.py", line 998, in placeholder
x = array_ops.placeholder(dtype, shape=shape, name=name)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2143, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7401, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
I was able to pass the mean of the batch to the kernel by simply creating a zero initialized kernel then assigning the mean value to it, without even creating a custom initilizer. I modified the custom layer as follow
def call(self, inputs):
data_format = conv_utils.convert_data_format(self.data_format, self.rank + 2)
inputs = tf.extract_image_patches(
inputs,
ksizes=(1,) + self.kernel_size + (1,),
strides=(1,) + self.strides + (1,),
rates=(1,) + self.dilation_rate + (1,),
padding=self.padding.upper(),
)
inputs = K.reshape(inputs,[-1,inputs.get_shape().as_list()[1],inputs.get_shape().as_list()
[2],self.kernel_size[0]*self.kernel_size[1] ,self.output_dim])
weights = tf.reduce_mean(inputs, 0)
self.kernel = self.add_weight(name='kernel',
shape=(weights.get_shape().as_list()[0],weights.get_shape().as_list()
[1],weights.get_shape().as_list()[2],weights.get_shape().as_list()[3]),
initializer='zeros',
trainable=True)
tf.compat.v1.assign(self.kernel, weights)
outputs = (tf.einsum('NHWKC,HWKC->NHWC',inputs,self.kernel)+self.c)**self.p
if self.data_format == 'channels_first':
outputs = K.permute_dimensions(outputs, (0, 3, 1, 2))
return outputs

TensorFlow create a dataset using generator for multiple columns with different data types

I know that I can use tensorflow.data.TextLineDataset for this but I'd like to write a customized function to create a DataSet from a generator.
I'm implementing the input function for the census income data like this
_CSV_COLUMNS = [
('age', tf.int32),
('workclass', tf.string),
('fnlwgt', tf.int32),
('education', tf.string),
('education_num', tf.int32),
('marital_status', tf.string),
('occupation', tf.string),
('relationship', tf.string),
('race', tf.string),
('gender', tf.string),
('capital_gain', tf.int32),
('capital_loss', tf.int32),
('hours_per_week', tf.int32),
('native_country', tf.string),
('income_bracket', tf.string),
]
def input_csv(data_file, num_epochs, batch_size):
df = pd.read_csv(data_file, header=None)
def gen():
for row in df.iterrows():
row = row[1]
yield dict(zip([n[0] for n in _CSV_COLUMNS[:14]], row[:14])), row[14] == '>50K'
return tf.data.Dataset.from_generator(gen, (dict(_CSV_COLUMNS[:14]), tf.bool))
When I try this function with the Estimator API, it results in this error:
InvalidArgumentError (see above for traceback): assertion failed: [Feature (key: age) cannot have rank 0. Given: Tensor(\"IteratorGetNext:0\", dtype=int32)] [Condition x > 0 did not hold element-wise:] [x (linear/linear_model_1/linear_model/age/Rank:0) = ] [0]
Any ideas? Thanks in advance.
Additional info:
I'm testing it with SageMaker local mode. The train_input_fn and model_fn are like
_NUMERIC_COLUMNS = [
tf.feature_column.numeric_column(c) for c in
['age', 'education_num', 'capital_gain', 'capital_loss', 'hours_per_week']
]
def model_fn(features, labels, mode, hyperparameters):
classifier = tf.estimator.LinearClassifier(_NUMERIC_COLUMNS)
return classifier.model_fn(features, labels, mode, None)
def train_input_fn(training_dir, hyperparameters):
return input_csv(os.path.join(training_dir, 'adult.data.csv'), 3, 20)
The traceback is like (I added 2 blank lines around my source.)
Caused by op 'linear/linear_model_1/linear_model/age/assert_positive/assert_less/Assert/Assert', defined at:
File "/usr/local/bin/entry.py", line 28, in <module>
modes[mode]()
File "/usr/local/lib/python3.6/site-packages/container_support/training.py", line 36, in start
fw.train()
File "/usr/local/lib/python3.6/site-packages/tf_container/train_entry_point.py", line 164, in train
train_wrapper.train()
File "/usr/local/lib/python3.6/site-packages/tf_container/trainer.py", line 73, in train
tf.estimator.train_and_evaluate(estimator=estimator, train_spec=train_spec, eval_spec=eval_spec)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 451, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 617, in run
getattr(self, task_to_run)()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 654, in run_master
self._start_distributed_training(saving_listeners=saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 767, in _start_distributed_training
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 376, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1145, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1170, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tf_container/trainer.py", line 108, in _model_fn
return self.customer_script.model_fn(features, labels, mode, params)
File "/opt/ml/code/train.py", line 32, in model_fn
return classifier.model_fn(features, labels, mode, None)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 263, in public_model_fn
return self._call_model_fn(features, labels, mode, config)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/canned/linear.py", line 339, in _model_fn
sparse_combiner=sparse_combiner)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/canned/linear.py", line 163, in _linear_model_fn
logits = logit_fn(features=features)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/canned/linear.py", line 101, in linear_logit_fn
cols_to_vars=cols_to_vars)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 464, in linear_model
retval = linear_model_layer(features) # pylint: disable=not-callable
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 647, in call
weighted_sum = layer(builder)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 362, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 539, in call
weight_var=self._weight_var)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2030, in _create_weighted_sum
weight_var=weight_var)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2043, in _create_dense_column_weighted_sum
trainable=trainable)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2474, in _get_dense_tensor
return inputs.get(self)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2263, in get
transformed = column._transform_feature(self) # pylint: disable=protected-access
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2442, in _transform_feature
input_tensor = inputs.get(self.key)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2250, in get
feature_tensor = self._get_raw_feature_as_tensor(key)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py", line 2312, in _get_raw_feature_as_tensor
key, feature_tensor))]):
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/check_ops.py", line 198, in assert_positive
return assert_less(zero, x, data=data, summarize=summarize)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/check_ops.py", line 559, in assert_less
return control_flow_ops.Assert(condition, data, summarize=summarize)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 118, in wrapped
return _add_should_use_warning(fn(*args, **kwargs))
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 149, in Assert
return gen_logging_ops._assert(condition, data, summarize, name="Assert")
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 51, in _assert
name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
self._traceback = tf_stack.extract_stack()
You cannot create a Tensor object with different data types. Check out the official doc
You can consider encoding everything as a string as the documentation suggests, or one hot encoding and further preprocessing before converting to a tensor, depending on your application.

TFRecords QueueRunner Error

I am trying to load many csv files into a single TFRecord file then be able to feed that TFRecord to my model. I below is all my code and I have tried to break it down as to what I think I am doing.
Generate data.. the target variable will be the last column.
for i in range(10):
filename = './Data/random_csv' + str(i) + '.csv'
pd.DataFrame(np.random.randint(0,100,size=(100, 50))).to_csv(filename)
Functions for making TFRecord File
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _float_feature(value):
return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def make_q_list(filepathlist, filetype):
filepathlist = filepathlist
filepaths = []
labels = []
for path in filepathlist:
data_files = os.listdir(path)
for data in data_files:
if data.endswith(filetype):
data_file = os.path.join(path, data)
data_file = data_file
data_label = os.path.basename(os.path.normpath(path))
filepaths.append(data_file)
labels.append(data_label)
return filepaths, labels
def rnn_list_format(df):
input_data_list = []
output_data_list = []
y = df[df.columns[-1]]
X = df[df.columns[:-1]]
for i in range(len(df)):
output_data_list.append(y.loc[i])
input_data_list.append(X.loc[i].as_matrix())
return input_data_list, output_data_list
def data_split(df):
y = df[df.columns[-1]]
X = df[df.columns[:-1]]
X, y = X.as_matrix(), y.as_matrix()
return X, y
The function to load csvs into Pandas. Then take the last column and make it my target variable, y. The Pandas dataframes get converted to numpy arrays and written to the TFRecords file.
def tables_to_TF(queue_list, tf_filename, file_type='csv'):
#Target variable needs to be the last column of data
filepath = os.path.join(tf_filename)
print('Writing', filepath)
writer = tf.python_io.TFRecordWriter(tf_filename)
for file in tqdm(queue_list):
if file_type == 'csv':
data = pd.read_csv(file)
X, y = data_split(data)
elif file_type == 'hdf':
data = pd.read_hdf(file)
X, y = data_split(data)
else:
print(file_type, 'is not supported at this time...')
break
rec_count = X.shape[0]
for index in range(rec_count):
_X = np.asarray(X[index]).tostring()
_y = np.asarray(y[index]).tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'X': _bytes_feature(_X),
'y': _bytes_feature(_y)}))
writer.write(example.SerializeToString())
The function to read the TFRecords file.
def read_and_decode(filename_queue, datashape=160*160*3):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
features={
'X': tf.FixedLenFeature([], tf.string),
'y': tf.FixedLenFeature([], tf.string)
})
X = tf.decode_raw(features['X'], tf.float32)
X.set_shape([datashape])
X = tf.cast(X, tf.float32)
y = tf.decode_raw(features['y'], tf.float32)
y.set_shape([1])
y = tf.cast(y, tf.float32)
return X, y
Created the batches in Tensorflow
def inputs(train_dir, file, batch_size, num_epochs, n_classes, one_hot_labels=False, datashape=160*160*3):
if not num_epochs: num_epochs = None
filename = os.path.join(train_dir, file)
with tf.name_scope('input'):
filename_queue = tf.train.string_input_producer(
[filename], num_epochs=num_epochs)
X, y = read_and_decode(filename_queue, datashape)
if one_hot_labels:
label = tf.one_hot(label, n_classes, dtype=tf.int32)
example_batch, label_batch = tf.train.shuffle_batch(
[X, y], batch_size=batch_size, num_threads=2,
capacity=2000, enqueue_many=False,
# Ensures a minimum amount of shuffling of examples.
min_after_dequeue=1000, name=file)
return example_batch, label_batch
Make the TFRecord file from the data that was created.
filepathlist = ['./Data']
q, _ = make_q_list(filepathlist, '.csv')
tffilename = 'Demo_TFR.tfrecords'
tables_to_TF(q, tffilename, file_type='csv')
Attempt to load the TFRecord file into a queueRunner.
X_train_batch, y_train_batch = inputs('./',
'Demo_TFR.tfrecords',
50,
1,
0,
one_hot_labels=False,
datashape=50)
sess = tf.Session()
init_op = tf.group(tf.global_variables_initializer())
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
sess.run([X_train_batch, y_train_batch])
ERROR
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, Attempting to use uninitialized value input/input_producer/limit_epochs/epochs
[[Node: input/input_producer/limit_epochs/CountUpTo = CountUpTo[T=DT_INT64, _class=["loc:#input/input_producer/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/input_producer/limit_epochs/epochs)]]
Caused by op 'input/input_producer/limit_epochs/CountUpTo', defined at:
File "/home/mcamp/anaconda3/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/home/mcamp/anaconda3/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
app.launch_new_instance()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-13-a00f528d3e80>", line 7, in <module>
datashape=50)
File "<ipython-input-11-468d0a66f589>", line 94, in inputs
[filename], num_epochs=num_epochs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/input.py", line 230, in string_input_producer
cancel_op=cancel_op)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/input.py", line 156, in input_producer
input_tensor = limit_epochs(input_tensor, num_epochs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/input.py", line 96, in limit_epochs
counter = epochs.count_up_to(num_epochs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 652, in count_up_to
return state_ops.count_up_to(self._variable, limit=limit)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_state_ops.py", line 126, in count_up_to
result = _op_def_lib.apply_op("CountUpTo", ref=ref, limit=limit, name=name)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value input/input_producer/limit_epochs/epochs
[[Node: input/input_producer/limit_epochs/CountUpTo = CountUpTo[T=DT_INT64, _class=["loc:#input/input_producer/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"](input/input_producer/limit_epochs/epochs)]]
---------------------------------------------------------------------------
OutOfRangeError Traceback (most recent call last)
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1020 try:
-> 1021 return fn(*args)
1022 except errors.OpError as e:
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
1002 feed_dict, fetch_list, target_list,
-> 1003 status, run_metadata)
1004
/home/mcamp/anaconda3/lib/python3.5/contextlib.py in __exit__(self, type, value, traceback)
65 try:
---> 66 next(self.gen)
67 except StopIteration:
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
468 compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 469 pywrap_tensorflow.TF_GetCode(status))
470 finally:
OutOfRangeError: RandomShuffleQueue '_7_input_1/Demo_TFR.tfrecords/random_shuffle_queue' is closed and has insufficient elements (requested 50, current size 0)
[[Node: input_1/Demo_TFR.tfrecords = QueueDequeueMany[_class=["loc:#input_1/Demo_TFR.tfrecords/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_1/Demo_TFR.tfrecords/random_shuffle_queue, input_1/Demo_TFR.tfrecords/n)]]
During handling of the above exception, another exception occurred:
OutOfRangeError Traceback (most recent call last)
<ipython-input-17-a00f528d3e80> in <module>()
12 coord = tf.train.Coordinator()
13 threads = tf.train.start_queue_runners(sess=sess, coord=coord)
---> 14 sess.run([X_train_batch, y_train_batch])
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
764 try:
765 result = self._run(None, fetches, feed_dict, options_ptr,
--> 766 run_metadata_ptr)
767 if run_metadata:
768 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
962 if final_fetches or final_targets:
963 results = self._do_run(handle, final_targets, final_fetches,
--> 964 feed_dict_string, options, run_metadata)
965 else:
966 results = []
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1012 if handle is None:
1013 return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
-> 1014 target_list, options, run_metadata)
1015 else:
1016 return self._do_call(_prun_fn, self._session, handle, feed_dict,
/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1032 except KeyError:
1033 pass
-> 1034 raise type(e)(node_def, op, message)
1035
1036 def _extend_graph(self):
OutOfRangeError: RandomShuffleQueue '_7_input_1/Demo_TFR.tfrecords/random_shuffle_queue' is closed and has insufficient elements (requested 50, current size 0)
[[Node: input_1/Demo_TFR.tfrecords = QueueDequeueMany[_class=["loc:#input_1/Demo_TFR.tfrecords/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_1/Demo_TFR.tfrecords/random_shuffle_queue, input_1/Demo_TFR.tfrecords/n)]]
Caused by op 'input_1/Demo_TFR.tfrecords', defined at:
File "/home/mcamp/anaconda3/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/home/mcamp/anaconda3/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
app.launch_new_instance()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-17-a00f528d3e80>", line 7, in <module>
datashape=50)
File "<ipython-input-15-468d0a66f589>", line 105, in inputs
min_after_dequeue=1000, name=file)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/training/input.py", line 917, in shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/data_flow_ops.py", line 458, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1099, in _queue_dequeue_many
timeout_ms=timeout_ms, name=name)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/mcamp/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
OutOfRangeError (see above for traceback): RandomShuffleQueue '_7_input_1/Demo_TFR.tfrecords/random_shuffle_queue' is closed and has insufficient elements (requested 50, current size 0)
[[Node: input_1/Demo_TFR.tfrecords = QueueDequeueMany[_class=["loc:#input_1/Demo_TFR.tfrecords/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_1/Demo_TFR.tfrecords/random_shuffle_queue, input_1/Demo_TFR.tfrecords/n)]]
EDIT:
The below code is what seems to be the root cause of the problem. I think I am not parsing the TFRecord file properly (duh*). I think maybe I am not reading it in as the correct data type. Almost the exact same code will read pictures into a TFRecord and back out.. Only difference is that I am trying to send float32 values through it all.
def read_and_decode(filename_queue, datashape=160*160*3):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
features={
'X': tf.FixedLenFeature([], tf.string),
'y': tf.FixedLenFeature([], tf.string)
})
X = tf.decode_raw(features['X'], tf.float32)
X.set_shape([datashape])
X = tf.cast(X, tf.float32)
y = tf.decode_raw(features['y'], tf.float32)
y.set_shape([1])
y = tf.cast(y, tf.float32)
return X, y
There's a lot to follow there so I'm not sure, but the quickest thing to check is whether your "num_epochs" is set properly. Those OutOfRangeErrors are thrown when the epoch limit has been reached.

tf.nn.dynamic_rnn raised Attempting to use uninitialized value error

My graph looks like this
with graph.as_default():
train_inputs = tf.placeholder(tf.int32, shape=[None, None])
with tf.device('/cpu:0'):
embeddings = tf.Variable(tf.zeros([vocab_size, options.embed_size]))
restorer = tf.train.Saver({'embeddings': embeddings})
init = tf.variables_initializer([embeddings])
uninit = tf.report_uninitialized_variables()
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
# length() returns a [batch_szie,] tensor of true lengths of sentences (lengths before zero-padding)
sequence_length = length(embed)
lstm = tf.nn.rnn_cell.LSTMCell(options.rnn_size)
output, _ = tf.nn.dynamic_rnn(
lstm,
embed,
dtype=tf.float32,
swequence_length=sequence_length
)
And my session:
with tf.Session(graph=graph) as session:
restorer.restore(session, options.restore_path)
# tf.global_variables_initializer.run()
init.run()
print session.run([uninit])
while len(data.ids):
# data.generate_batch returns a list of size [batch_size, max_length], and zero-padding is used, when the sentences are shorter than max_length. For example, batch_inputs = [[1,2,3,4], [3,2,1,0], [1,2,0,0]]
batch_inputs, _ = data.generate_batch(options.batch_size)
feed_dict = {train_inputs: batch_inputs}
test = session.run([tf.shape(output)], feed_dict=feed_dict)
print test
Function length():
def length(self, sequence):
length = tf.sign(sequence)
length = tf.reduce_sum(length, reduction_indices=1)
length = tf.cast(length, tf.int32)
return length
The error i got:
Traceback (most recent call last):
File "rnn.py", line 103, in <module>
test = session.run([tf.shape(output)], feed_dict=feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value RNN/LSTMCell/W_0
[[Node: RNN/LSTMCell/W_0/read = Identity[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](RNN/LSTMCell/W_0)]]
Caused by op u'RNN/LSTMCell/W_0/read', defined at:
File "rnn.py", line 75, in <module>
sequence_length=sequence_length,
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 845, in dynamic_rnn
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 1012, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2636, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2469, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2419, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 995, in _time_step
skip_conditionals=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 403, in _rnn_step
new_output, new_state = call_cell()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 983, in <lambda>
call_cell = lambda: cell(input_t, state)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 496, in __call__
dtype, self._num_unit_shards)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 329, in _get_concat_variable
sharded_variable = _get_sharded_variable(name, shape, dtype, num_shards)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 359, in _get_sharded_variable
dtype=dtype))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 1024, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 850, in get_variable
custom_getter=custom_getter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 346, in get_variable
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 331, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 677, in _get_single_variable
expected_shape=shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 224, in __init__
expected_shape=expected_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 367, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1424, in identity
result = _op_def_lib.apply_op("Identity", input=input, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value RNN/LSTMCell/W_0
[[Node: RNN/LSTMCell/W_0/read = Identity[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](RNN/LSTMCell/W_0)]]
However when I printed out the uninitialized variables, i got [array([], dtype=object)]
When i replaced init.run() with tf.global_variables_initializer.run(), it worked.
Any idea why init.run() doesn't work?
You defined init as follows:
init = tf.variables_initializer([embeddings])
This definition means that init initializes only the embeddings variable. Calling the tf.nn.dynamic_rnn() function creates more variables, representing the various internal weights in the LSTM, and these are not initialized by init.
By contrast, tf.global_variables_initializer() returns an operation that, when run, will initialize all of the (global) variables in your model, including those created for the LSTM.

Categories