I have the following problem:
I have created a model using deepchem, which is a wrapped keras model, trained it and reloaded it. I can predict using this model without a problem.
Now I want to make a copy of this model, which has one less inputs, since one input is always constant in my use scenario and always passing it lead to errors in a function I can't edit.
data = np.array(data.data, dtype=float32)
with tf.Graph().as_default() as temp_graph:
tf.import_graph_def(self.model.session.graph.as_graph_def(),
input_map={self.model._input_placeholders[1].name:
tf.constant(np.array([0], dtype=float32)),})
#self.model.session.graph = temp_graph
#for deep explainer: replace all switched dropouts with dropouts
#get input tensor for this graph
tensors = tf.contrib.graph_editor.get_tensors(temp_graph)
for t in tensors:
if "input_1" in t.name:
input_tensor = t
break
#reshape output --> only singletask!
output = tf.reshape(tensors[-1], [-1, 1])
model = (input_tensor, output)
sess = tf.Session(graph=temp_graph)
feed_dict = dict(zip([input_tensor], [data]))
print(sess.run(output, feed_dict))
In this code fragments I was able to load the graph of my model and pass a constant into its input. Now obviously I can't run this new model in the same session, since that session contains the old model. The way of running the model with the feed dict can't be changed, since it is in another package in the real scenario. I get the following error message:
Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist.
The full trace is:
Traceback (most recent call last):
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_2/bias)
[[{{node import/model/dense_2/BiasAdd/ReadVariableOp}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 490, in <module>
main()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 478, in main
evaluate()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 445, in evaluate
reader.explain()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1534, in explain
self.explain()
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1519, in explain
self._explain_Gradient_SHAP(self.df)
File "/EXT/Tobha/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 2047, in _explain_Gradient_SHAP
print(sess.run(output, feed_dict))
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/EXT/Tobha/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable dense_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_2/bias)
[[node import/model/dense_2/BiasAdd/ReadVariableOp (defined at /eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py:2033) ]]
Original stack trace for 'import/model/dense_2/BiasAdd/ReadVariableOp':
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 490, in <module>
main()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 478, in main
evaluate()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/Models.py", line 445, in evaluate
reader.explain()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1534, in explain
self.explain()
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 1519, in explain
self._explain_Gradient_SHAP(self.df)
File "/eclipse-workspace/Bachelorarbeit/toolbox_dc_2_3_0/python_source/DataHandling.py", line 2033, in _explain_Gradient_SHAP
tf.constant(np.array([0], dtype=float32)),})
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 443, in import_graph_def
_ProcessNewOps(graph)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 236, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3751, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3751, in <listcomp>
for c_op in c_api_util.new_tf_operations(self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3641, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/.conda/envs/test_BA_Tobias_std_deepchem-2-3-0_py36_20200114/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
I am using tensorflow 1.14 and Python 3.6 (This can't be changed aswell)
So my problem could be solved in 2 different ways: Either I get to run the second graph with the information that is in the old session, or I get to tell the old session to use one constant input.
Thanks for any help in advance!
best regards
Tobias
Edit:
I eventually fixed this by wrapping the class I was trying to use and overwriting some methods. I think another idea could have been, to replace one Keras input with a keras constant.
This error is a little tricky. Here's a couple of suggestions that spring to mind:
DeepChem HEAD is now running on TensorFlow 2.X. If your problem would be easier to handle in Eager mode, that might be one option. Of course, HEAD isn't stable and there might be other issues that crop up there.
DeepChem models are underneath the hood just made of Keras layers. If you can make a Keras model from the constituent layers of your model, then you can possibly avoid the DeepChem wrapper and solve the problem directly in Keras.
It might also help to add more information on the DeepChem model you're trying to use and the downstream function you're seeing an error in.
Related
Does torch.save work on hugging face models (I am using vit)? I assumed yes.
My error:
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 499, in _save
zip_file.write_record(name, storage.data_ptr(), num_bytes)
OSError: [Errno 116] Stale file handle
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1815, in <module>
main()
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1748, in main
train(args=args)
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1795, in train
meta_train_iterations_ala_l2l(args, args.agent, args.opt, args.scheduler)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/training/meta_training.py", line 213, in meta_train_iterations_ala_l2l
log_train_val_stats(args, args.it, step_name, train_loss, train_acc, training=True)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/supervised_learning.py", line 55, in log_train_val_stats
_log_train_val_stats(args=args,
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/supervised_learning.py", line 113, in _log_train_val_stats
save_for_supervised_learning(args, ckpt_filename='ckpt.pt')
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/checkpointing_uu/supervised_learning.py", line 54, in save_for_supervised_learning
torch.save({'training_mode': args.training_mode,
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 380, in save
return
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 259, in __exit__
self.file_like.write_end_of_file()
RuntimeError: [enforce fail at inline_container.cc:298] . unexpected pos 2736460544 vs 2736460432
my code:
# - ckpt
args_pickable: Namespace = uutils.make_args_pickable(args)
# note not saving any objects, to make sure checkpoint is loadable later with no problems
torch.save({'training_mode': args.training_mode,
'it': args.it,
'epoch_num': args.epoch_num,
# 'args': args_pickable, # some versions of this might not have args!
# decided only to save the dict version to avoid this ckpt not working, make it args when loading
'args_dict': vars(args_pickable), # some versions of this might not have args!
'model_state_dict': get_model_from_ddp(args.model).state_dict(),
'model_str': str(args.model), # added later, to make it easier to check what optimizer was used
'model_hps': args.model_hps,
'model_option': args.model_option,
'opt_state_dict': args.opt.state_dict(),
'opt_str': str(args.opt),
'opt_hps': args.opt_hps,
'opt_option': args.opt_option,
'scheduler_str': str(args.scheduler),
'scheduler_state_dict': try_to_get_scheduler_state_dict(args.scheduler),
'scheduler_hps': args.scheduler_hps,
'scheduler_option': args.scheduler_option,
},
pickle_module=pickle,
f=args.log_root / ckpt_filename)
if this is not the right way to checkpoint hugging face (HF) models, what is?
cross: hf discussion forum: https://discuss.huggingface.co/t/torch-save-with-hugging-face-models-fails/25034
How can I fix this error I downloaded this code from GitHub.
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].numpy()
throws the error
AttributeError: 'Tensor' object has no attribute 'numpy'
Please help me fix this!
I used:
sess = tf.Session()
with sess.as_default():
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
And i get this error. Someone help me i just want it to work why is this so hard?
D:\Python>python TextGenOut.py
File "TextGenOut.py", line 72
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
^
IndentationError: unexpected indent
D:\Python>python TextGenOut.py
2018-09-16 21:50:57.008663: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-16 21:50:57.272973: W T:\src\github\tensorflow\tensorflow\core\framework\op_kernel.cc:1275] OP_REQUIRES failed at resource_variable_ops.cc:480 : Not found: Container localhost does not exist. (Could not find resource: localhost/model/embedding/embeddings)
Traceback (most recent call last):
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call
return fn(*args)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "TextGenOut.py", line 72, in <module>
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 680, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 4951, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 877, in run
run_metadata_ptr)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1272, in _do_run
run_metadata)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]
Caused by op 'model/dense/MatMul/ReadVariableOp', defined at:
File "TextGenOut.py", line 66, in <module>
predictions, hidden = model(input_eval, hidden)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "TextGenOut.py", line 39, in call
x = self.fc(output)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\engine\base_layer.py", line 736, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\keras\layers\core.py", line 943, in call
outputs = gen_math_ops.mat_mul(inputs, self.kernel)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_math_ops.py", line 4750, in mat_mul
name=name)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 510, in _apply_op_helper
preferred_dtype=default_dtype)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1094, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1045, in _dense_var_to_tensor
return var._dense_var_to_tensor(dtype=dtype, name=name, as_ref=as_ref) # pylint: disable=protected-access
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1000, in _dense_var_to_tensor
return self.value()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 662, in value
return self._read_variable_op()
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 745, in _read_variable_op
self._dtype)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_resource_variable_ops.py", line 562, in read_variable_op
"ReadVariableOp", resource=resource, dtype=dtype, name=name)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3155, in create_op
op_def=op_def)
File "C:\Users\fried\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1717, in __init__
self._traceback = tf_stack.extract_stack()
FailedPreconditionError (see above for traceback): Error while reading resource variable model/dense/kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/model/dense/kernel)
[[Node: model/dense/MatMul/ReadVariableOp = ReadVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/dense/kernel)]]
I suspect the place where you copied the code from had eager execution enabled, i.e. had invoked tf.enable_eager_execution() at the start of the program.
You could do the same.
UPDATE: Note that eager execution is enabled by default in TensorFlow 2.0. So the answer above applies only to TensorFlow 1.x
Since the accepted answer did not solve the problem for me so I thought it might be helpful for some people who face the problem and that already have tensorflow version >= 2.2.0 and eager execution enabled.
The issue seems to be that for certain functions during the fitting model.fit()
the #tf.function decorator prohibits the execution of functions like tensor.numpy() for performance reasons.
The solution for me was to pass the flag run_eagerly=True to the model.compile() like this:
model.compile(..., run_eagerly=True)
Tensorflow 2 has a config option to run functions "eagerly" which will enable getting Tensor values via .numpy() method. To enable eager execution, use following command:
tf.config.run_functions_eagerly(True)
Note that this is useful mainly for debugging.
See also: https://www.tensorflow.org/api_docs/python/tf/config/run_functions_eagerly
This can also happen in TF2.0 if your code is wrapped in a #tf.function or inside a Keras layer. Both of those run in graph mode. There's a lot of secretly broken code out of there because behavior differs between eager and graph modes and people are not aware that they're switching contexts, so be careful!
It happens in older version of TF. So try pip install tensorflow --upgrade
otherwise run
import tensorflow as tf
tf.enable_eager_execution()
If you are using Jupyter notebook, restart the Kernel.
tf.multinomial returns a Tensor object that contains a 2D list with drawn samples of shape [batch_size, num_samples]. Calling .eval() on that tensor object is expected to return a numpy ndarray.
Something like this:
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
You also need to ensure that you have a session active (doesn't make a lot of sense otherwise):
sess = tf.Session()
with sess.as_default():
predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].eval()
I saw similar error when I run code something like the following,
tensor = tf.multiply(ndarray, 42)
tensor.numpy() # throw AttributeError: 'Tensor' object has no attribute 'numpy'
I use anaconda 3 with tensorflow 1.14.0. I upgraded tensorflow with the command below
conda update tensorflow
now tensorflow is 2.0.0, issue fixed. Try this to see if it resolves your issue.
I had the same issue in a tf.function(): But what has worked for me is to transform the numpy array into a tensorflow tensor via tf.convert_to_tensor Doku and then go ahead with tensorflow. Maybe this trick could be useful for anyone...
You can also use tf.get_static_value() to obtain the value of a tensor. This has the benefit of not needing eager mode. See docs here.
I am trying to run an example through TF 2.0 that is basically identical to the example here: https://www.tensorflow.org/tutorials/estimator/boosted_trees except for the facts that:
My array is larger, 63319 rows x 7330 cols instead of 627 rows and 9 cols
I have no categorical columns, just numeric ones
Many of my column names are fairly long (maybe 50 chars, not that long!)
As in the example, the data comes from a pandas dataframe, has an int label, etc. There are no nan's or inf's in my data.
If I try to train either a linear classifier or a boosted trees classifier on my data, I eventually get
google.protobuf.message.DecodeError: Error parsing message
and the last file mentioned is
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3211, in _as_graph_def
graph.ParseFromString(compat.as_bytes(data))
Any ideas? I'm seeing the same behavior on both Ubuntu 18.04 and MacOS 10.15.6. Ubuntu is Python 3.6.9, TF 2.3.0, protobuf 3.13.0. Mac is Python 3.8.3, TF 2.3.0, protobuf 3.12.3
There are things I can do to track this down, like gradually shrink my data set until the error goes away, change my column headings, etc, but they are all moderately painful and seem unlikely to produce useful information.
Thanks so much!
Full traceback of error (the formatting is somewhat unpleasant; I hope it's ok):
File "<stdin>", line 1, in <module>
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 349, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1175, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1208, in _train_model_default
saving_listeners)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1507, in _train_with_estimator_spec
log_step_count_steps=log_step_count_steps) as mon_sess:
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 604, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1038, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 737, in __init__
h.begin()
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 563, in begin
self._summary_writer = SummaryWriterCache.get(self._checkpoint_dir)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer_cache.py", line 63, in get
logdir, graph=ops.get_default_graph())
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py", line 371, in __init__
super(FileWriter, self).__init__(event_writer, graph, graph_def)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py", line 84, in __init__
self.add_graph(graph=graph, graph_def=graph_def)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py", line 194, in add_graph
true_graph_def = graph.as_graph_def(add_shapes=True)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3294, in as_graph_def
result, _ = self._as_graph_def(from_version, add_shapes)
File "/home/ginsberg/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3211, in _as_graph_def
graph.ParseFromString(compat.as_bytes(data))
google.protobuf.message.DecodeError: Error parsing message
and the code producing the error was simply est.train(train_input_fn,max_steps=100) just like in the example (with train_input_fn a copy of the example as well). Array sizes as above; X is 63319 x 7330 and Y is 63319 x 1.
I am running the Generative Adversarial Network in my personal system and I am getting the error provided as below, it may be because of GPU accessing problem as explained in this link: (Function call stack: keras_scratch_graph Error)
Since I want to run the code in my personal system which does not consist with GPU then how to manage that the code should not access the GPU?
The python code is provided in this link: (https://github.com/eriklindernoren/Keras-GAN/tree/master/pix2pix), where the running code is present in pix2pix.py file.
Produced Error is as follow:
C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\keras\engine\training.py:297: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
'Discrepancy between trainable weights and collected trainable'
2020-04-08 17:42:33.366720: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Failed precondition: Error while reading resource variable _AnonymousVar131 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar131/class tensorflow::Var does not exist.
[[{{node mul_33/ReadVariableOp}}]]
Traceback (most recent call last):
File "H:/data_rar_and_others/Code_For_GAN5/pix2pix.py", line 217, in <module>
gan.train(epochs=220, batch_size=4, sample_interval=50)
File "H:/data_rar_and_others/Code_For_GAN5/pix2pix.py", line 165, in train
d_loss_real = self.discriminator.train_on_batch([imgs_A, imgs_B], valid)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\keras\engine\training.py", line 1514, in train_on_batch
outputs = self.train_function(ins)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3727, in __call__
outputs = self._graph_fn(*converted_inputs)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\eager\function.py", line 1551, in __call__
return self._call_impl(args, kwargs)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\eager\function.py", line 1591, in _call_impl
return self._call_flat(args, self.captured_inputs, cancellation_manager)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\eager\function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\eager\function.py", line 545, in call
ctx=ctx)
File "C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable _AnonymousVar131 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar131/class tensorflow::Var does not exist.
[[node mul_33/ReadVariableOp (defined at C:\ProgramData\Anaconda2\envs\GaitRecognitionCNN-master13\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_6898]
Function call stack:
keras_scratch_graph
This error has been resolved when I have changed the
from keras.optimizers import Adam
with the following
from tensorflow.keras.optimizers import Adam
I'm stuck with restoring pre-trained network with Tensorflow....
import tensorflow as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
sess=tf.Session()
saver = tf.train.import_meta_graph('./model/20170512-110547/model-20170512-110547.meta')
saver.restore(sess,'./model/20170512-110547/')
I'd like to use pre-trained network which was trained for face recognition, and then wanna add some layers for transfer learning.
(I downloaded the model from here. https://github.com/davidsandberg/facenet)
When I execute the code above, it shows the error,
WARNING:tensorflow:The saved meta_graph is possibly from an older release:
'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
Traceback (most recent call last):
File "/Users/user/Desktop/desktop/Python/HCR/Transfer_face/test.py", line 7, in <module>
saver.restore(sess,'./model/20170512-110547/')
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./model/20170512-110547/
[[Node: save/RestoreV2_491 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_491/tensor_names, save/RestoreV2_491/shape_and_slices)]]
Caused by op u'save/RestoreV2_491', defined at:
File "/Users/user/Desktop/desktop/Python/HCR/Transfer_face/test.py", line 6, in <module>
saver = tf.train.import_meta_graph('./model/20170512-110547/model-20170512-110547.meta')
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1698, in import_meta_graph
**kwargs)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py", line 656, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def
op_def=op_def)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./model/20170512-110547/
[[Node: save/RestoreV2_491 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_491/tensor_names, save/RestoreV2_491/shape_and_slices)]]
I can't understand why the system can't find pre-trained data...
And the directory structure is as below
USER-no-MacBook-Pro:Transfer_face user$ ls -R
model test.py
./model:
20170512-110547
./model/20170512-110547:
20170512-110547.pb
model-20170512-110547.ckpt-250000.index
model-20170512-110547.ckpt-250000.data-00000-of-00001
model-20170512-110547.meta
Import the .pb file.
import tensorflow as tf
from tensorflow.python.framework import tensor_util
with tf.gfile.GFile('20170512-110547.pb', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
#import into default graph
tf.import_graph_def(graph_def)
#print some data
wts = [n for n in graph_def.node if n.op == 'Const']
for n in wts:
print(tensor_util.MakeNdarray(n.attr['value'].tensor))
Linked questions:
Import a simple Tensorflow frozen_model.pb file and make prediction in C++
get the value weights from .pb file by Tensorflow
Related documentation: GraphDef
You need use the ckpt path "./model/20170512-110547/model-20170512-110547.ckpt-250000" instead of the folder path.