Restoring a Tensorflow model that uses Iterators

Restoring a Tensorflow model that uses Iterators - python

I have a model that's trains my network using an Iterator; following the new Dataset API pipeline model that's now recommended by Google.
I read tfrecord files, feed data to the network, train nicely, and all is going well, I save my model in the end of the training so I can run Inference on it later. A simplified version of the code is as following:
""" Training and saving """
training_dataset = tf.contrib.data.TFRecordDataset(training_record)
training_dataset = training_dataset.map(ds._path_records_parser)
training_dataset = training_dataset.batch(BATCH_SIZE)
with tf.name_scope("iterators"):
training_iterator = Iterator.from_structure(training_dataset.output_types, training_dataset.output_shapes)
next_training_element = training_iterator.get_next()
training_init_op = training_iterator.make_initializer(training_dataset)
def train(num_epochs):
# compute for the number of epochs
for e in range(1, num_epochs+1):
session.run(training_init_op) #initializing iterator here
while True:
try:
images, labels = session.run(next_training_element)
session.run(optimizer, feed_dict={x: images, y_true: labels})
except tf.errors.OutOfRangeError:
saver_name = './saved_models/ucf-model'
print("Finished Training Epoch {}".format(e))
break
""" Restoring """
# restoring the saved model and its variables
session = tf.Session()
saver = tf.train.import_meta_graph(r'saved_models\ucf-model.meta')
saver.restore(session, tf.train.latest_checkpoint('.\saved_models'))
graph = tf.get_default_graph()
# restoring relevant tensors/ops
accuracy = graph.get_tensor_by_name("accuracy/Mean:0") #the tensor that when evaluated returns the mean accuracy of the batch
testing_iterator = graph.get_operation_by_name("iterators/Iterator") #my iterator used in testing.
next_testing_element = graph.get_operation_by_name("iterators/IteratorGetNext") #the GetNext operator for my iterator
# loading my testing set tfrecords
testing_dataset = tf.contrib.data.TFRecordDataset(testing_record_path)
testing_dataset = testing_dataset.map(ds._path_records_parser, num_threads=4, output_buffer_size=BATCH_SIZE*20)
testing_dataset = testing_dataset.batch(BATCH_SIZE)
testing_init_op = testing_iterator.make_initializer(testing_dataset) #to initialize the dataset
with tf.Session() as session:
session.run(testing_init_op)
while True:
try:
images, labels = session.run(next_testing_element)
accuracy = session.run(accuracy, feed_dict={x: test_images, y_true: test_labels}) #error here, x, y_true not defined
except tf.errors.OutOfRangeError:
break
My problem is mainly when I restore the model. How to feed testing data to the network?
When I restore my Iterator using testing_iterator = graph.get_operation_by_name("iterators/Iterator"), next_testing_element = graph.get_operation_by_name("iterators/IteratorGetNext"), I get the following error:
GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
So I did try to initialize my dataset using: testing_init_op = testing_iterator.make_initializer(testing_dataset)). I got this error: AttributeError: 'Operation' object has no attribute 'make_initializer'
Another issue is, since an iterator is being used, there's no need to use placeholders in the training_model, as an iterator feed data directly to the graph. But this way, how to restore my feed_dict keys in the 3rd to last line, when I feed data to the "accuracy" op?
EDIT: if someone could suggest a way to add placeholders between the Iterator and the network input, then I could try running the graph by evaluating the "accuracy" tensor while feeding data to the placeholders and ignoring the iterator altogether.

When restoring a saved meta graph, you can restore the initialization operation with name and then use it again to initialize the input pipeline for inference.
That is, when creating the graph, you can do
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
And then restore this operation by doing:
dataset_init_op = graph.get_operation_by_name('dataset_init')
Here is a self contained code snippet that compares results of a randomly initialized model before and after restoring.
Saving an Iterator
np.random.seed(42)
data = np.random.random([4, 4])
X = tf.placeholder(dtype=tf.float32, shape=[4, 4], name='X')
dataset = tf.data.Dataset.from_tensor_slices(X)
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_next_op = iterator.get_next()
# name the operation
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
w = np.random.random([1, 4])
W = tf.Variable(w, name='W', dtype=tf.float32)
output = tf.multiply(W, dataset_next_op, name='output')
sess = tf.Session()
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer())
sess.run(dataset_init_op, feed_dict={X:data})
while True:
try:
print(sess.run(output))
except tf.errors.OutOfRangeError:
saver.save(sess, 'tmp/', global_step=1002)
break
And then you can restore the same model for inference as follows:
Restoring saved iterator
np.random.seed(42)
data = np.random.random([4, 4])
tf.reset_default_graph()
sess = tf.Session()
saver = tf.train.import_meta_graph('tmp/-1002.meta')
ckpt = tf.train.get_checkpoint_state(os.path.dirname('tmp/checkpoint'))
saver.restore(sess, ckpt.model_checkpoint_path)
graph = tf.get_default_graph()
# Restore the init operation
dataset_init_op = graph.get_operation_by_name('dataset_init')
X = graph.get_tensor_by_name('X:0')
output = graph.get_tensor_by_name('output:0')
sess.run(dataset_init_op, feed_dict={X:data})
while True:
try:
print(sess.run(output))
except tf.errors.OutOfRangeError:
break

I would suggest to use tf.contrib.data.make_saveable_from_iterator, which has been designed precisely for this purpose. It is much less verbose and does not require you to change existing code, in particular how you define your iterator.
Working example, when we save everything after step 5 has completed. Note how I don't even bother knowing what seed is used.
import tensorflow as tf
iterator = (
tf.data.Dataset.range(100)
.shuffle(10)
.make_one_shot_iterator())
batch = iterator.get_next(name='batch')
saveable_obj = tf.contrib.data.make_saveable_from_iterator(iterator)
tf.add_to_collection(tf.GraphKeys.SAVEABLE_OBJECTS, saveable_obj)
saver = tf.train.Saver()
with tf.Session() as sess:
tf.global_variables_initializer().run()
for step in range(10):
print('{}: {}'.format(step, sess.run(batch)))
if step == 5:
saver.save(sess, './foo', global_step=step)
# 0: 1
# 1: 6
# 2: 7
# 3: 3
# 4: 8
# 5: 10
# 6: 12
# 7: 14
# 8: 5
# 9: 17
Then later, if we resume from step 6, we get the same output.
import tensorflow as tf
saver = tf.train.import_meta_graph('./foo-5.meta')
with tf.Session() as sess:
saver.restore(sess, './foo-5')
for step in range(6, 10):
print('{}: {}'.format(step, sess.run('batch:0')))
# 6: 12
# 7: 14
# 8: 5
# 9: 17

I couldn't solve the problem related to initializing the iterator, but since I pre-process my dataset using map method, and I apply transformations defined by Python operations wrapped with py_func, which cannot be serialized for storing\restoring, I'll have to initialize my dataset when I want to restore it anyway.
So, the problem that remains is how to feed data to my graph when I restore it. I placed a tf.identity node between the iterator output and my network input. Upon restoring, I feed my data to the identity node. A better solution that I discovered later is using placeholder_with_default(), as described in this answer.

I would suggest having a look at CheckpointInputPipelineHook CheckpointInputPipelineHook, which implements saving iterator state for further training with tf.Estimator.

Related

How to replace feed_dict when using an input pipeline?

Suppose you have an network that has worked with feed_dict so far to inject data into a graph. Every few epochs, I evaluated the training and test loss by feeding a batch from either dataset to my graph.
Now, for performance reasons, I decided to use an input pipeline. Take a look at this dummy example:
import tensorflow as tf
import numpy as np
dataset_size = 200
batch_size= 5
dimension = 4
# create some training dataset
dataset = tf.data.Dataset.\
from_tensor_slices(np.random.normal(2.0,size=(dataset_size,dimension)).
astype(np.float32))
dataset = dataset.batch(batch_size) # take batches
iterator = dataset.make_initializable_iterator()
x = tf.cast(iterator.get_next(),tf.float32)
w = tf.Variable(np.random.normal(size=(1,dimension)).astype(np.float32))
loss_func = lambda x,w: tf.reduce_mean(tf.square(x-w)) # notice that the loss function is a mean!
loss = loss_func(x,w) # this is the loss that will be minimized
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# train one epoch
sess.run(iterator.initializer)
for i in range(dataset_size//batch_size):
# the training step will update the weights based on ONE batch of examples each step
loss1,_ = sess.run([loss,train_op])
print('train step {:d}. batch loss {:f}.'.format(i,loss1))
# I want to print the loss from another dataset (test set) here
Printing the loss of the training data is no problem, but how do I do this for another dataset? When using feed_dict, I simply got a batch from said set and fed it a value for x.

There are several things you can do for that. One simple option could be something like having two datasets and iterators and use tf.cond to switch between them. However, the more powerful way of doing it is to use an iterator that supports this directly. See the guide on how to create iterators for a description of the various iterator types. For example, using a reinitializable iterator you could have something like this:
import tensorflow as tf
import numpy as np
dataset_size = 200
dataset_test_size = 20
batch_size= 5
dimension = 4
# create some training dataset
dataset = tf.data.Dataset.\
from_tensor_slices(np.random.normal(2.0,size=(dataset_size,dimension)).
astype(np.float32))
dataset = dataset.batch(batch_size) # take batches
# create some test dataset
dataset_test = tf.data.Dataset.\
from_tensor_slices(np.random.normal(2.0,size=(dataset_test_size,dimension)).
astype(np.float32))
dataset_test = dataset_test.batch(batch_size) # take batches
iterator = tf.data.Iterator.from_structure(dataset.output_types,
dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset)
dataset_test_init_op = iterator.make_initializer(dataset_test)
x = tf.cast(iterator.get_next(),tf.float32)
w = tf.Variable(np.random.normal(size=(1,dimension)).astype(np.float32))
loss_func = lambda x,w: tf.reduce_mean(tf.square(x-w)) # notice that the loss function is a mean!
loss = loss_func(x,w) # this is the loss that will be minimized
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# train one epoch
sess.run(dataset_init_op)
for i in range(dataset_size//batch_size):
# the training step will update the weights based on ONE batch of examples each step
loss1,_ = sess.run([loss,train_op])
print('train step {:d}. batch loss {:f}.'.format(i,loss1))
# print test loss
sess.run(dataset_test_init_op)
for i in range(dataset_test_size//batch_size):
loss1 = sess.run(loss)
print('test step {:d}. batch loss {:f}.'.format(i,loss1))
You can do something similar with a feedable iterator, depending on what you find more convenient, and I suppose even with an initializable iterator, for example making a boolean dataset that then you map to some data with tf.cond, although that would not be a very natural way to do it.
EDIT:
Here is how you can do it with an initializable iterator, actually in a cleaner way than what I was initially thinking, so maybe you actually like this more:
import tensorflow as tf
import numpy as np
dataset_size = 200
dataset_test_size = 20
batch_size= 5
dimension = 4
# create data
data = tf.constant(np.random.normal(2.0,size=(dataset_size,dimension)), tf.float32)
data_test = tf.constant(np.random.normal(2.0,size=(dataset_test_size,dimension)), tf.float32)
# choose data
testing = tf.placeholder_with_default(False, ())
current_data = tf.cond(testing, lambda: data_test, lambda: data)
# create dataset
dataset = tf.data.Dataset.from_tensor_slices(current_data)
dataset = dataset.batch(batch_size)
# create iterator
iterator = dataset.make_initializable_iterator()
x = tf.cast(iterator.get_next(),tf.float32)
w = tf.Variable(np.random.normal(size=(1,dimension)).astype(np.float32))
loss_func = lambda x,w: tf.reduce_mean(tf.square(x-w)) # notice that the loss function is a mean!
loss = loss_func(x,w) # this is the loss that will be minimized
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# train one epoch
sess.run(iterator.initializer)
for i in range(dataset_size//batch_size):
# the training step will update the weights based on ONE batch of examples each step
loss1,_ = sess.run([loss,train_op])
print('train step {:d}. batch loss {:f}.'.format(i,loss1))
# print test loss
sess.run(iterator.initializer, feed_dict={testing: True})
for i in range(dataset_test_size//batch_size):
loss1 = sess.run(loss)
print('test step {:d}. batch loss {:f}.'.format(i,loss1))

Tensorflow dataset with changing batch size to compute test loss during training

I'm trying to run a training loop where I periodically determine the current average loss and print it to the console. In order to determine the loss I'd like to use a different batch size. So it goes like this:
dataset = create_dataset().shuffle(1000).repeat().batch(minibatch_size)
iterator = dataset.make_one_shot_iterator() # using this iterator in the graph
while ...:
session.run(...) # perform training
if epoch % 10 = 0:
test_avg_loss = session.run(avg_loss) # want a different number of items here
I want a minibatch size of 10 during training but I'd like to test with 100 data points to obtain a better estimate for the average loss. How can make the dataset return a different number of items here? I tried passing a placeholder to batch but it seems unsupported. The error is:
'ValueError : Cannot capture a placeholder (name:batchSize, type:Placeholder) by value.'
I'm open to using a different code structure altogether if that seems like a better solution. I understand it is important to not pass data using feedDict for performance reasons so using a dataset seems like the way to go. I'm not seeking some kind of hack but I'd like to know what's the right way to do this.

A good solution is to use a reinitializable iterator, that let you switch between two (or more) Datasets, typically one for training and one for validation.
The example in the documentation is actually pretty neat:
# Define training and validation datasets with the same structure.
training_dataset = tf.data.Dataset.range(100).map(
lambda x: x + tf.random_uniform([], -10, 10, tf.int64))
validation_dataset = tf.data.Dataset.range(50)
# A reinitializable iterator is defined by its structure. We could use the
# `output_types` and `output_shapes` properties of either `training_dataset`
# or `validation_dataset` here, because they are compatible.
iterator = tf.data.Iterator.from_structure(training_dataset.output_types,
training_dataset.output_shapes)
next_element = iterator.get_next()
training_init_op = iterator.make_initializer(training_dataset)
validation_init_op = iterator.make_initializer(validation_dataset)
# Run 20 epochs in which the training dataset is traversed, followed by the
# validation dataset.
for _ in range(20):
# Initialize an iterator over the training dataset.
sess.run(training_init_op)
for _ in range(100):
sess.run(next_element)
# Initialize an iterator over the validation dataset.
sess.run(validation_init_op)
for _ in range(50):
sess.run(next_element)
Just make sure in your case that the iterator you create has an unknown batch size.

Based on your comment, you should look into a feedable iterator that can be used together with tf.placeholder to select what Iterator to use in each call to tf.Session.run, via the familiar feed_dict mechanism. It offers the same functionality as a reinitializable iterator, but it does not require you to initialize the iterator from the start of a dataset when you switch between iterators.
# Training and validation datasets
training_dataset = tf.data.Dataset.range(100).repeat().batch(100)
validation_dataset = tf.data.Dataset.range(150, 200).repeat().batch(10)
# A feedable iterator to toggle between validation and training dataset
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(
handle, training_dataset.output_types, training_dataset.output_shapes)
next_element = iterator.get_next()
training_iterator = training_dataset.make_one_shot_iterator()
validation_iterator = validation_dataset.make_one_shot_iterator()
with tf.Session() as sess:
# The `Iterator.string_handle()` method returns a tensor that can be evaluated
# and used to feed the `handle` placeholder.
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
# Run 20 epochs in which the training dataset is traversed, followed by the
# validation dataset.
for _ in range(20):
for _ in range(100):
out = sess.run(next_element, feed_dict={handle: training_handle})
for _ in range(50):
out = sess.run(next_element, feed_dict={handle: validation_handle})

Shape your placeholder with [None, None]
Now during evaluate and training do something like this :
Give a structure to your training file :
import tensorflow as tf
def shape(dataset):
#shape your data here
return {'input':np.array(input_data),'label':np.array(labels)}
def evaluate(model,batch_size=100):
sess = tf.get_default_graph()
iteration = len(dataset) // batch_size
loss = []
for j in iteration:
dataset = dataset[j * batch_size:(j + 1) * batch_size]
#shape it here before feeding to network
dataset=shape(dataset)
out = sess.run(model, feed_dict={input_place: dataset['input'], labels: data['labels']})
loss.append(out['loss'])
return np.mean(loss)
def train(model,batch_size=10):
iteration=len(dataset)//batch_size
with tf.Session() as sess:
for i in epoch(epoch):
for j in iteration:
dataset = dataset[j * batch_size:(j + 1) * batch_size]
dataset = shape(dataset)
# shape it here before feeding to network
out = sess.run(model, feed_dict={input_place: dataset['input'], labels: data['labels']})
print(out['loss'], out['training_accuracy'])
print(evaluate(model))

Saving predicted tensor to image in TensorFlow - Graph finalized

I was able to train a model in TensorFlow with my own data. Input and Output of the model are images. I now tried to get the output of the predictions and save it to an png image file to see what's going on. Unfortunately I am getting an error when running the following function I created to test with predictions. My goal is to save the prediction that is also an image so I can open it with a normal image viewer.
Some more to the code. In my main I am creating an estimator
def predict_element(my_model, features):
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
x=features,
num_epochs=1,
shuffle=False)
eval_results = my_model.predict(input_fn=eval_input_fn)
predictions = eval_results.next() #this returns a dict with my tensors
prediction_tensor = predictions["y"] #get the tensor from the dict
image_tensor = tf.reshape(prediction_tensor, [IMG_WIDTH, -1]) #reshape to a matrix due my returned tensor is a 1D flat one
decoded_image = tf.image.encode_png(image_tensor)
write_image = tf.write_file("output/my_output_image.png", decoded_image)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(write_image))
def get_input():
filename_dataset = tf.data.Dataset.list_files("features/*.png")
label_dataset = tf.data.Dataset.list_files("labels/*.png")
# Make a Dataset of image tensors by reading and decoding the files.
image_dataset = filename_dataset.map(lambda x: tf.cast(tf.image.decode_png(tf.read_file(x), channels=1),tf.float32))
l_dataset = label_dataset.map(lambda x: tf.cast(tf.image.decode_png(tf.read_file(x),channels=1),tf.float32))
image_reshape = image_dataset.map(lambda x: tf.reshape(x, [IM_WIDTH * IM_HEIGHT]))
label_reshape = l_dataset.map(lambda x: tf.reshape(x, [IM_WIDTH * IM_HEIGHT]))
iterator = image_reshape.make_one_shot_iterator()
iterator2 = label_reshape.make_one_shot_iterator()
next_img = iterator.get_next()
next_lbl = iterator2.get_next()
features = []
labels = []
# read all 10 images and labels and put it in the array
# so we can pass it to the estimator
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(10):
t1, t2 = sess.run([next_img, next_lbl])
features.append(t1)
labels.append(t2)
return {"x": np.array(features)}, np.array(labels)
def main(unused_argv):
features, labels = get_input() # creating the features dict {"x": }
my_estimator = tf.estimator.Estimator(model_fn=my_cnn_model, model_dir="/tmp/my_model")
predict_element(my_estimator, features)
The error is
Graph is finalized and cannot be modified
With some easy print() statements I could see that retrieving the dict with
eval_results = my_model.predict(input_fn=eval_input_fn)
is probable the one which finalizes the graph.
I absolutely don't know what to do or where to look for a solution here. How could I save the output?
I tried this in my model_fn:
#the last layer of my network is dropout
predictions = {
"y": dropout
}
if mode == tf.estimator.ModeKeys.PREDICT:
reshape1 = tf.reshape(dropout, [-1,IM_WIDTH, IM_HEIGHT])
sliced = tf.slice(reshape1, [0,0,0], [1, IM_WIDTH, IM_HEIGHT])
encoded = tf.image.encode_png(tf.cast(sliced, dtype=tf.uint8))
outputfile = tf.write_file(params["output_path"], encoded)
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
My problem here is that I can't pass back the "outputfile" node so I can work with it.

Well your graph is finalized and cannot be modified. You can either add this tensorflow operations to your model (before running it) or simply write some python code which saves the images seperately (without using tensorflow). Maybe I'll find some old code of mine as an example.
You could also create a second graph, then you can use tensorflow without changing the existing model graph.
You have to distinguish between graph nodes and evaluated objects. tf.reshape doesn't take an array as input but a graph node.
https://www.tensorflow.org/programmers_guide/graphs

for everyone with the same problem here is my solution. I don't know if this is the proper way but it works.
In my predict function i created a second graph for the reshaping, slicing, encoding and saving like:
pred_dict = eval_results.next() #generator the predict function returns
preds = pred_dict["y"] #get the predictions from the dict
#create the second graph
g = tf.Graph()
with g.as_default():
inp = tf.Variable(preds)
reshape1 = tf.reshape(printnode, [IM_WIDTH, IM_HEIGHT, -1])
sliced = tf.slice(reshape1, [0,0,0], [ IM_WIDTH, IM_HEIGHT,1])
reshaped = tf.reshape(sliced, [IM_HEIGHT, IM_WIDTH, 1])
encoded = tf.image.encode_png(tf.image.convert_image_dtype(reshaped,tf.uint16))
outputfile = tf.write_file("/tmp/pred_output/prediction_img.png", encoded)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(outputfile)

What is a standard way to add tf.placeholder for test/validation data with queue runners

I have a training model that takes all training data and creates a queue:
x = tf.placeholder(tf.float32, (N, steps, size), name='x')
y = tf.placeholder(tf.float32, (N, out_size), name='y')
var_x = tf.Variable(x, trainable=False, collections=[])
var_y = tf.Variable(y, trainable=False, collections=[])
x_queue, y_queue = tf.train.slice_input_producer([var_x, var_y],
num_epochs=10, shuffle=True)
x_batch, y_batch = tf.train.batch([x_queue, y_queue], batch_size=batch_size)
...
with tf.Session() as sess:
sess.run(var_x, feed_dict={x: X})
sess.run(var_y, feed_dict={y: Y})
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
...
This network works fine and I am able to train it.
In this network I'd like to add a new placeholder that takes my test data:
x_test = tf.placeholder(tf.float32, (1, steps, size), name='x_test')
And I'd like to use tf.cond to control which placeholder gets fed:
rnn_inputs = tf.cond(is_train, lambda: x, lambda: x_test)
However, a lot of posts say using tf.cond is not efficient. In addition, using a new placeholder for test/validation data is a problem since tensorflow throws an error asking me to feed data into it even if I am trying to train the model.
Is there a standard way of doing this?

The most efficient is to use iterators to feed your data. You can create a handle to specify whether to feed from the train or validation dataset. Here is an example from https://www.tensorflow.org/programmers_guide/datasets. I have found this method effective
# Define training and validation datasets with the same structure.
training_dataset = tf.data.Dataset.range(100).map(
lambda x: x + tf.random_uniform([], -10, 10, tf.int64)).repeat()
validation_dataset = tf.data.Dataset.range(50)
# A feedable iterator is defined by a handle placeholder and its structure. We
# could use the output_types and output_shapes properties of either
# training_dataset or validation_dataset here, because they have
# identical structure.
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(
handle, training_dataset.output_types, training_dataset.output_shapes)
next_element = iterator.get_next()
# You can use feedable iterators with a variety of different kinds of iterator
# (such as one-shot and initializable iterators).
training_iterator = training_dataset.make_one_shot_iterator()
validation_iterator = validation_dataset.make_initializable_iterator()
# The Iterator.string_handle() method returns a tensor that can be evaluated
# and used to feed the handle placeholder.
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
# Loop forever, alternating between training and validation.
while True:
# Run 200 steps using the training dataset. Note that the training dataset is
# infinite, and we resume from where we left off in the previous `while` loop
# iteration.
for _ in range(200):
sess.run(next_element, feed_dict={handle: training_handle})
# Run one pass over the validation dataset.
sess.run(validation_iterator.initializer)
for _ in range(50):
sess.run(next_element, feed_dict={handle: validation_handle})

How to use dataset in TensorFlow session for training

I like to perform image classification on our own large image libary (millions of labeled images) with tensorflow. I´m new to stackoverflow, python and tensorflow and worked myself through a few tutorials (mnist etc.) and got to the point, where i was able to prepare a TensorFlow datset from a dictionary including the absolute path to the images and the according labels. However, i´m stuck at the point using the dataset in a TensorFlow session. Here is my (example) code:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import time
import mymodule # I build my module to read the images and labels
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
from tensorflow.contrib.data import Iterator
beginTime = time.time()
batch_size = 100
learning_rate = 0.005
max_steps = 2
NUM_CLASSES = 25
def input_parser(img_path, label):
one_hot = tf.one_hot(label, NUM_CLASSES)
img_file = tf.read_file(img_path)
img_decoded = tf.image.decode_jpeg(img_file, channels = 3)
return img_decoded, one_hot
#Import Training data (returns the dicitonary with paths and labels)
train_dict = mymodule.getFileMap(labelList, imageList)
#Import Test data
test_dict = mymodule.getFileMap(labelList, imageList)
#Get train data
train_file_list, train_label_list = get_file_label_list(train_dict)
train_images_tensor = ops.convert_to_tensor(train_file_list, dtype=dtypes.string)
train_labels_tensor = ops.convert_to_tensor(train_label_list, dtype=dtypes.int64)
#Get test data
test_file_list, test_label_list = get_file_label_list(test_dict)
test_images_tensor = ops.convert_to_tensor(test_file_list, dtype=dtypes.string)
test_labels_tensor = ops.convert_to_tensor(test_label_list, dtype=dtypes.int64)
#Create TensorFlow Datset object
train_data = tf.data.Dataset.from_tensor_slices((train_images_tensor, train_labels_tensor))
test_data = tf.data.Dataset.from_tensor_slices((test_images_tensor, test_labels_tensor))
# Transform the datset so that it contains decoded images
# and one-hot vector labels
train_data = train_data.map(input_parser)
test_data = test_data.map(input_parser)
# Batching --> How to do it right?
#train_data = train_data.batch(batch_size = 100)
#test_data = train_data.batch(batch_size = 100)
#Define input placeholders
image_size = 990*990*3
images_placeholder = tf.placeholder(tf.float32, shape=[None, image_size])
labels_placeholder = tf.placeholder(tf.int64, shape=[None])
# Define variables (these afe the values we want to optimize)
weigths = tf.Variable(tf.zeros([image_size, NUM_CLASSES]))
biases = tf.Variable(tf.zeros([NUM_CLASSES]))
# Define the classifier´s result
logits = tf.matmul(images_placeholder, weigths) + biases
# Define the loss function
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = labels_placeholder))
# Define the training operation
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Operation comparing prediciton with true label
correct_prediciton = tf.equal(tf.argmax(logits, 1), labels_placeholder)
# Operation calculating the accuracy of our predicitons
accuracy = tf.reduce_mean(tf.cast(correct_prediciton, tf.float32))
#Create TensorFlow Iterator object
iterator = Iterator.from_structure(train_data.output_types,
train_data.output_shapes)
next_element = iterator.get_next()
#Create two initialization ops to switch between the datasets
train_init_op = iterator.make_initializer(train_data)
test_init_op = iterator.make_initializer(test_data)
with tf.Session() as sess:
#Initialize variables
sess.run(tf.global_variables_initializer())
sess.run(train_init_op)
for _ in range(10):
try:
elem = sess.run(next_element)
print(elem)
except tf.errors.OutOfRangeError:
print("End of training datset.")
break
Following this and this tutorial i could not solve the problem of how to use the (image and label) dataset in a tensorflow session for training. I was able to print out the datset by iterating through it, but wasn´t able to use it for learning.
I don´t understand how to access the images and labels seperately after they have been merged in the train_data = tf.data.Dataset.from_tensor_slices((train_images_tensor, train_labels_tensor)) operation, as requried by the 2nd tutorial. Also i don´t know how to implement batching correctly.
What i want to do in the session is basically this (from the 2nd tutorial):
# Generate input data batch
indices = np.random.choice(data_sets['images_train'].shape[0], batch_size)
images_batch = data_sets['images_train'][indices]
labels_batch = data_sets['labels_train'][indices]
# Periodically print out the model's current accuracy
if i % 100 == 0:
train_accuracy = sess.run(accuracy, feed_dict={
images_placeholder: images_batch, labels_placeholder: labels_batch})
print('Step {:5d}: training accuracy {:g}'.format(i, train_accuracy))
# Perform a single training step
sess.run(train_step, feed_dict={images_placeholder: images_batch,
labels_placeholder: labels_batch})
# After finishing the training, evaluate on the test set
test_accuracy = sess.run(accuracy, feed_dict={
images_placeholder: data_sets['images_test'],
labels_placeholder: data_sets['labels_test']})
print('Test accuracy {:g}'.format(test_accuracy))
endTime = time.time()
print('Total time: {:5.2f}s'.format(endTime - beginTime))
If anyone can tell me, how to access images and labels in the dataset sepearately and use it for training, i would be really thankful. Also a tip where and how to do the batching would be appreciated.
Thank you.

In your code, next_element is a tuple of two tensors, matching the structure of your datasets: i.e. it is a tuple whose first element is an image, and second element is a label. To access the individual tensors, you can do the following:
next_element = iterator.get_next()
next_image = next_element[0]
next_label = next_element[1]
# Or, in a single line:
next_image, next_label = iterator.get_next()
To batch a tf.data.Dataset, you can use the Dataset.batch() transformation. Your commented out code for this should simply work:
train_data = train_data.batch(batch_size = 100)
test_data = train_data.batch(batch_size = 100)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Restoring a Tensorflow model that uses Iterators - python

I would suggest having a look at CheckpointInputPipelineHook CheckpointInputPipelineHook, which implements saving iterator state for further training with tf.Estimator.

Related

How to replace feed_dict when using an input pipeline?

Tensorflow dataset with changing batch size to compute test loss during training

Saving predicted tensor to image in TensorFlow - Graph finalized

What is a standard way to add tf.placeholder for test/validation data with queue runners

How to use dataset in TensorFlow session for training

Categories

Resources