Tensorflow: Modern way to load large data - python

I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' file) does not fit in RAM all at once. What is the best way to save and load large data-sets into a neural network for training? Since I didn't manage to find a good answer to this (surely ubiquitous?) problem, I'm hoping to hear one here. Thank you very much in advance for any help!
Sources
Similar questions seem to have been asked many times (e.g. training-classifier-from-tfrecords-in-tensorflow, tensorflow-synchronize-readings-from-tfrecord, how-to-load-data-parallelly-in-tensorflow) but are several years old and usually contain no conclusive answer.
My current understanding is that using TFRecord files is a good way to approach this problem. The most promising tutorial I found so far explaining how to use TFRecord files with keras is medium.com. Other helpful sources were machinelearninguru.com and medium.com_source2 and sources therin.
The official tensorflow documentation and tutorials (on tf.data.Dataset, Importing Data, tf_records etc.) did not help me. In particular, several of the examples given there didn't work for me even without modifications.
My Attempt at using TFRecord files
I'm assuming TFRecords are a good way to solve my problem but I'm having a hard time using them. Here is an example I made based on the tutorial medium.com. I stripped down the code as much as I could.
# python 3.6, tensorflow 1.13.
# Adapted from https://medium.com/#moritzkrger/speeding-up-keras-with-tfrecord-datasets-5464f9836c36
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def writeTFRecords():
number_of_samples = 100 # create some random data to play with
images, labels = (np.random.sample((number_of_samples, 256, 256, 1)), np.random.randint(0, 30, number_of_samples))
writer = tf.python_io.TFRecordWriter("bla.tfrecord")
for index in range(images.shape[0]):
image = images[index]
label = labels[index]
feature = {'image': _bytes_feature(tf.compat.as_bytes(image.tostring())),
'label': _int64_feature(int(label))}
example = tf.train.Example(features=tf.train.Features(feature=feature))
writer.write(example.SerializeToString())
writer.close()
def loadTFRecord(data_path):
with tf.Session() as sess:
feature = {'train/image': tf.FixedLenFeature([], tf.string),
'train/label': tf.FixedLenFeature([], tf.int64)}
# Create a list of filenames and pass it to a queue
filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
# Define a reader and read the next record
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
# Decode the record read by the reader
features = tf.parse_single_example(serialized_example, features=feature)
# Convert the image data from string back to the numbers
image = tf.decode_raw(features['train/image'], tf.float32)
# Cast label data into int32
label = tf.cast(features['train/label'], tf.int32)
# Reshape image data into the original shape
image = tf.reshape(image, [256, 256, 1])
return image, label # I'm not 100% sure that's how this works...
# ######### generate a TFRecords file in the working directory containing random data. #################################
writeTFRecords()
# ######## Load the TFRecords file and use it to train a simple example neural network. ################################
image, label = loadTFRecord("bla.tfrecord")
model_input = keras.layers.Input(tensor=image)
model_output = keras.layers.Flatten(input_shape=(-1, 256, 256, 1))(model_input)
model_output = keras.layers.Dense(16, activation='relu')(model_output)
train_model = keras.models.Model(inputs=model_input, outputs=model_output)
train_model.compile(optimizer=keras.optimizers.RMSprop(lr=0.0001),
loss='mean_squared_error',
target_tensors=[label])
print("\n \n start training \n \n") # Execution gets stuck on fitting
train_model.fit(epochs=1, steps_per_epoch=10) # no output or error messages.
The code creates a TFRecord file and starts fitting, then just gets stuck with no output or error messages. I don't know what the problem is or how I could try to fix it.

While this is no real answer to the original question (i.e. "what is the optimal way to train on large datasets"), I managed to get tfrecords and datasets to work. Of particular help was this tutorial on YouTube. I include a minimal example with working code for anyone struggling with the same problem.
# Developed using python 3.6, tensorflow 1.14.0.
# This code writes data (pairs (label, image) where label is int64 and image is np.ndarray) into .tfrecord files and
# uses them for training a simple neural network. It is meant as a minimal working example of how to use tfrecords. This
# solution is likely not optimal. If you know how to improve it, please comment on
# https://stackoverflow.com/q/57717004/9988487. Refer to links therein for further information.
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def write_tfrecords_file(out_path: str, images: np.ndarray, labels: np.ndarray) -> None:
"""Write all image-label pairs into a single .tfrecord file.
:param out_path: File path of the .tfrecord file to generate or overwrite.
:param images: array with first dimension being the image index. Every images[i].tostring() is
serialized and written into the file as 'image': wrap_bytes(img_bytes)
:param labels: 1d array of integers. labels[i] is the label of images[i]. Written as 'label': wrap_int64(label)"""
assert len(images) == len(labels)
with tf.io.TFRecordWriter(out_path) as writer: # could use writer_options parameter to enable compression
for i in range(len(labels)):
img_bytes = images[i].tostring() # Convert the image to raw bytes.
label = labels[i]
data = {'image': _bytes_feature(img_bytes), 'label': _int64_feature(label)}
feature = tf.train.Features(feature=data) # Wrap the data as TensorFlow Features.
example = tf.train.Example(features=feature) # Wrap again as a TensorFlow Example.
serialized = example.SerializeToString() # Serialize the data.
writer.write(serialized) # Write the serialized data to the TFRecords file.
def parse_example(serialized, shape=(256, 256, 1)):
features = {'image': tf.io.FixedLenFeature([], tf.string), 'label': tf.io.FixedLenFeature([], tf.int64)}
# Parse the serialized data so we get a dict with our data.
parsed_example = tf.io.parse_single_example(serialized=serialized, features=features)
label = parsed_example['label']
image_raw = parsed_example['image'] # Get the image as raw bytes.
image = tf.decode_raw(image_raw, tf.float32) # Decode the raw bytes so it becomes a tensor with type.
image = tf.reshape(image, shape=shape)
return image, label # this function will be called once (to add it to tf graph; then parse images individually)
# create some arbitrary data to play with: 1000 images sized 256x256 with one colour channel. Use your custom np-arrays
IMAGE_WIDTH, NUM_OF_IMAGES, NUM_OF_CLASSES, COLOUR_CHANNELS = 256, 10_000, 10, 1
# using float32 to save memory. Must match type in parse_example(), tf.decode_raw(image_raw, tf.float32)
features_train = np.random.sample((NUM_OF_IMAGES, IMAGE_WIDTH, IMAGE_WIDTH, COLOUR_CHANNELS)).astype(np.float32)
labels_train = np.random.randint(low=0, high=NUM_OF_CLASSES, size=NUM_OF_IMAGES) # one random label for each image
features_eval = features_train[:200] # use the first 200 images as evaluation data for simplicity.
labels_eval = labels_train[:200]
write_tfrecords_file("train.tfrecord", features_train, labels_train) # normal: split the data files of several GB each
write_tfrecords_file("eval.tfrecord", features_eval, labels_eval) # this may take a while. Consider a progressbar
# The files are complete. Now define a model and use datasets to feed the data from the .tfrecord files into the model.
model = keras.Sequential([keras.layers.Flatten(input_shape=(256, 256, 1)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Check docs for parameters (compression, buffer size, thread count. Also www.tensorflow.org/guide/performance/datasets
train_dataset = tf.data.TFRecordDataset("train.tfrecord") # specify a list (or dataset) of file names for large data
train_dataset = train_dataset.map(parse_example) # parse tfrecords. Parameter num_parallel_calls may help performance.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
validation_dataset = tf.data.TFRecordDataset("eval.tfrecord")
validation_dataset = validation_dataset.map(parse_example).batch(64)
model.fit(train_dataset, epochs=3)
# evaluate the results
results = model.evaluate(validation_dataset)
print('\n\nvalidation loss, validation acc:', results)
Note that it's tricky to use some_keras_model.fit(..., validation_data=some_dataset) with dataset objects. It may result in
TypeError: 'DatasetV1Adapter' object does not support indexing.
This seems to be a bug (see github.com/tensorflow/tensorflow/issues/28995) and is supposedly fixed as of tf-nightly version '1.15.0-dev20190808'; The official tutorial uses this too, although it doesn't work in most versions. An easy but dirty-ish fix is to use verbose=0 (which only suppresses program output) and plot the validation results using tensorboard. Also see Keras model.fit() with tf.dataset API + validation_data.

Related

Tensorflow returns ValueError: Cannot create a tensor proto whose content is larger than 2GB

def loadData():
images_dir = os.path.join(current_dir, 'image_data')
images = []
for each in os.listdir(images_dir):
images.append(os.path.join(images_dir,each))
all_images = tf.convert_to_tensor(images, dtype = tf.string)
images_batch = tf.train.shuffle_batch(
[all_images], batch_size = BATCH_SIZE)
return images_batch
returns
ValueError: Cannot create a tensor proto whose content is larger than 2GB.
I'm trying to load about 11GB of images. How can I overcome those limitation?
Edit: Possbile duplicate:
You can split the output classes into multiple operations and concatenate them at the end is suggest, but I do not have multiple classes I can split.
Edit2:
Solutions to this problem suggest using placeholders. So now I'm not sure who to use placeholders in that case and where I can feed the array of images to tensorflow.
Here's a minimal version of my train function to show how I initialize the session.
def train():
images_batch = loadData()
sess = tf.Session()
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
for i in range(EPOCH):
train_image = sess.run(image_batch)
Using convert_to_tensor has the unexpected effect of adding your images to the computational graph, which has a hard limit of 2GB. If you hit this limit, you should reconsider how to feed images for the training process.
We already have a simple solution in TensorFlow, just use placeholders (tf.placeholder) and feed_dict in session.run. The only disadvantage in this case is that you have to produce batches of your data manually.

Tensorflow record: how to read and plot image values?

I have data in a tensorflow record file (data.record), and I seem to be able to read that data. I want to do something simple: just display the (png-encoded) image for a given example. But I can't get the image as a numpy array and simply show it. I mean, the data are in there how hard can it be to just pull it out and show it? I imagine I am missing something really obvious.
height = 700 # Image height
width = 500 # Image width
file_path = r'/home/train.record'
with tf.Session() as sess:
feature = {'image/encoded': tf.FixedLenFeature([], tf.string),
'image/object/class/label': tf.FixedLenFeature([], tf.int64)}
filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
parsed_example = tf.parse_single_example(serialized_example, features=feature)
image_raw = parsed_example['image/encoded']
image = tf.decode_raw(image_raw, tf.uint8)
image = tf.cast(image, tf.float32)
image = tf.reshape(image, (height, width))
This seems to have extracted an image from train.record, with the right dimensions, but it is of type tensorflow.python.framework.ops.Tensor, and when I try to plot it with something like:
cv2.imshow("image", image)
I just get an error: TypeError: Expected cv::UMat for argument 'mat'.
I have tried using eval, as recommended at a link below:
array = image.eval(session = sess)
But it did not work. The program just hangs at that point (for instance if I put it after the last line above).
More generally, it seems I am just missing something, for even when I try to get the class label:
label = parsed_example['label']
I get the same thing: not the value, but an object of type tensorflow.python.framework.ops.Tensor. I can literally see the value is there when I type the name in my ipython notebook, but am not sure how to access it as an int (or whatever).
Note I tried this, which has some methods that seem to directly convert to a numpy array but they did not work: https://github.com/yinguobing/tfrecord_utility/blob/master/view_record.py
I just got the error there is no numpy method for a tensor object.
Note I am using tensorflow 1.13, Python 3.7, working in Ubuntu 18. I get the same results whether I run from Spyder or the command line.
Related questions
- How to print the value of a Tensor object in TensorFlow?
- https://github.com/aymericdamien/TensorFlow-Examples/issues/40
To visualize a single image from the TFRecord file, you could do something along the lines of:
import tensorflow as tf
import matplotlib.pyplot as plt
def parse_fn(data_record):
feature = {'image/encoded': tf.FixedLenFeature([], tf.string),
'image/object/class/label': tf.FixedLenFeature([], tf.int64)}
sample = tf.parse_single_example(data_record, feature)
return sample
file_path = r'/home/train.record'
dataset = tf.data.TFRecordDataset([file_path])
record_iterator = dataset.make_one_shot_iterator().get_next()
with tf.Session() as sess:
# Read and parse record
parsed_example = parse_fn(record_iterator)
# Decode image and get numpy array
encoded_image = parsed_example['image/encoded']
decoded_image = tf.image.decode_jpeg(encoded_image, channels=3)
image_np = sess.run(decoded_image)
# Display image
plt.imshow(image_np)
plt.show()
This assumes that the image is JPEG-encoded. You should use the appropriate decoding function (e.g. for PNG images, use tf.image.decode_png).
NOTE: Not tested.
import tensorflow as tf
with tf.Session() as sess:
r = tf.random.uniform([10, 10])
print(type(r))
# <class 'tensorflow.python.framework.ops.Tensor'>
a = r.eval()
print(type(a))
# <class 'numpy.ndarray'>
I could not reproduce your exact case. But, you need to evaluate Tensor to NumPy NDArray. As far as I understand, this is not an issue with TensorRecord. Colab link for the code.
Try tfrmaker, a tensorflow TFRrecord utility package. You can install the package with pip:
pip install tfrmaker
Then you could visualize the batches of your dataset like this example:
import os
from tfrmaker import images, display
# mapping label names with integer encoding.
LABELS = {"bishop": 0, "knight": 1, "pawn": 2, "queen": 3, "rook": 4}
# directory contains tfrecords
TFR_DIR = "tfrecords/chess/"
# fetch all tfrecords in the directory to a list
tfr_paths = [os.path.join(TFR_DIR,file) for file in os.listdir(TFR_DIR) if os.fsdecode(file).endswith(".tfrecord")]
# load one or more tfrecords as an iterator object.
dataset = images.load(tfr_paths, batch_size = 32)
# iterate one batch and visualize it along with labels.
databatch = next(iter(dataset))
display.batch(databatch, LABELS)
The package also has some cool features like:
dynamic resizing
splitting tfrecords into optimal shards
spliting training, validation, testing of tfrecords
count no of images in tfrecords
asynchronous tfrecord creation
NOTE: This package currently supports image datasets that are organised as directories with class names as sub directory names.

How to get two tf.dataset from tf.data.Dataset.zip((images, labels))

I am working on the Python/tensorflow/mnist tutorial.
Since a few weeks using the orignal code from tensorflow web site i get the warning that the image dataset would soon be deprecated abd that i should use the following one :
https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py
I load it it my code using :
from tensorflow.models.official.mnist import dataset
trainfile = dataset.train(data_dir)
Which returns :
tf.data.Dataset.zip((images, labels))
The issue is that I cannot find a,way to separate them in the following way for example :
trainfile = dataset.train(data_dir)
train_data= trainfile.images
train_label= trainfile.label
But this clearly doesnot work because the attributrs images and label do not exist. trainfile is a tf.dataset.
Knowing that tf.dataset is made of int32 and float32 i tried :
train_data = trainfile.map(lambda x,y : x.dtype == tf.float32)
But it returns and empty dataset.
I insist (but will be open mimded) in doing it this way (two complete batches of image and label) because this is how the tutorial works :
https://www.tensorflow.org/tutorials/estimators/cnn
I saw a lot of solution to get elements from datasets but nothing to go back from the zip operations that is done in the following code
tf.data.Dataset.zip((images, labels))
Thanks you in advance for your help.
I hope this helps:
inputs = tf.placeholder(tf.float32, shape=(None, 784), name='inputs')
outputs = tf.placeholder(tf.float32, shape=(None,), name='outputs')
#Prepare a tensorflow dataset
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
ds = ds.shuffle(buffer_size=10, reshuffle_each_iteration=True).batch(batch_size=batch_size, drop_remainder=True).repeat()
iter = ds.make_one_shot_iterator()
next = iter.get_next()
inputs = next[0]
outputs = next[1]
TensorFlow's get_single_element() is finally around which can be used to unzip features and labels from the dataset.
This avoids the need of generating and using an iterator using .map(), iter() or one_shot_iterator() (which could be costly for big datasets).
get_single_element() returns a tensor (or a tuple or dict of tensors) encapsulating all the members of the dataset. We need to pass all the members of the dataset batched into a single element.
This can be used to get features as a tensor-array, or features and labels as a tuple or dictionary (of tensor-arrays) depending upon how the original dataset was created.
Check this answer on SO for an example that unpacks features and labels into a tuple of tensor-arrays.
Instead of separating into two datasets, one for images and another for labels, it's best to make a single iterator which returns both the image and the label.
The reason why this is preferred is that it's a lot easier to ensure that you match each example with its label even after a complicated series of shuffles, reorderings, filterings, etc, as you might have in a nontrivial input pipeline.
You can visualize images and find its associated labels
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
ds = ds.shuffle(buffer_size=10).batch(batch_size=batch_size)
iter = ds.make_one_shot_iterator()
next = iter.get_next()
def display(image, label):
# display image
...
plt.imshow(image)
...
with tf.Session() as sess:
try:
while True:
image, label = sess.run(next)
# image = numpy array (batch, image_size)
# label = numpy array (batch, label)
display(image[0], label[0]) #display first image in batch
except:
pass

How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API?

For training an LSTM model in Tensorflow, I have structured my data into a tf.train.SequenceExample format and stored it into a TFRecord file. I would now like to use the new DataSet API to generate padded batches for training. In the documentation there is an example for using padded_batch, but for my data I can't figure out what the value of padded_shapes should be.
For reading the TFrecord file into the batches I have written the following Python code:
import math
import tensorflow as tf
import numpy as np
import struct
import sys
import array
if(len(sys.argv) != 2):
print "Usage: createbatches.py [RFRecord file]"
sys.exit(0)
vectorSize = 40
inFile = sys.argv[1]
def parse_function_dataset(example_proto):
sequence_features = {
'inputs': tf.FixedLenSequenceFeature(shape=[vectorSize],
dtype=tf.float32),
'labels': tf.FixedLenSequenceFeature(shape=[],
dtype=tf.int64)}
_, sequence = tf.parse_single_sequence_example(example_proto, sequence_features=sequence_features)
length = tf.shape(sequence['inputs'])[0]
return sequence['inputs'], sequence['labels']
sess = tf.InteractiveSession()
filenames = tf.placeholder(tf.string, shape=[None])
dataset = tf.contrib.data.TFRecordDataset(filenames)
dataset = dataset.map(parse_function_dataset)
# dataset = dataset.batch(1)
dataset = dataset.padded_batch(4, padded_shapes=[None])
iterator = dataset.make_initializable_iterator()
batch = iterator.get_next()
# Initialize `iterator` with training data.
training_filenames = [inFile]
sess.run(iterator.initializer, feed_dict={filenames: training_filenames})
print(sess.run(batch))
The code works well if I use dataset = dataset.batch(1) (no padding needed in that case), but when I use the padded_batch variant, I get the following error:
TypeError: If shallow structure is a sequence, input must also be a
sequence. Input has type: .
Can you help me figuring out what I should pass for the padded_shapes parameter?
(I know there is lots of example code using threading and queues for this, but I'd rather use the new DataSet API for this project)
You need to pass a tuple of shapes.
In your case you should pass
dataset = dataset.padded_batch(4, padded_shapes=([vectorSize],[None]))
or try
dataset = dataset.padded_batch(4, padded_shapes=([None],[None]))
Check this code for more details. I had to debug this method to figure out why it wasn't working for me.
If your current Dataset object contains a tuple, you can also to specify the shape of each padded element.
For example, I have a (same_sized_images, Labels) dataset and each label has different length but same rank.
def process_label(resized_img, label):
# Perfrom some tensor transformations
# ......
return resized_img, label
dataset = dataset.map(process_label)
dataset = dataset.padded_batch(batch_size,
padded_shapes=([None, None, 3],
[None, None])) # my label has rank 2
You may need to get help from the dataset output shapes:
padded_shapes = dataset.output_shapes
Beware of not passing a tuple of tuples. That gives a very vague error of "cannot convert value None to type Nonetype".
So correct:
padded_shapes = ([None, None], [None])
INCORRECT:
padded_shapes = ((None, None), (None))

How to obtain gcloud predictions by passing a base64 image to a retrained inception model?

I am trying to obtain a prediction with gcloud by passing a base64 encoded image to a retrained inception model, by using a similar approach as the one adopted by Davide Biraghi in this post.
When using 'DecodeJpeg/contents:0' as input I also get the same error when trying to get predictions, therefore I adopted a slightly different approach.
Following rhaertel80's suggestions in his answer to this post, I have created a graph that takes a jpeg image as input in 'B64Connector/input', preprocess it and feeds it to the inception model in 'ResizeBilinear:0'.
The prediction returns the values, although the wrong ones (I am trying to find a solution in another post) but at least it doesn't fail. The placeholder I use as input is
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,), name='B64Connector/input')
And I add it to the model inputs with
inputs = {"b64_bytes": 'B64Connector/input:0'}
tf.add_to_collection("inputs", json.dumps(inputs))
As Davide I am following the suggestions found in these posts: here, here and here and I am trying to get predictions with
gcloud beta ml predict --json-instances=request.json --model=MODEL
where the file request.json has been obtained with this code
jpgtxt = base64.b64encode(open(imagefile ,"rb").read())
with open( outputfile, 'w' ) as f :
f.write( json.dumps( {"b64_bytes": {"b64": jpgtxt}} ) )
I would like to know why the prediction fails when I use as input 'DecodeJpeg/contents:0' and it doesn't when I use this different approach, since they look almost identical to me: I use the same script to generate the instances (changing the input_key) and the same command line to request predictions
Is there a way to pass the instance fed to 'B64Connector/input:0' to 'DecodeJpeg/contents:0' in order to get the right predictions?
Here I describe more in detail my approach and how I use the images_placeholder.
I define a function that resizes the image:
def decode_and_resize(image_str_tensor):
"""Decodes jpeg string, resizes it and returns a uint8 tensor."""
image = tf.image.decode_jpeg(image_str_tensor, channels=MODEL_INPUT_DEPTH)
# Note resize expects a batch_size, but tf_map supresses that index,
# thus we have to expand then squeeze. Resize returns float32 in the
# range [0, uint8_max]
image = tf.expand_dims(image, 0)
image = tf.image.resize_bilinear(
image, [MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH], align_corners=False)
image = tf.squeeze(image, squeeze_dims=[0])
image = tf.cast(image, dtype=tf.uint8)
return image
and one that generates the definition of the graph in which the resize takes place and where images_placeholder is defined and used
def create_b64_graph() :
with tf.Graph().as_default() as b64_graph:
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,),
name='B64Connector/input')
decoded_images = tf.map_fn(
decode_and_resize, images_placeholder, back_prop=False, dtype=tf.uint8)
# convert_image_dtype, also scales [0, uint8_max] -> [0, 1).
images = tf.image.convert_image_dtype(decoded_images, dtype=tf.float32)
# Finally, rescale to [-1,1] instead of [0, 1)
images = tf.sub(images, 0.5)
images = tf.mul(images, 2.0)
# NOTE: using identity to get a known name for the output tensor.
output = tf.identity(images, name='B64Connector/output')
b64_graph_def = b64_graph.as_graph_def()
return b64_graph_def
Moreover I am using the following code to merge the resizing graph with the inception graph. Can I use a similar approach to link images_placeholder directly to 'DecodeJpeg/contents:0'?
def concatenate_to_inception_graph( b64_graph_def ):
model_dir = INPUT_MODEL_PATH
model_filename = os.path.join(
model_dir, 'classify_image_graph_def.pb')
with tf.Session() as sess:
# Import the b64_graph and get its output tensor
resized_b64_tensor, = (tf.import_graph_def(b64_graph_def, name='',
return_elements=['B64Connector/output:0']))
with gfile.FastGFile(model_filename, 'rb') as f:
inception_graph_def = tf.GraphDef()
inception_graph_def.ParseFromString(f.read())
# Concatenate b64_graph and inception_graph
g_1 = tf.import_graph_def(inception_graph_def, name='inception',
input_map={'ResizeBilinear:0' : resized_b64_tensor} )
return sess.graph

Categories