im using the Tensorflow dataset api and try to predict for a multilabel classification. Its working fine but the resulting predictions have no correspondending id, so I don't know to what the prediciton belongs to.
I'm using the following code to create the dataset and predict:
def test_input_fn():
filenames = tf.train.match_filenames_once("../input/test/*_green.png")
dataset =
def _parse_image_data(filename):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_png(image_string, channels=1)
image_reshape = tf.reshape(image_decoded, [512*512*1])
return image_reshape
pred_result = estimator.predict(test_input)
Is there an easy way to somehow append the ids to the prediction.
You can return image_reshape,filename in your _parse_image_data(filename) function, and modify your model function accordingly, so that for predictions it will return not only the predicted labels but also the filenames.
More detailedly, suppose your model function is model_fn(features, labels, mode, params), now your features is no longer just image_reshape, but image_shape,filename, so basically you should change wherever you use features to features[0]; you may also need to append features[1] to your predictions (I'm assuming you name your variables like this, I can give more specific codes if you can show me your model function).
Or, if you would like a separate function that simply prints the filenames content :
def print_filenames():
filenames = tf.train.match_filenames_once("../input/test/*_green.png")
with tf.Session() as sess:
In the above codes, the output is in the as-is order, so it should be the same order of your predictions.
Im studying tensorflow ocr model from keras example authored by A_K_Nain. This model use custom object (CTC Layer). It is in the site:
I trained model using my dataset and then the result of prediction model is perfect.
I want to save and load this model and i tried it. But i got some errors so i appended this code in CTC Layer class.
def get_config(self):
config = super(CTCLayer, self).get_config()
return config
After that
I tried to save whole model and weight but nothing worked.
So i applied 2 save point.
First way.
history =
new_model = load_model('./model/my_model', custom_objects={'CTCLayer':CTCLayer})
prediction_model = keras.models.Model(
new_model .get_layer(name='image').input, new_model .get_layer(name='dense2').output
and second way.
prediction_model = keras.models.Model(
model.get_layer(name='image').input, model.get_layer(name='dense2').output
These still never worked. it didn't make error but result of prediction is terrible.
Accurate results are obtained when training and saving and loading are performed together.
If I load same model without training together, the result is so bad.
How can i use this model without training everytime? please help me.
The problem does not come from tensorflow. In the captcha_ocr tutorial, characters is a set, sets are unordered. So the mapping from characters to integers using StringLookup is dependent of the current run of the notebook. That is why you get rubbish when using it in another notebook without retraining, the mapping is not the same!
A solution is to use an ordered list instead of the set for characters :
characters = sorted(list(set([char for label in labels for char in label])))
Note that the set operator here permits to get a unique version of each character and then it is converted back to a list and sorted. It will work then on any script/notebook without retraining (using the same formula).
The problem is not in the saved model but in the character list that you are using to map number back to string. Everytime you restart the notebook, it resets the character list and when you load your model, it can't accurately map the numbers back to string. To resolve this issue you need to save character list. Please follow the below code.
train_labels_cleaned = []
characters = set()
max_len = 0
for label in train_labels:
label = label.split(" ")[-1].strip()
for char in label:
max_len = max(max_len, len(label))
print("Maximum length: ", max_len)
print("Vocab size: ", len(characters))
# Check some label samples
ff = list(characters)
# save list as pickle file
import pickle
with open("/content/drive/MyDrive/Colab Notebooks/OCR_course/characters", "wb") as fp: #Pickling
pickle.dump(ff, fp)
# Load character list again
import pickle
with open("/content/drive/MyDrive/Colab Notebooks/OCR_course/characters", "rb") as fp: # Unpickling
b = pickle.load(fp)
# Maping characaters to integers
char_to_num = StringLookup(vocabulary=b, mask_token=None)
#Maping integers back to original characters
num_to_chars = StringLookup(vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True)
Now when you map back the numbers to string after prediction, it will retain the original order and will predict accurately.
If you still didn't understand the logic, you can watch my video in which I explained this project from scratch and resolved all the issues that you are facing.
I am working on the Python/tensorflow/mnist tutorial.
Since a few weeks using the orignal code from tensorflow web site i get the warning that the image dataset would soon be deprecated abd that i should use the following one :
I load it it my code using :
from tensorflow.models.official.mnist import dataset
trainfile = dataset.train(data_dir)
Which returns :, labels))
The issue is that I cannot find a,way to separate them in the following way for example :
trainfile = dataset.train(data_dir)
train_data= trainfile.images
train_label= trainfile.label
But this clearly doesnot work because the attributrs images and label do not exist. trainfile is a tf.dataset.
Knowing that tf.dataset is made of int32 and float32 i tried :
train_data = x,y : x.dtype == tf.float32)
But it returns and empty dataset.
I insist (but will be open mimded) in doing it this way (two complete batches of image and label) because this is how the tutorial works :
I saw a lot of solution to get elements from datasets but nothing to go back from the zip operations that is done in the following code, labels))
Thanks you in advance for your help.
I hope this helps:
inputs = tf.placeholder(tf.float32, shape=(None, 784), name='inputs')
outputs = tf.placeholder(tf.float32, shape=(None,), name='outputs')
#Prepare a tensorflow dataset
ds =, y_train))
ds = ds.shuffle(buffer_size=10, reshuffle_each_iteration=True).batch(batch_size=batch_size, drop_remainder=True).repeat()
iter = ds.make_one_shot_iterator()
next = iter.get_next()
inputs = next[0]
outputs = next[1]
TensorFlow's get_single_element() is finally around which can be used to unzip features and labels from the dataset.
This avoids the need of generating and using an iterator using .map(), iter() or one_shot_iterator() (which could be costly for big datasets).
get_single_element() returns a tensor (or a tuple or dict of tensors) encapsulating all the members of the dataset. We need to pass all the members of the dataset batched into a single element.
This can be used to get features as a tensor-array, or features and labels as a tuple or dictionary (of tensor-arrays) depending upon how the original dataset was created.
Check this answer on SO for an example that unpacks features and labels into a tuple of tensor-arrays.
Instead of separating into two datasets, one for images and another for labels, it's best to make a single iterator which returns both the image and the label.
The reason why this is preferred is that it's a lot easier to ensure that you match each example with its label even after a complicated series of shuffles, reorderings, filterings, etc, as you might have in a nontrivial input pipeline.
You can visualize images and find its associated labels
ds =, y_train))
ds = ds.shuffle(buffer_size=10).batch(batch_size=batch_size)
iter = ds.make_one_shot_iterator()
next = iter.get_next()
def display(image, label):
# display image
with tf.Session() as sess:
while True:
image, label =
# image = numpy array (batch, image_size)
# label = numpy array (batch, label)
display(image[0], label[0]) #display first image in batch
I have trained a model using the API, so my training code looks something like this
with graph.as_default():
dataset =
dataset =, num_parallel_calls=n_workers)
dataset = dataset.shuffle(10000)
dataset = dataset.padded_batch(batch_size, padded_shapes={...})
handle = tf.placeholder(tf.string, shape=[])
iterator =,
batch = iterator.get_next()
# Model code
iterator = dataset.make_initializable_iterator()
with tf.Session(graph=graph) as sess:
train_handle =
for epoch in range(n_epochs):
while True:
try:, feed_dict={handle: train_handle})
except tf.errors.OutOfRangeError:
Now after the model is trained I want to infer on examples that are not in the datasets and I am not sure how to go about doing it.
Just to be clear, I know how to use another dataset, for example I just pass a handle to my test set upon testing.
The question is about given the scaling scheme and the fact that the network expects a handle, if I want to make a prediction to a new example which is not written to a TFRecord, how would I go about doing that?
If I'd modify the batch I'd be responsible for the scaling beforehand which is something I would like to avoid if possible.
So how should I infer single examples from a model traiend the way?
(This is not for production purposes it is for evaluating what will happen if I change specific features)
actually there is a tensor name called "IteratorGetNext:0" in the graph
when you use dataset api, so you can using following way to directly set
#get a tensor from a graph
input tensor : input = graph.get_tensor_by_name("IteratorGetNext:0")
# difine the target tensor you want evaluate for your prediction
prediction tensor: predictions=...
# finally call session to run
then, feed_dict={input: np.asanyarray(images), ...})
I am trying to obtain a prediction with gcloud by passing a base64 encoded image to a retrained inception model, by using a similar approach as the one adopted by Davide Biraghi in this post.
When using 'DecodeJpeg/contents:0' as input I also get the same error when trying to get predictions, therefore I adopted a slightly different approach.
Following rhaertel80's suggestions in his answer to this post, I have created a graph that takes a jpeg image as input in 'B64Connector/input', preprocess it and feeds it to the inception model in 'ResizeBilinear:0'.
The prediction returns the values, although the wrong ones (I am trying to find a solution in another post) but at least it doesn't fail. The placeholder I use as input is
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,), name='B64Connector/input')
And I add it to the model inputs with
inputs = {"b64_bytes": 'B64Connector/input:0'}
tf.add_to_collection("inputs", json.dumps(inputs))
As Davide I am following the suggestions found in these posts: here, here and here and I am trying to get predictions with
gcloud beta ml predict --json-instances=request.json --model=MODEL
where the file request.json has been obtained with this code
jpgtxt = base64.b64encode(open(imagefile ,"rb").read())
with open( outputfile, 'w' ) as f :
f.write( json.dumps( {"b64_bytes": {"b64": jpgtxt}} ) )
I would like to know why the prediction fails when I use as input 'DecodeJpeg/contents:0' and it doesn't when I use this different approach, since they look almost identical to me: I use the same script to generate the instances (changing the input_key) and the same command line to request predictions
Is there a way to pass the instance fed to 'B64Connector/input:0' to 'DecodeJpeg/contents:0' in order to get the right predictions?
Here I describe more in detail my approach and how I use the images_placeholder.
I define a function that resizes the image:
def decode_and_resize(image_str_tensor):
"""Decodes jpeg string, resizes it and returns a uint8 tensor."""
image = tf.image.decode_jpeg(image_str_tensor, channels=MODEL_INPUT_DEPTH)
# Note resize expects a batch_size, but tf_map supresses that index,
# thus we have to expand then squeeze. Resize returns float32 in the
# range [0, uint8_max]
image = tf.expand_dims(image, 0)
image = tf.image.resize_bilinear(
image, [MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH], align_corners=False)
image = tf.squeeze(image, squeeze_dims=[0])
image = tf.cast(image, dtype=tf.uint8)
return image
and one that generates the definition of the graph in which the resize takes place and where images_placeholder is defined and used
def create_b64_graph() :
with tf.Graph().as_default() as b64_graph:
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,),
decoded_images = tf.map_fn(
decode_and_resize, images_placeholder, back_prop=False, dtype=tf.uint8)
# convert_image_dtype, also scales [0, uint8_max] -> [0, 1).
images = tf.image.convert_image_dtype(decoded_images, dtype=tf.float32)
# Finally, rescale to [-1,1] instead of [0, 1)
images = tf.sub(images, 0.5)
images = tf.mul(images, 2.0)
# NOTE: using identity to get a known name for the output tensor.
output = tf.identity(images, name='B64Connector/output')
b64_graph_def = b64_graph.as_graph_def()
return b64_graph_def
Moreover I am using the following code to merge the resizing graph with the inception graph. Can I use a similar approach to link images_placeholder directly to 'DecodeJpeg/contents:0'?
def concatenate_to_inception_graph( b64_graph_def ):
model_dir = INPUT_MODEL_PATH
model_filename = os.path.join(
model_dir, 'classify_image_graph_def.pb')
with tf.Session() as sess:
# Import the b64_graph and get its output tensor
resized_b64_tensor, = (tf.import_graph_def(b64_graph_def, name='',
with gfile.FastGFile(model_filename, 'rb') as f:
inception_graph_def = tf.GraphDef()
# Concatenate b64_graph and inception_graph
g_1 = tf.import_graph_def(inception_graph_def, name='inception',
input_map={'ResizeBilinear:0' : resized_b64_tensor} )
return sess.graph
I created model in tensorflow of neural network.
I saved the model and restore it in another python file.
The code is below:
def restoreModel():
prediction = neuralNetworkModel(x)
tf_p = tensorFlow.nn.softmax(prediction)
temp = np.array([2,1,541,161124,3,3])
temp = np.vstack(temp)
with tensorFlow.Session() as sess:
new_saver = tensorFlow.train.import_meta_graph('model.ckpt.meta')
new_saver.restore(sess, tensorFlow.train.latest_checkpoint('./'))
all_vars = tensorFlow.trainable_variables()
predict =[tf_p], feed_dict={
tensorFlow.transpose(x): temp,
y : ***
when "temp" variable in what I want to predict!
X is the vector shape, and I "transposed" it to match the shapes.
I dont understand what I need to write in feed_dict variable.
I am answering late but maybe it can still be useful. feed_dict is used to give tensorflow the values you want your placeholders to take. fetches (the first argument of run) is the list of results you want. The keys of feed_dict and the elements of fetches must be either the names of the tensors (I didn't try it though) or variables you can get by
graph = tf.get_default_graph()
var = graph.get_operation_by_name('name_of_operation').outputs[0]
Maybe graph.get_tensor_by_name('name_of_operation:0') works too, I didn't try.
By default, the name of placeholders are simply 'Placeholder', 'Placeholder_1' etc, following the order of creation in the graph definition.