CIFAR-10 neural net doesn't accept batch - python

I am making a neural network in python using the tensorflow library to classify the CIFAR-10 dataset. The issue is that I cannot find a way of converting the batch so that the train step will accept the feed. This is my code, with sections I have verified to work being replaced with comments describing them:
# import statements
# declare some global variables
# also declare filepaths
def read_cifar10(filename_queue):
# parse cifar10 file, return as label with uint8image (Cifar10Record)
def generate_batch(image, label):
# generate shuffled batch, given images and labels
def get_imgs(test = False, fmin = 1, files = 1):
# generate the filename(s) and the filename_queue
# read the input using read_cifar10
# cast it to uint8image and float32
# apply distortions
# set the shape of the image and label
# return the batch, made using generate_batch
# build placeholders for input and output
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
# create a weight variable with a given shape
# slight amount of variation
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev = 0.1)
return tf.Variable(initial)
# same but for bias variables
def bias_variable(shape):
initial = tf.constant(0.1, shape = shape)
return tf.Variable(initial)
# convolve it, do some more layers, apply relu, make dropout and readout layers, etc
# do softmax regression
# define the loss function, training step, accuracy function
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y_, logits = y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
guess = tf.argmax(y_conv, 1)
answer = tf.argmax(y_, 1)
correct_prediction = tf.equal(guess, answer)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
def train_nn():
for i in range(TRAIN_STEPS):
# this is where the error occurs
batch = get_imgs(files = 5).eval()
# every 100 steps output training accuracy
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict = {x: batch[0], y_: batch[1], keep_prob: 0.5})
print('step %d, training accuracy %g' % (i, train_accuracy)) = {x: batch[0], y_: batch[1], keep_prob: 0.5})
def test_nn():
# test it
batch = get_imgs(test = True).eval()
print('test accuracy %g' % accuracy.eval(feed_dict = {x: batch[0], y_: batch[1], keep_prob: 1.0}))
# create session
with tf.Session() as sess:
# initialize global variables
# make queue runners
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess = sess, coord = coord)
# train it, test it
Some things I have tried and the results:
batch = get_imgs(args) gives 'TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.'
batch = get_imgs(args).eval() gives 'AttributeError: 'tuple' object has no attribute 'eval''
batch = causes the program to run indefinitely with no output
printing type(batch) says the batch is a tuple
printing batch gives a description of a tensor
printing batch.eval() or type(batch.eval()) gives 'W tensorflow/core/kernels/] _3_input_producer: Skipping cancelled dequeue attempt with queue not closed'
I suspect the issue is either with the batch conversion, the queueing with tf.train.Coordinator(), or the placeholders. Any help would be greatly appreciated.


TensorFlow restore throwing "No Variable to save" error

I am working through some code to understand how to save and restore checkpoints in tensorflow. To do so, I implemented a simple neural netowork that works with MNIST digits and saved the .ckpt file like so:
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
learning_rate = 0.001
n_input = 784 # MNIST data input (img shape = 28*28)
n_classes = 10 # MNIST total classes 0-9
#import MNIST data
mnist = input_data.read_data_sets('.', one_hot = True)
#Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])
#Weights and biases
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))
#logits = xW + b
logits = tf.add(tf.matmul(features, weights), bias)
#Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
import math
save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100
saver = tf.train.Saver()
# Launch the graph
with tf.Session() as sess:
# Training cycle
for epoch in range(n_epochs):
total_batch = math.ceil(mnist.train.num_examples / batch_size)
# Loop over all batches
for i in range(total_batch):
batch_features, batch_labels = mnist.train.next_batch(batch_size)
feed_dict={features: batch_features, labels: batch_labels})
# Print status for every 10 epochs
if epoch % 10 == 0:
valid_accuracy =
features: mnist.validation.images,
labels: mnist.validation.labels})
print('Epoch {:<3} - Validation Accuracy: {}'.format(
# Save the model, save_file)
print('Trained Model Saved.')
This part works well, and I get the .ckpt file saved in the correct directory. The problem comes in when I try to restore the model in an attempt to work on it again. I use the following code to restore the model:
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, 'train_model.ckpt.meta')
print('model restored')
and end up with the error: ValueError: No variables to save
Not too sure, what the mistake here is. Any help is appreciated. Thanks in advance
A Graph is different to the Session. A graph is the set of operations joining tensors, each of which is a symbolic representation of a set of values. A Session assigns specific values to the Variable tensors, and allows you to run operations in that graph.
The chkpt file saves variable values - i.e. those saved in the weights and biases - but not the graph itself.
The solution is simple: re-run the graph construction (everything before the Session, then start your session and load values from the chkpt file.
Alternatively, you can check out this guide for exporting and importing MetaGraphs.
You should tell the Saver which Variables to restore, default Saver will get all the Variables from the default graph.
As in your case, you should add the constructing graph code before saver = tf.train.Saver()

TensorFlow: tf.layers vs low-level API

I am currently in the process of planning my first Conv. NN implementation in Tensorflow, and have been reading many of the tutorials available on Tensorflow's website for insight.
It seems that there are essentially two ways to create a custom CNN:
1) Use Tensorflow layers module tf.layers, which is the "high-level API". Using this method, you define a model definition function consisting of tf.layers objects, and in the main function, instantiate a tf.learn.Estimator, passing the model definition function to it. From here, the fit() and evaluate() methods can be called on the Estimator object, which train and validate, respectively. Link: Main function below:
def main(unused_argv):
# Load training and eval data
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
# Create the Estimator
mnist_classifier = learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")
# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=50)
# Train the model
# Configure the accuracy metric for evaluation
metrics = {
metric_fn=tf.metrics.accuracy, prediction_key="classes"),
# Evaluate the model and print results
eval_results = mnist_classifier.evaluate(
x=eval_data, y=eval_labels, metrics=metrics)
Full code here
2) Use Tensorflow's "low-level API" in which layers are defined in a definition function. Here, layers are manually defined, and the user must perform many calculations manually. In the main function, the user starts a tf.Session(), and manually configures training/validation using for loop(s). Link: Main function below:
def main(_):
# Import data
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
# Build the graph for the deep net
y_conv, keep_prob = deepnn(x)
with tf.name_scope('loss'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_,
cross_entropy = tf.reduce_mean(cross_entropy)
with tf.name_scope('adam_optimizer'):
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
with tf.name_scope('accuracy'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
correct_prediction = tf.cast(correct_prediction, tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
graph_location = tempfile.mkdtemp()
print('Saving graph to: %s' % graph_location)
train_writer = tf.summary.FileWriter(graph_location)
with tf.Session() as sess:
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy)){x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
Full code here
My dilemma is, I like the simplicity of defining the neural network using tf.layers (option 1), but I want the customizability of the training that the "low-level API" (option 2) provides. Specifically, when using the tf.layers implementation, is there a way to report validation accuracy every n iterations of training? Or more generally, can I train/validate using the tf.Session(), or am I confined to using the tf.learn.Estimator's fit() and evaluate() methods?
It seems odd that one would want a final evaluation score after all training is complete, as I thought the whole point of validation is to track network progression during training. Otherwise, what would be the difference between validation and testing?
Any help would be appreciated.
You're nearly right however tf.layers is separate from the Estimator class of functions etc. If you wanted to you could use tf.Layers to define your layers but then build your own training loops or whatever else you like. You can think of tf.Layers just being those functions that you could create in your second option above.
If you are interested in being able to build up a basic model quickly but being able to extend it with other functions, your own training loops etc. then there's no reason you can't use layers to build your model and interact with it however you wish.
tf.Layers -
tf.Estimator -

TensorFlow: model saved successful but restore failed, where am I wrong?

I am learning TensorFlow recently, obviously I am a newbie. But I have tried many ways in this question, I wrote this code to train my model and want to directly restore it instead train it again if the model.ckpt file already exists. But after train, my test accuracy is about 90%, but if I restore it directly the accuracy just about 10%, I think it because I am failed restore my model. I just have two variables named weights and biases, this is my main-part code:
def train(bottleneck_tensor, jpeg_data_tensor):
image_lists = create_image_lists(TEST_PERCENTAGE, VALIDATION_PERCENTAGE)
n_classes = len(image_lists.keys())
# input
bottleneck_input = tf.placeholder(tf.float32, [None, BOTTLENECK_TENSOR_SIZE],
ground_truth_input = tf.placeholder(tf.float32, [None, n_classes], name='GroundTruthInput')
# this is the new_layer code
# with tf.name_scope('final_training_ops'):
# weights = tf.Variable(tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, n_classes], stddev=0.001))
# biases = tf.Variable(tf.zeros([n_classes]))
# logits = tf.matmul(bottleneck_input, weights) + biases
final_tensor = tf.nn.softmax(logits)
# losses
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=ground_truth_input)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
train_step = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy_mean)
# calculate the accurancy
with tf.name_scope('evaluation'):
correct_prediction = tf.equal(tf.argmax(final_tensor, 1), tf.argmax(ground_truth_input, 1))
evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
image_order_step = tf.arg_max(final_tensor, 1)
saver = tf.train.Saver(tf.global_variables(), write_version=tf.train.SaverDef.V1)
with tf.Session() as sess:
init = tf.global_variables_initializer()
if os.path.exists('F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt'):
reader = tf.train.NewCheckpointReader('F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
all_variables = reader.get_variable_to_shape_map()
for each in all_variables:
print(each, all_variables[each])
print("retrain model")
for i in range(STEPS):
train_bottlenecks, train_ground_truth = get_random_cached_bottlenecks(
sess, n_classes, image_lists, BATCH, 'training', jpeg_data_tensor, bottleneck_tensor),
feed_dict={bottleneck_input: train_bottlenecks, ground_truth_input: train_ground_truth})
# 在验证数据上测试正确率
if i % 100 == 0 or i + 1 == STEPS:
validation_bottlenecks, validation_ground_truth = get_random_cached_bottlenecks(
sess, n_classes, image_lists, BATCH, 'validation', jpeg_data_tensor, bottleneck_tensor)
validation_accuracy =, feed_dict={
bottleneck_input: validation_bottlenecks, ground_truth_input: validation_ground_truth})
print('Step %d: Validation accuracy on random sampled %d examples = %.1f%%' % (
i, BATCH, validation_accuracy * 100)), 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
print('Beginning Test')
# test
test_bottlenecks, test_ground_truth = get_tst_bottlenecks(sess, image_lists, n_classes,
# saver.restore(sess, 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
test_accuracy =, feed_dict={
bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth})
print('Final test accuracy = %.1f%%' % (test_accuracy * 100))
label_name_list = list(image_lists.keys())
for label_index, label_name in enumerate(label_name_list):
category = 'testing'
for index, unused_base_name in enumerate(image_lists[label_name][category]):
bottlenecks = []
ground_truths = []
print("real lable%s:" % label_name)
# print(unused_base_name)
bottleneck = get_or_create_bottleneck(sess, image_lists, label_name, index, category,
jpeg_data_tensor, bottleneck_tensor)
# saver.restore(sess, 'F:/_pythonWS/imageClassifier/ckpt/imagesClassFilter.ckpt')
ground_truth = np.zeros(n_classes, dtype=np.float32)
ground_truth[label_index] = 1.0
image_kind =, feed_dict={
bottleneck_input: bottlenecks, ground_truth_input: ground_truths})
image_kind_order = int(image_kind[0])
print("pre_lable%s:" % label_name_list[image_kind_order])
Try this method to save and restore:
saver = tf.train.Saver()
with tf.Session() as sess:
# restore saved model
new_saver = tf.train.import_meta_graph('my-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
# save model weights, after training process, 'my-model')
define a tf.train.Saver outside the session. After finish training process save the weights by, 'my-model'). And restore the weights like above.
I know where I am wrong..., the truth is that I have restore the model successfully, but because I create the result list every time by rand, when I use image_order_step = tf.arg_max(final_tensor, 1) to calculate the kind of the test image, because when I run the code next time, the lables order change, but the weight and biaese still the same to the last time, for example,for the first time,the lable list is [A1,A2,A3,A4,A5,A6],and after calculate the image_order_step = tf.arg_max(final_tensor, 1) result is 3,so the result will be A4, next time the lable list change to [A5,A3,A1,A6,A2,A4], but the image_order_step = tf.arg_max(final_tensor, 1) result still 3, so the predict result will be A6, so the accuracy will change every time and totally by rand...
This question tell me, be careful for the every detail, or a little ERROR will make you confusing for a long time. OVER!

Tensorflow: Run training phase on GPU and test phase on CPU

I wish to run the training phase of my tensorflow code on my GPU while after I finish and store the results to load the model I created and run its test phase on CPU.
I have created this code (I have put a part of it, just for reference because it's huge otherwise, I know that the rules are to include a fully functional code and I apologise about that).
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.contrib.rnn.python.ops import rnn_cell, rnn
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x_train = mnist.train.images
# Check that the dataset contains 55,000 rows and 784 columns
N,D = x_train.shape
sess = tf.InteractiveSession()
x = tf.placeholder("float", [None, n_steps,n_input])
y_true = tf.placeholder("float", [None, n_classes])
keep_prob = tf.placeholder(tf.float32,shape=[])
learning_rate = tf.placeholder(tf.float32,shape=[])
#[............Build the RNN graph model.............]
# Because I am using my GPU for the training, I avoid allocating the whole
# mnist.validation set because of memory error, so I gragment it to
# small batches (100)
x_validation_bin, y_validation_bin = mnist.validation.next_batch(batch_size)
x_validation_bin = binarize(x_validation_bin, threshold=0.1)
x_validation_bin = x_validation_bin.reshape((-1,n_steps,n_input))
for k in range(epochs):
steps = 0
for i in range(training_iters):
#Stochastic descent
batch_x, batch_y = mnist.train.next_batch(batch_size)
batch_x = binarize(batch_x, threshold=0.1)
batch_x = batch_x.reshape((-1,n_steps,n_input)), feed_dict={x: batch_x, y_true: batch_y,keep_prob: keep_prob,eta:learning_rate})
if do_report_err == 1:
if steps % display_step == 0:
# Calculate batch accuracy
acc =, feed_dict={x: batch_x, y_true: batch_y,keep_prob: 1.0})
# Calculate batch loss
loss =, feed_dict={x: batch_x, y_true: batch_y,keep_prob: 1.0})
print("Iter " + str(i) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy = " + "{:.5f}".format(acc))
steps += 1
# Validation Accuracy and Cost
validation_accuracy =,feed_dict={x:x_validation_bin, y_true:y_validation_bin, keep_prob:1.0})
validation_cost =,feed_dict={x:x_validation_bin, y_true:y_validation_bin, keep_prob:1.0})
validation_accuracy_array.append(final_validation_accuracy), savefilename)
total_epochs = total_epochs + 1
np.savez(datasavefilename,epochs_saved = total_epochs,learning_rate_saved = learning_rate,keep_prob_saved = best_keep_prob, validation_loss_array_saved = validation_loss_array,validation_accuracy_array_saved = validation_accuracy_array,modelsavefilename = savefilename)
After that, my model has been trained successfully and saved the relevant data, so I wish to load the file and do a final train and test part in the model but using my CPU this time. The reason is the GPU can't handle the whole dataset of mnist.train.images and mnist.train.labels.
So, manually I select this part and I run it:
with tf.device('/cpu:0'):
# Initialise variables
# Accuracy and Cost
saver.restore(sess, savefilename)
x_train_bin = binarize(mnist.train.images, threshold=0.1)
x_train_bin = x_train_bin.reshape((-1,n_steps,n_input))
final_train_accuracy =,feed_dict={x:x_train_bin, y_true:mnist.train.labels, keep_prob:1.0})
final_train_cost =,feed_dict={x:x_train_bin, y_true:mnist.train.labels, keep_prob:1.0})
x_test_bin = binarize(mnist.test.images, threshold=0.1)
x_test_bin = x_test_bin.reshape((-1,n_steps,n_input))
final_test_accuracy =,feed_dict={x:x_test_bin, y_true:mnist.test.labels, keep_prob:1.0})
final_test_cost =,feed_dict={x:x_test_bin, y_true:mnist.test.labels, keep_prob:1.0})
But I get an OMM GPU memory error, which it doesn't make sense to me since I think I have forced the program to rely on CPU. I did not put a command sess.close() in the first (training with batches) code, but I am not sure if this really the reason behind it. I followed this post actually for the CPU
Any suggestions how to run the last part on CPU only?
with tf.device() statements only apply to graph building, not to execution, so doing inside a device block is equivalent to not having the device at all.
To do what you want to do you need to build separate training and test graphs, which share variables.

How to use a trained model on different inputs

I implemented a relatively straightforward logistic regression function. I save all the necessary variables such as weights, bias, x, y, etc. and then I run the training algorithm...
# launch the graph
with tf.Session() as sess:
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
_, c =[optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
# compute average loss
avg_cost += c / total_batch
# display logs per epoch step
if (epoch + 1) % FLAGS.display_step == 0:
print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost))
save_path =, "/tmp/model.ckpt")
The model is saved and the prediction and accuracy of the trained model is displayed...
# list of booleans to determine the correct predictions
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
print(correct_prediction.eval({x:mnist.test.images, y:mnist.test.labels}))
# calculate total accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
This is all fine and dandy. However, now I want to be able to predict any given image using the trained model. For example, I want to feed it picture of say 7 and see what it predicts it to be.
I have another module that restores the model. First we load the variables...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
saver = tf.train.Saver()
with tf.Session() as sess:
save.restore(sess, "/tmp/model.ckpt")
This is good. Now I want to compare one image to the model and get a prediction. In this example, I take the first image from the test dataset mnist.test.images[0] and I attempt to compare it to the model.
classification =, 1), feed_dict={x: mnist.test.images[0]})
I know this will not work. I get the error...
ValueError: Cannot feed value of shape (784,) for Tensor 'Placeholder:0', which has shape '(?, 784)'
I am at a loss for ideas. This question is rather long, if a straightforward answer is not possible, some guidance as to the steps I may take to do this is appreciated.
Your input placeholder must be of size (?, 784), the question mark meaning variable size which is probably the batch size. You are feeding an input of size (784,) which does not work as the error message states.
In your case, during prediction time, the batch size is just 1, so the following should work:
import numpy as np
x_in = np.expand_dims(mnist.test.images[0], axis=0)
classification =, 1), feed_dict={x:x_in})
Assuming that the input image is available as a numpy array. If it is already a tensor, the corresponding function is tf.expand_dims(..).
