currently I am a newbie on TensorFlow, I have trained a model using MNIST set and now I have made some pictures with numbers and I want to try to test the precision. I think I have a syntax or understanding of how things in TensorFlow are working
This is my model:
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
sess = tf.InteractiveSession()
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
#stride 1 and 0 padding
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
#pooling over 2x2
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
# Second Layer
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
#Fully connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
#Readout layer
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.global_variables_initializer())
for i in range(200):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
# Here is my custom dataset
custom_data=GetDataset()
print sess.run(y_conv,feed_dict={x: custom_data})
It is not the right syntax for making prediction with my custom data ? I am missing something here ? My data are in the same format as the one from the MNIST set but I can find a correct syntax for how to make a prediction:
print sess.run(y_conv,feed_dict={x: custom_data})
Thanks a lot for any help !
y_conv will provide you what you need to make recommendations. You're probably just not understanding the form the data takes on in that tensor.
In your code you have a loss function and an optimizer:
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
Note that you pass y_conv to the softmax_cross_encropy_with_logits method. y_conv at this point is an unscaled number. Negative values represent the negative class, positive values represent the positive class.
softmax will convert these to a probability distribution over all outputs. This notably converts all outputs to a [0,1] range. Cross entropy then computes the error (cross entropy assumes values in the [0,1] range).
It's common to simply create another tensor that actually computes the prediction, if you're using softmax then:
prediction = tf.softmax(y_conv)
That would give you the predicted probability distribution over the labels. Just request that tensor in your sess.run step.
If you just care about the most probable class, then you can take the max of the y_conv values. Note also that if this statement is true you might want to experiment with tf.nn.sigmoid_cross_entropy_with_logits which is a slightly more tuned to producing results that are not a probability distribution (slightly better for single class prediction, and mandatory for multi class prediction).
Related
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
above is the code for multilayer Convolutional Neural Network straight from https://www.tensorflow.org/versions/r1.3/get_started/mnist/pros
I've been trying to obtain the values in h_conv1 and h_conv2, and I've tried using
get_value = h_conv1.eval() or h_conv1.eval(session=sess)
both are not successful, I even tried setting name in h_conv1 and get it by using
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1, name='example')
test = tf.get_default_graph().get_tensor_by_name("example:0")
and still it wasn't successful.
However, it's easy to extract the values of W_conv1 by using
weights = W_conv1.eval()
and it will shows up in Spyder's variable explorer as a numpy array and I can do whatever I want with it.
I was wondering is there any other way to get the h_conv1 value, so I can do some processing steps on those values before feeding it to the next operation.
If you're running into the problem I expect you are, the problem is that the weights you are printing out succesfully are tensorflow variables (meaning their values get stored as part of the session), but h_conv1 is an operation, meaning it has an output that is defined as a function of its input. Since those inputs end up routing back to placeholder variables, and you're not giving any placeholder variables when you eval the operation, it fails with InvalidArgumentOperation.
What I'm guessing what you're looking for is the value of the output the last time you ran the training operation. To get this, the approach I got working is simply to attach a tf.Print operation to that node, and then evaluating that at the same time as I run the training operation. So in your graph definition you have something like this:
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_conv1_printop = tf.Print(h_conv1, [h_conv1])
Then replace this line:
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
with this line:
_, hconv1_out = sess.run([train_step, h_conv1_printop], feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print(hconv1_out)
Basically, attach a print Op to the relu Op you want to see the output of, and then evaluate that Op at the same time you are running the rest of the session so that it gets filled in with the value.
Hopefully that makes sense and solves your issue, always helpful to review the docs on graphs and sessions since this is all a common point confusion.
I'm experiencing a weird case and I have no idea how to solve this. Basically I am training a multi layer NN. However when I try to add summaries, I receive the next error:
Caused by op 'Placeholder_2', defined at:
File "multilayer.py", line 235, in <module>
train_model(data, real_output, real_check, args.learning_rate, args.op, args.batch)
File "multilayer.py", line 120, in train_model
keep_prob = tf.placeholder(tf.float32)
File "C:\Users\dangz\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1599, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "C:\Users\dangz\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3090, in _placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "C:\Users\dangz\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\dangz\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
op_def=op_def)
File "C:\Users\dangz\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_2' with dtype float
[[Node: Placeholder_2 = Placeholder[dtype=DT_FLOAT, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
The code is:
# Initializing the variables
init = tf.global_variables_initializer()
# Merge all the summaries
summaries = tf.summary.merge_all()
# Create Saver
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init)
writer = tf.summary.FileWriter(log_path, sess.graph)
for i in range(1000):
#until 1000
batch_ini = 50*i
batch_end = 50*i+50
batch_xs = data[0][0][batch_ini:batch_end]
batch_ys = real_output[batch_ini:batch_end]
if i % 10 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch_xs, y_: batch_ys, keep_prob: 1.0})
curr_loss, cur_accuracy, _, summary = sess.run([cross_entropy, accuracy, train_step, summaries],
feed_dict={ x: batch_xs,
y_: batch_ys,
keep_prob: 0.5})
However when I delete summaries from sess.run then I can train the model. From the error it says that I have to feed a value to keep_prob but I am doing so, that's the part I don't understand. Deleting the summaries works, next are the code lines that I changed:
curr_loss, cur_accuracy, _ = sess.run([cross_entropy, accuracy, train_step],
feed_dict={ x: batch_xs,
y_: batch_ys,
keep_prob: 0.5})
Graph definition (due to stackoverflow format, everything after the for until crossentropy is inside the for):
#set up the computation. Definition of the variables.
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
y_ = tf.placeholder(tf.float32, [None, 10])
#declare weights and biases
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
#convolution and pooling
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
for i in range(2):
#First convolutional layer: 32 features per each 5x5 patch
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
#Reshape x to a 4d tensor
x_image = tf.reshape(x, [-1, 28, 28, 1])
#We convolve x_image with the weight tensor, add the bias, apply the ReLU function, and finally max pool.
#The max_pool_2x2 method will reduce the image size to 14x14.
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
#Second convolutional layer: 64 features for each 5x5 patch.
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
#Densely connected layer: Processes the 64 7x7 images with 1024 neurons
#Reshape the tensor from the pooling layer into a batch of vectors,
#multiply by a weight matrix, add a bias, and apply a ReLU.
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
#drop_out
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
#Readout Layer
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
#Crossentropy
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
with tf.name_scope('cross_entropy'):
deltas = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)
with tf.name_scope('total'):
cross_entropy = tf.reduce_mean(deltas)
tf.summary.scalar('cross_entropy', cross_entropy)
''' Optimization Algorithm '''
with tf.name_scope('train_step'):
train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
with tf.name_scope("evaluation"):
with tf.name_scope("correct_prediction"):
y_p = tf.argmax(y_conv, 1)
correct_predictions = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
with tf.name_scope("accuracy"):
accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
tf.summary.scalar("accuracy", accuracy)
I tried to follow the tensorflow tutorial 'Deep MNIST for Experts' [1]. My program (source below) is working, but it converges very slowly and achieves a quite bad accuracy of ~90%. It should achieve about 99,2% accuracy. I compared my solution to the 'mnist_deep.py' available for download [2], which looks quite similar ... but achieves those 99,2% accuracy on the same machine (so it's not a bug in tensorflow nor anything wrong with my installation). Surprisingly, it needs much more time for the training on the same machine telling me the trained model must be different / more complex. I checked my sources and compared it to mine, reordered stuff and checked the numbers. But, I didn't find any relevant difference except for coding style. I'm new to python - so maybe it's just some simple syntax issue...
Questions:
What is the difference in my version causing this problem?
Additional: How to debug those issues in tensorflow? I saw some generated graphs of the models on the webpage... how to generate them from the source?
My Program:
import os
import tensorflow as tf
# os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
from tensorflow.examples.tutorials.mnist import input_data
x = tf.placeholder(tf.float32, [None, 784])
# reshape
x_image = tf.reshape(x, [-1, 28, 28, 1])
# conv1
W_conv1 = weight_variable([5,5,1,32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
# pool1
h_pool1 = max_pool_2x2(h_conv1)
# conv2
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
# pool2
h_pool2 = max_pool_2x2(h_conv2)
# fcl
W_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
# dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
# map 1024 features to 10 classes
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
y_ = tf.placeholder(tf.float32, [None, 10])
# loss function
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)
)
# ADAM Optimizer
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# accuracy
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
correct_prediction = tf.cast(correct_prediction, tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0
})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_:batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
OUTPUT:
# python3 mnist_deep.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
step 0, training accuracy 0.06
step 100, training accuracy 0.06
step 200, training accuracy 0
step 300, training accuracy 0.12
step 400, training accuracy 0.06
step 500, training accuracy 0.1
step 600, training accuracy 0.18
step 700, training accuracy 0.12
step 800, training accuracy 0.1
step 900, training accuracy 0.22
step 1000, training accuracy 0.2
[...]
Version from the Webpage:
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A deep MNIST classifier using convolutional layers.
See extensive documentation at
https://www.tensorflow.org/get_started/mnist/pros
"""
# Disable linter warnings to maintain consistency with tutorial.
# pylint: disable=invalid-name
# pylint: disable=g-bad-import-order
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import tempfile
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
FLAGS = None
def deepnn(x):
"""deepnn builds the graph for a deep net for classifying digits.
Args:
x: an input tensor with the dimensions (N_examples, 784), where 784 is the
number of pixels in a standard MNIST image.
Returns:
A tuple (y, keep_prob). y is a tensor of shape (N_examples, 10), with values
equal to the logits of classifying the digit into one of 10 classes (the
digits 0-9). keep_prob is a scalar placeholder for the probability of
dropout.
"""
# Reshape to use within a convolutional neural net.
# Last dimension is for "features" - there is only one here, since images are
# grayscale -- it would be 3 for an RGB image, 4 for RGBA, etc.
with tf.name_scope('reshape'):
x_image = tf.reshape(x, [-1, 28, 28, 1])
# First convolutional layer - maps one grayscale image to 32 feature maps.
with tf.name_scope('conv1'):
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
# Pooling layer - downsamples by 2X.
with tf.name_scope('pool1'):
h_pool1 = max_pool_2x2(h_conv1)
# Second convolutional layer -- maps 32 feature maps to 64.
with tf.name_scope('conv2'):
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
# Second pooling layer.
with tf.name_scope('pool2'):
h_pool2 = max_pool_2x2(h_conv2)
# Fully connected layer 1 -- after 2 round of downsampling, our 28x28 image
# is down to 7x7x64 feature maps -- maps this to 1024 features.
with tf.name_scope('fc1'):
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
# Dropout - controls the complexity of the model, prevents co-adaptation of
# features.
with tf.name_scope('dropout'):
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
# Map the 1024 features to 10 classes, one for each digit
with tf.name_scope('fc2'):
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
return y_conv, keep_prob
def conv2d(x, W):
"""conv2d returns a 2d convolution layer with full stride."""
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
"""max_pool_2x2 downsamples a feature map by 2X."""
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
def weight_variable(shape):
"""weight_variable generates a weight variable of a given shape."""
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
"""bias_variable generates a bias variable of a given shape."""
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def main(_):
# Import data
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
# Build the graph for the deep net
y_conv, keep_prob = deepnn(x)
with tf.name_scope('loss'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_,
logits=y_conv)
cross_entropy = tf.reduce_mean(cross_entropy)
with tf.name_scope('adam_optimizer'):
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
with tf.name_scope('accuracy'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
correct_prediction = tf.cast(correct_prediction, tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
graph_location = tempfile.mkdtemp()
print('Saving graph to: %s' % graph_location)
train_writer = tf.summary.FileWriter(graph_location)
train_writer.add_graph(tf.get_default_graph())
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--data_dir', type=str,
default='/tmp/tensorflow/mnist/input_data',
help='Directory for storing input data')
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
OUTPUT:
# python3 mnist_deep.py
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
Saving graph to: /tmp/tmpw8uaz0vs
step 0, training accuracy 0.12
step 100, training accuracy 0.64
step 200, training accuracy 0.86
step 300, training accuracy 0.94
step 400, training accuracy 0.94
[...]
Links
[1] https://www.tensorflow.org/get_started/mnist/pros
[2] https://www.github.com/tensorflow/tensorflow/blob/r1.3/tensorflow/examples/tutorials/mnist/mnist_deep.py
Question 1
Possible reason for poor results: train_step.run(feed_dict={x: batch[0], y_:batch[1], keep_prob: 0.5}) should be outside the if i%100 statement.
Question 2
To answer you second question, the graphs are generated by tensorboard (as mentioned by #JoshVarty). To visualize your graph, you need to write it to a file. Add the following somewhere before running the session and after defining the complete graph:
file_writer = tf.summary.FileWriter(".", tf.get_default_graph())
Then start tensorboard server in the terminal from your current path tensorboard --logdir=".", and you can open it from your browser with default port 6006 and the graph will appear after you run the script.
I am trying to build a CNN using Adagrad optimizer but am getting the following error.
tensorflow.python.framework.errors.FailedPreconditionError: Attempting to use uninitialized value Variable_7/Adadelta
[[Node: Adadelta/update_Variable_7/ApplyAdadelta = ApplyAdadelta[T=DT_FLOAT, _class=["loc:#Variable_7"], use_locking=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_7, Variable_7/Adadelta, Variable_7/Adadelta_1, Adadelta/lr, Adadelta/rho, Adadelta/epsilon, gradients/add_3_grad/tuple/control_dependency_1)]]
Caused by op u'Adadelta/update_Variable_7/ApplyAdadelta',
optimizer = tf.train.AdadeltaOptimizer(learning_rate).minimize(cross_entropy)
I tried reinitializing the session variables after the adagrad statement as mentioned in this post, but that didn't help too.
How can I avoid this error? Thanks.
Tensorflow: Using Adam optimizer
import tensorflow as tf
import numpy
from tensorflow.examples.tutorials.mnist import input_data
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# Parameters
learning_rate = 0.01
training_epochs = 100
batch_size = 1000
display_step = 1
# Set model weights
W = tf.Variable(tf.zeros([784, 10]), name="weights")
b = tf.Variable(tf.zeros([10]), name="bias")
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
# Initializing the variables
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
x_image = tf.reshape(batch_xs, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
y_conv=tf.nn.softmax(tf.matmul(h_fc1, W_fc2) + b_fc2)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(batch_ys * tf.log(y_conv), reduction_indices=[1]))
#optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
optimizer = tf.train.AdadeltaOptimizer(learning_rate).minimize(cross_entropy)
sess.run(init)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(batch_ys,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run([cross_entropy, y_conv,optimizer])
print cross_entropy.eval()
The problem here is that tf.initialize_all_variables() is a misleading name. It really means "return an operation that initializes all variables that have already been created (in the default graph)". When you call tf.train.AdadeltaOptimizer(...).minimize(), TensorFlow creates additional variables, which are not covered by the init op that you created earlier.
Moving the line:
init = tf.initialize_all_variables()
...after the construction of the tf.train.AdadeltaOptimizer should make your program work.
N.B. Your program rebuilds the entire network, apart from the variables, on each training step. This is likely to be very inefficient, and the Adadelta algorithm will not adapt as expected because its state is recreated on each step. I would strongly recommend moving the code from the definition of batch_xs to the creation of the optimizer outside of the two nested for loops. You should define tf.placeholder() ops for the batch_xs and batch_ys inputs, and use the feed_dict argument to sess.run() to pass in the values returned by mnist.train.next_batch().
I am using tensor flow to run a convolution neural network on MNIST database. But I am getting the following error.
tensorflow.python.framework.errors.InvalidArgumentError: You must feed
a value for placeholder tensor 'x' with dtype float [[Node: x =
Placeholderdtype=DT_FLOAT, shape=[],
_device="/job:localhost/replica:0/task:0/cpu:0"]]
x = tf.placeholder(tf.float32, [None, 784], name='x') # mnist data image of shape 28*28=784
I thought I correctly update the value of x using feed_dict, but its saying i haven't update the value of placeholder x.
Also, is there any other logical flaw in my code?
Any help would be greatly appreciated. Thanks.
import tensorflow as tf
import numpy
from tensorflow.examples.tutorials.mnist import input_data
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1
# tf Graph Input
#x = tf.placeholder(tf.float32, [50, 784], name='x') # mnist data image of shape 28*28=784
#y = tf.placeholder(tf.float32, [50, 10], name='y') # 0-9 digits recognition => 10 classes
# Set model weights
W = tf.Variable(tf.zeros([784, 10]), name="weights")
b = tf.Variable(tf.zeros([10]), name="bias")
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
# Initializing the variables
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
# Training cycle
for i in range(1000):
print i
batch_xs, batch_ys = mnist.train.next_batch(50)
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
y_conv=tf.nn.softmax(tf.matmul(h_fc1, W_fc2) + b_fc2)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_conv), reduction_indices=[1]))
sess.run(
[cross_entropy, y_conv],
feed_dict={x: batch_xs, y: batch_ys})
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y,1))
print correct_prediction.eval()
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Why are you trying to create placeholder variables ? You should be able to use the outputs generated by mnist.train.next_batch(50) directly provided that you move the computation of correct_prediction and accuracy inside the model itself.
batch_xs, batch_ys = mnist.train.next_batch(50)
x_image = tf.reshape(batch_xs, [-1,28,28,1])
...
cross_entropy = tf.reduce_mean(-tf.reduce_sum(batch_ys * tf.log(y_conv), reduction_indices=[1]))
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(batch_ys,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
predictions_correct, acc = sess.run([cross_entropy, y_conv, correct_prediction, accuracy])
print predictions_correct, acc
You're receiving that error because you're attempting to run eval() on correct_prediction. That tensor requires the batch inputs (x and y) in order to be evaluated. You could correct the error by changing it to:
print correct_prediction.eval(feed_dict={x: batch_xs, y: batch_ys})
But as Benoit Steiner mentioned, you could just as easily pull it into the model.
On a more general note, you're not doing any kind of optimization here, but maybe you just haven't gotten around to that yet. As it stands now, it'll just print out bad predictions for a while. :)
Firstly your x and y are commented out, if this is present in your actual code it is very likely the issue.
correct_prediction.eval() is equivalent to tf.session.run(correct_prediction) (or in your case sess.run() ) and thus requires the same syntax*. So it would need to be correct_prediction.eval(feed_dict={x: batch_xs, y: batch_ys}) in order to run, be warned however that this is generally RAM intensive, and may cause your system to hang. Pulling the accuracy function into the model may be a good idea because of the ram usage.
I did not see an optimization function to utilize your cross entropy, however i have never tried not using one,so if it works don't fix it. but if it ends up throwing an error you may want to try:
optimizer = optimizer = tf.train.AdamOptimizer().minimize(cross_entropy)
and replace the 'cross_entropy' in
sess.run([cross_entropy, y_conv],feed_dict={x: batch_xs, y: batch_ys})
with 'optimizer'
https://pythonprogramming.net/tensorflow-neural-network-session-machine-learning-tutorial/
check the accuracy evaluation section of the script.