ValueError: Cannot feed value of shape - python

I'm new to Tensorflow and trying to programm a CNN with it. Im watching a tutorial and do exactly, 100% same code, and still get an error message at the end, that says:
ValueError: Cannot feed value of shape (100, 10) for Tensor 'y_2:0', which has shape '(?, 1)'
from __future__ import absolute_import, division, print_function
import tensorflow as tf
import numpy as np
get_ipython().run_line_magic('matplotlib', 'inline')
import matplotlib
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
training_digits, training_labels = mnist.train.next_batch(1000)
test_digits, test_labels = mnist.test.next_batch(200)
height = 28
width = 28
channels = 1
n_inputs = height * width
conv1_feature_maps = 32
conv1_kernel_size = 3
conv1_stride = 1
conv1_pad = "SAME"
conv2_feature_maps = 64
conv2_kernel_size = 3
conv2_stride = 2
conv2_pad = "SAME"
pool3_feature_maps = conv2_feature_maps
n_fullyconn1 = 64
n_outputs = 10
tf.reset_default_graph()
X = tf.placeholder(tf.float32, shape=[None, n_inputs], name="X")
X_reshaped = tf.reshape(X, shape=[-1, height, width, channels])
y = tf.placeholder(tf.int32, shape=[None], name="y")
conv1 = tf.layers.conv2d(X_reshaped, filters=conv1_feature_maps,
kernel_size=conv1_kernel_size,
strides=conv1_stride, padding=conv1_pad,
activation = tf.nn.relu, name="conv1")
conv2 = tf.layers.conv2d(conv1, filters=conv2_feature_maps,
kernel_size=conv2_kernel_size,
strides=conv2_stride, padding=conv2_pad,
activation = tf.nn.relu, name="conv2")
pool3 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="VALID")
pool3_flat = tf.reshape(pool3, shape=[-1,pool3_feature_maps * 7 *7])
fullyconn1 = tf.layers.dense(pool3_flat, n_fullyconn1, activation = tf.nn.relu, name="fc1")
logits = tf.layers.dense(fullyconn1, n_outputs, name="output")
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits,
labels = y)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.AdamOptimizer()
training_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
init = tf.global_variables_initializer()
saver = tf.train.Saver()
n_epochs = 5
batch_size = 100
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
for iteration in range(mnist.train.num_examples // batch_size):
X_batch, y_batch = mnist.train.next_batch(batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
acc_test = accuracy.eval(feed_dict={X: mnist.test.images, y: mnist.test.labels})
print(epoch, "Train accuracy: ", acc_train, "Test accuracy: ", acc_test)
save_path = saver.save(sess, "./my_mnist_model")
What am I doing wrong and what should I change to solve this problem?
I tried to read all the other answers on stackoverflow, but i cannot connect them to my code.
Thanks!

You define y of size [?, 1] here:
y = tf.placeholder(tf.int32, shape=[None,1], name="y").
Change it to:
y = tf.placeholder(tf.int32, shape=[None,10], name="y")
Note that the shape is changed to [None,10].
Edit:
Set one_hot to False in mnist = input_data.read_data_sets("MNIST_data/", one_hot=True).

Related

ValueError: Shape must be rank 2 but is rank 4 for 'in_top_k/InTopKV2' (op: 'InTopKV2') with input shapes: [?,28,28,10], [?], []

I'm new to Tensorflow and I am trying to train on MNIST. However, the code fails on
correct = tf.nn.in_top_k(logits, tf.argmax(y, axis=1), 1)
with the error "ValueError: Shape must be rank 2 but is rank 4 for 'in_top_k/InTopKV2' (op: 'InTopKV2') with input shapes: [?,28,28,10], [?], []"
What is going on here, and what do I need to know to make this compatible with different architectures in the future? I've included the entire file below.
import tensorflow as tf
import numpy as np
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
tf.reset_default_graph()
x = tf.placeholder(tf.float32, shape=(None, 28, 28), name='x_input')
y = tf.placeholder(tf.float32, shape=(None, 10), name='y_label')
y = tf.stop_gradient(y, name="stop_gradient_y")
input_layer = tf.reshape(x, [-1, 28, 28, 1], name='x_reshaped')
fc_layer1 = tf.layers.dense(
inputs=input_layer, units=1024, activation=tf.nn.relu, name='fc_layer_1')
fc_layer2 = tf.layers.dense(
inputs=fc_layer1, units=512, activation=tf.nn.relu, name='fc_layer_2')
fc_layer3 = tf.layers.dense(
inputs=fc_layer2, units=512, activation=tf.nn.relu, name='fc_layer_3')
fc_layer4 = tf.layers.dense(
inputs=fc_layer3, units=512, activation=tf.nn.relu, name='fc_layer_4')
fc_layer5 = tf.layers.dense(
inputs=fc_layer4, units=512, activation=tf.nn.relu, name='fc_layer_5')
logits = tf.layers.dense(inputs=fc_layer5, units=10, name='logits')
classes = tf.argmax(input=logits, axis=1, name='classes')
probabilities = tf.nn.softmax(logits, name="probabilities_out")
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits, name='loss_func')
grad = tf.gradients(loss, x)
grad_out = tf.identity(grad, name='gradient_out')
optimizer = tf.train.AdamOptimizer()
train_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, tf.argmax(y, axis=1), 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train/np.float32(255)
y_train = y_train.astype(np.int32)
x_test = x_test/np.float32(255)
y_test = y_test.astype(np.int32)
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
num_epochs = 100
batch_size = 100
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
for epoch in range(num_epochs):
print('Epoch: {}'.format(epoch))
for i in range(x_train.shape[0] // batch_size):
batch_indices = np.random.randint(x_train.shape[0], size=batch_size)
x_batch = x_train[batch_indices]
y_batch = y_train[batch_indices]
sess.run(train_op, feed_dict={x: x_batch, y: y_batch})
acc_test = accuracy.eval(feed_dict={x: x_test, y: y_test})
print(epoch, "Test accuracy:", acc_test)
constant_graph = graph_util.convert_variables_to_constants(
sess,
sess.graph.as_graph_def(),
['probabilities_out', 'gradient_out'])
graph_io.write_graph(constant_graph, '.', 'mnist_gradient_fc_without.pb', as_text=False)
Thanks JacoSolari and Jarom Allen. For the benefit of community providing complete working code here
import tensorflow as tf
import numpy as np
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
tf.reset_default_graph()
x = tf.placeholder(tf.float32, shape=(None, 28, 28), name='x_input')
y = tf.placeholder(tf.float32, shape=(None, 10), name='y_label')
y = tf.stop_gradient(y, name="stop_gradient_y")
input_layer = tf.reshape(x, [-1, 784 ], name='x_reshaped')
fc_layer1 = tf.layers.dense(
inputs=input_layer, units=1024, activation=tf.nn.relu, name='fc_layer_1')
fc_layer2 = tf.layers.dense(
inputs=fc_layer1, units=512, activation=tf.nn.relu, name='fc_layer_2')
fc_layer3 = tf.layers.dense(
inputs=fc_layer2, units=512, activation=tf.nn.relu, name='fc_layer_3')
fc_layer4 = tf.layers.dense(
inputs=fc_layer3, units=512, activation=tf.nn.relu, name='fc_layer_4')
fc_layer5 = tf.layers.dense(
inputs=fc_layer4, units=512, activation=tf.nn.relu, name='fc_layer_5')
logits = tf.layers.dense(inputs=fc_layer5, units=10, name='logits')
classes = tf.argmax(input=logits, axis=1, name='classes')
probabilities = tf.nn.softmax(logits, name="probabilities_out")
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits, name='loss_func')
grad = tf.gradients(loss, x)
grad_out = tf.identity(grad, name='gradient_out')
optimizer = tf.train.AdamOptimizer()
train_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, tf.argmax(y, axis=1), 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train/np.float32(255)
y_train = y_train.astype(np.int32)
x_test = x_test/np.float32(255)
y_test = y_test.astype(np.int32)
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
num_epochs = 5
batch_size = 100
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
for epoch in range(num_epochs):
print('Epoch: {}'.format(epoch))
for i in range(x_train.shape[0] // batch_size):
batch_indices = np.random.randint(x_train.shape[0], size=batch_size)
x_batch = x_train[batch_indices]
y_batch = y_train[batch_indices]
sess.run(train_op, feed_dict={x: x_batch, y: y_batch})
acc_test = accuracy.eval(feed_dict={x: x_test, y: y_test})
print(epoch, "Test accuracy:", acc_test)
constant_graph = graph_util.convert_variables_to_constants(
sess,
sess.graph.as_graph_def(),
['probabilities_out', 'gradient_out'])
graph_io.write_graph(constant_graph, '.', 'mnist_gradient_fc_without.pb', as_text=False)
Output:
Epoch: 0
0 Test accuracy: 0.9527
Epoch: 1
1 Test accuracy: 0.9683
Epoch: 2
2 Test accuracy: 0.9731
Epoch: 3
3 Test accuracy: 0.9776
Epoch: 4
4 Test accuracy: 0.9821
INFO:tensorflow:Froze 12 variables.
INFO:tensorflow:Converted 12 variables to const ops.

Tensorflow feed_dict ValueError: setting an array element with a sequence

I'm new to tensorflow and trying to run a CNN on Twitter embedding matrices (each embedding matrix is 574x300 - word x embedding length) in batches of 100 tweets at a time. I keep getting the error ValueError: setting an array element with a sequence. at the following line at the bottom: sess.run(training_op, feed_dict={input_tweets: x_batch, tweet_labels: y_batch}).
filter_size = 2
embedding_size = 300
length_embedding = 575
num_filters = 100
filter_shape = [filter_size, embedding_size, 1, num_filters]
batch_size = 100
n_epochs = 10
n_inputs = length_embedding*embedding_size
n_outputs = 2 #classify between 2 categories
num_train_examples = 2000
with tf.name_scope("inputs"):
input_tweets = tf.placeholder(tf.float32, shape = [batch_size, length_embedding], name="input_tweets")
input_tweets_reshaped = tf.expand_dims(input_tweets, -1)
tweet_labels = tf.placeholder(tf.int32, shape = [batch_size], name="tweet_labels")
W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
conv = tf.nn.conv2d(input_tweets_reshaped, W,
strides = [1,1,1,1], padding="VALID", name="conv")
conv_bias = tf.nn.bias_add(conv, b)
#pooling
sequence_length=input_tweets_reshaped.shape[1]
with tf.name_scope("pool"):
pool = tf.nn.max_pool(conv, ksize=[1, sequence_length - filter_size + 1, 1, 1],
strides=[1,1,1,1],
padding="VALID",
name="pool")
pool_flat = tf.reshape(pool, shape=[-1, num_filters])
#fully-connected layer
with tf.name_scope("fc_layer"):
fc_layer = tf.layers.dense(pool_flat, num_filters, activation=tf.nn.relu, name="fc_layer")
#output
with tf.name_scope("output_layer"):
logits = tf.layers.dense(fc_layer, n_outputs, name="output_layer")
Y_proba = tf.nn.softmax(logits, name="Y_proba")
#train
with tf.name_scope("train"):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=tweet_labels)
loss=tf.reduce_mean(xentropy)
optimizer=tf.train.AdamOptimizer()
training_op=optimizer.minimize(loss)
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, tweet_labels, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
with tf.name_scope("init_and_save"):
init = tf.global_variables_initializer()
saver = tf.train.Saver()
#--run model
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
for iteration in range(num_train_examples // batch_size):
print("iteration: "+str(iteration))
x_batch = x_train[iteration*batch_size : (iteration+1)*batch_size]
y_batch = y_train[iteration*batch_size : (iteration+1)*batch_size]
sess.run(training_op, feed_dict={input_tweets: x_batch, tweet_labels: y_batch})
acc_train = accuracy.eval(feed_dict={input_tweets: x_batch, tweet_labels: y_batch})
acc_test = accuracy.eval(feed_dict={input_tweets: x_test, tweet_labels: y_test})
print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)
x_batch is a numpy array of length 100, and each element is a matrix of dimension 575 x 300 (though when I call x_batch.shape, it returns (100, 575)). y_batch is a 1d numpy array of 1's and 0's; y_batch.shape returns (100,). I think the problem is maybe about the dimensions of the inputs - can anyone see clearly what the mismatch is?
Thank you!
The input to the conv2d must have rank=4, but you have rank=3.
embedding_size, which determines the second dimension of your filter, must be less than or equal to the third dimension of your input tensor. You have third dimension equal to 1 - expanded dimension. Therefore, it cannot be greater than 1!
You could use tf.layers.conv2d() that will automatically create variables for convolution.
Maybe you intended to use tf.layers.conv1d() It expects a tensor of rank=3 as input.
I'm not sure what you want to achieve with your code, but here's the modified version that works:
import tensorflow as tf
import numpy as np
filter_size = 2
embedding_size = 300
length_embedding = 575
num_filters = 100
filter_shape = [filter_size, 1, 1, num_filters]
batch_size = 100
n_epochs = 10
n_inputs = length_embedding*embedding_size
n_outputs = 2 #classify between 2 categories
num_train_examples = 2000
with tf.name_scope("inputs"):
input_tweets = tf.placeholder(tf.float32, shape = [None, length_embedding], name="input_tweets")
input_tweets_reshaped = input_tweets[..., tf.newaxis, tf.newaxis]
tweet_labels = tf.placeholder(tf.int32, shape = [None], name="tweet_labels")
W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
b = tf.Variable(0.1*tf.ones([num_filters]), name="b")
conv = tf.nn.conv2d(input_tweets_reshaped,
W,
strides=[1,1,1,1],
padding="VALID",
name="conv")
conv_bias = tf.nn.bias_add(conv, b)
#pooling
sequence_length=input_tweets_reshaped.shape[1]
with tf.name_scope("pool"):
pool = tf.nn.max_pool(conv, ksize=[1, sequence_length - filter_size + 1, 1, 1],
strides=[1,1,1,1],
padding="VALID",
name="pool")
pool_flat = tf.reshape(pool, shape=[-1, num_filters])
#fully-connected layer
with tf.name_scope("fc_layer"):
fc_layer = tf.layers.dense(pool_flat, num_filters, activation=tf.nn.relu, name="fc_layer")
#output
with tf.name_scope("output_layer"):
logits = tf.layers.dense(fc_layer, n_outputs, name="output_layer")
Y_proba = tf.nn.softmax(logits, name="Y_proba")
#train
with tf.name_scope("train"):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=tweet_labels)
loss=tf.reduce_mean(xentropy)
optimizer=tf.train.AdamOptimizer()
training_op=optimizer.minimize(loss)
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, tweet_labels, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
with tf.name_scope("init_and_save"):
init = tf.global_variables_initializer()
saver = tf.train.Saver()
x_train = np.random.normal(size=(10*batch_size, length_embedding, ))
y_train = np.random.randint(low=0, high=2, size=10*batch_size)
x_test = x_train
y_test = y_train
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
for iteration in range(num_train_examples // batch_size):
print("iteration: "+str(iteration))
x_batch = x_train[iteration*batch_size : (iteration+1)*batch_size]
y_batch = y_train[iteration*batch_size : (iteration+1)*batch_size]
sess.run(training_op, feed_dict={input_tweets: x_batch, tweet_labels: y_batch})
acc_train = accuracy.eval(feed_dict={input_tweets: x_batch, tweet_labels: y_batch})
acc_test = accuracy.eval(feed_dict={input_tweets: x_test, tweet_labels: y_test})
print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)

Tensorflow can't determine why size of tensor is changing and can't debug directly

I am creating a convolutional neural network class using tensorflow and have run into an error where, during optimization, one tensor has 40000 elements when it should only have 10000. The problem arises in the line:
correct_prediction = tf.equal(tf.argmax(self.y, 1), tf.argmax(output, 1))
giving the error:
InvalidArgumentError (see above for traceback): Incompatible shapes: [10000] vs. [40000]
[[Node: Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ArgMax, ArgMax_1)]]
so basically it's saying that the output of my network is 4x larger than the test labels I provided it. My question is how do I view the size of my output tensor as it changes? It seems like everything happens behind the scenes via the single line of code session.run('model-name') which prevents me from being able to debug it. How do I peek inside and figure out what's going on?
Here is the full code:
class ConvNet:
def __init__(self, epochs=1, learning_rate=0.01, batch_size=50):
self.learning_rate = learning_rate
self.epochs = epochs
self.batch_size = batch_size
self.x = tf.placeholder(tf.float32, [None, 784])
self.x_shaped = tf.reshape(self.x, [-1, 28, 28, 1])
self.y = tf.placeholder(tf.float32, [None, 10])
self.accuracy = None
def predict(self, x_test, y_test):
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('convnet-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
test_acc = sess.run(self.accuracy, feed_dict={self.x: x_test, self.y: y_test})
print('test accuracy: {:.3f}'.format(test_acc))
def train(self, x_train, y_train):
output, raw_output = self.create_structure()
self.optimize(raw_output, output, x_train, y_train)
def create_structure(self):
output = self.conv_layer(self.x_shaped, 1, 32, [5, 5], 'conv1')
output = self.relu_layer(output)
output = self.conv_layer(output, 32, 64, [5, 5], 'conv2')
output = self.relu_layer(output)
output = self.max_pool_layer(output, [2, 2])
raw_output, output = self.full_connect_layers(output)
return output, raw_output
def optimize(self, raw_output, output, x_train, y_train):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=raw_output, labels=output))
optimiser = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(self.y, 1), tf.argmax(output, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
total_batch = int(len(y_train) / self.batch_size)
for epoch in range(self.epochs):
for i in range(total_batch):
batch_x, batch_y = self.get_batch(x_train, y_train)
_, c = sess.run([optimiser, cross_entropy], feed_dict={self.x: batch_x, self.y: batch_y})
saver = tf.train.Saver()
saver.save(sess, './convnet-model')
def conv_layer(self, input_data, num_input_channels, num_filters, filter_shape, name):
# setup the filter shape for tf.nn.conv2d
conv_filter_shape = [filter_shape[0], filter_shape[1], num_input_channels, num_filters]
# init weights and bias for the filter
weights = tf.Variable(tf.truncated_normal(conv_filter_shape, stddev=0.03), name=name+'_w')
bias = tf.Variable(tf.truncated_normal([num_filters]), name=name+'_b')
# set up the convolutional layer operation
out_layer = tf.nn.conv2d(input_data, weights, [1, 1, 1, 1], padding='SAME')
# add the bias
out_layer += bias
return out_layer
def relu_layer(self, out_layer):
return tf.nn.relu(out_layer)
def max_pool_layer(self, out_layer, pool_shape):
ksize = [1, pool_shape[0], pool_shape[1], 1]
strides = [1, 2, 2, 1]
out_layer = tf.nn.max_pool(out_layer, ksize=ksize, strides=strides, padding='SAME')
return out_layer
def full_connect_layers(self, output):
flattened = tf.reshape(output, [-1, 7 * 7 * 64])
w1 = tf.Variable(tf.truncated_normal([7*7*64, 1000], stddev=0.03), name='w1')
b1 = tf.Variable(tf.truncated_normal([1000], stddev=0.01), name='b1')
dense_layer1 = tf.matmul(flattened, w1) + b1
dense_layer1 = tf.nn.relu(dense_layer1)
w2 = tf.Variable(tf.truncated_normal([1000, 10], stddev=0.03), name='w2')
b2 = tf.Variable(tf.truncated_normal([10], stddev=0.01), name='b2')
dense_layer2 = tf.matmul(dense_layer1, w2) + b2
y = tf.nn.softmax(dense_layer2)
return dense_layer2, y
def get_batch(self, features, labels):
index_list = np.arange(0, len(labels))
np.random.shuffle(index_list)
new_features = np.asarray([features[i] for i in index_list])
new_labels = np.asarray([labels[i] for i in index_list])
return new_features[0:self.batch_size], new_labels[0:self.batch_size]
main file:
from convnet import ConvNet
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
convnet = ConvNet()
convnet.train(mnist.test.images, mnist.test.labels)
convnet.predict(mnist.test.images, mnist.test.labels)

Trouble predicting with tensorflow model

I've trained a Deep Neural Network on the MNIST dataset. Here is the code for training.
n_classes = 10
batch_size = 100
x = tf.placeholder(tf.float32, [None, 784],name='Xx')
y = tf.placeholder(tf.float32,[None,10],name='Yy')
input = 784
n_nodes_1 = 300
n_nodes_2 = 300
def neural_network_model(data):
variables = {'w1':tf.Variable(tf.random_normal([input,n_nodes_1])),
'w2':tf.Variable(tf.random_normal([n_nodes_1,n_nodes_2])),
'w3':tf.Variable(tf.random_normal([n_nodes_2,n_classes])),
'b1':tf.Variable(tf.random_normal([n_nodes_1])),
'b2':tf.Variable(tf.random_normal([n_nodes_2])),
'b3':tf.Variable(tf.random_normal([n_classes]))}
output1 = tf.add(tf.matmul(data,variables['w1']),variables['b1'])
output2 = tf.nn.relu(output1)
output3 = tf.add(tf.matmul(output2, variables['w2']), variables['b2'])
output4 = tf.nn.relu(output3)
output5 = tf.add(tf.matmul(output4, variables['w3']), variables['b3'],name='last')
return output5
def train_neural_network(x):
prediction = neural_network_model(x)
name_of_final_layer = 'fin'
final = tf.nn.softmax_cross_entropy_with_logits_v2(logits=prediction,
labels=y,name=name_of_final_layer)
cost = tf.reduce_mean(final)
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 3
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
epoch_loss = 0
for _ in range(int(mnist.train.num_examples/batch_size)):
epoch_x, epoch_y = mnist.train.next_batch(batch_size)
_,c=sess.run([optimizer,cost],feed_dict={x:epoch_x,y:epoch_y})
epoch_loss += c
print("Epoch",epoch+1,"Completed Total Loss:",epoch_loss)
correct = tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct,'float'))
print('Accuracy on val_set:',accuracy.eval({x:mnist.test.images,y:mnist.test.labels}))
path = saver.save(sess,"net/network")
print("Saved to",path)
Here is my code for evaluating a single datapoint
def eval_neural_network():
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('net/network.meta')
new_saver.restore(sess, "net/network")
sing = np.reshape(mnist.test.images[0],(-1,784))
output = sess.run([y],feed_dict={x:sing})
print(output)
eval_neural_network()
The error that popped up is :
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Yy' with dtype float and shape [?,10]
[[Node: Yy = Placeholder[dtype=DT_FLOAT, shape=[?,10], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I've researched this online for multiple days now and still could not get it to work. Any advice?
The losses are oscillating like this but the predictions don't seem to be bad. It works.
It also extracts the mnist archive repeatedly. Accuracy also can reach 0.98 with a simpler network.
Epoch 1 Completed Total Loss: 47.47844
Accuracy on val_set: 0.8685
Epoch 2 Completed Total Loss: 10.217445
Accuracy on val_set: 0.9
Epoch 3 Completed Total Loss: 14.013474
Accuracy on val_set: 0.9104
[2]
import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data
import numpy as np
import matplotlib.pyplot as plt
n_classes = 10
batch_size = 100
x = tf.placeholder(tf.float32, [None, 784],name='Xx')
y = tf.placeholder(tf.float32,[None,10],name='Yy')
input = 784
n_nodes_1 = 300
n_nodes_2 = 300
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)
def neural_network_model(data):
variables = {'w1':tf.Variable(tf.random_normal([input,n_nodes_1])),
'w2':tf.Variable(tf.random_normal([n_nodes_1,n_nodes_2])),
'w3':tf.Variable(tf.random_normal([n_nodes_2,n_classes])),
'b1':tf.Variable(tf.random_normal([n_nodes_1])),
'b2':tf.Variable(tf.random_normal([n_nodes_2])),
'b3':tf.Variable(tf.random_normal([n_classes]))}
output1 = tf.add(tf.matmul(data,variables['w1']),variables['b1'])
output2 = tf.nn.relu(output1)
output3 = tf.add(tf.matmul(output2, variables['w2']), variables['b2'])
output4 = tf.nn.relu(output3)
output5 = tf.add(tf.matmul(output4, variables['w3']), variables['b3'],name='last')
return output5
def train_neural_network(x):
prediction = neural_network_model(x)
name_of_final_layer = 'fin'
final = tf.nn.softmax_cross_entropy_with_logits_v2(logits=prediction,
labels=y,name=name_of_final_layer)
cost = tf.reduce_mean(final)
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 3
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
for _ in range(int(mnist.train.num_examples/batch_size)):
epoch_x, epoch_y = mnist.train.next_batch(batch_size)
_,c=sess.run([optimizer,cost],feed_dict={x:epoch_x,y:epoch_y})
print("Epoch",epoch+1,"Completed Total Loss:",c)
correct = tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct,'float'))
print('Accuracy on val_set:',accuracy.eval({x:mnist.test.images,y:mnist.test.labels}))
#path = saver.save(sess,"net/network")
#print("Saved to",path)
return prediction
def eval_neural_network(prediction):
with tf.Session() as sess:
new_saver = tf.train.import_meta_graph('net/network.meta')
new_saver.restore(sess, "net/network")
singleprediction = tf.argmax(prediction, 1)
sing = np.reshape(mnist.test.images[1], (-1, 784))
output = singleprediction.eval(feed_dict={x:sing},session=sess)
digit = mnist.test.images[1].reshape((28, 28))
plt.imshow(digit, cmap='gray')
plt.show()
print(output)
prediction = train_neural_network(x)
eval_neural_network(prediction)
This complete example based on tensorflow github worked for me:
(I modified few lines of code by removing name scope for x, keep_prob and changing to tf.placeholder_with_default. There's probably a better way to do this somewhere.
​
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import pandas as pd
import argparse
import sys
import tempfile
​
from tensorflow.examples.tutorials.mnist import input_data
​
import tensorflow as tf
​
FLAGS = None
​
​
def deepnn(x):
"""deepnn builds the graph for a deep net for classifying digits.
​
Args:
x: an input tensor with the dimensions (N_examples, 784), where 784 is the
number of pixels in a standard MNIST image.
​
Returns:
A tuple (y, keep_prob). y is a tensor of shape (N_examples, 10), with values
equal to the logits of classifying the digit into one of 10 classes (the
digits 0-9). keep_prob is a scalar placeholder for the probability of
dropout.
"""
# Reshape to use within a convolutional neural net.
# Last dimension is for "features" - there is only one here, since images are
# grayscale -- it would be 3 for an RGB image, 4 for RGBA, etc.
with tf.name_scope('reshape'):
x_image = tf.reshape(x, [-1, 28, 28, 1])
​
# First convolutional layer - maps one grayscale image to 32 feature maps.
with tf.name_scope('conv1'):
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
​
# Pooling layer - downsamples by 2X.
with tf.name_scope('pool1'):
h_pool1 = max_pool_2x2(h_conv1)
​
# Second convolutional layer -- maps 32 feature maps to 64.
with tf.name_scope('conv2'):
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
​
# Second pooling layer.
with tf.name_scope('pool2'):
h_pool2 = max_pool_2x2(h_conv2)
​
# Fully connected layer 1 -- after 2 round of downsampling, our 28x28 image
# is down to 7x7x64 feature maps -- maps this to 1024 features.
with tf.name_scope('fc1'):
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
​
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
​
# Dropout - controls the complexity of the model, prevents co-adaptation of
# features.
​
keep_prob = tf.placeholder_with_default(1.0,())
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
​
# Map the 1024 features to 10 classes, one for each digit
with tf.name_scope('fc2'):
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
​
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
return y_conv, keep_prob
​
​
def conv2d(x, W):
"""conv2d returns a 2d convolution layer with full stride."""
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
​
​
def max_pool_2x2(x):
"""max_pool_2x2 downsamples a feature map by 2X."""
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
​
​
def weight_variable(shape):
"""weight_variable generates a weight variable of a given shape."""
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
​
​
def bias_variable(shape):
"""bias_variable generates a bias variable of a given shape."""
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
​
​
# Import data
mnist = input_data.read_data_sets("/tmp")
# Create the model
x = tf.placeholder(tf.float32, [None, 784],name='x')
# Define loss and optimizer
y_ = tf.placeholder(tf.int64, [None])
# Build the graph for the deep net
y_conv, keep_prob = deepnn(x)
with tf.name_scope('loss'):
cross_entropy = tf.losses.sparse_softmax_cross_entropy(
labels=y_, logits=y_conv)
cross_entropy = tf.reduce_mean(cross_entropy)
​
with tf.name_scope('adam_optimizer'):
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
​
with tf.name_scope('accuracy'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), y_)
correct_prediction = tf.cast(correct_prediction, tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
​
graph_location = tempfile.mkdtemp()
print('Saving graph to: %s' % graph_location)
train_writer = tf.summary.FileWriter(graph_location)
train_writer.add_graph(tf.get_default_graph())
​
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(1000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
​
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
sing = np.reshape(mnist.test.images[0],(-1,784))
output = sess.run(y_conv,feed_dict={x:sing,keep_prob:1.0})
print(tf.argmax(output,1).eval())
saver = tf.train.Saver()
saver.save(sess,"/tmp/network")
Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
Saving graph to: /tmp/tmp17hf_6c7
step 0, training accuracy 0.2
step 100, training accuracy 0.86
step 200, training accuracy 0.8
step 300, training accuracy 0.94
step 400, training accuracy 0.94
step 500, training accuracy 0.96
step 600, training accuracy 0.88
step 700, training accuracy 0.98
step 800, training accuracy 0.98
step 900, training accuracy 0.98
test accuracy 0.9663
[7]
If you want to restore from a new python run:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import argparse
import sys
import tempfile
from tensorflow.examples.tutorials.mnist import input_data
sess = tf.Session()
saver = tf.train.import_meta_graph('/tmp/network.meta')
saver.restore(sess,tf.train.latest_checkpoint('/tmp'))
graph = tf.get_default_graph()
mnist = input_data.read_data_sets("/tmp")
simg = np.reshape(mnist.test.images[0],(-1,784))
op_to_restore = graph.get_tensor_by_name("fc2/MatMul:0")
x = graph.get_tensor_by_name("x:0")
output = sess.run(op_to_restore,feed_dict= {x:simg})
print("Result = ", np.argmax(output))

Tensorflow accuracy at .99 but predictions awful

Maybe I'm making predictions wrong?
Here's the project... I have a greyscale input image that I am trying to segment. The segmentation is a simple binary classification (think of foreground vs background). So the ground truth (y) is a matrix of 0's and 1's -- so there's 2 classifications. Oh and the input image is a square, so I just use one variable called n_input
My accuracy essentially converges to 0.99 but when I make a prediction I get all zero's. EDIT --> there is a single 1 in each output matrices, both in the same place...
Here's my session code(everything else is working)...
with tf.Session() as sess:
sess.run(init)
summary = tf.train.SummaryWriter('/tmp/logdir/', sess.graph_def)
step = 1
from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data
data = scroll_data.read_data('/home/kendall/Desktop/')
# Keep training until reach max iterations
flag = 0
# while flag == 0:
while step * batch_size < training_iters:
batch_y, batch_x = data.train.next_batch(batch_size)
# pdb.set_trace()
# batch_x = batch_x.reshape((batch_size, n_input))
batch_x = batch_x.reshape((batch_size, n_input, n_input))
batch_y = batch_y.reshape((batch_size, n_input, n_input))
batch_y = convert_to_2_channel(batch_y, batch_size)
# batch_y = batch_y.reshape((batch_size, n_output, n_classes))
batch_y = batch_y.reshape((batch_size, 200, 200, n_classes))
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
keep_prob: dropout})
if step % display_step == 0:
flag = 1
# Calculate batch loss and accuracy
loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
y: batch_y,
keep_prob: 1.})
print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
"{:.6f}".format(loss) + ", Training Accuracy= " + \
"{:.5f}".format(acc)
step += 1
print "Optimization Finished!"
save_path = "model.ckpt"
saver.save(sess, save_path)
im = Image.open('/home/kendall/Desktop/HA900_frames/frame0635.tif')
batch_x = np.array(im)
pdb.set_trace()
batch_x = batch_x.reshape((1, n_input, n_input))
batch_x = batch_x.astype(float)
# pdb.set_trace()
prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.})
print prediction
arr1 = np.empty((n_input,n_input))
arr2 = np.empty((n_input,n_input))
for i in xrange(n_input):
for j in xrange(n_input):
for k in xrange(2):
if k == 0:
arr1[i][j] = prediction[0][i][j][k]
else:
arr2[i][j] = prediction[0][i][j][k]
# prediction = np.asarray(prediction)
# prediction = np.reshape(prediction, (200,200))
# np.savetxt("prediction.csv", prediction, delimiter=",")
np.savetxt("prediction1.csv", arr1, delimiter=",")
np.savetxt("prediction2.csv", arr2, delimiter=",")
Since there are two classifications, that end part (with the couple of loops) is just to partition the prediction into two 2x2 matrices.
I saved the prediction arrays to a CSV file, and like I said, they were all zeros.
I have also confirmed all data is correct (dimensions and values).
Why would the training converge, but predictions are awful?
If you want to look at all the code, here it is...
import tensorflow as tf
import pdb
import numpy as np
from numpy import genfromtxt
from PIL import Image
# Import MINST data
# from tensorflow.examples.tutorials.mnist import input_data
# mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# Parameters
learning_rate = 0.001
training_iters = 20000
batch_size = 128
display_step = 1
# Network Parameters
n_input = 200 # MNIST data input (img shape: 28*28)
n_output = 40000 # MNIST total classes (0-9 digits)
n_classes = 2
#n_input = 200
dropout = 0.75 # Dropout, probability to keep units
# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input, n_input])
y = tf.placeholder(tf.float32, [None, n_input, n_input, n_classes])
keep_prob = tf.placeholder(tf.float32) #dropout (keep probability)
# Create some wrappers for simplicity
def conv2d(x, W, b, strides=1):
# Conv2D wrapper, with bias and relu activation
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
x = tf.nn.bias_add(x, b)
return tf.nn.relu(x)
def maxpool2d(x, k=2):
# MaxPool2D wrapper
return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],
padding='SAME')
# Create model
def conv_net(x, weights, biases, dropout):
# Reshape input picture
x = tf.reshape(x, shape=[-1, n_input, n_input, 1])
# Convolution Layer
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
# Max Pooling (down-sampling)
conv1 = maxpool2d(conv1, k=2)
conv1 = tf.nn.local_response_normalization(conv1)
# Convolution Layer
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
# Max Pooling (down-sampling)
conv2 = tf.nn.local_response_normalization(conv2)
conv2 = maxpool2d(conv2, k=2)
# Convolution Layer
conv3 = conv2d(conv2, weights['wc3'], biases['bc3'])
# Max Pooling (down-sampling)
conv3 = tf.nn.local_response_normalization(conv3)
conv3 = maxpool2d(conv3, k=2)
# pdb.set_trace()
# Fully connected layer
# Reshape conv2 output to fit fully connected layer input
fc1 = tf.reshape(conv3, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
fc1 = tf.nn.relu(fc1)
# Apply Dropout
fc1 = tf.nn.dropout(fc1, dropout)
output = []
for i in xrange(2):
output.append(tf.nn.softmax(tf.add(tf.matmul(fc1, weights['out']), biases['out'])))
return output
# return tf.nn.softmax(tf.add(tf.matmul(fc1, weights['out']), biases['out']))
# Store layers weight & bias
weights = {
# 5x5 conv, 1 input, 32 outputs
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
# 5x5 conv, 32 inputs, 64 outputs
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
# 5x5 conv, 32 inputs, 64 outputs
'wc3': tf.Variable(tf.random_normal([5, 5, 64, 128])),
# fully connected, 7*7*64 inputs, 1024 outputs
'wd1': tf.Variable(tf.random_normal([25*25*128, 1024])),
# 1024 inputs, 10 outputs (class prediction)
'out': tf.Variable(tf.random_normal([1024, n_output]))
}
biases = {
'bc1': tf.Variable(tf.random_normal([32])),
'bc2': tf.Variable(tf.random_normal([64])),
'bc3': tf.Variable(tf.random_normal([128])),
'bd1': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_output]))
}
# Construct model
pred = conv_net(x, weights, biases, keep_prob)
# pdb.set_trace()
pred = tf.pack(tf.transpose(pred,[1,2,0]))
pred = tf.reshape(pred, [-1,n_input,n_input,n_classes])
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Initializing the variables
init = tf.initialize_all_variables()
saver = tf.train.Saver()
def convert_to_2_channel(x, batch_size):
#assume input has dimension (batch_size,x,y)
#output will have dimension (batch_size,x,y,2)
output = np.empty((batch_size, 200, 200, 2))
temp_arr1 = np.empty((batch_size, 200, 200))
temp_arr2 = np.empty((batch_size, 200, 200))
for i in xrange(batch_size):
for j in xrange(200):
for k in xrange(200):
if x[i][j][k] == 1:
temp_arr1[i][j][k] = 1
temp_arr2[i][j][k] = 0
else:
temp_arr1[i][j][k] = 0
temp_arr2[i][j][k] = 1
for i in xrange(batch_size):
for j in xrange(200):
for k in xrange(200):
for l in xrange(2):
if l == 0:
output[i][j][k][l] = temp_arr1[i][j][k]
else:
output[i][j][k][l] = temp_arr2[i][j][k]
return output
# Launch the graph
with tf.Session() as sess:
sess.run(init)
summary = tf.train.SummaryWriter('/tmp/logdir/', sess.graph_def)
step = 1
from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data
data = scroll_data.read_data('/home/kendall/Desktop/')
# Keep training until reach max iterations
flag = 0
# while flag == 0:
while step * batch_size < training_iters:
batch_y, batch_x = data.train.next_batch(batch_size)
# pdb.set_trace()
# batch_x = batch_x.reshape((batch_size, n_input))
batch_x = batch_x.reshape((batch_size, n_input, n_input))
batch_y = batch_y.reshape((batch_size, n_input, n_input))
batch_y = convert_to_2_channel(batch_y, batch_size)
# batch_y = batch_y.reshape((batch_size, n_output, n_classes))
batch_y = batch_y.reshape((batch_size, 200, 200, n_classes))
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
keep_prob: dropout})
if step % display_step == 0:
flag = 1
# Calculate batch loss and accuracy
loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
y: batch_y,
keep_prob: 1.})
print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
"{:.6f}".format(loss) + ", Training Accuracy= " + \
"{:.5f}".format(acc)
step += 1
print "Optimization Finished!"
save_path = "model.ckpt"
saver.save(sess, save_path)
im = Image.open('/home/kendall/Desktop/HA900_frames/frame0635.tif')
batch_x = np.array(im)
pdb.set_trace()
batch_x = batch_x.reshape((1, n_input, n_input))
batch_x = batch_x.astype(float)
# pdb.set_trace()
prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.})
print prediction
arr1 = np.empty((n_input,n_input))
arr2 = np.empty((n_input,n_input))
for i in xrange(n_input):
for j in xrange(n_input):
for k in xrange(2):
if k == 0:
arr1[i][j] = prediction[0][i][j][k]
else:
arr2[i][j] = prediction[0][i][j][k]
# prediction = np.asarray(prediction)
# prediction = np.reshape(prediction, (200,200))
# np.savetxt("prediction.csv", prediction, delimiter=",")
np.savetxt("prediction1.csv", arr1, delimiter=",")
np.savetxt("prediction2.csv", arr2, delimiter=",")
# Calculate accuracy for 256 mnist test images
print "Testing Accuracy:", \
sess.run(accuracy, feed_dict={x: data.test.images[:256],
y: data.test.labels[:256],
keep_prob: 1.})
Errors in the code
There are multiple errors in your code:
you shouldn't call tf.nn.sigmoid_cross_entropy_with_logits with the output of a softmax layer, but with the unscaled logits:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.
in fact since you have 2 classes, you should use a loss with softmax, using tf.nn.softmax_cross_entropy_with_logits
When using tf.argmax(pred, 1), you only apply argmax over axis 1, which is the height of the output image. You should use tf.argmax(pred, 3) on the last axis (of size 2).
This might explain why you get 0.99 accuracy
On the output image, it will take the argmax over the height of the image, which is by default 0 (as all values are equal for each channel)
Wrong model
The biggest drawback is that your model in general will be very hard to optimize.
You have a softmax over 40,000 classes, which is huge.
You do not take advantage at all of the fact that you want to output an image (the prediction foreground / background).
for instance prediction 2,345 is highly correlated with prediction 2,346 and prediction 2,545 but you don't take that into account
I recommend reading a bit about semantic segmentation first:
this paper: Fully Convolutional Networks for Semantic Segmentation
these slides from CS231n (Stanford): especially the part about upsampling and deconvolution
Recommendations
If you want to work with TensorFlow, you will need to start small. First try a very simple network with maybe 1 hidden layer.
You need to plot all the shapes of your tensors to make sure they correspond to what you thought. For instance, if you had plotted tf.argmax(y, 1), you would have realized the shape is [batch_size, 200, 2] instead of the expected [batch_size, 200, 200].
TensorBoard is your friend, you should try to plot the input image here, as well as your predictions to see what they look like.
Try small, with a very small dataset of 10 images and see if you can overfit it and predict almost the exact response.
To conclude, I am not sure of all my suggestions but they are worth trying, and I hope this will help you on the path to success !

Categories