Will unused placeholders in tensorflow affect the result? - python

I had an unused placeholder in my code which was designed for further purposes. However, I found adding or not of it will affect the cost for each epoch and event the final result (e.g. accuracy, auc). I have checked that it wasn't connected to the computational graph since the placeholder was added on top of a complete program and it appears only once in the code. Since my project is a bit huge to reproduce, I tried to reproduce it using a simple code piece copied from a blog https://adventuresinmachinelearning.com/python-tensorflow-tutorial/.
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
tf.random.set_random_seed(1234)
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Python optimisation variables
learning_rate = 0.5
epochs = 10
batch_size = 100
# declare the training data placeholders
# input x - for 28 x 28 pixels = 784
x = tf.placeholder(tf.float32, [None, 784])
# now declare the output data placeholder - 10 digits
y = tf.placeholder(tf.float32, [None, 10])
# zz = tf.placeholder(tf.float32, name='zz') <--- the added placeholder
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')
# calculate the output of the hidden layer
hidden_out = tf.add(tf.matmul(x, W1), b1)
hidden_out = tf.nn.relu(hidden_out)
# now calculate the hidden layer output - in this case, let's use a softmax activated
# output layer
y_ = tf.nn.softmax(tf.add(tf.matmul(hidden_out, W2), b2))
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y * tf.log(y_clipped)
+ (1 - y) * tf.log(1 - y_clipped), axis=1))
# add an optimiser
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
# finally setup the initialisation operator
init_op = tf.global_variables_initializer()
# define an accuracy assessment operation
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# start the session
with tf.Session() as sess:
# initialise the variables
sess.run(init_op)
total_batch = int(len(mnist.train.labels) / batch_size)
for epoch in range(epochs):
avg_cost = 0
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
_, c = sess.run([optimiser, cross_entropy],
feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))
I used tf.random.set_random_seed(1234) to ensure that the random initialization of each round is the same (not sure if it can be done but the results are consistent across multiple runs).
By adding the placeholder, the cost for epoch 1 would be 0.000, while by removing it, the cost would be 0.001.
Although in this simple example, the final accuracy stays the same (accuracy=0.9783), in my code, this operation affects a lot (adding the placeholder actually improves the AUC value for a binary classification problem). And if I changed the type of this placeholder, e.g. from placeholder to sparse_placeholder, the cost would also change (accuracy=0.9772 in this example). So I am wondering why the unused placeholder also has come contributions to the result, and how can I debug such a problem?
Thanks in advance! Anyone can directly copy paste and use python + name to run the code after the installation of TensorFlow 1.12+ and observe the differences by uncommenting the placeholder zz or not.
Plus: I see this post with similar issue and am reading thru https://github.com/tensorflow/tensorflow/issues/19171

Related

Weird Error with TensorFlow 2.0 Being Incompatible with TensorFlow 1.0

I am testing some TensorFlow code; I'm seeing this error:
AttributeError: module 'tensorflow' has no attribute 'variable_scope'
I am running TensorFlow version 2.1.0.
Here is the code that I am testing.
# imports
import os
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Input data:
# For this tutorial we use the MNIST dataset. MNIST is a dataset of handwritten digits. If you are into machine learning, you might have heard of this dataset by now. MNIST is kind of benchmark of datasets for deep learning. One other reason that we use the MNIST is that it is easily accesible through Tensorflow. If you want to know more about the MNIST dataset you can check Yann Lecun's website. We can easily import the dataset and see the size of training, test and validation set:
# Import MNIST data
# from tensorflow.examples.tutorials.mnist import input_data
#import tensorflow_datasets as tfds
# Construct a tf.data.Dataset
#mnist = tfds.load(name="mnist", split=tfds.Split.TRAIN)
mnist = tf.keras.datasets.mnist
#mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
#print("Size of:")
#print("- Training-set:\t\t{}".format(len(mnist.train.labels)))
#print("- Test-set:\t\t{}".format(len(mnist.test.labels)))
#print("- Validation-set:\t{}".format(len(mnist.validation.labels)))
# hyper-parameters
logs_path = "C:/Users/ryans/MNIST_data/logs/embedding/" # path to the folder that we want to save the logs for Tensorboard
learning_rate = 0.001 # The optimization learning rate
epochs = 10 # Total number of training epochs
batch_size = 100 # Training batch size
display_freq = 100 # Frequency of displaying the training results
# Network Parameters
# We know that MNIST images are 28 pixels in each dimension.
img_h = img_w = 28
# Images are stored in one-dimensional arrays of this length.
img_size_flat = img_h * img_w
# Number of classes, one class for each of 10 digits.
n_classes = 10
# number of units in the first hidden layer
h1 = 200
# Graph:
# Like before, we start by constructing the graph. But, we need to define some functions that we need rapidly in our code.
# weight and bais wrappers
def weight_variable(name, shape):
"""
Create a weight variable with appropriate initialization
:param name: weight name
:param shape: weight shape
:return: initialized weight variable
"""
initer = tf.truncated_normal_initializer(stddev=0.01)
return tf.get_variable('W_' + name,
dtype=tf.float32,
shape=shape,
initializer=initer)
def bias_variable(name, shape):
"""
Create a bias variable with appropriate initialization
:param name: bias variable name
:param shape: bias variable shape
:return: initialized bias variable
"""
initial = tf.constant(0., shape=shape, dtype=tf.float32)
return tf.get_variable('b_' + name,
dtype=tf.float32,
initializer=initial)
def fc_layer(x, num_units, name, use_relu=True):
"""
Create a fully-connected layer
:param x: input from previous layer
:param num_units: number of hidden units in the fully-connected layer
:param name: layer name
:param use_relu: boolean to add ReLU non-linearity (or not)
:return: The output array
"""
with tf.variable_scope(name):
in_dim = x.get_shape()[1]
W = weight_variable(name, shape=[in_dim, num_units])
tf.summary.histogram('W', W)
b = bias_variable(name, [num_units])
tf.summary.histogram('b', b)
layer = tf.matmul(x, W)
layer += b
if use_relu:
layer = tf.nn.relu(layer)
return layer
# Now that we have our helper functions we can create our graph.
# Create graph
# Placeholders for inputs (x), outputs(y)
with tf.compat.v1.variable_scope('Input'):
x = tf.compat.v1.placeholder(tf.float32, shape=[None, img_size_flat], name='X')
tf.summary.image('input_image', tf.reshape(x, (-1, img_w, img_h, 1)), max_outputs=5)
y = tf.compat.v1.placeholder(tf.float32, shape=[None, n_classes], name='Y')
fc1 = fc_layer(x, h1, 'Hidden_layer', use_relu=True)
output_logits = fc_layer(fc1, n_classes, 'Output_layer', use_relu=False)
# Define the loss function, optimizer, and accuracy
with tf.compat.v1.variable_scope('Train'):
with tf.compat.v1.variable_scope('Loss'):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=output_logits), name='loss')
tf.summary.scalar('loss', loss)
with tf.compat.v1.variable_scope('Optimizer'):
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, name='Adam-op').minimize(loss)
with tf.compat.v1.variable_scope('Accuracy'):
correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name='correct_pred')
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name='accuracy')
tf.summary.scalar('accuracy', accuracy)
# Network predictions
cls_prediction = tf.argmax(output_logits, axis=1, name='predictions')
# Initializing the variables
init = tf.global_variables_initializer()
merged = tf.summary.merge_all()
# Session:
# Launch the graph (session)
sess = tf.InteractiveSession() # using InteractiveSession instead of Session to test network in separate cell
sess.run(init)
train_writer = tf.summary.FileWriter(logs_path, sess.graph)
num_tr_iter = int(mnist.train.num_examples / batch_size)
global_step = 0
for epoch in range(epochs):
print('Training epoch: {}'.format(epoch + 1))
for iteration in range(num_tr_iter):
batch_x, batch_y = mnist.train.next_batch(batch_size)
global_step += 1
# Run optimization op (backprop)
feed_dict_batch = {x: batch_x, y: batch_y}
_, summary_tr = sess.run([optimizer, merged], feed_dict=feed_dict_batch)
train_writer.add_summary(summary_tr, global_step)
if iteration % display_freq == 0:
# Calculate and display the batch loss and accuracy
loss_batch, acc_batch = sess.run([loss, accuracy],
feed_dict=feed_dict_batch)
print("iter {0:3d}:\t Loss={1:.2f},\tTraining Accuracy={2:.01%}".
format(iteration, loss_batch, acc_batch))
# Run validation after every epoch
feed_dict_valid = {x: mnist.validation.images, y: mnist.validation.labels}
loss_valid, acc_valid = sess.run([loss, accuracy], feed_dict=feed_dict_valid)
print('---------------------------------------------------------')
print("Epoch: {0}, validation loss: {1:.2f}, validation accuracy: {2:.01%}".
format(epoch + 1, loss_valid, acc_valid))
print('---------------------------------------------------------')
I think the code was designed for an earlier version of TensorFlow. I made a few small modifications to get the code to run on my laptop. Here's the part that I am struggling with.
# Placeholders for inputs (x), outputs(y)
with tf.compat.v1.variable_scope('Input'):
x = tf.compat.v1.placeholder(tf.float32, shape=[None, img_size_flat], name='X')
tf.summary.image('input_image', tf.reshape(x, (-1, img_w, img_h, 1)), max_outputs=5)
y = tf.compat.v1.placeholder(tf.float32, shape=[None, n_classes], name='Y')
fc1 = fc_layer(x, h1, 'Hidden_layer', use_relu=True)
output_logits = fc_layer(fc1, n_classes, 'Output_layer', use_relu=False)
The 'with' statement runs, but I am getting an error on this line:
fc1 = fc_layer(x, h1, 'Hidden_layer', use_relu=True)
I thought the change to 'tf.compat.v1' would oversome the issue of different TensorFlow versions, but apparently not.
I found the code sample here.
https://www.easy-tensorflow.com/tf-tutorials/tensorboard/tb-embedding-visualization
As placeholder is removed from tensorflow 2.0, compat.v1 must be used. However, another problem is incompatibility and can be solved by using tf.compat.v1.disable_eager_execution() before with tf.compat.v1.variable_scope(...):
In a way, you can turn on the eager execution by calling tf.compat.v1.enable_eager_execution
You may check https://www.tensorflow.org/guide/migrate

Neural Network with Tensorflow doesn't update weights/bias

Problem
I'm trying to classify some 64x64 images as a black box exercise. The NN I have written doesn't change my weights. First time writing something like this, the same code, but on MNIST letters input works just fine, but on this code it does not train like it should:
import tensorflow as tf
import numpy as np
path = ""
# x is a holder for the 64x64 image
x = tf.placeholder(tf.float32, shape=[None, 4096])
# y_ is a 1 element vector, containing the predicted probability of the label
y_ = tf.placeholder(tf.float32, [None, 1])
# define weights and balances
W = tf.Variable(tf.zeros([4096, 1]))
b = tf.Variable(tf.zeros([1]))
# define our model
y = tf.nn.softmax(tf.matmul(x, W) + b)
# loss is cross entropy
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
# each training step in gradient decent we want to minimize cross entropy
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
train_labels = np.reshape(np.genfromtxt(path + "train_labels.csv", delimiter=',', skip_header=1), (14999, 1))
train_data = np.genfromtxt(path + "train_samples.csv", delimiter=',', skip_header=1)
# perform 150 training steps with each taking 100 train data
for i in range(0, 15000, 100):
sess.run(train_step, feed_dict={x: train_data[i:i+100], y_: train_labels[i:i+100]})
if i % 500 == 0:
print(sess.run(cross_entropy, feed_dict={x: train_data[i:i+100], y_: train_labels[i:i+100]}))
print(sess.run(b), sess.run(W))
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.close()
How do I solve this problem?
The key to the problem is that the class number of you output y_ and y is 1.You should adopt one-hot mode when you use tf.nn.softmax_cross_entropy_with_logits on classification problems in tensorflow. tf.nn.softmax_cross_entropy_with_logits will first compute tf.nn.softmax. When your class number is 1, your results are all the same. For example:
import tensorflow as tf
y = tf.constant([[1],[0],[1]],dtype=tf.float32)
y_ = tf.constant([[1],[2],[3]],dtype=tf.float32)
softmax_var = tf.nn.softmax(logits=y_)
cross_entropy = tf.multiply(y, tf.log(softmax_var))
errors = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)
with tf.Session() as sess:
print(sess.run(softmax_var))
print(sess.run(cross_entropy))
print(sess.run(errors))
[[1.]
[1.]
[1.]]
[[0.]
[0.]
[0.]]
[0. 0. 0.]
This means that no matter what your output y_, your loss will be zero. So your weights and bias haven't been updated.
The solution is to modify the class number of y_ and y.
I suppose your class number is n.
First approch:You can change data to one-hot before feed data.Then use the following code.
y_ = tf.placeholder(tf.float32, [None, n])
W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))
Second approch:change data to one-hot after feed data.
y_ = tf.placeholder(tf.int32, [None, 1])
y_ = tf.one_hot(y_,n) # your dtype of y_ need to be tf.int32
W = tf.Variable(tf.zeros([4096, n]))
b = tf.Variable(tf.zeros([n]))
All your initial weights are zeros. When you have that way, the NN doesn't learn well. You need to initialize all the initial weights with random values.
See Why should weights of Neural Networks be initialized to random numbers?
"Why Not Set Weights to Zero?
We can use the same set of weights each time we train the network; for example, you could use the values of 0.0 for all weights.
In this case, the equations of the learning algorithm would fail to make any changes to the network weights, and the model will be stuck. It is important to note that the bias weight in each neuron is set to zero by default, not a small random value.
"
See
https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/

Constructing a Neural Network for Multiple Outputs

My input data is as follows:
AT V AP RH PE
14.96 41.76 1024.07 73.17 463.26
25.18 62.96 1020.04 59.08 444.37
5.11 39.4 1012.16 92.14 488.56
20.86 57.32 1010.24 76.64 446.48
10.82 37.5 1009.23 96.62 473.9
26.27 59.44 1012.23 58.77 443.67
15.89 43.96 1014.02 75.24 467.35
9.48 44.71 1019.12 66.43 478.42
14.64 45 1021.78 41.25 475.98
....................................
I am basically working on Python using Tensorflow Library.
As of now,I have a linear model,which is working fine for 4 inputs and 1 output.This is basically a regression problem.
For e.g: After training my neural network with sufficient data(say if the size of data is some 10000), then while training my neural network,if I am passing the values 45,30,25,32,as inputs , it is returning the value 46 as Output.
I basically have two queries:
As of now, in my code, I am using the parameters
training_epochs , learning_rate etc. I am as of now giving the
value of training_epochs as 10000.So, when I am testing my neural
network by passing four input values, I am getting the output as
some 471.25, while I expect it to be 460.But if I am giving the
value of training_epochs as 20000, instead of 10000, I am getting
my output value as 120.5, which is not at all close when compared to
the actual value "460".
Can you please explain, how can one chose the values of training_epochs and learning_rate(or any other parameter values) in my code, so that I can get good accuracy.
Now, the second issue is, my neural network as of now is working
only for linear data as well as only for 1 output. If I want to have
3 inputs and 2 outputs and also a non-linear model, what are the
possible changes I can make in my code?
I am posting my code below:
import tensorflow as tf
import numpy as np
import pandas as pd
#import matplotlib.pyplot as plt
rng = np.random
# In[180]:
# Parameters
learning_rate = 0.01
training_epochs = 10000
display_step = 1000
# In[171]:
# Read data from CSV
df = pd.read_csv("H:\MiniThessis\Sample.csv")
# In[173]:
# Seperating out dependent & independent variable
train_x = df[['AT','V','AP','RH']]
train_y = df[['PE']]
trainx = train_x.as_matrix().astype(np.float32)
trainy = train_y.as_matrix().astype(np.float32)
# In[174]:
n_input = 4
n_classes = 1
n_hidden_1 = 5
n_samples = 9569
# tf Graph Input
#Inserts a placeholder for a tensor that will be always fed.
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
# Set model weights
W_h1 = tf.Variable(tf.random_normal([n_input, n_hidden_1]))
W_out = tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
b_h1 = tf.Variable(tf.random_normal([n_hidden_1]))
b_out = tf.Variable(tf.random_normal([n_classes]))
# In[175]:
# Construct a linear model
layer_1 = tf.matmul(x, W_h1) + b_h1
layer_1 = tf.nn.relu(layer_1)
out_layer = tf.matmul(layer_1, W_out) + b_out
# In[176]:
# Mean squared error
cost = tf.reduce_sum(tf.pow(out_layer-y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
# In[177]:
# Initializing the variables
init = tf.global_variables_initializer()
# In[181]:
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Fit all training data
for epoch in range(training_epochs):
_, c = sess.run([optimizer, cost], feed_dict={x: trainx,y: trainy})
# Display logs per epoch step
if (epoch+1) % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c))
print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={x: trainx,y: trainy})
print(training_cost)
correct_prediction = tf.equal(tf.argmax(out_layer, 1), tf.argmax(y, 1))
best = sess.run([out_layer], feed_dict=
{x:np.array([[14.96,41.76,1024.07,73.17]])})
print(correct_prediction)
print(best)
1.you can adjust these following lines;
# In general baises are either initialized as zeros or not zero constant, but not Gaussian
b_h1 = tf.Variable(tf.zeros([n_hidden_1]))
b_out = tf.Variable(tf.zeros([n_classes]))
# MSE error
cost = tf.reduce_mean(tf.pow(out_layer-y, 2))/(2*n_samples)
Also, Feed the data as mini batches; as the optimizer you are using is tuned for minibatch optimization; feeding the data as a whole doesn't result in optimal performance.
2.
for multiple ouputs you need to change only the n_classes and the cost fucntion (tf.nn.softmax_cross_entropy_with_logits). Also the model you defined here isn't linear; as you are using the non linear activation function tf.nn.relu.

Simple Tensorflow Multilayer Neural Network Not Learning

I am trying to write a two layer neural network to train a class labeler. The input to the network is a 150-feature list of about 1000 examples; all features on all examples have been L2 normalized.
I only have two outputs, and they should be disjoint--I am just attempting to predict whether the example is a one or a zero.
My code is relatively simple; I am feeding the input data into the hidden layer, and then the hidden layer into the output. As I really just want to see this working in action, I am training on the entire data set with each step.
My code is below. Based on the other NN implementations I have referred to, I believe that the performance of this network should be improving over time. However, regardless of the number of epochs I set, I am getting back an accuracy of about ~20%. The accuracy is not changing when the number of steps are changed, so I don't believe that my weights and biases are being updated.
Is there something obvious I am missing with my model? Thanks!
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
inputs = np.random.normal(size=[1000,150]).astype('float32')*1.5
label = np.round(np.random.uniform(low=0,high=1,size=[1000,1])*0.8)
reverse_label = 1-label
labels = np.append(label,reverse_label,1)
# parameters
learn_rate = 0.01
epochs = 200
n_input = 150
n_hidden = 75
n_output = 2
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden]))
b1 = tf.Variable(tf.truncated_normal([n_output]))
w0 = tf.Variable(tf.truncated_normal([n_input,n_hidden]))
w1 = tf.Variable(tf.truncated_normal([n_hidden,n_output]))
# step function
def returnPred(x,w0,w1,b0,b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h #return the first response vector from the
y_ = returnPred(x,w0,w1,b0,b1) # predict operation
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y) # calculate loss between prediction and actual
model = tf.train.GradientDescentOptimizer(learning_rate=learn_rate).minimize(loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) #initialize graph
for step in range(0,epochs):
sess.run(model,feed_dict={x: inputs, y: labels }) #train model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
I changed your optimizer to AdamOptimizer (in many cases it performs better than GradientDescentOptimizer).
I also played a bit with the parameters. In particular, I took smaller std for your variable initialization, decreased learning rate (as your loss was unstable and "jumped around") and increased epochs (as I noticed that your loss continues to decrease).
I also reduced the size of the hidden layer. It is harder to train networks with large hidden layer when you don't have that much data.
Regarding your loss, it is better to apply tf.reduce_mean on it so that loss would be a number. In addition, following the answer of ml4294, I used softmax instead of sigmoid, so the loss looks like:
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_,labels=y))
The code below achieves accuracy of around 99.9% on the training data:
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
inputs = np.random.normal(size=[1000,150]).astype('float32')*1.5
label = np.round(np.random.uniform(low=0,high=1,size=[1000,1])*0.8)
reverse_label = 1-label
labels = np.append(label,reverse_label,1)
# parameters
learn_rate = 0.002
epochs = 400
n_input = 150
n_hidden = 60
n_output = 2
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden],stddev=0.2,seed=0))
b1 = tf.Variable(tf.truncated_normal([n_output],stddev=0.2,seed=0))
w0 = tf.Variable(tf.truncated_normal([n_input,n_hidden],stddev=0.2,seed=0))
w1 = tf.Variable(tf.truncated_normal([n_hidden,n_output],stddev=0.2,seed=0))
# step function
def returnPred(x,w0,w1,b0,b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h #return the first response vector from the
y_ = returnPred(x,w0,w1,b0,b1) # predict operation
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_,labels=y)) # calculate loss between prediction and actual
model = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) #initialize graph
for step in range(0,epochs):
sess.run([model,loss],feed_dict={x: inputs, y: labels }) #train model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
Just a suggestion in addition to the answer provided by Miriam Farber:
You use a multi-dimensional output label ([0., 1.]) for the classification. I suggest to use the softmax cross entropy tf.nn.softmax_cross_entropy_with_logits() instead of the sigmoid cross entropy, since you assume the outputs to be disjoint softmax on Wikipedia. I achieved much faster convergence with this small modification.
This should also improve your performance once you decide to increase your output dimensionality from 2 to a higher number.
I guess you have some problem here:
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y) # calculate loss between prediction and actual
It should look smth like that:
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y))
Did't look at you code much, so if this would't work out you can check udacity deep learning course or forum they have good samples of that are you trying to do.
GL

How to use a trained model on different inputs

I implemented a relatively straightforward logistic regression function. I save all the necessary variables such as weights, bias, x, y, etc. and then I run the training algorithm...
# launch the graph
with tf.Session() as sess:
sess.run(init)
# training cycle
for epoch in range(FLAGS.training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
# loop over all batches
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})
# compute average loss
avg_cost += c / total_batch
# display logs per epoch step
if (epoch + 1) % FLAGS.display_step == 0:
print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost))
save_path = saver.save(sess, "/tmp/model.ckpt")
The model is saved and the prediction and accuracy of the trained model is displayed...
# list of booleans to determine the correct predictions
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
print(correct_prediction.eval({x:mnist.test.images, y:mnist.test.labels}))
# calculate total accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
This is all fine and dandy. However, now I want to be able to predict any given image using the trained model. For example, I want to feed it picture of say 7 and see what it predicts it to be.
I have another module that restores the model. First we load the variables...
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)
# initializing the variables
init = tf.global_variables_initializer()
saver = tf.train.Saver()
with tf.Session() as sess:
save.restore(sess, "/tmp/model.ckpt")
This is good. Now I want to compare one image to the model and get a prediction. In this example, I take the first image from the test dataset mnist.test.images[0] and I attempt to compare it to the model.
classification = sess.run(tf.argmax(pred, 1), feed_dict={x: mnist.test.images[0]})
print(classification)
I know this will not work. I get the error...
ValueError: Cannot feed value of shape (784,) for Tensor 'Placeholder:0', which has shape '(?, 784)'
I am at a loss for ideas. This question is rather long, if a straightforward answer is not possible, some guidance as to the steps I may take to do this is appreciated.
Your input placeholder must be of size (?, 784), the question mark meaning variable size which is probably the batch size. You are feeding an input of size (784,) which does not work as the error message states.
In your case, during prediction time, the batch size is just 1, so the following should work:
import numpy as np
...
x_in = np.expand_dims(mnist.test.images[0], axis=0)
classification = sess.run(tf.argmax(pred, 1), feed_dict={x:x_in})
Assuming that the input image is available as a numpy array. If it is already a tensor, the corresponding function is tf.expand_dims(..).

Categories