Why is my graph a crazy flickering monster?

Why is my graph a crazy flickering monster? - python

I am trying to create a graph showing the correlation between mini batch accuracy and validation accuracy of a neural net.
But instead, I have a crazy graph that is flickering at a super high frequency and is zoomed in on a very small portion of the graph.
Here is my code:
num_nodes=1024
batch_size = 128
beta = 0.01
def animate(i):
graph_data = open('NeuralNetData.txt','r').read()
lines = graph_data.split('\n')
xs = []
ys = []
for line in lines:
if len(line) > 1:
x, y = line.split(',')
xs.append(x)
ys.append(y)
ax1.clear()
ax1.plot(xs, ys,label='validation accuracy')
ax1.legend(loc='lower right')
ax1.set_ylabel("Accuracy(%)", fontsize=15)
ax1.set_xlabel("Images Seen", fontsize=15)
ax1.set_title("Neural Network Accuracy Data\nStochastic Gradient Descent", fontsize=10)
plt.show()
def animate2(i):
graph_data = open('NeuralNetData2.txt','r').read()
lines = graph_data.split('\n')
xs = []
ys = []
for line in lines:
if len(line) > 1:
x, y = line.split(',')
xs.append(x)
ys.append(y)
ax1.plot(xs, ys, label='mini-batch accuracy')
ax1.legend(loc='lower right')
plt.tight_layout()
plt.show()
style.use('fivethirtyeight')
#Creating Graph
fig = plt.figure(figsize=(50,50))
ax1 = fig.add_subplot(1,1,1)
#1 hidden layer using RELUs and trying regularization techniques
with graph.as_default():
# Input data. For the training data, we use a placeholder that will be fed
# at run time with a training minibatch.
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
tf_valid_dataset = tf.constant(valid_dataset)
tf_test_dataset = tf.constant(test_dataset)
# Variables.
weights_1 = tf.Variable(tf.truncated_normal([image_size * image_size, num_nodes]))
biases_1 = tf.Variable(tf.zeros([num_nodes]))
weights_2 = tf.Variable(tf.truncated_normal([num_nodes, num_labels]))
biases_2 = tf.Variable(tf.zeros([num_labels]))
# Training computation.
logits_1 = tf.matmul(tf_train_dataset, weights_1) + biases_1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, weights_2) + biases_2
# Normal loss function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_2, labels=tf_train_labels))
# Loss function with L2 Regularization with beta=0.01
regularizers = tf.nn.l2_loss(weights_1) + tf.nn.l2_loss(weights_2)
loss = tf.reduce_mean(loss + beta * regularizers)
# Optimizer.
optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
# Predictions for the training
train_prediction = tf.nn.softmax(logits_2)
# Predictions for validation
logits_1 = tf.matmul(tf_valid_dataset, weights_1) + biases_1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, weights_2) + biases_2
valid_prediction = tf.nn.softmax(logits_2)
# Predictions for test
logits_1 = tf.matmul(tf_test_dataset, weights_1) + biases_1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, weights_2) + biases_2
test_prediction = tf.nn.softmax(logits_2)
num_steps = 3001
open("NeuralNetData.txt","w").close()
open("NeuralNetData.txt","a+")
open("NeuralNetData2.txt","w+").close()
open("NeuralNetData2.txt","a+")
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print("Initialized")
for step in range(num_steps):
f= open("NeuralNetData.txt","a")
t= open("NeuralNetData2.txt","a")
# Pick an offset within the training data, which has been randomized.
# Note: we could use better randomization across epochs.
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
images_seen = step* batch_size
# Generate a minibatch.
batch_data = train_dataset[offset:(offset + batch_size), :]
batch_labels = train_labels[offset:(offset + batch_size), :]
# Prepare a dictionary telling the session where to feed the minibatch.
# The key of the dictionary is the placeholder node of the graph to be fed,
# and the value is the numpy array to feed to it.
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
if (images_seen % 1000 == 0):
print("Minibatch loss at step {}: {}".format(step, l))
print("Minibatch accuracy: {:.1f}".format(accuracy(predictions, batch_labels)))
print("Validation accuracy: {:.1f}".format(accuracy(valid_prediction.eval(), valid_labels)))
x=str(images_seen)
y=str(accuracy(valid_prediction.eval(), valid_labels))
f.write(x+','+y+'\n')
f.close()
r=str(accuracy(predictions, batch_labels))
t.write(x+','+r+'\n')
t.close()
ani = animation.FuncAnimation(fig, animate, interval=1000)
ani2 = animation.FuncAnimation(fig, animate2, interval=1000)
print("Test accuracy: {:.1f}".format(accuracy(test_prediction.eval(), test_labels)))

First, don't call plt.show() inside an updating function that is called by FuncAnimation. Instead it should probably called exactly once at the end of the script.
Second, it seems you are using two different FuncAnimations which work on the same axes (ax1). One of those is clearing that axes. So what may happen is that the plot is updated by one function while it is cleared by the other - the outcome is probably close to chaos.
Third, you are creating 6002 FuncAnimations instead of only one or two. Each of them will operate on the same axes. So if the above already produced chaos, this will produce 6002 times chaos.

Related

Weird decision boundary using neural net in Tensorflow

I have generated a balanced dataset of 4000 examples, 2000 for the negative class and 2000 for the positive one. Then, I've build a neural net with one single hidden layer and 3 neurons with a ReLU activation function and an output layer with a sigmoid. The cost function is a standard cross-entropy function and I chose Adam as optimizer. Using minibatches of 15 examples, after 1000 epochs of running the final accuracy 96.37%, so I am assuming that the model is doing well on the test set. But when I want to display the decision boundary, that's what I get:
I cannot figure out if the problem is a code error or the model just needs mode training. Script I'm using for this:
# implement a neural network that finds a decision boundary under a
constraint on the second hidden layer with tensorflow
import numpy as np
from sklearn.utils import shuffle
from sklearn.preprocessing import normalize
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tf_utils import random_mini_batches
import matplotlib.pyplot as plt
def generate_dataset():
np.random.seed(2)
# positive class samples
d1_x = np.random.normal(5, 10, 1000)
d1_y = np.random.normal(5, 2, 1000)
d2_x = np.random.normal(40, 20, 1000)
d2_y = np.random.normal(2, 1, 1000)
# negative class samples
d3_x = np.random.normal(60, 5, 2000)
d3_y = np.random.normal(10, 1, 2000)
plt.scatter(d1_x, d1_y, color='b')
plt.scatter(d2_x, d2_y, color='b')
plt.scatter(d3_x, d3_y, color='r')
Y = np.zeros((4000, 1))
d_x = np.concatenate([d1_x, d2_x, d3_x])
d_y = np.concatenate([d1_y, d2_y, d3_y])
d_x = d_x.reshape(d_x.shape[0], 1)
d_y = d_y.reshape(d_y.shape[0], 1)
X = np.concatenate([d_x, d_y], axis=1)
Y[2000:] = 1
return X, Y
# define a tensorflow model 5-3-1 with two hideen layers and the output
being scalar
costs = []
print_cost = True
learning_rate = .0009
minibatch_size = 15
num_epochs = 1000
XX, YY = generate_dataset()
XX, YY = shuffle(XX, YY)
X_norm = normalize(XX)
X_train, X_test, y_train, y_test = train_test_split(X_norm, YY,
test_size=0.2, random_state=42)
X_train = np.transpose(X_train)
y_train = np.transpose(y_train)
X_test = np.transpose(X_test)
y_test = np.transpose(y_test)
# define train and test sets
m = XX.shape[1] # input dimension
n = YY.shape[1] # output dimension
X = tf.placeholder(tf.float32, shape = [m, None], name = 'X')
y = tf.placeholder(tf.float32, shape = [n, None], name = 'y')
# model parameters
n1 = 3 # output dimension of the first hidden layer
#n2 = 4 # output dimension of the second hidden layer
#n3 = 2
W1 = tf.get_variable("W1", [n1, m],
initializer=tf.contrib.layers.xavier_initializer(seed=1))
b1 = tf.get_variable("b1", [n1 ,1], initializer=tf.zeros_initializer)
#W2 = tf.get_variable("W2", [n2, n1],
initializer=tf.contrib.layers.xavier_initializer(seed=1))
#b2 = tf.get_variable("b2", [n2, 1], initializer=tf.zeros_initializer)
#W3 = tf.get_variable("W3", [n3, n2],
initializer=tf.contrib.layers.xavier_initializer(seed=1))
#b3 = tf.get_variable("b3", [n3, 1], initializer=tf.zeros_initializer)
W4 = tf.get_variable("W4", [n, n1],
initializer=tf.contrib.layers.xavier_initializer(seed=1))
b4 = tf.get_variable("b4", [n, 1], initializer=tf.zeros_initializer)
# forward propagation
z1 = tf.add(tf.matmul(W1, X), b1)
a1 = tf.nn.relu(z1)
#z2 = tf.add(tf.matmul(W2, a1), b2)
#a2 = tf.nn.relu(z2)
#z3 = tf.add(tf.matmul(W3, a2), b3)
#a3 = tf.nn.relu(z3)
z4 = tf.add(tf.matmul(W4, a1), b4)
pred = tf.nn.sigmoid(z4)
# cost function
cost = tf.reduce_mean(tf.losses.log_loss(labels=y, predictions=pred)) #
logit is the probability estimate given by the model --> this is what is used inside the formula, not the net input z
# ADAM optimizer
optimizer =
tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# metrics
correct_prediction = tf.less_equal(tf.abs(pred - y), 0.5)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
init = tf.global_variables_initializer()
with tf.Session() as sess:
seed = 1
sess.run(init)
for epoch in range(num_epochs):
epoch_cost = 0
seed += 1
num_minibatches = int(X_train.shape[0] / minibatch_size)
minibatches = random_mini_batches(X_train, y_train, minibatch_size, seed)
for minibatch in minibatches:
(minibatch_X, minibatch_Y) = minibatch
_, minibatch_cost = sess.run([optimizer, cost], feed_dict={X:minibatch_X, y:minibatch_Y})
epoch_cost += minibatch_cost / minibatch_size
# Print the cost every epoch
if print_cost == True and epoch % 100 == 0:
print("Cost after epoch %i: %f" % (epoch, epoch_cost))
if print_cost == True and epoch % minibatch_size == 0:
costs.append(epoch_cost)
#plt.plot(costs)
#plt.show()
cp, val_accuracy = sess.run([correct_prediction, accuracy], feed_dict={X: X_test, y: y_test})
# plot the cost
# plt.plot(np.squeeze(costs))
# plt.ylabel('cost'), feed_dict={X: X_test, y: y_test})
# plt.xlabel('iterations (per fives)')
# plt.title("Learning rate =" + str(learning_rate))
# plt.show()
cmap = plt.get_cmap('Paired')
# Define region of interest by data limits
xmin, xmax = min(XX[:, 0]) - 1, max(XX[:, 0]) + 1
ymin, ymax = min(XX[:, 1]) - 1, max(XX[:, 1]) + 1
steps = 100
x_span = np.linspace(xmin, xmax, steps)
y_span = np.linspace(ymin, ymax, steps)
xx, yy = np.meshgrid(x_span, y_span)
A = np.concatenate([[xx.ravel()], [yy.ravel()]], axis=0)
A = normalize(A, axis=0)
# Make predictions across region of interest
predictions = sess.run(pred, feed_dict={X: A})
# Plot decision boundary in region of interest
z = predictions.reshape(xx.shape)
plt.contourf(xx, yy, z, cmap=cmap, alpha=.5)
plt.show()
# Get predicted labels on training data and plot
#train_labels = model.predict(X)
#ax.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap, lw=0)

TensorFlow model's cost is constantly at 0

So I've been learning TensorFlow with this Computer Vision project and I'm not sure if I understand it well enough. I think I got the session part right, although graph seems to be the issue here. Here is my code:
def model_train(placeholder_dimensions, filter_dimensions, strides, learning_rate, num_epochs, minibatch_size, print_cost = True):
# for training purposes
tf.reset_default_graph()
# create datasets
train_set, test_set = load_dataset() custom function and and custom made dataset
X_train = np.array([ex[0] for ex in train_set])
Y_train = np.array([ex[1] for ex in train_set])
X_test = np.array([ex[0] for ex in test_set])
Y_test = np.array([ex[1] for ex in test_set])
#convert to one-hot encodings
Y_train = tf.one_hot(Y_train, depth = 10)
Y_test = tf.one_hot(Y_test, depth = 10)
m = len(train_set)
costs = []
tf.reset_default_graph()
graph = tf.get_default_graph()
with graph.as_default():
# create placeholders
X, Y = create_placeholders(*placeholder_dimensions)
# initialize parameters
parameters = initialize_parameters(filter_dimensions)
# forward propagate
Z4 = forward_propagation(X, parameters, strides)
# compute cost
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = Z4, labels = Y))
# define optimizer for backpropagation that minimizes the cost function
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
# initialize variables
init = tf.global_variables_initializer()
# start session
with tf.Session() as sess:
sess.run(init)
for epoch in range(num_epochs):
minibatch_cost = 0.
num_minibatches = int(m / minibatch_size)
# get random minibatch
minibatches = random_minibatches(np.array([X_train, Y_train]), minibatch_size)
for minibatch in minibatches:
minibatch_X, minibatch_Y = minibatch
_ , temp_cost = sess.run([optimizer, cost], {X: minibatch_X, Y: minibatch_Y})
minibatch_cost += temp_cost / num_minibatches
if print_cost == True and epoch % 5 == 0:
print('Cost after epoch %i: %f' %(epoch, minibatch_cost))
if print_cost == True:
costs.append(minibatch_cost)
# plot the costs
plot_cost(costs, learning_rate)
# calculate correct predictions
prediction = tf.argmax(Z4, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y, 1))
# calculate accuracy on test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
print('Training set accuracy:', train_accuracy)
print('Test set accuracy:', test_accuracy)
return parameters
where create_placeholder and initialize_parameters function are as follows:
def initialize_parameters(filter_dimensions):
# initialize weight parameters for convolution layers
W1 = tf.get_variable('W1', shape = filter_dimensions['W1'])
W2 = tf.get_variable('W2', shape = filter_dimensions['W2'])
parameters = {'W1': W1, 'W2': W2}
return parameters
def forward_propagation(X, parameters, strides):
with tf.variable_scope('model1'):
# first block
Z1 = tf.nn.conv2d(X, parameters['W1'], strides['conv1'], padding = 'VALID')
A1 = tf.nn.relu(Z1)
P1 = tf.nn.max_pool(A1, ksize = strides['pool1'], strides = strides['pool1'], padding = 'VALID')
# second block
Z2 = tf.nn.conv2d(P1, parameters['W2'], strides['conv2'], padding = 'VALID')
A2 = tf.nn.relu(Z2)
P2 = tf.nn.max_pool(A2, ksize = strides['pool2'], strides = strides['pool2'], padding = 'VALID')
# flatten
F = tf.contrib.layers.flatten(P2)
# dense block
Z3 = tf.contrib.layers.fully_connected(F, 50)
A3 = tf.nn.relu(Z3)
# output
Z4 = tf.contrib.layers.fully_connected(A3, 10, activation_fn = None)
return Z4
I have previous experience with Keras, yet i can't find what is the problem here.

I would check 2 things first:
#convert to one-hot encodings
Y_train = tf.one_hot(Y_train, depth = 10)
Y_test = tf.one_hot(Y_test, depth = 10)
Check if this code is outputting what you expect.
and second : check the model initialization, again, if it looks like you expect.
Just my 2 cents

Verify validity of a feedforward network

I am new to tensorflow and i am tasked to design a feedforward neural network which consists of: an input layer, one hidden perceptron layer of 10 neurons and an output softmax layer. Assume a learning rate of 0.01, L2 regularization with weight decay parameter of 0.000001, and batch size of 32.
I would like to know if there is anyway to know if the network that I have created is what intend to create. Like a graph showing the nodes?
The following is attempt on the task but I am not sure if it is correct.
import math
import tensorflow as tf
import numpy as np
import pylab as plt
# scale data
def scale(X, X_min, X_max):
return (X - X_min)/(X_max-X_min)
def tfvariables(start_nodes, end_nodes):
W = tf.Variable(tf.truncated_normal([start_nodes, end_nodes], stddev=1.0/math.sqrt(float(start_nodes))))
b = tf.Variable(tf.zeros([end_nodes]))
return W, b
NUM_FEATURES = 36
NUM_CLASSES = 6
learning_rate = 0.01
beta = 10 ** -6
epochs = 10000
batch_size = 32
num_neurons = 10
seed = 10
np.random.seed(seed)
#read train data
train_input = np.loadtxt('sat_train.txt',delimiter=' ')
trainX, train_Y = train_input[:, :36], train_input[:, -1].astype(int)
trainX = scale(trainX, np.min(trainX, axis=0), np.max(trainX, axis=0))
# There are 6 class-labels 1,2,3,4,5,7
train_Y[train_Y == 7] = 6
trainY = np.zeros((train_Y.shape[0], NUM_CLASSES))
trainY[np.arange(train_Y.shape[0]), train_Y-1] = 1 #one matrix
# experiment with small datasets
trainX = trainX[:1000]
trainY = trainY[:1000]
n = trainX.shape[0]
# Create the model
x = tf.placeholder(tf.float32, [None, NUM_FEATURES])
y_ = tf.placeholder(tf.float32, [None, NUM_CLASSES])
# Build the graph for the deep net
W1, b1 = tfvariables(NUM_FEATURES, num_neurons)
W2, b2 = tfvariables(num_neurons, NUM_CLASSES)
logits_1 = tf.matmul(x, W1) + b1
perceptron_layer = tf.nn.sigmoid(logits_1)
logits_2 = tf.matmul(perceptron_layer, W2) + b2
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=logits_2)
# Standard Loss
loss = tf.reduce_mean(cross_entropy)
# Loss function with L2 Regularization with beta
regularizers = tf.nn.l2_loss(W1) + tf.nn.l2_loss(W2)
loss = tf.reduce_mean(loss + beta * regularizers)
# Create the gradient descent optimizer with the given learning rate.
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(cross_entropy)
correct_prediction = tf.cast(tf.equal(tf.argmax(logits_2, 1), tf.argmax(y_, 1)), tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
sess.run(tf.global_variables_initializer())
train_acc = []
train_loss = []
for i in range(epochs):
train_op.run(feed_dict={x: trainX, y_: trainY})
train_acc.append(accuracy.eval(feed_dict={x: trainX, y_: trainY}))
train_loss.append(loss.eval(feed_dict={x: trainX, y_: trainY}))
if i % 500 == 0:
print('iter %d: accuracy %g loss %g'%(i, train_acc[i], train_loss[i]))
# plot learning curves
plt.figure(1)
plt.plot(range(epochs), train_acc)
plt.xlabel(str(epochs) + ' iterations')
plt.ylabel('Train accuracy')
# plot learning curves
plt.figure(1)
plt.plot(range(epochs), train_loss)
plt.xlabel(str(epochs) + ' iterations')
plt.ylabel('Train loss')
plt.show()
plt.show()

You can utitilize Tensorboard to visualize the graph you created. Basically, you have to follow a few steps to do this:
declare a writer as writer = tf.summary.FileWriter('PATH/TO/A/LOGDIR')
add the graph to the writer with writer.add_graph(sess.graph) with sess being your current tf.Session() in which you execute the graph
possibly you have to use writer.flush() to write it to disk immediately
Note that you have to add these lines AFTER building your graph.
You can view the graph by executing this command in your shell:
tensorboard --logdir=PATH/TO/A/LOGDIR
Then you are presented an address (usually something like localhost:6006) on which you can view the graph with your browser (Chrome and Firefox are guaranteed to work).

Tensorboard (in TensorFlow) is useful tool.
Use tf.summary.FileWriter for writing the graph into a folder and run tensorboard from the corresponding directory.
Check the following links:
https://www.tensorflow.org/guide/graphs
https://www.tensorflow.org/guide/summaries_and_tensorboard

Tensorboard: unable to find named scope

I have a scope which I named 'Pred/Accuracy' that I cant seem to find in Tensorboard. I will include my entire code a little later but specifically in my definition of my cost function I have:
def compute_cost(z, Y, parameters, l2_reg=False):
with tf.name_scope('cost'):
logits = tf.transpose(z)
labels = tf.transpose(Y)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits,
labels = labels))
if l2_reg == True:
reg = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
cost = cost + tf.reduce_sum(reg)
with tf.name_scope('Pred/Accuracy'):
prediction=tf.argmax(z)
correct_prediction = tf.equal(tf.argmax(z), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
return cost, prediction, accuracy
But on tensorboard I cant see it even if I click on the cost block:
Below is basically my entire code excluding importing / pre-processing data
# Create X and Y placeholders
def create_xy_placeholder(n_x, n_y):
X = tf.placeholder(tf.float32, shape = [n_x, None], name = 'X')
Y = tf.placeholder(tf.float32, shape = [n_y, None], name = 'Y')
return X, Y
# initialize parameters hidden layers
def initialize_parameters(n_x, scale, hidden_units):
hidden_units= [n_x] + hidden_units
parameters = {}
regularizer = tf.contrib.layers.l2_regularizer(scale)
for i in range(0, len(hidden_units[1:])):
with tf.variable_scope('hidden_parameters_'+str(i+1)):
w = tf.get_variable("W"+str(i+1), [hidden_units[i+1], hidden_units[i]],
initializer=tf.contrib.layers.xavier_initializer(),
regularizer=regularizer)
b = tf.get_variable("b"+str(i+1), [hidden_units[i+1], 1],
initializer = tf.constant_initializer(0.1))
parameters.update({"W"+str(i+1): w})
parameters.update({"b"+str(i+1): b})
return parameters
# forward progression with batch norm and dropout
def forward_propagation(X, parameters, batch_norm=False, keep_prob=1):
a_new = X
for i in range(0, int(len(parameters)/2)-1):
with tf.name_scope('forward_pass_'+str(i+1)):
w = parameters['W'+str(i+1)]
b = parameters['b'+str(i+1)]
z = tf.matmul(w, a_new) + b
if batch_norm == True:
z = tf.layers.batch_normalization(z, momentum=0.99, axis=0)
a = tf.nn.relu(z)
if keep_prob < 1:
a = tf.nn.dropout(a, keep_prob)
a_new = a
tf.summary.histogram('act_'+str(i+1), a_new)
# calculating final Z before input into cost as logit
with tf.name_scope('forward_pass_'+str(int(len(parameters)/2))):
w = parameters['W'+str(int(len(parameters)/2))]
b = parameters['b'+str(int(len(parameters)/2))]
z = tf.matmul(w, a_new) + b
if batch_norm == True:
z = tf.layers.batch_normalization(z, momentum=0.99, axis=0)
return z
# compute cost with option for l2 regularizatoin
def compute_cost(z, Y, parameters, l2_reg=False):
with tf.name_scope('cost'):
logits = tf.transpose(z)
labels = tf.transpose(Y)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits,
labels = labels))
if l2_reg == True:
reg = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
cost = cost + tf.reduce_sum(reg)
with tf.name_scope('Pred/Accuracy'):
prediction=tf.argmax(z)
correct_prediction = tf.equal(tf.argmax(z), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
return cost, prediction, accuracy
# defining the model (need to add keep_prob for dropout)
def model(X_train, Y_train, X_test, Y_test,
hidden_units=[30, 50, 50, 30, 4], # hidden units/layers
learning_rate = 0.0001, # Learning rate
num_epochs = 2000, minibatch_size = 30, # minibatch/ number epochs
keep_prob=0.5, # dropout
batch_norm=True, # batch normalization
l2_reg=True, scale = 0.01, # L2 regularization/scale is lambda
print_cost = True):
ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables
tf.set_random_seed(1) # to keep consistent results
seed = 3 # to keep consistent results
(n_x, m) = X_train.shape # (n_x: input size, m : number of examples in the train set)
n_y = Y_train.shape[0] # n_y : output size
costs = [] # To keep track of the cost
logs_path = '/tmp/tensorflow_logs/example/'
# Create Placeholders of shape (n_x, n_y)
X, Y = create_xy_placeholder(n_x, n_y)
# Initialize parameters
parameters = initialize_parameters(n_x, scale, hidden_units)
# Forward propagation: Build the forward propagation in the tensorflow graph
z = forward_propagation(X, parameters, keep_prob, batch_norm)
# Cost function: Add cost function to tensorflow graph
cost, prediction, accuracy = compute_cost(z, Y, parameters, l2_reg)
# Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.
with tf.name_scope('optimizer'):
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate)
# Op to calculate every variable gradient
grads = tf.gradients(cost, tf.trainable_variables())
grads = list(zip(grads, tf.trainable_variables()))
# Op to update all variables according to their gradient
apply_grads = optimizer.apply_gradients(grads_and_vars = grads)
# Initialize all the variables
init = tf.global_variables_initializer()
# to view in tensorboard
tf.summary.scalar('loss', cost)
tf.summary.scalar('accuracy', accuracy)
# Create summaries to visualize weights
for var in tf.trainable_variables():
tf.summary.histogram(var.name, var)
# Summarize all gradients
for grad, var in grads:
tf.summary.histogram(var.name + '/gradient', grad)
merged_summary_op = tf.summary.merge_all()
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
# Start the session to compute the tensorflow graph
with tf.Session(config=config) as sess:
# Run the initialization
sess.run(init)
# define writer
summary_writer = tf.summary.FileWriter(logs_path,
graph=tf.get_default_graph())
# Do the training loop
for epoch in range(num_epochs):
epoch_cost = 0. # Defines a cost related to an epoch
num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set
seed = seed + 1
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
count = 0
for minibatch in minibatches:
# Select a minibatch
(minibatch_X, minibatch_Y) = minibatch
# IMPORTANT: The line that runs the graph on a minibatch.
# Run the session to execute the "optimizer" and the "cost", the feedict should contain a minibatch for (X,Y).
_ , minibatch_cost, summary = sess.run([apply_grads, cost,
merged_summary_op],
feed_dict = {X: minibatch_X, Y: minibatch_Y})
epoch_cost += minibatch_cost / num_minibatches
# Write logs at every iteration
summary_writer.add_summary(summary, epoch * num_minibatches + count)
count += 1
# Print the cost every epoch
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
prediction1=tf.argmax(z)
# print('Z5: ', Z5.eval(feed_dict={X: minibatch_X, Y: minibatch_Y}))
print('prediction: ', prediction1.eval(feed_dict={X: minibatch_X,
Y: minibatch_Y}))
correct1=tf.argmax(Y)
# print('Y: ', Y.eval(feed_dict={X: minibatch_X,
# Y: minibatch_Y}))
print('correct: ', correct1.eval(feed_dict={X: minibatch_X,
Y: minibatch_Y}))
if print_cost == True and epoch % 5 == 0:
costs.append(epoch_cost)
# plot the cost
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
# lets save the parameters in a variable
parameters = sess.run(parameters)
print ("Parameters have been trained!")
# Calculate the correct predictions
correct_prediction = tf.equal(tf.argmax(z), tf.argmax(Y))
# Calculate accuracy on the test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
print("Run the command line:\n" \
"--> tensorboard --logdir=/tmp/tensorflow_logs " \
"\nThen open http://0.0.0.0:6006/ into your web browser")
return parameters
# run model on test data
parameters = model(x_train, y_train, x_test, y_test, keep_prob=1)

Tensorflow scopes are hierarchical: you can have a scope within another scope within another scope, etc. The name "Pred/Accuracy" means exactly that: you have a top level "Pred" scope and "Accuracy" nested scope (this is because slash is has a special meaning in naming).
Tensorboard shows the top ones by default: "Pred" (on the top), "batch_normalization", etc. You can expand them to see what's inside them by double clicking. Inside "Pred" you should find "Accuracy".
If you like, just name your scope differently, e.g. "Pred_Accuracy", and the full name will appear in tensorboard.

How to do the regression by tensorflow in this example?

I am using tensorflow to do a linear regression. Here I am facing a problem:
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (8,6)
data = pd.read_csv('./data.csv')
xs = data["A"][:100]
ys = data["B"][:100]
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')
W = tf.Variable(tf.random_normal([1]),name = 'weight')
b = tf.Variable(tf.random_normal([1]),name = 'bias')
Y_pred = tf.add(tf.multiply(X,W), b)
sample_num = xs.shape[0]
loss = tf.reduce_sum(tf.pow(Y_pred - Y,2))/sample_num
learning_rate = 0.0001
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
n_samples = xs.shape[0]
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(100):
for x,y in zip(xs,ys):
_, l = sess.run([optimizer, loss], feed_dict={X: x, Y:y})
W, b = sess.run([W, b])
plt.plot(xs, ys, 'bo', label='Real data')
plt.plot(xs, xs*W + b, 'r', label='Predicted data')
plt.legend()
plt.show()
The data.csv is here.
The plot is diametrically opposed to what I expected:
So, what is the problem? I am a beginner of python and tensorflow, and just can't reach the points.

As Nipun mentioned, try AdamOptimizer instead of GradientDescentOptimizer.
You will often find that AdamOptimizer is generally a better optimizer than GradientDescentOptimizer and reaches the minima much faster.
It does so by adapting the learning rate instead of keeping it constant (0.0001 in your case).
Also, more the number of epochs, better the model (not considering over-fitting here).

Since your learning rate and the number of epochs are too small, your regression models haven't converged. Therefore, you may need to increase the learning rate and use the tf.train.AdamOptimizer.
Here I set the learning rate to 2, epochs=10000 and got the following graph.
Here I have given the code with the comments where necessary.
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (8, 6)
data = pd.read_csv('./data.csv')
xs = data["A"][:100]
ys = data["B"][:100]
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
Y_pred = tf.add(tf.multiply(X, W), b)
loss = tf.reduce_mean(tf.pow(Y_pred - Y, 2))
learning_rate = 2 #increase the learning rate
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)#use the AdamOptimizer
BATCH_SIZE = 8 #Batch Size define here
n_samples = xs.shape[0]
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(10000): #increase the num of epoches
for start, end in zip(range(0, n_samples, BATCH_SIZE), # mini batch Gradientdecent
range(BATCH_SIZE, n_samples + 1, BATCH_SIZE)):
_, l = sess.run([optimizer, loss], feed_dict={X: xs[start:end], Y: ys[start:end]})
prediction = sess.run(Y_pred, feed_dict={X: xs})
#W, b = sess.run([W, b])
plt.plot(xs, ys, 'bo', label='Real data')
plt.plot(xs, prediction, 'r', label='Predicted data')
plt.legend()
plt.show()
Also, you can use the mini batch gradientdescent method to accelerate the convergence as the code above.
Moreover, you can increase the number of epochs and learning rate further to get the optimal result.
Hope this helps.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is my graph a crazy flickering monster? - python

Related

Weird decision boundary using neural net in Tensorflow

TensorFlow model's cost is constantly at 0

Verify validity of a feedforward network

Tensorboard: unable to find named scope

How to do the regression by tensorflow in this example?

Categories

Resources