I've been trying to train a model as usual with train/test data. I was able to have my accuracy, cost + the valid accuracy and cost. So I presume that the model is working and the result is enough with an 85%.
Now, after I finished with my train/test data, I have a csv file with the same type and structure of data but without one column (default -indicate if client will pay or be delayed). I'm trying to predict this value with the model. I'm bugging on how to insert those data and get back with the missing column.
Problem section :
This is my code for restoring and predict on the new data -> (y_pred [5100x41])
with tf.Session() as sess:
saver = tf.train.import_meta_graph('my_test_model101.meta')
print("Model found.")
saver.restore(sess, tf.train.latest_checkpoint('./'))
print("Model restored compl.")
z = tf.placeholder(tf.float32, shape= (None,5100))
y_pred= y_pred.as_matrix()
output =sess.run(z,feed_dict={x: y_pred})
print(output)
Can anyone help me to understand what's I am doing wrong here ?!!!
Error message is:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_4' with dtype float and shape [?,5100]
[[Node: Placeholder_4 = Placeholder[dtype=DT_FLOAT, shape=[?,5100], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Expecting:
My input [5100 x 41] but the last column had initially Nan value, I want it with the predicted value which is supposed to be 0 or 1.
To see the trained model architecure :
Model architecture :
# Number of input nodes.
input_nodes = 41
# Multiplier maintains a fixed ratio of nodes between each layer.
mulitplier = 3
# Number of nodes in each hidden layer
hidden_nodes1 = 41
hidden_nodes2 = round(hidden_nodes1 * mulitplier)
hidden_nodes3 = round(hidden_nodes2 * mulitplier)
# Percent of nodes to keep during dropout.
pkeep = tf.placeholder(tf.float32)
# input
x = tf.placeholder(tf.float32, [None, input_nodes])
# layer 1
W1 = tf.Variable(tf.truncated_normal([input_nodes, hidden_nodes1], stddev = 0.15))
b1 = tf.Variable(tf.zeros([hidden_nodes1]))
y1 = tf.nn.sigmoid(tf.matmul(x, W1) + b1)
# layer 2
W2 = tf.Variable(tf.truncated_normal([hidden_nodes1, hidden_nodes2], stddev = 0.15))
b2 = tf.Variable(tf.zeros([hidden_nodes2]))
y2 = tf.nn.sigmoid(tf.matmul(y1, W2) + b2)
# layer 3
W3 = tf.Variable(tf.truncated_normal([hidden_nodes2, hidden_nodes3], stddev = 0.15))
b3 = tf.Variable(tf.zeros([hidden_nodes3]))
y3 = tf.nn.sigmoid(tf.matmul(y2, W3) + b3)
y3 = tf.nn.dropout(y3, pkeep)
# layer 4
W4 = tf.Variable(tf.truncated_normal([hidden_nodes3, 2], stddev = 0.15))
b4 = tf.Variable(tf.zeros([2]))
y4 = tf.nn.softmax(tf.matmul(y3, W4) + b4)
# output
y = y4
y_ = tf.placeholder(tf.float32, [None, 2])
After building the model, I understand you need to add Placeholder to stock what you're looking for. So :
# Parameters
training_epochs = 5 # These proved to be enough to let the network learn
training_dropout = 0.9
display_step = 1 # 10
n_samples = y_train.shape[0]
batch_size = 2048
learning_rate = 0.001
# Cost function: Cross Entropy
cost = -tf.reduce_sum(y_ * tf.log(y))
# We will optimize our model via AdamOptimizer
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
# Correct prediction if the most likely value (default or non Default) from softmax equals the target value.
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Till now everything is working well and I saved the model. I was able to restore this model (printed the variables and all was there---So restore is fine)
The placeholder 'z' has nothing in it and nothing is assigned to it. So when you run the session, nothing needs to be done because 'z' depends on nothing in the model. I think you want,
output =sess.run(y,feed_dict={x: y_pred})
Because 'y' is the output tensor.
Having said that, I think you might want to read up a little more on the flow graph used by tensorflow to understand how the calculations happen. Currently, it doesn't sound like you have fully understood the placeholder variables.
Related
I had an unused placeholder in my code which was designed for further purposes. However, I found adding or not of it will affect the cost for each epoch and event the final result (e.g. accuracy, auc). I have checked that it wasn't connected to the computational graph since the placeholder was added on top of a complete program and it appears only once in the code. Since my project is a bit huge to reproduce, I tried to reproduce it using a simple code piece copied from a blog https://adventuresinmachinelearning.com/python-tensorflow-tutorial/.
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
tf.random.set_random_seed(1234)
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Python optimisation variables
learning_rate = 0.5
epochs = 10
batch_size = 100
# declare the training data placeholders
# input x - for 28 x 28 pixels = 784
x = tf.placeholder(tf.float32, [None, 784])
# now declare the output data placeholder - 10 digits
y = tf.placeholder(tf.float32, [None, 10])
# zz = tf.placeholder(tf.float32, name='zz') <--- the added placeholder
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')
# calculate the output of the hidden layer
hidden_out = tf.add(tf.matmul(x, W1), b1)
hidden_out = tf.nn.relu(hidden_out)
# now calculate the hidden layer output - in this case, let's use a softmax activated
# output layer
y_ = tf.nn.softmax(tf.add(tf.matmul(hidden_out, W2), b2))
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y * tf.log(y_clipped)
+ (1 - y) * tf.log(1 - y_clipped), axis=1))
# add an optimiser
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)
# finally setup the initialisation operator
init_op = tf.global_variables_initializer()
# define an accuracy assessment operation
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# start the session
with tf.Session() as sess:
# initialise the variables
sess.run(init_op)
total_batch = int(len(mnist.train.labels) / batch_size)
for epoch in range(epochs):
avg_cost = 0
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
_, c = sess.run([optimiser, cross_entropy],
feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))
I used tf.random.set_random_seed(1234) to ensure that the random initialization of each round is the same (not sure if it can be done but the results are consistent across multiple runs).
By adding the placeholder, the cost for epoch 1 would be 0.000, while by removing it, the cost would be 0.001.
Although in this simple example, the final accuracy stays the same (accuracy=0.9783), in my code, this operation affects a lot (adding the placeholder actually improves the AUC value for a binary classification problem). And if I changed the type of this placeholder, e.g. from placeholder to sparse_placeholder, the cost would also change (accuracy=0.9772 in this example). So I am wondering why the unused placeholder also has come contributions to the result, and how can I debug such a problem?
Thanks in advance! Anyone can directly copy paste and use python + name to run the code after the installation of TensorFlow 1.12+ and observe the differences by uncommenting the placeholder zz or not.
Plus: I see this post with similar issue and am reading thru https://github.com/tensorflow/tensorflow/issues/19171
I got a dataset of 178 elements, and each contains 13 features and 1 label.
Label is stored as one-hot array. My training dataset is made of 158 elements.
Here is what my model looks like :
x = tf.placeholder(tf.float32, [None,training_data.shape[1]])
y_ = tf.placeholder(tf.float32, [None,training_data_labels.shape[1]])
node_1 = 300
node_2 = 300
node_3 = 300
out_n = 3
#1
W1 = tf.Variable(tf.random_normal([training_data.shape[1], node_1]))
B1 = tf.Variable(tf.random_normal([node_1]))
y1 = tf.add(tf.matmul(x,W1),B1)
y1 = tf.nn.relu(y1)
#2
W2 = tf.Variable(tf.random_normal([node_1, node_2]))
B2 = tf.Variable(tf.random_normal([node_2]))
y2 = tf.add(tf.matmul(y1,W2),B2)
y2 = tf.nn.relu(y2)
#3
W3 = tf.Variable(tf.random_normal([node_2, node_3]))
B3 = tf.Variable(tf.random_normal([node_3]))
y3 = tf.add(tf.matmul(y2,W3),B3)
y3 = tf.nn.relu(y3)
#output
W4 = tf.Variable(tf.random_normal([node_3, out_n]))
B4 = tf.Variable(tf.random_normal([out_n]))
y4 = tf.add(tf.matmul(y3,W4),B4)
y = tf.nn.softmax(y4)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(200):
sess.run(optimizer,feed_dict={x:training_data, y_:training_data_labels})
correct = tf.equal(tf.argmax(y_, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
print('Accuracy:',accuracy.eval({x:eval_data, y_:eval_data_labels}))
But the accuracy is very low, i tried increase the range 200 to some higher number but it still remains low.
What could I do to improve the results ?
The problem is that you're taking the softmax of y4 and then passing that to tf.nn.softmax_cross_entropy_with_logits. This error is common enough that there's actually a note about it in the documentation for softmax_cross_entropy_with_logits:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally
for efficiency. Do not call this op with the output of softmax, as it will produce
incorrect results.
The rest of your code looks fine, so just replace y4 with y and get rid of y = tf.nn.softmax(y4).
I am trying to write a two layer neural network to train a class labeler. The input to the network is a 150-feature list of about 1000 examples; all features on all examples have been L2 normalized.
I only have two outputs, and they should be disjoint--I am just attempting to predict whether the example is a one or a zero.
My code is relatively simple; I am feeding the input data into the hidden layer, and then the hidden layer into the output. As I really just want to see this working in action, I am training on the entire data set with each step.
My code is below. Based on the other NN implementations I have referred to, I believe that the performance of this network should be improving over time. However, regardless of the number of epochs I set, I am getting back an accuracy of about ~20%. The accuracy is not changing when the number of steps are changed, so I don't believe that my weights and biases are being updated.
Is there something obvious I am missing with my model? Thanks!
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
inputs = np.random.normal(size=[1000,150]).astype('float32')*1.5
label = np.round(np.random.uniform(low=0,high=1,size=[1000,1])*0.8)
reverse_label = 1-label
labels = np.append(label,reverse_label,1)
# parameters
learn_rate = 0.01
epochs = 200
n_input = 150
n_hidden = 75
n_output = 2
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden]))
b1 = tf.Variable(tf.truncated_normal([n_output]))
w0 = tf.Variable(tf.truncated_normal([n_input,n_hidden]))
w1 = tf.Variable(tf.truncated_normal([n_hidden,n_output]))
# step function
def returnPred(x,w0,w1,b0,b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h #return the first response vector from the
y_ = returnPred(x,w0,w1,b0,b1) # predict operation
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y) # calculate loss between prediction and actual
model = tf.train.GradientDescentOptimizer(learning_rate=learn_rate).minimize(loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) #initialize graph
for step in range(0,epochs):
sess.run(model,feed_dict={x: inputs, y: labels }) #train model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
I changed your optimizer to AdamOptimizer (in many cases it performs better than GradientDescentOptimizer).
I also played a bit with the parameters. In particular, I took smaller std for your variable initialization, decreased learning rate (as your loss was unstable and "jumped around") and increased epochs (as I noticed that your loss continues to decrease).
I also reduced the size of the hidden layer. It is harder to train networks with large hidden layer when you don't have that much data.
Regarding your loss, it is better to apply tf.reduce_mean on it so that loss would be a number. In addition, following the answer of ml4294, I used softmax instead of sigmoid, so the loss looks like:
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_,labels=y))
The code below achieves accuracy of around 99.9% on the training data:
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()
# generate data
np.random.seed(10)
inputs = np.random.normal(size=[1000,150]).astype('float32')*1.5
label = np.round(np.random.uniform(low=0,high=1,size=[1000,1])*0.8)
reverse_label = 1-label
labels = np.append(label,reverse_label,1)
# parameters
learn_rate = 0.002
epochs = 400
n_input = 150
n_hidden = 60
n_output = 2
# set weights/biases
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
b0 = tf.Variable(tf.truncated_normal([n_hidden],stddev=0.2,seed=0))
b1 = tf.Variable(tf.truncated_normal([n_output],stddev=0.2,seed=0))
w0 = tf.Variable(tf.truncated_normal([n_input,n_hidden],stddev=0.2,seed=0))
w1 = tf.Variable(tf.truncated_normal([n_hidden,n_output],stddev=0.2,seed=0))
# step function
def returnPred(x,w0,w1,b0,b1):
z1 = tf.add(tf.matmul(x, w0), b0)
a2 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a2, w1), b1)
h = tf.nn.relu(z2)
return h #return the first response vector from the
y_ = returnPred(x,w0,w1,b0,b1) # predict operation
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_,labels=y)) # calculate loss between prediction and actual
model = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(loss) # apply gradient descent based on loss
init = tf.global_variables_initializer()
tf.Session = sess
sess.run(init) #initialize graph
for step in range(0,epochs):
sess.run([model,loss],feed_dict={x: inputs, y: labels }) #train model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: inputs, y: labels})) # print accuracy
Just a suggestion in addition to the answer provided by Miriam Farber:
You use a multi-dimensional output label ([0., 1.]) for the classification. I suggest to use the softmax cross entropy tf.nn.softmax_cross_entropy_with_logits() instead of the sigmoid cross entropy, since you assume the outputs to be disjoint softmax on Wikipedia. I achieved much faster convergence with this small modification.
This should also improve your performance once you decide to increase your output dimensionality from 2 to a higher number.
I guess you have some problem here:
loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y) # calculate loss between prediction and actual
It should look smth like that:
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_,labels=y))
Did't look at you code much, so if this would't work out you can check udacity deep learning course or forum they have good samples of that are you trying to do.
GL
I am trying to understand the transfer learning through Tensorflow. But I am getting the stated error.
This is my code
def add_final_training_ops(graph, class_count, final_tensor_name,
ground_truth_tensor_name):
"""Adds a new softmax and fully-connected layer for training.
We need to retrain the top layer to identify our new classes, so this function
adds the right operations to the graph, along with some variables to hold the
weights, and then sets up all the gradients for the backward pass.
The set up for the softmax and fully-connected layers is based on:
https://tensorflow.org/versions/master/tutorials/mnist/beginners/index.html
Args:
graph: Container for the existing model's Graph.
class_count: Integer of how many categories of things we're trying to
recognize.
final_tensor_name: Name string for the new final node that produces results.
ground_truth_tensor_name: Name string of the node we feed ground truth data
into.
Returns:
Nothing.
"""
bottleneck_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(
BOTTLENECK_TENSOR_NAME))
bottleneck_tensor = tf.placeholder_with_default(bottleneck_tensor1, shape=[None, 2048])
layer_weights = tf.Variable(
tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, class_count], stddev=0.001),
name='final_weights')
layer_biases = tf.Variable(tf.zeros([class_count]), name='final_biases')
logits = tf.matmul(bottleneck_tensor, layer_weights,
name='final_matmul') + layer_biases
tf.nn.softmax(logits, name=final_tensor_name)
ground_truth_placeholder = tf.placeholder(tf.float64,
[None, class_count],
name=ground_truth_tensor_name)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=ground_truth_placeholder)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
train_step = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(
cross_entropy_mean)
return train_step, cross_entropy_mean
def do_train(sess,X_input, Y_input, X_validation, Y_validation):
ground_truth_tensor_name = 'ground_truth'
mini_batch_size = 10
n_train = X_input.shape[0]
graph = create_graph()
train_step, cross_entropy = add_final_training_ops(
graph, len(classes), FLAGS.final_tensor_name,
ground_truth_tensor_name)
init = tf.initialize_all_variables()
sess.run(init)
evaluation_step = add_evaluation_step(graph, FLAGS.final_tensor_name, ground_truth_tensor_name)
# Get some layers we'll need to access during training.
bottleneck_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(BOTTLENECK_TENSOR_NAME))
bottleneck_tensor = tf.placeholder_with_default(bottleneck_tensor1, shape=[None, 2048])
ground_truth_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(ground_truth_tensor_name))
ground_truth_tensor = tf.placeholder_with_default(ground_truth_tensor1, shape=[None, len(classes)])
i=0
epocs = 1
for epoch in range(epocs):
shuffledRange = np.random.permutation(n_train)
y_one_hot_train = encode_one_hot(len(classes), Y_input)
y_one_hot_validation = encode_one_hot(len(classes), Y_validation)
shuffledX = X_input[shuffledRange,:]
shuffledY = y_one_hot_train[shuffledRange]
for Xi, Yi in iterate_mini_batches(shuffledX, shuffledY, mini_batch_size):
print Xi.shape
print type(Xi)
print type(Yi)
print Yi.shape
print Yi.dtype
print Yi[0]
sess.run(train_step,
feed_dict={bottleneck_tensor: Xi,
ground_truth_tensor: Yi})
Print statements has the following outputs :
(10, 2048)
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
(10, 5)
float64
[ 0. 0. 0. 1. 0.]
I am getting the error at :
sess.run(train_step,feed_dict={bottleneck_tensor: Xi,ground_truth_tensor: Yi})
Can someone tell me why I am facing this error?
The problem is that you created a placeholder in add_final_training_ops that you don't feed. You might think that the placeholder ground_truth_tensor that you create in add_final_training_ops is the same, but it is not, it is a new one, even if it is initialized by the former.
The easiest fix would be perhaps to return the placeholder from add_final_training_ops and use this one instead.
I am trying to save Nueral Network weights into a file and then restoring those weights by initializing the network instead of random initialization. My code works fine with random initialization. But, when i initialize weights from file it is showing me an error TypeError: Input 'b' of 'MatMul' Op has type float64 that does not match type float32 of argument 'a'. I don't know how do i solve this issue.Here is my code:
Model Initialization
# Parameters
training_epochs = 5
batch_size = 64
display_step = 5
batch = tf.Variable(0, trainable=False)
regualarization = 0.008
# Network Parameters
n_hidden_1 = 300 # 1st layer num features
n_hidden_2 = 250 # 2nd layer num features
n_input = model.layer1_size # Vector input (sentence shape: 30*10)
n_classes = 12 # Sentence Category detection total classes (0-11 categories)
#History storing variables for plots
loss_history = []
train_acc_history = []
val_acc_history = []
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
Model parameters
#loading Weights
def weight_variable(fan_in, fan_out, filename):
stddev = np.sqrt(2.0/fan_in)
if (filename == ""):
initial = tf.random_normal([fan_in,fan_out], stddev=stddev)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
#loading Biases
def bias_variable(shape, filename):
if (filename == ""):
initial = tf.constant(0.1, shape=shape)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
# Create model
def multilayer_perceptron(_X, _weights, _biases):
layer_1 = tf.nn.relu(tf.add(tf.matmul(_X, _weights['h1']), _biases['b1']))
layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, _weights['h2']), _biases['b2']))
return tf.matmul(layer_2, weights['out']) + biases['out']
# Store layers weight & bias
weights = {
'h1': w2v_utils.weight_variable(n_input, n_hidden_1, filename="weights_h1.txt"),
'h2': w2v_utils.weight_variable(n_hidden_1, n_hidden_2, filename="weights_h2.txt"),
'out': w2v_utils.weight_variable(n_hidden_2, n_classes, filename="weights_out.txt")
}
biases = {
'b1': w2v_utils.bias_variable([n_hidden_1], filename="biases_b1.txt"),
'b2': w2v_utils.bias_variable([n_hidden_2], filename="biases_b2.txt"),
'out': w2v_utils.bias_variable([n_classes], filename="biases_out.txt")
}
# Define loss and optimizer
#learning rate
# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
learning_rate = tf.train.exponential_decay(
0.02*0.01, # Base learning rate. #0.002
batch * batch_size, # Current index into the dataset.
X_train.shape[0], # Decay step.
0.96, # Decay rate.
staircase=True)
# Construct model
pred = tf.nn.relu(multilayer_perceptron(x, weights, biases))
#L2 regularization
l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in tf.trainable_variables()])
#Softmax loss
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
#Total_cost
cost = cost+ (regualarization*0.5*l2_loss)
# Adam Optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost,global_step=batch)
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Initializing the variables
init = tf.initialize_all_variables()
print "Network Initialized!"
ERROR DETAILS
The tf.matmul() op does not perform automatic type conversions, so both of its inputs must have the same element type. The error message you are seeing indicates that you have a call to tf.matmul() where the first argument has type tf.float32, and the second argument has type tf.float64. You must convert one of the inputs to match the other, for example using tf.cast(x, tf.float32).
Looking at your code, I don't see anywhere that a tf.float64 tensor is explicitly created (the default dtype for floating-point values in the TensorFlow Python API—e.g. for tf.constant(37.0)—is tf.float32). I would guess that the errors are caused by the np.loadtxt(filename) calls, which might be loading an np.float64 array. You can explicitly change them to load np.float32 arrays (which are converted to tf.float32 tensors) as follows:
initial = np.loadtxt(filename).astype(np.float32)
Although It's an old question but I would like you include that I came across the same problem. I resolved it using dtype=tf.float64 for parameter initialization and for creating X and Y placeholders as well.
Here is the snap of my code.
X = tf.placeholder(shape=[n_x, None],dtype=tf.float64)
Y = tf.placeholder(shape=[n_y, None],dtype=tf.float64)
and
parameters['W' + str(l)] = tf.get_variable('W' + str(l), [layers_dims[l],layers_dims[l-1]],dtype=tf.float64, initializer = tf.contrib.layers.xavier_initializer(seed = 1))
parameters['b' + str(l)] = tf.get_variable('b' + str(l), [layers_dims[l],1],dtype=tf.float64, initializer = tf.zeros_initializer())
Declaring all placholders and parameters with float64 datatype will resolve this issue.
For Tensorflow 2
You can cast one of the tensor, like this for example:
_X = tf.cast(_X, dtype='float64')
You can get rid of this error by setting all layers to have a default dtype of float64:
tf.keras.backend.set_floatx('float64')