TensorFlow error: TensorShape() must have the same rank - python

def compileActivation(self, net, layerNum):
variable = net.x if layerNum == 0 else net.varArrayA[layerNum - 1]
#print tf.expand_dims(net.dropOutVectors[layerNum], 1)
#print net.varWeights[layerNum]['w'].get_shape().as_list()
z = tf.matmul((net.varWeights[layerNum]['w']), (variable * (tf.expand_dims(net.dropOutVectors[layerNum], 1) if self.dropout else 1.0))) + tf.expand_dims(net.varWeights[layerNum]['b'], 1)
a = self.activation(z, self.pool_size)
net.varArrayA.append(a)
I am running an activation function which computes z and passes it into a sigmoid activation.
When I try to execute the above function, I get the following error:
ValueError: Shapes TensorShape([Dimension(-2)]) and TensorShape([Dimension(None), Dimension(None)]) must have the same rank
The theano equivalent for computing z is working just fine:
z = T.dot(net.varWeights[layerNum]['w'], variable * (net.dropOutVectors[layerNum].dimshuffle(0, 'x') if self.dropout else 1.0)) + net.varWeights[layerNum]['b'].dimshuffle(0, 'x')

Mihir,
When I encountered this problem it was because my placeholders were the wrong size in my feed dictionary. Also you should know how to run the graph in a session. tf.Session.run(fetches, feed_dict=None)
Here's my code to make the placeholders
# Note this place holder is for the input data feed-dict definition
input_placeholder = tf.placeholder(tf.float32, shape=(batch_size, FLAGS.InputLayer))
# Not sure yet what this will be used for.
desired_output_placeholder = tf.placeholder(tf.float32, shape=(batch_size, FLAGS.OutputLayer))
Here's my fill feed dictionary function:
def feel_feed_funct(data_sets_train, input_pl, output_pl):
ti_feed, dto_feed = data_sets_train.next_batch(FLAGS.batch_size)
feed_dict = {
input_pl: ti_feed,
output_pl: dto_feed
}
return feed_dict
Later I do this:
# Fill a feed dictionary with the actual set of images and labels
# for this particular training step.
feed_dict = fill_feed_dict(data_sets.train, input_placeholder, desired_output_placeholder)
Then to run the session and fetch the outputs I have this line
_, l = sess.run([train_op, loss], feed_dict=feed_dict)

Related

How to use a batch_size of Keras tensor at the model building time?

I want to use an external program as a custom operation.
Because automatic gradient would be not available, I wrote the code to provide gradients by using numerical methods. However, because it have to compute the batch_size number of derivatives,
I wrote it to get batch_size from the shape of x.
Following is an example using numpy function as an external program
f(x) = np.sum(x**2)
(In fact, for this simple numpy function, no loop over batch_size is necessary. But, it is written for general external function.)
#tf.custom_gradient
def custom_op(x):
# without using numpy, use external function
# assume x shape = (batch_size,3)
batch_size= x.shape[0]
input_length = x.shape[1]
# assert input_length==3
yout=[] # shape should be (batch_size,1)
gout=[] # shape should be (batch_size,3)
for i in range(batch_size):
inputs = x[i,:] # shape (3,)
y = np.sum(inputs**2) # shape (3,)
yout.append(y) # shape (1,)
# compute differences
dy = []
for j in range(len(inputs)):
delta = np.zeros_like(inputs)
delta[j] = np.abs(inputs[j])*0.001
yplus = np.sum((inputs + delta)**2) # change only j-th input
grad = (yplus-y)/delta[j] #shape (1,)
dy.append(grad)
gout.append(dy)
yout = tf.convert_to_tensor(yout,dtype='float32') # (batch_size,)
yout = tf.reshape(yout,shape=(batch_size,1)) # (batch_size,1)
gout = tf.convert_to_tensor(gout,dtype='float32') # (batch_size,)
gout = tf.reshape(gout,shape=(batch_size,input_length)) # (batch_size,1)
def grad(upstream):
return upstream*gout
return yout, grad
x = tf.Variable([[1.,2.,3.],[2.,3.,4.]],dtype='float32')
with tf.GradientTape() as tape:
y = custom_op(x)
tape.gradient(y,x)
and found it works.
However, when I tried to use it in the keras model , for example,
def construct_model():
inputs = tf.keras.Input(shape=(3,)) #input array
x = tf.keras.layers.Dense(1)(inputs)
outputs = custom_op(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
optimizer = 'adam'
model.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error'])
return model
model = construct_model()
it gives errors
because kerasTensor "inputs" does not have specified batch_size.
I tried to specify batch_size as "tf.keras.Input(shape=(3,),batch_size=2)".
However, it also raises errors because of the use of kerasTensor.
How should I change the custom_op to be compatible with keras?

normalize the input of relu function encountered a runtime error

when I try to pass the maximum activation value from previous layer to normalize the input of relu in next layer I encounter a runtime error as below. However, when I pass fixed value it works well without any error.
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 175, in backward allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved
tensors after they have already been freed). Saved intermediate values of the graph are freed
when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to
backward through the graph a second time or if you need to access saved tensors after calling
backward.
As you see in this code below, I pass the argument prev_layer_max from the previous layer and encounter the error:
class th_norm_ReLU(nn.Module):
def __init__(self, modify):
super(th_norm_ReLU, self).__init__()
self.therelu = F.relu
def forward(self, input, prev_layer_max):
output = input * (prev_layer_max / input.max())
norm_output = self.therelu (output)
return norm_output
But if I use a fixed value instead of passed prev_layer_max argument, as this code below I make it equal to 1 it works normally without any error:
def forward(self, input, prev_layer_max = 1):
output = input * (1 / input.max())
norm_output = self.therelu (output)
the training loop is as below :
for epoch in range(params.epochs):
running_loss = 0
start_time = time.time()
for i, (images, labels) in enumerate(train_loader):
model.train()
model.zero_grad()
optimizer.zero_grad()
labels.to(device)
images = images.float().to(device)
outputs = model(images, epoch)
loss = criterion(outputs.cpu(), labels)
running_loss += loss.item()
loss.backward()
optimizer.step()
here is the forward in the model where I record the max of each layer in a list ( thresh_list ):
def forward(self, input, epoch):
x = self.conv1(input)
x = self.relu(x,1)
self.thresh_list[0] = max(self.thresh_list[0], x.max()) #to get the max activation
x = self.conv_dropout(x)
x = self.conv2(x)
x = self.relu(x, self.thresh_list[0])
self.thresh_list[1] = max(self.thresh_list[1], x.max())
x = self.pool1(x)
x = self.conv_dropout(x)
x = self.conv3(x)
x = self.relu(x, self.thresh_list[1] )
self.thresh_list[2] = max(self.thresh_list[2], x.max())
The Relue function I call is :
self.relu = th_norm_ReLU(True)
and the_norm_ReLU model is shown above.

What is the correct way to update an input variable during training?

I have an input
inp = torch.tensor([1.0])
and a neural network
class Model_updater(nn.Module):
def __init__(self):
super(Model_updater, self).__init__()
self.fc1 = nn.Linear(1, 2)
self.fc2 = nn.Linear(2, 3)
self.fc3 = nn.Linear(3, 2)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
net_updater = Model_updater()
opt_updater = optim.Adam(net_updater.parameters())
I'm trying to update my input using the neural network's output:
inp = torch.tensor([1.0])
epochs = 3
for i in range(epochs):
opt_updater.zero_grad()
inp_copy = inp.detach().clone()
mu, sigma = net_updater(inp_copy)
dist1 = Normal(mu, torch.abs(sigma))
a = dist1.rsample()
inp += a
loss = torch.tensor(5.0) - inp
loss.backward(retain_graph=True)
opt_updater.step()
But getting the error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [3, 2]], which is output 0 of TBackward, is at version 2; expected version 1
I also tried changing the loss calculations with
loss = torch.tensor(5.0) - inp_copy
But got the error
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I also tried without the retain_graph=True but I get
RuntimeError: Trying to backward through the graph a second time,
but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time.
Which doesn't really makes sense to me because I don't see where I'm calling backward() twice
Most likely, this is what you want
inp1 = inp + a # create a separate variable for updated value
inp.data = inp1.data # update the value without touching the graph
loss = torch.tensor(5.0) - inp1 # use updated value which has gradient

InvalidArgumentError: You must feed a value for placeholder tensor 'ground_truth' with dtype double

I am trying to understand the transfer learning through Tensorflow. But I am getting the stated error.
This is my code
def add_final_training_ops(graph, class_count, final_tensor_name,
ground_truth_tensor_name):
"""Adds a new softmax and fully-connected layer for training.
We need to retrain the top layer to identify our new classes, so this function
adds the right operations to the graph, along with some variables to hold the
weights, and then sets up all the gradients for the backward pass.
The set up for the softmax and fully-connected layers is based on:
https://tensorflow.org/versions/master/tutorials/mnist/beginners/index.html
Args:
graph: Container for the existing model's Graph.
class_count: Integer of how many categories of things we're trying to
recognize.
final_tensor_name: Name string for the new final node that produces results.
ground_truth_tensor_name: Name string of the node we feed ground truth data
into.
Returns:
Nothing.
"""
bottleneck_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(
BOTTLENECK_TENSOR_NAME))
bottleneck_tensor = tf.placeholder_with_default(bottleneck_tensor1, shape=[None, 2048])
layer_weights = tf.Variable(
tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, class_count], stddev=0.001),
name='final_weights')
layer_biases = tf.Variable(tf.zeros([class_count]), name='final_biases')
logits = tf.matmul(bottleneck_tensor, layer_weights,
name='final_matmul') + layer_biases
tf.nn.softmax(logits, name=final_tensor_name)
ground_truth_placeholder = tf.placeholder(tf.float64,
[None, class_count],
name=ground_truth_tensor_name)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=ground_truth_placeholder)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
train_step = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(
cross_entropy_mean)
return train_step, cross_entropy_mean
def do_train(sess,X_input, Y_input, X_validation, Y_validation):
ground_truth_tensor_name = 'ground_truth'
mini_batch_size = 10
n_train = X_input.shape[0]
graph = create_graph()
train_step, cross_entropy = add_final_training_ops(
graph, len(classes), FLAGS.final_tensor_name,
ground_truth_tensor_name)
init = tf.initialize_all_variables()
sess.run(init)
evaluation_step = add_evaluation_step(graph, FLAGS.final_tensor_name, ground_truth_tensor_name)
# Get some layers we'll need to access during training.
bottleneck_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(BOTTLENECK_TENSOR_NAME))
bottleneck_tensor = tf.placeholder_with_default(bottleneck_tensor1, shape=[None, 2048])
ground_truth_tensor1 = graph.get_tensor_by_name(ensure_name_has_port(ground_truth_tensor_name))
ground_truth_tensor = tf.placeholder_with_default(ground_truth_tensor1, shape=[None, len(classes)])
i=0
epocs = 1
for epoch in range(epocs):
shuffledRange = np.random.permutation(n_train)
y_one_hot_train = encode_one_hot(len(classes), Y_input)
y_one_hot_validation = encode_one_hot(len(classes), Y_validation)
shuffledX = X_input[shuffledRange,:]
shuffledY = y_one_hot_train[shuffledRange]
for Xi, Yi in iterate_mini_batches(shuffledX, shuffledY, mini_batch_size):
print Xi.shape
print type(Xi)
print type(Yi)
print Yi.shape
print Yi.dtype
print Yi[0]
sess.run(train_step,
feed_dict={bottleneck_tensor: Xi,
ground_truth_tensor: Yi})
Print statements has the following outputs :
(10, 2048)
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
(10, 5)
float64
[ 0. 0. 0. 1. 0.]
I am getting the error at :
sess.run(train_step,feed_dict={bottleneck_tensor: Xi,ground_truth_tensor: Yi})
Can someone tell me why I am facing this error?
The problem is that you created a placeholder in add_final_training_ops that you don't feed. You might think that the placeholder ground_truth_tensor that you create in add_final_training_ops is the same, but it is not, it is a new one, even if it is initialized by the former.
The easiest fix would be perhaps to return the placeholder from add_final_training_ops and use this one instead.

How to fix MatMul Op has type float64 that does not match type float32 TypeError?

I am trying to save Nueral Network weights into a file and then restoring those weights by initializing the network instead of random initialization. My code works fine with random initialization. But, when i initialize weights from file it is showing me an error TypeError: Input 'b' of 'MatMul' Op has type float64 that does not match type float32 of argument 'a'. I don't know how do i solve this issue.Here is my code:
Model Initialization
# Parameters
training_epochs = 5
batch_size = 64
display_step = 5
batch = tf.Variable(0, trainable=False)
regualarization = 0.008
# Network Parameters
n_hidden_1 = 300 # 1st layer num features
n_hidden_2 = 250 # 2nd layer num features
n_input = model.layer1_size # Vector input (sentence shape: 30*10)
n_classes = 12 # Sentence Category detection total classes (0-11 categories)
#History storing variables for plots
loss_history = []
train_acc_history = []
val_acc_history = []
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
Model parameters
#loading Weights
def weight_variable(fan_in, fan_out, filename):
stddev = np.sqrt(2.0/fan_in)
if (filename == ""):
initial = tf.random_normal([fan_in,fan_out], stddev=stddev)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
#loading Biases
def bias_variable(shape, filename):
if (filename == ""):
initial = tf.constant(0.1, shape=shape)
else:
initial = np.loadtxt(filename)
print initial.shape
return tf.Variable(initial)
# Create model
def multilayer_perceptron(_X, _weights, _biases):
layer_1 = tf.nn.relu(tf.add(tf.matmul(_X, _weights['h1']), _biases['b1']))
layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, _weights['h2']), _biases['b2']))
return tf.matmul(layer_2, weights['out']) + biases['out']
# Store layers weight & bias
weights = {
'h1': w2v_utils.weight_variable(n_input, n_hidden_1, filename="weights_h1.txt"),
'h2': w2v_utils.weight_variable(n_hidden_1, n_hidden_2, filename="weights_h2.txt"),
'out': w2v_utils.weight_variable(n_hidden_2, n_classes, filename="weights_out.txt")
}
biases = {
'b1': w2v_utils.bias_variable([n_hidden_1], filename="biases_b1.txt"),
'b2': w2v_utils.bias_variable([n_hidden_2], filename="biases_b2.txt"),
'out': w2v_utils.bias_variable([n_classes], filename="biases_out.txt")
}
# Define loss and optimizer
#learning rate
# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
learning_rate = tf.train.exponential_decay(
0.02*0.01, # Base learning rate. #0.002
batch * batch_size, # Current index into the dataset.
X_train.shape[0], # Decay step.
0.96, # Decay rate.
staircase=True)
# Construct model
pred = tf.nn.relu(multilayer_perceptron(x, weights, biases))
#L2 regularization
l2_loss = tf.add_n([tf.nn.l2_loss(v) for v in tf.trainable_variables()])
#Softmax loss
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
#Total_cost
cost = cost+ (regualarization*0.5*l2_loss)
# Adam Optimizer
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost,global_step=batch)
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Initializing the variables
init = tf.initialize_all_variables()
print "Network Initialized!"
ERROR DETAILS
The tf.matmul() op does not perform automatic type conversions, so both of its inputs must have the same element type. The error message you are seeing indicates that you have a call to tf.matmul() where the first argument has type tf.float32, and the second argument has type tf.float64. You must convert one of the inputs to match the other, for example using tf.cast(x, tf.float32).
Looking at your code, I don't see anywhere that a tf.float64 tensor is explicitly created (the default dtype for floating-point values in the TensorFlow Python API—e.g. for tf.constant(37.0)—is tf.float32). I would guess that the errors are caused by the np.loadtxt(filename) calls, which might be loading an np.float64 array. You can explicitly change them to load np.float32 arrays (which are converted to tf.float32 tensors) as follows:
initial = np.loadtxt(filename).astype(np.float32)
Although It's an old question but I would like you include that I came across the same problem. I resolved it using dtype=tf.float64 for parameter initialization and for creating X and Y placeholders as well.
Here is the snap of my code.
X = tf.placeholder(shape=[n_x, None],dtype=tf.float64)
Y = tf.placeholder(shape=[n_y, None],dtype=tf.float64)
and
parameters['W' + str(l)] = tf.get_variable('W' + str(l), [layers_dims[l],layers_dims[l-1]],dtype=tf.float64, initializer = tf.contrib.layers.xavier_initializer(seed = 1))
parameters['b' + str(l)] = tf.get_variable('b' + str(l), [layers_dims[l],1],dtype=tf.float64, initializer = tf.zeros_initializer())
Declaring all placholders and parameters with float64 datatype will resolve this issue.
For Tensorflow 2
You can cast one of the tensor, like this for example:
_X = tf.cast(_X, dtype='float64')
You can get rid of this error by setting all layers to have a default dtype of float64:
tf.keras.backend.set_floatx('float64')

Categories