I want to compute the reconstruction accuracy of my autoencoder using CrossEntropyLoss:
ae_criterion = nn.CrossEntropyLoss()
ae_loss = ae_criterion(X, Y)
where X is the autoencoder's reconstruction and Y is the target (since it is an autoencoder, Y is the same as the original input X).
Both X and Y have shape [42, 32, 130] = [batch_size, timesteps, number_of_classes]. When I run the code above I get the following error:
ValueError: Expected target size (42, 130), got torch.Size([42, 32,
130])
After looking the docs, I'm still unsure on how should I call nn.CrossEntropyLoss() in the appropriate way. It seems that I should change Y to be of shape [42, 32, 1], with each element being a scalar in the interval [0, 129] (or [1, 130]), am I right?
Is there a way to avoid this? Since X and Y are between 0 and 1, could I just use binary cross-entropy loss element-wise in an equivalent way?
For CrossEntropyLoss, shape of the Y must be (42, 32), each element must be a Long scalar in the interval [0, 129].
You may want to use BCELoss or BCEWithLogitsLoss for your problem.
Related
I am trying to train a pretty simple 2-layer neural network for a multi-class classification class. I am using CrossEntropyLoss and I get the following error: ValueError: Expected target size (128, 4), got torch.Size([128]) in my training loop at the point where I am trying to calculate the loss.
My last layer is a softmax so it outputs the probabilities of each of the 4 classes. My target values are a vector of dimension 128 (just the class values). Am I initializing the CrossEntropyLoss object incorrectly?
I looked up existing posts, this one seemed the most relevant:
https://discuss.pytorch.org/t/valueerror-expected-target-size-128-10000-got-torch-size-128-1/29424 However, if I had to squeeze my target values, how would that work? Like right now they are just class values for e.g., [0 3 1 0]. Is that not how they are supposed to look? I would think that the loss function maps the highest probability from the last layer and associates that to the appropriate class index.
Details:
This is using PyTorch
Python version is 3.7
NN architecture is: embedding -> pool -> h1 -> relu -> h2 -> softmax
Model Def (EDITED):
self.embedding_layer = create_embedding_layer(embeddings)
self.pool = nn.MaxPool1d(1)
self.h1 = nn.Linear(embedding_dim, embedding_dim)
self.h2 = nn.Linear(embedding_dim, 4)
self.s = nn.Softmax(dim=2)
forward pass:
x = self.embedding_layer(x)
x = self.pool(x)
x = self.h1(x)
x = F.relu(x)
x = self.h2(x)
x = self.s(x)
return x
The issue is that the output of your model is a tensor shaped as (batch, seq_length, n_classes). Each sequence element in each batch is a four-element tensor corresponding to the predicted probability associated with each class (0, 1, 2, and 3). Your target tensor is shaped (batch,) which is usually the correct shape (you didn't use one-hot-encodings). However, in this case, you need to provide a target for each one of the sequence elements.
Assuming the target is the same for each element of your sequence (this might not be true though and is entirely up to you to decide), you may repeat the targets seq_length times. nn.CrossEntropyLoss allows you to provide additional axes, but you have to follow a specific shape layout:
Input: (N, C) where C = number of classes, or (N, C, d_1, d_2, ..., d_K) with K≥1 in the case of K-dimensional loss.
Target: (N) where each value is 0 ≤ targets[i] ≤ C−1 , or (N, d_1, d_2, ..., d_K) with K≥1 in the case of K-dimensional loss.
In your case, C=4 and seq_length (what you referred to as D) would be d_1.
>>> seq_length = 10
>>> out = torch.rand(128, seq_length, 4) # mocking model's output
>>> y = torch.rand(128).long() # target tensor
>>> criterion = nn.CrossEntropyLoss()
>>> out_perm = out.permute(0, 2, 1)
>>> out_perm.shape
torch.Size([128, 4, 10]) # (N, C, d_1)
>>> y_rep = y[:, None].repeat(1, seq_length)
>>> y_rep.shape
torch.Size([128, 10]) # (N, d_1)
Then call your loss function with criterion(out_perm, y_rep).
Checking frozen tensorflow model:
wget https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz
I see that input size is Tensor 'input:0', which has shape '(1, 299, 299, 3)', I wonder is it possible to make input (None, 299, 299, 3) to make availible batch prediction with batch_size > 1?
In the general case it may not be possible to do this, as there could be operations that rely on the first dimension being 1 (e.g. suppose tf.squeeze is used on input:0). However, you can try to replace the input with a placeholder of the desired shape. You can do this with tf.graph_util.import_graph_def. If the operations allow it, then TensorFlow should import the graph adjusting the node shapes accordingly. See the following example:
import tensorflow as tf
# First graph
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, [1, 10, 20], name='Input')
y = tf.square(x, name='Output')
print(y)
# Tensor("Output:0", shape=(1, 10, 20), dtype=float32)
gd = tf.get_default_graph().as_graph_def()
# Second graph
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, [None, 10, 20], name='Input')
y, = tf.graph_util.import_graph_def(gd, input_map={'Input:0': x},
return_elements=['Output:0'], name='')
print(y)
# Tensor("Output:0", shape=(?, 10, 20), dtype=float32)
In the first graph, the Output:0 node has a shape (1, 10, 20), which is inferred from the shape of the Input:0 tensor. However, when I take the graph definition from the first graph and load in the second graph, replacing the Input:0 tensor with a placeholder with undefined first dimension, the shape of Output:0 is updated to (?, 10, 20). If I run the operations in the second graph giving an input value with a first dimension greater than one, it will work as expected, because the graph is correct.
I am currently working on a neural network that takes some inputs and returns 2 outputs. I used 2 outputs in a regression problem where they both are 2 coordinates, X and Y.
My problem doesn't need X and Y values but angle it is facing which is atan2(y,x).
I am trying to to create a custom keras metric and a loss function that does a atan2 operation between the elements of the predicted tensor and true tensor so as to better train the network on my task.
The shape of the output tensor in metric is [?, 2] and I want to do a function where I can loop through the tensor and apply atan2(tensor[itr, 1], tensor[itr, 0]) on it to get an array of another tensors.
I have tried using tf.slit and tf.slice
I don't want to convert it into a numpy array and back to tensorflow due to performance reasons.
I have tried to get the shape of tensors using tensor.get_shape().as_list() and iterate through it.
self.model.compile(loss="mean_absolute_error",
optimizer=tf.keras.optimizers.Adam(lr=0.01),
metrics=[vect2d_to_angle_metric])
# This is the function i want to work on
def vect2d_to_angle_metric(y_true, y_predicted):
print("y_true = ", y_true)
print("y_predicted = ", y_predicted)
print("y_true shape = ", y_true.shape())
print("y_predicted shape = ", y_predicted.shape())
The print out of the above function being
y_true = Tensor("dense_2_target:0", shape=(?, ?), dtype=float32)
y_predicted = Tensor("dense_2/BiasAdd:0", shape=(?, 2), dtype=float32)
y_true shape = Tensor("metrics/vect2d_to_angle_metric/Shape:0", shape=(2,), dtype=int32)
y_predicted shape = Tensor("metrics/vect2d_to_angle_metric/Shape_1:0", shape=(2,), dtype=int32)
Python pseudo-code of the functionality I want to apply to the tensorflow function
def evaluate(self):
mean_array = []
for i in range(len(x_test)):
inputs = x_test[i]
prediction = self.model.getprediction(i)
predicted_angle = np.arctan2(result[i][1], result[i][0])
real_angle = np.arctan2(float(self.y_test[i][1]), float(self.y_test[i][0]))
mean_array.append(([abs(predicted_angle - real_angle)]/real_angle) * 100)
i += 1
I expect to slide the 2 sides of the tensor [i][0] and [i][1] and to a tf.atan2() function on both of them and finally make another tensor out of them so as to follow with other calculations and pass the custom loss.
I'm trying to use lambda layer in keras to return a Euclidean distance of two vectors. The code is:
def distance(x):
a=x[0]
b=x[1]
dist=np.linalg.norm(a-b)
return dist
dist=Lambda(distance,output_shape=(1,1)name='dist')([x,y])
The input of this layer are two vectors of (100,1,8192). The 100 is the batch.The output is a constant in theory. And I want to use dist as output of this model like:
model = Model(inputs=[probe_input_car,probe_input_sign,gallary_input_car,gallary_input_sign], outputs=dist, name='fcn')`
When I run this model, there will be a error:
ValueError: Input dimension mis-match. (input[0].shape[2] = 1, input[1].shape[2] = 8192)
Apply node that caused the error: Elemwise{Composite{EQ(i0, RoundHalfToEven(i1))}}(/dist_target, Elemwise{Composite{sqrt(sqr(i0))}}.0)
Toposort index: 92
Inputs types: [TensorType(float32, 3D), TensorType(float32, 3D)]
Inputs shapes: [(100, 1, 1), (100, 1, 8192)]
Inputs strides: [(4, 4, 4), (32768, 32768, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{acc_dtype=int64}(Elemwise{Composite{EQ(i0, RoundHalfToEven(i1))}}.0)]]
I think this is caused by the output_shape of lambda layer. How should I set the output_shape of the layer. Because I use theano as the backend, it can't calculate the output_shape itself.
And if it is not caused by output_shape. Where is the error?
It seems you're simply getting the wrong parts of the vector.
The message says it's trying to compute something with two tensors shaped as:
(100,1,1)
(100,1,8192)
Based on your input list, where you have [car,signal,car2,signal2]. I believe you probably want some operation betwen either car x car or signal x signal.
So, your lambda layer should probably start as either:
a = x[0]
b = x[2]
or:
a = x[1]
b = x[3]
Hint: if you're able to find an equivalent function in keras backend to calculate what you want, it's probably better. I wonder how you haven't got a "disconnected" error message for using a numpy function.
The error occurs because you used np.linalg.norm() on a Theano tensor. It doesn't throw an error, but the output is definitely not what you expect.
To avoid this error, use Keras backend functions instead. For example,
dist = K.sqrt(K.sum(K.square(a - b), axis=-1, keepdims=True))
What happened inside np.linalg.norm(x):
x = np.asarray(x) wraps a - b into a length-1 array (of dtype object) whose only element is a Theano tensor of shape (100, 1, 8192).
sqnorm = np.dot(x, x): recall the definition of dot product. When you dot a length-1 array with itself, you're actually computing (a - b) * (a - b), or an element-wise square of a - b. That's why there's sqr(i0) in the second line of your error.
np.sqrt(sqnorm) is returned. So you can see sqrt(sqr(i0)) appear in your error.
Therefore, the output of np.linalg.norm(a - b) is a tensor of shape (100, 1, 8192), not (100, 1, 1).
Also, if you look closer into the code, Elemwise{Composite{EQ(i0, RoundHalfToEven(i1))}} is just accuracy.
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
So the error message is trying to tell you that there's a mismatch between the shapes of y_true and y_pred. While y_true is of shape (100, 1, 1), y_pred has a shape (100, 1, 8192) because np.linalg.norm() gives wrong results for Theano tensors.
I'm using tensorflow run queues to feed my data during training time:
X, Y = tf.train.batch(
[image, label],
batch_size=64
)
However X, Y have forced shape of [64, 32, 32,3], and [64, 10]. During evaluation time I'd like to run loss operation on whole test set, which has dimensions: [10000, 32, 32, 3] and [10000, 10]. I would use feed_dict property in session.run() to overwrite X,Y with my values, however they have incompatible shapes.
Can I somehow instruct tensorflow to forget about first dimension, that is reshape [64, 32, 32, 3] -> [None, 32, 32, 3]? Or is there any other option for me to replace X,Y with another value.
Whole dataset is small enough to fit in memory, therefore I'm using similar approach as in https://github.com/tensorflow/tensorflow/blob/r0.9/tensorflow/examples/how_tos/reading_data/fully_connected_preloaded.py
This is a bit subtle: in TensorFlow terminology, you don't actually want to reshape the tensor (i.e. change the number of elements in each dimension), but instead you want TensorFlow to "forget" a specific dimension, in order to feed values with a range of sizes.
The tf.placeholder_with_default() op is designed to support this case. It takes a default input, which in your case would be the next training batch (of shape [64, ...]); and a shape, which in your case would be the same shape as the input, with the first dimension set to None. You can then feed this placeholder with values of any batch size.
Here's an example of how you'd use it:
X_batch, Y_batch = tf.train.batch([image, label], batch_size=64)
# Alternatively, `X_shape = [None, 32, 32, 3]`
X_shape = tf.TensorShape([None]).concatenate(X_batch.get_shape()[1:])
# Alternatively, `Y_shape = [None, 10]`
Y_shape = tf.TensorShape([None]).concatenate(Y_batch.get_shape()[1:])
# Create tensors that can be fed with a less specific shape
# than `X_batch`, `Y_batch`.
X = tf.placeholder_with_default(X_batch, shape=X_shape)
Y = tf.placeholder_with_default(Y_batch, shape=Y_shape)