I am trying to write a custom loss function for a keras NN model, but it seems like the loss function is outputting the wrong value. My loss function is
def tangle_loss3(input_tensor):
def custom_loss(y_true, y_pred):
true_diff = y_true - input_tensor
pred_diff = y_pred - input_tensor
normalized_diff = K.abs(tf.math.divide(pred_diff, true_diff))
normalized_diff = tf.reduce_mean(normalized_diff)
return normalized_diff
return custom_loss
Then I use it in this simple feed-forward network:
input_layer = Input(shape=(384,), name='input')
hl_1 = Dense(64, activation='elu', name='hl_1')(input_layer)
hl_2 = Dense(32, activation='elu', name='hl_2')(hl_1)
hl_3 = Dense(32, activation='elu', name='hl_3')(hl_2)
output_layer = Dense(384, activation=None, name='output')(hl_3)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model = tf.keras.models.Model(input_layer, output_layer)
model.compile(loss=tangle_loss3(input_layer), optimizer=optimizer)
Then to test whether the loss function is working, I created a random input and target vector and did the numpy calculation of what I expect, but this does not seem to match the result from keras.
X = np.random.rand(1, 384)
y = np.random.rand(1, 384)
np.mean(np.abs((model.predict(X) - X)/(y - X)))
# returns some number
model.test_on_batch(X, y)
# always returns 0.0
Why does my loss function always return zero? And should these answers match?
I misunderstood your issue, and I have updated my method. it should work now. I stack the input layer and output layer to get a new layer that I pass to output.
def tangle_loss3(y_true, y_pred):
true_diff = y_true - y_pred[0]
pred_diff = y_pred[1] - y_pred[0]
normalized_diff = tf.abs(tf.math.divide(pred_diff, true_diff))
normalized_diff = tf.reduce_mean(normalized_diff)
return normalized_diff
input_layer = Input(shape=(384,), name='input')
hl_1 = Dense(64, activation='elu', name='hl_1')(input_layer)
hl_2 = Dense(32, activation='elu', name='hl_2')(hl_1)
hl_3 = Dense(32, activation='elu', name='hl_3')(hl_2)
output_layer = Dense(384, activation=None, name='output')(hl_3)
out = tf.stack([input_layer, output_layer])
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model = tf.keras.models.Model(input_layer, out)
model.compile(loss=tangle_loss3, optimizer=optimizer)
and now when I calculate the loss it works
X = np.random.rand(1, 384)
y = np.random.rand(1, 384)
np.mean(np.abs((model.predict(X)[1] - X)/(y - X)))
# returns some number
model.test_on_batch(X, y)
Note that I have to use model.predict(X)[1] as we get two outputs, both the input and output layers' results. This is just one hacky solution but it works.
The custom loss works well with single non-nested custom_loss(y_true,y_pred). You can try to add Subtract layer of keras for output and then try to use new label as new_label = label - input right before you add to to the training pipeline.
Now only use customloss
Related
I am completely new to PyTorch, I would like to move my TF code to PyTorch, and I think I am missing something.
I have X as input and Y as output. X is a time series data, on which I would like to do 1D convolution. Y is just a plain number.
X has a shape of (1050589, 81, 21). I have 1050589 experiments, each experiment has 81 timestamps and each timestamp has 21 points of data. This is the required format for TF, but as far as I was able to get out in PyTorch the time dimension should be the last one.
I have my data in a numpy array, so first I transformed the data to fit PyTorch, and also transformed into a list.
a = []
for n, i in enumerate(X):
a.append([X[n].T, Y[n]])
train_data = DataLoader(a, batch_size=128)
My model looks like this:
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.linear_relu_stack = nn.Sequential(
nn.Conv1d(EMBED_SIZE, 32, 7, padding='same'),
nn.ReLU(),
nn.Flatten(),
nn.Linear(81*32, 32),
nn.ReLU(),
nn.Linear(32, 1),
)
def forward(self, x):
logits = self.linear_relu_stack(x)
return logits.double()
The architecture is simple, as I want to keep it the same as I have in Tensorflow. One convolution with a kernel of 7 and 32 channels, followed by a dense layer and a single output layer.
Same network in Tensorflow:
def conv_1d_model():
model = Sequential(name="model_conv1D")
model.add(Conv1D(filters=32, kernel_size=7, activation='relu', input_shape=(81, 21), padding="same"))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
return model
Now when I try to optimize this network in PyTorch my losses are all over the place, not decreasing at all, while in TensorFlow it runs perfectly well.
I am sure I am missing something, can anyone point me in the right direction?
My optimization function in PyTorch:
model = NeuralNetwork()
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
for batch, (X, y) in enumerate(dataloader):
# Compute prediction and loss
pred = torch.squeeze(model(X)) # I was getting a warning about the pred being in different shape than y, so I squeezed it
loss = loss_fn(pred, y)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 10 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
Optimization in Tensorflow
model = conv_1d_model()
opt = Adam(learning_rate=learning_rate)
model.compile(loss='mse', optimizer=opt, metrics=['mae'])
model_history = model.fit(X, Y, validation_split=0.2, epochs=epochs, batch_size=batch_size, verbose=1)
I am trying to use a custom loss function in my Keras sequential model (TensorFlow 2.6.0). This custom loss (ideally) will calculate the data loss plus the residual of a physical equation (say, diffusion equation, Navier Stokes, etc.). This residual error is based on the model output derivative wrt its inputs and I want to use GradientTape.
In this MWE, I removed the data loss term and other equation losses, and just used the derivative of the output wrt its input. The dataset can be found here.
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf #tf.__version__ = '2.6.0'
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8] #X.shape = (768, 8)
y = dataset[:,8]
X = tf.convert_to_tensor(X, dtype=tf.float32)
y = tf.convert_to_tensor(y, dtype=tf.float32)
def customLoss(y_true,y_pred):
x_tensor = tf.convert_to_tensor(model.input, dtype=tf.float32)
# x_tensor = tf.cast(x_tensor, tf.float32)
with tf.GradientTape() as t:
t.watch(x_tensor)
output = model(x_tensor)
DyDX = t.gradient(output, x_tensor)
dy_t = DyDX[:, 5:6]
R_pred=dy_t
# loss_data = tf.reduce_mean(tf.square(yTrue - yPred), axis=-1)
loss_PDE = tf.reduce_mean(tf.square(R_pred))
return loss_PDE
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss=customLoss, optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=15)
After execution, I get this ValueError:
ValueError: Passed in object of type <class 'keras.engine.keras_tensor.KerasTensor'>, not tf.Tensor
When I change loss=customLoss to loss='mse', the model starts training, but using that customLoss is the whole point. Any ideas?
The problem seems to come from model.input in the loss function, If I understand your code correctly, you can use the loss :
def custom_loss_pass(model, x_tensor):
def custom_loss(y_true,y_pred):
with tf.GradientTape() as t:
t.watch(x_tensor)
output = model(x_tensor)
DyDX = t.gradient(output, x_tensor)
dy_t = DyDX[:, 5:6]
R_pred=dy_t
# loss_data = tf.reduce_mean(tf.square(yTrue - yPred), axis=-1)
loss_PDE = tf.reduce_mean(tf.square(R_pred))
return loss_PDE
return custom_loss
And then:
model.compile(loss=custom_loss_pass(model, X), optimizer='adam', metrics=['accuracy'])
I am not sure it does what you want but at least it works!
Considering the fact that you have a basic model similar to this:
input_layer = layers.Input(shape=(50,20))
layer = layers.Dense(123, activation = 'relu')
layer = layers.LSTM(128, return_sequences = True)(layer)
outputs = layers.Dense(20, activation='softmax')(layer)
model = Model(input_layer,outputs)
How would you implement CTC loss? I tried something from the keras code tutorial on OCR like this:
class CTCLayer(layers.Layer):
def __init__(self, name=None):
super().__init__(name=name)
self.loss_fn = keras.backend.ctc_batch_cost
def call(self, y_true, y_pred):
# Compute the training-time loss value and add it
# to the layer using `self.add_loss()`.
batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")
input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
loss = self.loss_fn(y_true, y_pred, input_length, label_length)
self.add_loss(loss)
# At test time, just return the computed predictions
return y_pred
labels = layers.Input(shape=(None,), dtype="float32")
input_layer = layers.Input(shape=(50,20))
layer = layers.Dense(123, activation = 'relu')
layer = layers.LSTM(128, return_sequences = True)(layer)
outputs = layers.Dense(20, activation='softmax')(layer)
output = CTCLayer()(labels,outputs)
model = Model(input_layer,outputs)
However when it came to the model.fit part it started to fall apart due to me not knowing how to feed the model the "label" input layer thing. I think that the approach in the tutorial is quite unambiguous so what would be a better and more efficient way to do implement the CTC loss?
The only thing you are doing wrong is the Model creation model = Model(input_layer,outputs) it should be model = Model([input_layer,labels],output) that said you can also compile the model with tf.nn.ctc_loss as loss if you don't want to have 2 inputs
def my_loss_fn(y_true, y_pred):
loss_value = tf.nn.ctc_loss(y_true, y_pred, y_true_length, y_pred_length,
logits_time_major = False)
return tf.reduce_mean(loss_value, axis=-1)
model.compile(optimizer='adam', loss=my_loss_fn)
Something like this, Note that the code above is not tested and you need to find the the y_pred and y_true length but you can do that as is done in the ctc layer
I have implemented a CNN with two output layers for GTSRB Dataset problem. One output layer classifies images into their respective classes and second layer predicts bounding box coordinates. In dataset, the upper left and lower right coordinate is provided for training images. We have to predict the same for the test images. How do i define the loss metric(MSE or any other) and performance metric(R-Squared or any other) for regression layer since it outputs 4 values(x and y coordinates for upper left and lower right point)? Below is the code of model.
def get_model() :
#Input layer
input_layer = Input(shape=(IMG_HEIGHT, IMG_WIDTH, N_CHANNELS, ), name="input_layer", dtype='float32')
#Convolution, maxpool and dropout layers
conv_1 = Conv2D(filters=8, kernel_size=(3,3), activation=relu,
kernel_initializer=he_normal(seed=54), bias_initializer=zeros(),
name="first_convolutional_layer") (input_layer)
maxpool_1 = MaxPool2D(pool_size=(2,2), name = "first_maxpool_layer")(conv_1)
#Fully connected layers
flat = Flatten(name="flatten_layer")(maxpool_1)
d1 = Dense(units=64, activation=relu, kernel_initializer=he_normal(seed=45),
bias_initializer=zeros(), name="first_dense_layer", kernel_regularizer = l2(0.001))(flat)
d2 = Dense(units=32, activation=relu, kernel_initializer=he_normal(seed=47),
bias_initializer=zeros(), name="second_dense_layer", kernel_regularizer = l2(0.001))(d1)
classification = Dense(units = 43, activation=None, name="classification")(d2)
regression = Dense(units = 4, activation = 'linear', name = "regression")(d2)
#Model
model = Model(inputs = input_layer, outputs = [classification, regression])
model.summary()
return model
For classification output, you need to use softmax.
classification = Dense(units = 43, activation='softmax', name="classification")(d2)
You should use categorical_crossentropy loss for the classification output.
For regression, you can use mse loss.
Lets suppose I have specified mobilenet from keras models this way:
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(12, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer = Adam(),
metrics=['accuracy'])
But I would like to add custom layer to preporess input image this way:
def myFunc(x):
return K.reshape(x/255,(-1,224,224,3))
new_model = Sequential()
new_model.add(Lambda(myFunc,input_shape =( 224, 224, 3), output_shape=(224, 224, 3)))
new_model.add(model)
new_model.compile(loss='categorical_crossentropy', optimizer = Adam(),
metrics=['accuracy'])
new_model.summary()
It works pretty well but now I need to have it input shape 224 224 3 instead of (None, 224, 224, 3) - how to make it
In order to expand the dimension of your tensor, you can use
import tensorflow.keras.backend as K
# adds a new dimension to a tensor
K.expand_dims(tensor, 0)
However, I do not see why you would need it, just like #meonwongac mentioned.
If you still want to use a Lambda layer instead of resizing / applying other operations on images with skimage/OpenCV/ other library, one way of using the Lambda layer is the following:
import tensorflow as tf
input_ = Input(shape=(None, None, 3))
next_layer = Lambda(lambda image: tf.image.resize_images(image, (128, 128))(input_)