For a regression task I'd like to customize the loss function to output a certainty measure additionally.
The initial normal network would be:
model = Sequential()
model.add(Dense(15, input_dim=7, kernel_initializer='normal'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer='adam')
I'd like to add a certainty indicator sigma to the loss function. E.g. depending on how accurate the predictions are different sigma sizes lead to minimal loss.
loss = (y_pred-y_true)^2/(2*sigma^2) + log(sigma)
The final outputs of the NN would then be y_pred and sigma.
I'm a bit lost in the implementation (new to keras):
Where would we initialize/store sigma for it to be updated around recurring, similar datapoints during training.
How do we connect the variable sigma from the loss function to the second NN output.
My current base stucture, where I'm obviously lacking the pieces
def custom_loss(y_true, y_pred, sigma):
loss = pow((y_pred - y_true), 2)/(2 * pow(sigma, 2))+math.log(sigma)
return loss, sigma
model = Sequential()
model.add(Dense(15, input_dim=7, kernel_initializer='normal'))
model.add(Dense(2, kernel_initializer='normal'))
model.compile(loss=custom_loss, optimizer='adam')
Any tips/guidances are highly appreciated. Thanks!
The key is to extend y_pred from a scalar to a vector
def custom_loss(y_true, y_pred):
loss = pow((y_pred[0] - y_true), 2) / (2 * pow(y_pred[1], 2)) + \
tf.math.log(y_pred[1])
return loss
model = Sequential()
model.add(Dense(15, input_dim=7, kernel_initializer='normal'))
model.add(Dense(2, kernel_initializer='normal'))
model.compile(loss=custom_loss, optimizer='adam')
The model then returns the sigma to the prediction.
Y = model.predict(X) # Y = [prediction, sigma]
Related
I have a desired loss function as:
one_weight = (1-num_of_ones)/(num_of_ones + num_of_zeros)
zero_weight = (1-num_of_zeros)/(num_of_ones + num_of_zeros)
def weighted_binary_crossentropy(zero_weight, one_weight):
def weighted_binary_crossentropy(y_true, y_pred):
b_ce = K.binary_crossentropy(y_true, y_pred)
# weighted calc
weight_vector = y_true * one_weight + (1 - y_true) * zero_weight
weighted_b_ce = weight_vector * b_ce
return K.mean(weighted_b_ce)
return weighted_binary_crossentropy
I'm trying to use this loss function in my model which is:
model = Sequential()
model.add(BatchNormalization())
model.add(Conv2D(16, kernel_size=(32,1),strides=(1,1), activation='relu', input_shape=(78,64,1)))
model.add(Conv2D(16, kernel_size=(1,10),strides=(1,10), activation='relu'))
model.add(BatchNormalization())
model.add(ReLU(max_value=None))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='sigmoid'))
model.compile(optimizer=opt, loss = weighted_binary_crossentropy , metrics = ['acc'] )
history = model.fit(Train_Data, Train_labels, batch_size =20, epochs = 450, shuffle = True , validation_data = (Val_Data, Val_labels))
my question is, the loss function requires an input which is y_pred (the labels of test data which are predicted by model). y_pred is accessible after training the model by my desired loss function, but the loss function requires y_pred during training the model.
On the other hand, I can say: I use the loss function to train my model but it gives error, because there is no y_pred to use it as input of loss function.
How can i use my desired loss function to train the model while I don't have y_pred before starting the training process? note that I have other required loss function parameters.
Pass your own parameters to weighted_binary_crossentropy. This function returns internal wrapped function (weighted_binary_crossentropy) which accepts y_true and y_pred and you don't need to do anything with it.
model.compile(optimizer=opt,
loss=weighted_binary_crossentropy(zero_weight,one_weight),
metrics=['acc'])
I am trying to use a custom loss function for my model. I am scaling y values previously and in my loss function I inverse scale them.(Using the answer from scaling back data in customized keras training loss function) After a random amount of epochs the loss starts to come as NaN also mean_absolute_error val_mean_absolute_error and val_loss are all NaN. Heres my model and custom loss function:
model = Sequential()
model.add(LSTM(units=512, activation="tanh", return_sequences=True, input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(units=256, activation="tanh", return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=128, activation="tanh", return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=64, activation="tanh"))
model.add(Dropout(0.2))
model.add(Dense(units = 2))
model.compile(optimizer = "Adam", loss = my_loss_function , metrics=['mean_absolute_error'])
model.summary()
I have 2 outputs as you can see.
def my_loss_function(y_actual, y_predicted):
y_actual = (y_actual - K.constant(y_scaler.min_)) / K.constant(y_scaler.scale_)
y_predicted = (y_predicted - K.constant(y_scaler.min_)) / K.constant(y_scaler.scale_)
a_loss = abs(y_actual[0]-y_predicted[0])*128000
b_loss = abs(y_actual[1]-y_predicted[1])*27000
loss= tf.math.sqrt(tf.square(a_loss) + tf.square(b_loss))
return loss
y_scaler is used earlier:
y_scaler = MinMaxScaler(feature_range = (0, 1))
y_scaler.fit(y_data)
y_data=y_scaler.transform(y_data)
y_testdata=y_scaler.transform(y_testdata)
Can anyone help?
When I use MSE, MAE etc. it works fine
I tried to design an LSTM network using keras but the accuracy is 0.00 while the loss value is 0.05 the code which I wrote is below.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(1, activation = tf.nn.relu))
def percentage_difference(y_true, y_pred):
return K.mean(abs(y_pred/y_true - 1) * 100)
model.compile(optimizer='sgd',
loss='mse',
metrics = ['accuracy', percentage_difference])
model.fit(x_train, y_train.values, epochs = 10)
my input train and test data set have been imported using the pandas' library. The number of features is 5 and the number of target is 1. All endeavors will be appreciated.
From what I see is that you're using a neural network applied for a regression problem.
Regression is the task of predicting continuous values by learning from various independent features.
So, in the regression problem we don't have metrics like accuracy because this is for classification branch of the supervised learning.
The equivalent of accuracy for regression could be coefficient of determination or R^2 Score.
from keras import backend as K
def coeff_determination(y_true, y_pred):
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
model.compile(optimizer='sgd',
loss='mse',
metrics = [coeff_determination])
I am training a convolutional network with continuous output in the last layer. The last layer has 4 nodes. I am using the Mean Squared Error as a loss function. As a check I used the Mean Squared Error from Tensorflow. This gave only the same results for the first batch of the first epoch. Therefore my question is why do these differ? I used convolutional layers with max pooling and in the end I flattened it and used dropout.
Moreover, I was also wondering how is the Mean Squared Error computed for 4 nodes? Is it just summing the Mean Squared Error of each node? Cause when I calculate the Mean Squared Error per node there is not a clear connection.
This is the metric.
def loss(y_true, y_pred):
loss = tf.metrics.mean_squared_error(y_true, y_pred)[1]
K.get_session().run(tf.local_variables_initializer())
return loss
And here I compile the model
model.compile(loss='mse', optimizer= adam, metrics=[loss, node])
This is how I calculated the Mean squared Error for one node:
def node(y_true, y_pred):
loss = tf.metrics.mean_squared_error(y_true[:,0], y_pred[:,0])[1]
K.get_session().run(tf.local_variables_initializer())
return node
And this is a simplified form of the model:
width = height = 128
model = Sequential()
model.add(Convolution2D(filters=64, kernel_size=(5, 5), activation='relu', padding='same',
input_shape=(width, height, 1)))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Flatten())
model.add(Dense(units=256, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(units=4, activation='linear'))
adam = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0)
model.compile(loss='mse', optimizer= adam, metrics=[loss,node])
You are returning the function itself.
Look at your code:
def node(y_true, y_pred):
loss = tf.metrics.mean_squared_error(y_true[:,0], y_pred[:,0])[1]
K.get_session().run(tf.local_variables_initializer())
return node # This is a function name. This should be "return loss"
Try correcting this first.
I am using mean square error to compute the loss function of a multi output regressor. I used a recurrent neural network model with the one to many architecture. My output vector is of size 6 (1*6) and the values are monotonic (non decreasing).
example:
y_i = [1,3,6,13,30,57,201]
I would like to force the model to learn this dependency. Hence adding a constraint to the cost function. I am getting an error equal to 300 on the validation set. I believe after editing the mean square error loss function i will be able to get a better performance.
I am using keras for the implementation. Here is the core model.
batchSize = 256
epochs = 20
samplesData = trainX
samplesLabels = trainY
print("Compiling neural network model...")
Model = Sequential()
Model.add(LSTM(input_shape = (98,),input_dim=98, output_dim=128, return_sequences=True))
Model.add(Dropout(0.2))
#Model.add(LSTM(128, return_sequences=True))
#Model.add(Dropout(0.2))
Model.add(TimeDistributedDense(7))
#rmsprop = rmsprop(lr=0.0, decay=0.0)
Model.compile(loss='mean_squared_error', optimizer='rmsprop')
Model.summary()
print("Training model...")
# learning schedule callback
#lrate = LearningRateScheduler(step_decay)
#callbacks_list = [lrate]
history = Model.fit(samplesData, samplesLabels, batch_size=batchSize, nb_epoch= epochs, verbose=1,
validation_split=0.2, show_accuracy=True)
print("model training has been completed.")
Any other tips concerning learning rate, decay, etc.. are appreciated.
Keep the mean squared error as just a metric. Use Smooth L1 loss instead. Here's my implementation.
#Define Smooth L1 Loss
def l1_smooth_loss(y_true, y_pred):
abs_loss = tf.abs(y_true - y_pred)
sq_loss = 0.5 * (y_true - y_pred)**2
l1_loss = tf.where(tf.less(abs_loss, 1.0), sq_loss, abs_loss - 0.5)
return tf.reduce_sum(l1_loss, -1)
#And
Model.compile(loss='l1_smooth_loss', optimizer='rmsprop')