Keras F1 score metrics for training the model

Keras F1 score metrics for training the model - python

I am new to keras and I want to train the model with F1-score as my metrics.
I came across two things, one is that I can add callbacks and other is using the in built metrics function
Here, it says that the metrics function will not be used for training the model. So, does that mean I can anything in metrics argument while compiling the model?
Specfically,
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
In the above case even though accuracy is passed as metrics, it will not be used for training the model.
Second thing is to use callbacks as defined here,
import numpy as np
from keras.callbacks import Callback
from sklearn.metrics import confusion_matrix, f1_score, precision_score, recall_score
class Metrics(Callback):
def on_train_begin(self, logs={}):
self.val_f1s = []
self.val_recalls = []
self.val_precisions = []
def on_epoch_end(self, epoch, logs={}):
val_predict = (np.asarray(self.model.predict(self.model.validation_data[0]))).round()
val_targ = self.model.validation_data[1]
_val_f1 = f1_score(val_targ, val_predict)
_val_recall = recall_score(val_targ, val_predict)
_val_precision = precision_score(val_targ, val_predict)
self.val_f1s.append(_val_f1)
self.val_recalls.append(_val_recall)
self.val_precisions.append(_val_precision)
print “ — val_f1: %f — val_precision: %f — val_recall %f” %(_val_f1, _val_precision, _val_recall)
return
metrics = Metrics()
Then fit the model,
model.fit(training_data, training_target,
validation_data=(validation_data, validation_target),
nb_epoch=10,
batch_size=64,
callbacks=[metrics])
I am not sure if this will train the model on f1 score.

You can't train a neural network with f1-scores. For back propagating the error during training you need some sort of function which tells you, how far away your prediction is from the expected value. Such a function is as example the MSE loss.
F1 score on the other hand is just the harmonic mean between precision and recall from your samples. It does not tell you, in which direction you have to update the weights in order to get a better model. It also does not tell you, how far away you prediction is from the expected value.
What you could do is to print the F1 score after every epoch. An example on how to do this can be found in this blogpost

Related

How does keras.evaluate() calculate the loss?

I am building a MLP using TensorFlow 2.0. I am plotting the learning curve and also using keras.evaluate on both training and test data to see how well it performed. The code I'm using:
history = model.fit(X_train, y_train, batch_size=32,
epochs=200, validation_split=0.2, verbose=0)
# evaluate the model
eval_result_tr = model.evaluate(X_train, y_train)
eval_result_te = model.evaluate(X_test, y_test)
print("[training loss, training accuracy]:", eval_result_tr)
print("[test loss, test accuracy]:", eval_result_te)
#[training loss, training accuracy]: [0.5734676122665405, 0.9770742654800415]
#[test loss, test accuracy]: [0.7273344397544861, 0.9563318490982056]
#plot the learning rate curve
import matplotlib.pyplot as plt
plt.plot(history.history["loss"], label='eğitim')
plt.plot(history.history['val_loss'], label='doğrulama')
plt.xlabel("Öğrenme ivmesi")
plt.ylabel("Hata payı")
plt.title("Temel modelin öğrenme eğrisi")
plt.legend()
The output is:
My question is: How keras.evaluate() calculates the training loss to be 0.5734676122665405? I take the average of history.history["loss"] bu it returns different (0.7975356701016426) value.
Or, am I mistaken to begin with by trying to evaluate the model performance on training data by eval_result_tr = model.evaluate(X_train, y_train)?

For community benefit, adding #Dr. Snoopy's answer here in the answer section.
This has been asked many times before, the loss you see is with
changing weights during training, evaluating outside of training will
use fixed weights, so you will always see a different loss value.

RandomizedSearchCV with scoring accuracy leads to error:Scoring failed.The score on this train-test partition for these parameters will be set to nan

I want to find a good neural network instance by using RandomizedSearchCV with regard to accuracy, because the task is to solve a binary classification problem. Unfortunately I get the error message
Scoring failed. The score on this train-test partition for these parameters will be set to nan.
This is my implementation:
# Define neural network instance
def build_model(n_hidden_layers=2, n_neurons=77, dropout_rate=0.5 ,optimizer='adam', input_shape=77, activation_hidden="relu", activation_output="sigmoid",loss='binary_crossentropy',metrics=['binary_accuracy'],hidden_weight_initializer="he_normal",output_weight_initializer="glorot_normal",l1=0,l2=0,use_batch_norm=0):
model = keras.models.Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
for layer in range(n_hidden_layers):
model.add(keras.layers.Dense(n_neurons, activation=activation_hidden, kernel_initializer=hidden_weight_initializer, kernel_regularizer=tf.keras.regularizers.l1_l2(l1,l2)))
model.add(keras.layers.Dropout(dropout_rate))
if use_batch_norm == 1:
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(1,activation=activation_output, kernel_initializer=output_weight_initializer))
model.compile(loss=loss, optimizer=optimizer, metrics=metrics)
return model
# Dreate wrapper class for RandomizedSearchCV
keras_reg = keras.wrappers.scikit_learn.KerasRegressor(build_model)
# Define hyperparameter spaces for trained neural network instances
param_distribs = {
"n_hidden_layers": [1,2, 3,4,5],
"n_neurons": [x for x in range(10,100)],
"dropout_rate": [0, 0.1, 0.2, 0.3, 0.4, 0.5],
"use_batch_norm": [0,1],
# "optimizer": ['adam',],
"activation_hidden": ['relu','elu','selu'],# 'relu','','elu','selu',,'LeakyRelU(alpha=0.2)','PReLU(alpha_initializer=Constant(value=0.25))'
# "activation_output": ['relu','sigmoid'],
# "loss": ['binary_crossentropy']
# "l1":
# "l2":
}
from sklearn.metrics import make_scorer, precision_score, accuracy_score
precision = make_scorer(precision_score, pos_label="donated")
accuracy = make_scorer(accuracy_score, pos_label="donated")
# Use RandomizedSearchCV to find model instance with best performance on training data
rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=2, cv=2,scoring="accuracy")#, scoring=accuracy,random_state=1)#iter=10,cv=3
rnd_search_cv.fit(X_train, y_train, epochs=10,#100
validation_data=(X_test, y_test),
callbacks=[keras.callbacks.EarlyStopping(patience=5)],
batch_size=256)

I was trying to do the same thing and ran into the same problem. I found out that the problem was the scorer object supplied in RandomizedSearchCV(). The loss function should be specified within your build_model() function when compiling the model, and it must not be supplied by RandomizedSearchCV().
This should be easy if you want to use a simple loss function or metric like 'accuray' since it is already provided by keras. If you want a more specific loss function, you probably need to build it by yourself and make sure that it fulffils the requirements as described in the tensorflow documentation https://www.tensorflow.org/api_docs/python/tf/keras/Model.

Overfitting and data leakage in tensorflow/keras neural network

Good morning, I'm new in machine learning and neural networks. I am trying to build a fully connected neural network to solve a regression problem. The dataset is composed by 18 features and 1 label, and all of these are physical quantities.
You can find the code below. I upload the figure of the loss function evolution along the epochs (you can find it below). I am not sure if there is overfitting. Someone can explain me why there is or not overfitting?
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.feature_selection import SelectFromModel
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
import keras
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
from keras import optimizers
from sklearn.metrics import r2_score
from keras import regularizers
from keras import backend
from tensorflow.keras import regularizers
from keras.regularizers import l2
# =============================================================================
# Scelgo il test size
# =============================================================================
test_size = 0.2
dataset = pd.read_csv('DataSet.csv', decimal=',', delimiter = ";")
label = dataset.iloc[:,-1]
features = dataset.drop(columns = ['Label'])
y_max_pre_normalize = max(label)
y_min_pre_normalize = min(label)
def denormalize(y):
final_value = y*(y_max_pre_normalize-y_min_pre_normalize)+y_min_pre_normalize
return final_value
# =============================================================================
# Split
# =============================================================================
X_train1, X_test1, y_train1, y_test1 = train_test_split(features, label, test_size = test_size, shuffle = True)
y_test2 = y_test1.to_frame()
y_train2 = y_train1.to_frame()
# =============================================================================
# Normalizzo
# =============================================================================
scaler1 = preprocessing.MinMaxScaler()
scaler2 = preprocessing.MinMaxScaler()
X_train = scaler1.fit_transform(X_train1)
X_test = scaler2.fit_transform(X_test1)
scaler3 = preprocessing.MinMaxScaler()
scaler4 = preprocessing.MinMaxScaler()
y_train = scaler3.fit_transform(y_train2)
y_test = scaler4.fit_transform(y_test2)
# =============================================================================
# Creo la rete
# =============================================================================
optimizer = tf.keras.optimizers.Adam(lr=0.001)
model = Sequential()
model.add(Dense(60, input_shape = (X_train.shape[1],), activation = 'relu',kernel_initializer='glorot_uniform'))
model.add(Dropout(0.2))
model.add(Dense(60, activation = 'relu',kernel_initializer='glorot_uniform'))
model.add(Dropout(0.2))
model.add(Dense(60, activation = 'relu',kernel_initializer='glorot_uniform'))
model.add(Dense(1,activation = 'linear',kernel_initializer='glorot_uniform'))
model.compile(loss = 'mse', optimizer = optimizer, metrics = ['mse'])
history = model.fit(X_train, y_train, epochs = 100,
validation_split = 0.1, shuffle=True, batch_size=250
)
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)
y_train_pred = denormalize(y_train_pred)
y_test_pred = denormalize(y_test_pred)
plt.figure()
plt.plot((y_test1),(y_test_pred),'.', color='darkviolet', alpha=1, marker='o', markersize = 2, markeredgecolor = 'black', markeredgewidth = 0.1)
plt.plot((np.array((-0.1,7))),(np.array((-0.1,7))),'-', color='magenta')
plt.xlabel('True')
plt.ylabel('Predicted')
plt.title('Test')
plt.figure()
plt.plot((y_train1),(y_train_pred),'.', color='darkviolet', alpha=1, marker='o', markersize = 2, markeredgecolor = 'black', markeredgewidth = 0.1)
plt.plot((np.array((-0.1,7))),(np.array((-0.1,7))),'-', color='magenta')
plt.xlabel('True')
plt.ylabel('Predicted')
plt.title('Train')
plt.figure()
plt.plot(loss_values,'b',label = 'training loss')
plt.plot(val_loss_values,'r',label = 'val training loss')
plt.xlabel('Epochs')
plt.ylabel('Loss Function')
plt.legend()
print("\n\nThe R2 score on the test set is:\t{:0.3f}".format(r2_score(y_test_pred, y_test1)))
print("The R2 score on the train set is:\t{:0.3f}".format(r2_score(y_train_pred, y_train1)))
from sklearn import metrics
# Measure MSE error.
score = metrics.mean_squared_error(y_test_pred,y_test1)
print("\n\nFinal score test (MSE): %0.4f" %(score))
score1 = metrics.mean_squared_error(y_train_pred,y_train1)
print("Final score train (MSE): %0.4f" %(score1))
score2 = np.sqrt(metrics.mean_squared_error(y_test_pred,y_test1))
print(f"Final score test (RMSE): %0.4f" %(score2))
score3 = np.sqrt(metrics.mean_squared_error(y_train_pred,y_train1))
print(f"Final score train (RMSE): %0.4f" %(score3))
EDIT:
I tried alse to do feature importances and to raise n_epochs, these are the results:
Feature Importance:
No Feature Importace:

Looks like you don't have overfitting! Your training and validation curves are descending together and converging. The clearest sign you could get of overfitting would be a deviation between these two curves, something like this:
Since your two curves are descending and are not diverging, it indicates your NN training is healthy.
HOWEVER! Your validation curve is suspiciously below the training curve. This hints a possible data leakage (train and test data have been mixed somehow). More info on a nice an short blog post. In general, you should split the data before any other preprocessing (normalizing, augmentation, shuffling, etc...).
Other causes for this could be some type of regularization (dropout, BN, etc..) that is active while computing the training accuracy and it's deactivated when computing the Validation/Test accuracy.

Overfitting is, when the model does not generalize to other data than the training data. When this happen you will have a very (!) low training loss but a high validation loss. You can think of it this way: if you have N points you can fit a N-1 polynomial such that you have a zero training loss (your model hits all your training points perfectly). But, if you apply that model to some other data, it will most likely produce a very high error (see the image below). Here the red line is our model and the green is the true data (+ noice), and you can see in the last picture we get zero training error. In the first, our model is too simple (high train/high validation error), the second is good (low train/low valuidation error) the third and last is too complex i.e overfitting (very low train/high validation error).
Neural network can work in the same way, so by looking at your training vs validation error, you can conclude if it overfits or not

No, this is not overfitting as your validation loss isn´t increasing.
Nevertheless, if I were you I would be a little bit skeptical. Try to train your model for even more epochs and watch out for the validation loss.
What you definitely should do, is to observe the following:
- are there duplicates or near-duplicates in the data (creates information leakage from train to test validation split)
- are there features that have a causal connection to the target variable
Edit:
Usually, you have some random component in a real-world dataset, so that rules that are observed in train data aren´t 100% true for validation data.
Your plot shows that the validation loss is even more decreasing as train loss decreases. Usually, you get to some point in training, where the rules you observe in train data are too specific to describe the whole data. That´s when overfitting begins. Hence, it is weird, that your validation loss doesn´t increase again.
Please check whether your validation loss approaches zero when you´re training for more epochs. If it´s the case I would check your database very carefully.
Let´s assume, that there is a kind of information leakage from the train set to the validation set (through duplicate records for example). Your model would change the weights to describe very specific rules. When applying your model to new data it would fail miserably since the observed connections are not really general.
Another common data problem is, that features may have an inversed causality.
The thing that validation loss is generally lower than train error is probably depending on dropout and regularization, since it´s applied while training but not for predicting/testing.
I put some emphasis on this because a tiny bug or an error in the data can "fuck up" your whole model.

How to apply some features into a deep learning model?

i am trying to build an MLP model that takes a dataset consists of 9 columns
this is a sample (patient number, time in mill/sec., normalization of X Y and Z, kurtosis, skewness, pitch, roll and yaw, label) respectively.
1,15,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0
1,31,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0
1,46,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1.0928406847,-4.08015176908,0
1,62,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1.0928406847,-4.08015176908,0
and this is my code, there is no error in my code but the results with and without features are the same .. so i am asking if i used the right way to fed those features into the model.
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
import pandas as pd
import itertools
import math
np.random.seed(7)
train = np.loadtxt("featwithsignalsTRAIN.txt", delimiter=",")
test = np.loadtxt("featwithsignalsTEST.txt", delimiter=",")
x_train = train[:,[2,3,4,5,6,7]]
x_test = test[:,[2,3,4,5,6,7]]
y_train = train[:,8]
y_test = test[:,8]
model = Sequential()
model.add(Dense(500, input_dim=6, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy' , optimizer='adam', metrics=['accuracy'])
# Fit the model
batch_size = 128
epochs = 10
hist = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=2,
)
avg = np.mean(hist.history['acc'])
print('The Average Testing Accuracy is', avg)
##Evaluate the model
score=model.evaluate(x_test, y_test, verbose=2)
print(score)

There is nothing wrong with your model, but it's possible that your model doesn't learn anything useful. It could be that you are using a learning too high or too small, that you need more epochs, or that simply your features are not useful.
Here are some advices :
You can directly add a validation set to your fit method, which will compute the same metrics on this set at the end of each epoch and will allow you to see if your model learn something useful or if it's just overfitting on the training set without having to wait for the model to finish its training. (make sure you use verbose = 1 or 2 to see the training process).
model.fit( ... , validation_data = (x_test , y_test) , ...)
I see that you used the history callback. A good practice is to see how the accuracy is changing from an epoch to another instead of taking the mean. This allows you to see if your network is effectively learning something. A network rarely converge on the firsts epochs.
Do you have an idea of the 'usefulness' of your feature ? You can get an idea of that by performing an exploratory analysis before creating your model or by fitting a more 'conventional' model (linear regression, decision trees, random forest ...). It's Highly recommended before fitting a neural network, and this also allows you to compare different types of models and to see if you realy need to use neural networks.
If you are sure that your features would at least perform better than a random guess, try playing with the learning rate. A high learnign rate could cause the model to overshoot the minimum, and a learning rate too small could cause the model to learn very slowly or to get stuck in a local minima. You could also try to tune the number of epochs.

Python/Keras - How to access each epoch prediction?

I'm using Keras to predict a time series. As standard I'm using 20 epochs.
I want to check if my model is learning well, by predicting for each one of the 20 epochs.
By using model.predict() I'm getting only one prediction among all epochs (not sure how Keras selects it). I want all predictions, or at least the 10 best.
Would anyone know how to help me?

I think there is a bit of a confusion here.
An epoch is only used while training the neural network, so when training stops (in this case, after the 20th epoch), then the weights correspond to the ones computed on the last epoch.
Keras prints current loss values on the validation set during training after each epoch. If the weights after each epoch are not saved, then they are lost. You can save weights for each epoch with the ModelCheckpoint callback, and then load them back with load_weights on your model.
You can compute your predictions after each training epoch by implementing an appropriate callback by subclassing Callback and calling predict on the model inside the on_epoch_end function.
Then to use it, you instantiate your callback, make a list and use it as keyword argument callbacks to model.fit.

The following code will do the desired job:
import tensorflow as tf
import keras
# define your custom callback for prediction
class PredictionCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
y_pred = self.model.predict(self.validation_data[0])
print('prediction: {} at epoch: {}'.format(y_pred, epoch))
# ...
# register the callback before training starts
model.fit(X_train, y_train, batch_size=32, epochs=25,
validation_data=(X_valid, y_valid),
callbacks=[PredictionCallback()])

In case you want to make predictions on the test data, after every epoch while the training is going-on you can try this
class CustomCallback(keras.callbacks.Callback):
def __init__(self, model, x_test, y_test):
self.model = model
self.x_test = x_test
self.y_test = y_test
def on_epoch_end(self, epoch, logs={}):
y_pred = self.model.predict(self.x_test, self.y_test)
print('y predicted: ', y_pred)
You need mention the callback during model.fit
model.sequence()
# your model architecture
model.fit(x_train, y_train, epochs=10,
callbacks=[CustomCallback(model, x_test, y_test)])
Similar to on_epoch_end there are many other methods provided by keras
on_train_begin, on_train_end, on_epoch_begin, on_epoch_end, on_test_begin,
on_test_end, on_predict_begin, on_predict_end, on_train_batch_begin, on_train_batch_end,
on_test_batch_begin, on_test_batch_end, on_predict_batch_begin,on_predict_batch_end

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.