Overfitting and Underfitting With Machine Learning? - python

i worked with multilabel text classification and this result i have bu using binary relevance and labelpowerset with below algorithms ?, is there is overfitting or what?
Binary Relevance with LinearSVC:
------Training Model Metrics-----
Hamming Loss: 0.0628
Accuracy: 0.4922
Macro Precision: 0.8449
Macro Recall: 0.7629
Macro F1-measure: 0.7977
Micro Precision: 0.8804
Micro Recall: 0.8560
Micro F1-measure: 0.8681
------Testing Model Metrics-----
Hamming Loss: 0.0779
Accuracy: 0.4243
Macro Precision: 0.6969
Macro Recall: 0.6591
Macro F1-measure: 0.6724
Micro Precision: 0.8456
Micro Recall: 0.8263
Micro F1-measure: 0.8358
Label Powerset with LinearSVC:
------Training Model Metrics-----
Hamming Loss: 0.0399
Accuracy: 0.8266
Macro Precision: 0.8659
Macro Recall: 0.8652
Macro F1-measure: 0.8648
Micro Precision: 0.9176
Micro Recall: 0.9170
Micro F1-measure: 0.9173
------Testing Model Metrics-----
Hamming Loss: 0.0931
Accuracy: 0.6111
Macro Precision: 0.6554
Macro Recall: 0.6622
Macro F1-measure: 0.6573
Micro Precision: 0.8066
Micro Recall: 0.8053
Micro F1-measure: 0.8059
Hamming Loss: 0.0716
Accuracy: 0.5009
Macro Precision: 0.8529
Macro Recall: 0.5865
Macro F1-measure: 0.6740
Micro Precision: 0.9005
Micro Recall: 0.7903
Micro F1-measure: 0.8418
------Testing Model Metrics-----
Hamming Loss: 0.0840
Accuracy: 0.4532
Macro Precision: 0.8107
Macro Recall: 0.5487
Macro F1-measure: 0.6324
Micro Precision: 0.8748
Micro Recall: 0.7589
Micro F1-measure: 0.8127
Label Powerset with KNeighborsClassifier:
------Training Model Metrics-----
Hamming Loss: 0.0709
Accuracy: 0.6994
Macro Precision: 0.7713
Macro Recall: 0.7264
Macro F1-measure: 0.7452
Micro Precision: 0.8553
Micro Recall: 0.8499
Micro F1-measure: 0.8526
------Testing Model Metrics-----
Hamming Loss: 0.0859
Accuracy: 0.6359
Macro Precision: 0.7064
Macro Recall: 0.6689
Macro F1-measure: 0.6833
Micro Precision: 0.8238
Micro Recall: 0.8168
Micro F1-measure: 0.8203

Related

Custom loss function's results does not match with the inbuilt loss function's result

I have implemented a custom binary cross entropy loss function in tensorflow. To test this I had compared it with the inbuilt binary cross entropy loss function in Tensorflow. But, I got very different results in both cases. I am unable to understand this behaviour.
def custom_loss(eps,w1,w2):
def loss(y_true, y_pred):
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(y_pred+eps))
return ans
return loss
I had set eps to 1e-6, w1=1 and w2=1. The loss dropped to very small values when I used my implementation of the loss function. Whereas, there was a steady drop while using the inbuilt loss function in tensorflow.
Edit:
Here are the outputs:
1: Using the custom implementation:
1/650 [..............................] - ETA: 46:37 - loss: 0.8810 - acc: 0.50
2/650 [..............................] - ETA: 41:27 - loss: 0.4405 - acc: 0.40
3/650 [..............................] - ETA: 39:38 - loss: 0.2937 - acc: 0.41
4/650 [..............................] - ETA: 38:44 - loss: 0.2203 - acc: 0.45
5/650 [..............................] - ETA: 38:13 - loss: 0.1762 - acc: 0.46
6/650 [..............................] - ETA: 37:47 - loss: 0.1468 - acc: 0.42
7/650 [..............................] - ETA: 37:29 - loss: 0.1259 - acc: 0
Using the built in loss function with eps=1e-7.
1/650 [..............................] - ETA: 48:15 - loss: 2.4260 - acc: 0.31
2/650 [..............................] - ETA: 42:09 - loss: 3.1842 - acc: 0.46
3/650 [..............................] - ETA: 40:10 - loss: 3.4615 - acc: 0.47
4/650 [..............................] - ETA: 39:06 - loss: 3.9737 - acc: 0.45
5/650 [..............................] - ETA: 38:28 - loss: 4.5173 - acc: 0.47
6/650 [..............................] - ETA: 37:58 - loss: 5.1865 - acc: 0.45
7/650 [..............................] - ETA: 37:41 - loss: 5.8239 - acc: 0.43
8/650 [..............................] - ETA: 37:24 - loss: 5.6979 - acc: 0.46
9/650 [..............................] - ETA: 37:12 - loss: 5.5973 - acc: 0.47
The input is an image from the MURA dataset. To keep the test uniform same images are passed in both the tests.
You have a slight error in your implementation.
You have:
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(y_pred + eps))
Whereas, I think you were aiming for:
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(1 - y_pred + eps))
Generally we also take the average of this loss so that makes our implementation:
def custom_loss(eps,w1,w2):
def loss(y_true, y_pred):
ans = -1*(w1*y_true*tf.log(y_pred+eps) + w2*(1-y_true)*tf.log(1-y_pred+eps))
return tf.reduce_mean(ans)
return loss
which we can now test against the out of the box implementation:
y_true = tf.constant([0.1, 0.2])
y_pred = tf.constant([0.11, 0.19])
custom_loss(y_true, y_pred) # == 0.41316
tf.keras.losses.binary_crossentropy(y_true, y_pred) # == 0.41317
and find that the results match to many decimal places (I can't account for the small difference - maybe a different epsilon value? - but I guess such a small difference is negligible)

Cannot develop a custom metric for Keras model

I am developing a Keras model for the multi-class classification problem (4 classes) with a custom metric.
The problem is that I cannot develop a custom metric for this model. When I run the model, the values of metrics are empty.
This is my model:
nb_classes = 4
model = Sequential()
model.add(LSTM(
units=50,
return_sequences=True,
input_shape=(20,18),
dropout=0.2,
recurrent_dropout=0.2
)
)
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=nb_classes,
activation='softmax'))
model.compile(loss="categorical_crossentropy",optimizer='adadelta')
history = model.fit(np.array(X_train), y_train,
validation_data=(np.array(X_test), y_test),
epochs=50,
batch_size=2,
callbacks=[model_metrics],
shuffle=False,
verbose=1)
This is how model_metrics is defined:
class Metrics(Callback):
def on_train_begin(self, logs={}):
self.val_f1s = []
self.val_recalls = []
self.val_precisions = []
def on_epoch_end(self, epoch, logs={}):
val_predict = np.argmax((np.asarray(self.model.predict(self.validation_data[0]))).round(), axis=1)
val_targ = np.argmax(self.validation_data[1], axis=1)
_val_f1 = metrics.f1_score(val_targ, val_predict, average='weighted')
_val_recall = metrics.recall_score(val_targ, val_predict, average='weighted')
_val_precision = metrics.precision_score(val_targ, val_predict, average='weighted')
self.val_f1s.append(_val_f1)
self.val_recalls.append(_val_recall)
self.val_precisions.append(_val_precision)
print(" — val_f1: %f — val_precision: %f — val_recall %f".format(_val_f1, _val_precision, _val_recall))
return
model_metrics = Metrics()
When I run fit, I get this result:
Train on 400 samples, validate on 80 samples
Epoch 1/50
400/400 [==============================] - 7s 17ms/step - loss: 0.6892 - val_loss: 4.8016
— val_f1: %f — val_precision: %f — val_recall %f
Epoch 2/50
20/400 [>.............................] - ETA: 3s - loss: 2.8010
/Users/tau/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
/Users/tau/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1143: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
400/400 [==============================] - 3s 9ms/step - loss: 0.7593 - val_loss: 4.5832
— val_f1: %f — val_precision: %f — val_recall %f
Epoch 3/50
400/400 [==============================] - 4s 9ms/step - loss: 0.6809 - val_loss: 4.9039
— val_f1: %f — val_precision: %f — val_recall %f
You can see val_f1: %f — val_precision: %f — val_recall %f. There are no values of metrics. Why? What am I doing wrong?
Your problem is not in Keras. You are using the Python string formatting wrong. Here is the correct usage:
print(" — val_f1: {:f} — val_precision: {:f} — val_recall {:f}".format(_val_f1, _val_precision, _val_recall))
Alternatively:
print(" — val_f1: %f — val_precision: %f — val_recall %f" % (_val_f1, _val_precision, _val_recall))

ValueError: Unknown metric function when using custom metric in Keras

Keras 2.x killed off a bunch of useful metrics that I need to use, so I copied the functions from the old metrics.py file into my code, then included them as follows.
def precision(y_true, y_pred): #taken from old keras source code
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def recall(y_true, y_pred): #taken from old keras source code
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
...
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy', precision, recall])
and this results in
ValueError: Unknown metric function:precision
What am I doing wrong? I can't see anything I'm doing wrong according to Keras documentation.
edit:
Here is the full Traceback:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Python/2.7/site-packages/keras/models.py", line 274, in
load_model
sample_weight_mode=sample_weight_mode)
File "/Library/Python/2.7/site-packages/keras/models.py", line 824, in
compile
**kwargs)
File "/Library/Python/2.7/site-packages/keras/engine/training.py", line
934, in compile
handle_metrics(output_metrics)
File "/Library/Python/2.7/site-packages/keras/engine/training.py", line
901, in handle_metrics
metric_fn = metrics_module.get(metric)
File "/Library/Python/2.7/site-packages/keras/metrics.py", line 75, in get
return deserialize(str(identifier))
File "/Library/Python/2.7/site-packages/keras/metrics.py", line 67, in
deserialize
printable_module_name='metric function')
File "/Library/Python/2.7/site-packages/keras/utils/generic_utils.py",
line 164, in deserialize_keras_object
':' + function_name)
ValueError: Unknown metric function:precision
<FATAL> : Failed to load Keras model from file:
model.h5
***> abort program execution
Traceback (most recent call last):
File "classification.py", line 84, in <module>
'H:!V:FilenameModel=model.h5:NumEpochs=20:BatchSize=32')
#:VarTransform=D,G
TypeError: none of the 3 overloaded methods succeeded. Full details:
TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader,
TString theMethodName, TString methodTitle, TString theOption = "") =>
could not convert argument 2
TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader,
TMVA::Types::EMVA theMethod, TString methodTitle, TString theOption = "") =>
FATAL error (C++ exception of type runtime_error)
TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader*,
TMVA::Types::EMVA, TString, TString, TMVA::Types::EMVA, TString) =>
takes at least 6 arguments (4 given)
From the traceback it seems that the problem occurs when you try to load the saved model:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Python/2.7/site-packages/keras/models.py", line 274, in
load_model
sample_weight_mode=sample_weight_mode)
...
ValueError: Unknown metric function:precision
<FATAL> : Failed to load Keras model from file:
model.h5
Take a look at this issue: https://github.com/keras-team/keras/issues/10104
You need to add your custom objects when loading the model. For example:
dependencies = {
'auc_roc': auc_roc
}
model = keras.models.load_model(self.output_directory + 'best_model.hdf5', custom_objects=dependencies)
My suggestion would be to implement your metrics in Keras callback.
Because:
It can achieve the same thing as metrics does.
It can also provide you model saving strategy.
class Checkpoint(keras.callbacks.Callback):
def __init__(self, test_data, filename):
self.test_data = test_data
self.filename = filename
def on_train_begin(self, logs=None):
self.pre = [0.]
self.rec = [0.]
print('Test on %s begins' % self.filename)
def on_train_end(self, logs={}):
print('Best Precison: %s' % max(self.pre))
print('Best Recall: %s' % max(self.rec))
return
def on_epoch_end(self, epoch, logs={}):
x, y = self.test_data
self.pre.append(precision(x, y))
self.rec.append(recall(x, y))
# print your precision or recall as you want
print(...)
# Save your model when a better trained model was found
if pre > max(self.pre):
self.model.save(self.filename, overwrite=True)
print('Higher precision found. Save as %s' % self.filename)
return
after that, you can add your callback to your:
checkpoint = Checkpoint((x_test, y_test), 'precison.h5')
model.compile(loss='categorical_crossentropy', optimizer='adam', callbacks=[checkpoint])
I tested your code in Python 3.6.5, TensorFlow==1.9 and Keras==2.2.2 and it worked. I think the error could be due to Python 2 usage.
import numpy as np
import tensorflow as tf
import keras
import keras.backend as K
from keras.layers import Dense
from keras.models import Sequential, Input, Model
from sklearn import datasets
print(f"TF version: {tf.__version__}, Keras version: {keras.__version__}\n")
# dummy dataset
iris = datasets.load_iris()
x, y_ = iris.data, iris.target
def one_hot(v): return np.eye(len(np.unique(v)))[v]
y = one_hot(y_)
# model
inp = Input(shape=(4,))
dense = Dense(8, activation='relu')(inp)
dense = Dense(16, activation='relu')(dense)
dense = Dense(3, activation='softmax')(dense)
model = Model(inputs=inp, outputs=dense)
# custom metrics
def precision(y_true, y_pred): #taken from old keras source code
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def recall(y_true, y_pred): #taken from old keras source code
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
# training
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy', precision, recall])
model.fit(x=x, y=y, batch_size=8, epochs=15)
Output:
TF version: 1.9.0, Keras version: 2.2.2
Epoch 1/15
150/150 [==============================] - 0s 2ms/step - loss: 1.2098 - acc: 0.2600 - precision: 0.0000e+00 - recall: 0.0000e+00
Epoch 2/15
150/150 [==============================] - 0s 135us/step - loss: 1.1036 - acc: 0.4267 - precision: 0.0000e+00 - recall: 0.0000e+00
Epoch 3/15
150/150 [==============================] - 0s 132us/step - loss: 1.0391 - acc: 0.5733 - precision: 0.0000e+00 - recall: 0.0000e+00
Epoch 4/15
150/150 [==============================] - 0s 133us/step - loss: 0.9924 - acc: 0.6533 - precision: 0.0000e+00 - recall: 0.0000e+00
Epoch 5/15
150/150 [==============================] - 0s 108us/step - loss: 0.9379 - acc: 0.6667 - precision: 0.0000e+00 - recall: 0.0000e+00
Epoch 6/15
150/150 [==============================] - 0s 134us/step - loss: 0.8802 - acc: 0.6667 - precision: 0.0533 - recall: 0.0067
Epoch 7/15
150/150 [==============================] - 0s 167us/step - loss: 0.8297 - acc: 0.7867 - precision: 0.4133 - recall: 0.0800
Epoch 8/15
150/150 [==============================] - 0s 138us/step - loss: 0.7743 - acc: 0.8200 - precision: 0.9467 - recall: 0.3667
Epoch 9/15
150/150 [==============================] - 0s 161us/step - loss: 0.7232 - acc: 0.7467 - precision: 1.0000 - recall: 0.5667
Epoch 10/15
150/150 [==============================] - 0s 134us/step - loss: 0.6751 - acc: 0.8000 - precision: 0.9733 - recall: 0.6333
Epoch 11/15
150/150 [==============================] - 0s 134us/step - loss: 0.6310 - acc: 0.8867 - precision: 0.9924 - recall: 0.6400
Epoch 12/15
150/150 [==============================] - 0s 131us/step - loss: 0.5844 - acc: 0.8867 - precision: 0.9759 - recall: 0.6600
Epoch 13/15
150/150 [==============================] - 0s 111us/step - loss: 0.5511 - acc: 0.9133 - precision: 0.9759 - recall: 0.6533
Epoch 14/15
150/150 [==============================] - 0s 134us/step - loss: 0.5176 - acc: 0.9000 - precision: 0.9403 - recall: 0.6733
Epoch 15/15
150/150 [==============================] - 0s 134us/step - loss: 0.4899 - acc: 0.8667 - precision: 0.8877 - recall: 0.6733

In Neural Networks: accuracy improvement after each epoch is GREATER than accuracy improvement after each batch. Why?

I am training a neural network in batches with Keras 2.0 package for Python.
Below is some information about the data and the training parameters:
#samples in train: 414934
#features: 590093
#classes: 2 (binary classification problem)
batch size: 1024
#batches = 406 (414934 / 1024 = 405.2)
Below are some logs of the follow code:
for i in range(epochs):
print("train_model:: starting epoch {0}/{1}".format(i + 1, epochs))
model.fit_generator(generator=batch_generator(data_train, target_train, batch_size),
steps_per_epoch=num_of_batches,
epochs=1,
verbose=1)
(partial) Logs:
train_model:: starting epoch 1/3
Epoch 1/1
1/406 [..............................] - ETA: 11726s - loss: 0.7993 - acc: 0.5996
2/406 [..............................] - ETA: 11237s - loss: 0.7260 - acc: 0.6587
3/406 [..............................] - ETA: 14136s - loss: 0.6619 - acc: 0.7279
404/406 [============================>.] - ETA: 53s - loss: 0.3542 - acc: 0.8917
405/406 [============================>.] - ETA: 26s - loss: 0.3541 - acc: 0.8917
406/406 [==============================] - 10798s - loss: 0.3539 - acc: 0.8918
train_model:: starting epoch 2/3
Epoch 1/1
1/406 [..............................] - ETA: 15158s - loss: 0.2152 - acc: 0.9424
2/406 [..............................] - ETA: 14774s - loss: 0.2109 - acc: 0.9419
3/406 [..............................] - ETA: 16132s - loss: 0.2097 - acc: 0.9408
404/406 [============================>.] - ETA: 64s - loss: 0.2225 - acc: 0.9329
405/406 [============================>.] - ETA: 32s - loss: 0.2225 - acc: 0.9329
406/406 [==============================] - 13127s - loss: 0.2225 - acc: 0.9329
train_model:: starting epoch 3/3
Epoch 1/1
1/406 [..............................] - ETA: 22631s - loss: 0.1145 - acc: 0.9756
2/406 [..............................] - ETA: 24469s - loss: 0.1220 - acc: 0.9688
3/406 [..............................] - ETA: 23475s - loss: 0.1202 - acc: 0.9691
404/406 [============================>.] - ETA: 60s - loss: 0.1006 - acc: 0.9745
405/406 [============================>.] - ETA: 31s - loss: 0.1006 - acc: 0.9745
406/406 [==============================] - 11147s - loss: 0.1006 - acc: 0.9745
My question is: what happens after each epoch that improves the accuracy like that? For example, the accuracy at the end of the first epoch is 0.8918, but at the beginning of the second epoch accuracy of 0.9424 is observed. Similarly, the accuracy at the end of the second epoch is 0.9329, but the third epoch starts with accuracy of 0.9756.
I would expect to find an accuracy of ~0.8918 at the beginning of the second epoch, and ~0.9329 at the beginning of the third epoch.
I know that in each batch there is one forward pass and one backward pass of training samples in the batch. Thus, in each epoch there is one forward pass and one backward pass of all training samples.
Also, from Keras documentation:
Epoch: an arbitrary cutoff, generally defined as "one pass over the entire dataset", used to separate training into distinct phases, which is useful for logging and periodic evaluation.
Why is the accuracy improvement within each epoch is smaller than the accuracy improvement between the end of epoch X and the beginning of epoch X+1?
This has nothing to do with your model or your dataset; the reason for this "jump" lies in how metrics are calculated and displayed in Keras.
As Keras processes batch after batch, it saves accuracies at each one of them, and what it displays to you is not the accuracy on the latest processed batch, but the average over all batches in the current epoch. And, as the model is being trained, accuracies over successive batches tend to improve.
Now consider: in the first epoch, let's say, there are 50 batches, and network went from 0% to 90% during these 50 batches. Then at the end of the epoch Keras will show accuracy of, e.g. (0 + 0.1 + 0.5 + ... + 90) / 50%, which is, obviously, much less than 90%! But, because your actual accuracy is 90%, the first batch of the second epoch will show 90%, giving the impression of a sudden "jump" in quality. The same, obviously, goes for loss or any other metric.
Now, if you want more realistic and trustworthy calculation of accuracy, loss, or any other metric you may find yourself using, I would suggest using validation_data parameter in model.fit[_generator] to provide validation data, which will not be used for training, but will be used only to evaluate the network at the end of each epoch, without averaging over various points in time.
The accuracy at the end of an epoch is the accuracy over the full dataset. The accuracy after each batch is the accuracy over all batches that are used for training at that moment. It could be the case that your first batch is predicted very well and the following batches have a lower accuracy. In that case the accuracy over your full dataset will be low compared to the accuracy of your first batch.

What does the acc means in the Keras model.fit output? the accuracy of the final iteration in a epoch or the average accuracy in a epoch?

I want to know the printed accuracy is the accuracy of the final iteration in a epoch or the average accuracy in a epoch?
code:
history=model.fit(data,label,nb_epoch=1,batch_size=32,validation_data=(X_test,Y_test))
the printed log:
Epoch 1/1
128/128 [==============================] - 17s - loss: 2.3152 - acc: 0.0859 - val_loss: 2.3010 - val_acc: 0.1157
According to callback and history documentation;
acc represents the average training accuracy at the end of an epoch.
val_acc represents the accuracy of validation set at the and of an epoch.

Categories