I have a few questions about interpreting the performance of certain optimizers on MNIST using a Lenet5 network and what does the validation loss/accuracy vs training loss/accuracy graphs tell us exactly.
So everything is done in Keras using a standard LeNet5 network and it is ran for 15 epochs with a batch size of 128.
There are two graphs, train acc vs val acc and train loss vs val loss. I made 4 graphs because I ran it twice, once with validation_split = 0.1 and once with validation_data = (x_test, y_test) in model.fit parameters. Specifically the difference is shown here:
train = model.fit(x_train, y_train, epochs=15, batch_size=128, validation_data=(x_test,y_test), verbose=1)
train = model.fit(x_train, y_train, epochs=15, batch_size=128, validation_split=0.1, verbose=1)
These are the graphs I produced:
using validation_data=(x_test, y_test):
using validation_split=0.1:
So my two questions are:
1.) How do I interpret both the train acc vs val acc and train loss vs val acc graphs? Like what does it tell me exactly and why do different optimizers have different performances (i.e the graphs are different as well).
2.) Why do the graphs change when I use validation_split instead? Which one would be a better choice to use?
I will attempt to provide an answer
You can see that towards the end training accuracy is slightly higher than validation accuracy and training loss is slightly lower than validation loss. This hints at overfitting and if you train for more epochs the gap should widen.
Even if you use the same model with same optimizer you will notice slight difference between runs because weights are initialized randomly and randomness associated with GPU implementation. You can look here for how to address this issue.
Different optimizers will usually produce different graph because they update model parameters differently. For example, vanilla SGD will do update at constant rate for all parameters and at all training steps. But if you add momentum the rate will depend on previous updates and usually will result in faster convergence. Which means you can achieve same accuracy as vanilla SGD in lower number of iteration.
Graphs will change because training data will be changed if you split randomly. But for MNIST you should use standard test split provided with the dataset.
Related
I would like to monitor accuracy for my tensorflow model, however, when compiling my model using metrics=['accuracy'] or metrics=[tf.keras.metrics.Accuracy()] and then train my model the following Warning pops up.
WARNING:tensorflow: Early stopping conditioned on metric accuracy which is not available. Available metrics are: loss, val_loss
model.compile(optimizer='adam', loss='mean_squared_error', metrics=["tried both options i mentioned"])
callbacks = [EarlyStopping(monitor='accuracy', patience=1000)]
model.fit(x_train, y_train, epochs=5000, batch_size=100, validation_split=0.2, callbacks=callbacks)
Based on the link here:
Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got right. Formally, accuracy has the following definition:
So, for other problems like regression you should use other metrics rather than accuracy, like metrics=[tf.keras.metrics.MeanSquaredError()])
In addition to Kaveh's answer, there are other metrics for regression problems. One that I think is quite useful is R2 squared (https://en.wikipedia.org/wiki/Coefficient_of_determination) and it isn't included in Keras.
Tensorflow addons library (https://www.tensorflow.org/addons) implements it and can be used in a ANN with the following code:
import tensorflow_addons as tfa
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.01),
loss="mean_squared_error",
metrics=tfa.metrics.RSquare(y_shape=(1,)))
'''model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, verbose=2)'''
in the above line of code model is a sequential keras model having layers and is compiled.
what is the use of the parameter validaiton_data. the model is going to train on X_train and y_train data. so based on y_train the parameters would be adjusted and back propagation is done.
what is the use of validation_data and why is a different data in this case testing data provided.
During training, (x_train, y_train) data is used to adjust trainable parameters of the model. However, we don't know whether model is overfit or under fit, whether model is going to do well when new data is provided. So, that is the reason why we have validation data (x_test, y_test) to test the accuracy of model on any unseen data.
Depending on training and validation accuracy, we can decide.
- Whether the model is overfit/underfit,
- whether to collect more data,
- do we need to implement regularization technique,
- do we need to use data augmentation techniques,.
- whether we need to do tune hyper parameters etc.
In Keras, when we are training a model for a fixed number of epochs using model.fit(), one of its parameters is shuffle (a boolean). The Keras documentation about it reads:
"Boolean (whether to shuffle the training data before each epoch)."
Essentially, I am training a Convolutional Neural Network and trying to get reproducible results. So, I followed the instructions and specified seeds as mentioned in this answer.
Although it worked partially (successfully got reproducible results on my local machine only), it was thought setting shuffle=False would help (by keeping the same data inputs), but keeping the reproducibility aside for a second, doing that dramatically reduced the performance of the model. Specifically, after each epoch, the metrics give same results (meaning not improving) even an increase in epochs gives same numbers (Accuracy = ~75 after 3 epochs and after 30 epochs). But setting shuffle=True shows gradual normal improvement in results.
Training data shape: (143256, 1, 150, 3)
Target data shape: (143256, 3)
Batch Size: 64
metrics = ['accuracy']
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=metrics)
....
model.fit(x_train, to_categorical(y_train), batch_size=batch_size,
epochs=epochs, verbose=verbose,
validation_data=(x_val, to_categorical(y_val)),
shuffle=False, callbacks=[metrics],
class_weight=class_weights)
Is this normal behavior of shuffling being set to false? Because even though the data is not permuted, the weights should be updated in each epoch and hence the metrics should improve overtime.
Assuming there is some issue with my implementation, should there be any significant difference in model performance when trying to train with both approaches (shuffling or without it)?
How can the results be reproducible with shuffle=True, which they apparently are, even if seeds are specified?
Any help will be really appreciated. Thanks!
I’m a new learner, I just try to get accuracy and validate accuracy using the below code
model = Sequential()
model.add(LSTM(10, input_shape=(train_X.shape[1], train_X.shape[2])))
#model.add(Dropout(0.2))
#model.add(LSTM(30, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(1), return_sequences=True)
model.compile(loss=’mae’, optimizer=’adam’, metrics=[‘accuracy’])
# fit network
history = model.fit(train_X, train_y, epochs=50, batch_size=120, validation_data=(test_X, test_y), verbose=2, shuffle=False)
# plot history
pyplot.plot(history.history[‘loss’], label=’train’)
pyplot.plot(history.history[‘val_loss’], label=’test’)
pyplot.legend()
pyplot.show()
print(history.history[‘acc’])
As the loss value is very low (which is round 0.0136) inspite of that I’m getting the accuracy is 6.9% and validate accuracy is 2.3% respectively, which is very low
That is because accuracy is meaningful only for classification problems; for regression (i.e. numeric prediction) ones, such as yours, accuracy is meaningless.
What's more, the fact is that Keras unfortunately will not "protect" you or any other user from putting such meaningless requests in your code, i.e. you will not get any error, or even a warning, that you are attempting something that does not make sense, such as requesting the accuracy in a regression setting; see my answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)? for more details and a practical demonstration (the argument is identical in the case of MAE instead of MSE, since both loss functions signify regression problems).
In regression settings, usually the performance metric is the same with the loss (here MAE), so you should just remove the metrics=[‘accuracy’] argument from your model compilation and worry only for your loss (which, as you say, is low indeed).
I want to ask a question about how to monitor validation loss in the training process of estimators in TensorFlow. I have checked a similar question (validation during training of Estimator) asked before, but it did not help much.
If I use estimators to build a model, I will give an input function to the Estimator.train() function. But there is no way to add another validation_x, and validation_y data in the training process. Therefore, when the training started, I can only see the training loss. The training loss is expected to decrease when the training process running longer. However, this information is not helpful to prevent overfitting. The more valuable information is validation loss. Usually, the validation loss is the U-shape with the number of epochs. To prevent overfitting, we want to find the number of epochs that the validation loss is minimum.
So this is my problem. How can I get validation loss for each epoch in the training process of using estimators?
You need to create a validation input_fn and either use estimator.train() and estimator.evaluate() alternatively or simpy use tf.estimator.train_and_evaluate()
x = ...
y = ...
...
# For example, if x and y are numpy arrays < 2 GB
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
val_dataset = tf.data.Dataset.from_tensor_slices((x_val_, y_val))
...
estimator = ...
for epoch in n_epochs:
estimator.train(input_fn = train_dataset)
estimator.evaluate(input_fn = val_dataset)
estimator.evaluate() will compute the loss and any other metrics that are defined in your model_fn and will save the events in a new "eval" directory inside your job_dir.