I am trying to interpret these learning curves.
These seem to overfit after the 1st epoch.
I have built a model using TensorFlow and the BERT transformer.
Is there another way to interpret these other than the optimum number of epochs is one?
Accuracy learning curve
Loss learning curve
Related
How to evaluate my MLPClassifier model? Is confusion matrix, accuracy, classification report enough? Do i need ROC for evaluating my MLPClassifier result? And aside from that how can i plot loss for test and training set, i used loss_curve function but it only show the loss plot for training set.
Ps. I'm dealing with multi-class classification problem.
This is a very open question and with no code, so I will answer you with what I think is best. Usually for multi-label classification problem it is standard to use accuracy as a measure to track training. Another good measure is called f1-score. Sklearn's classification_report is a very good method to track training.
Confusion matrices come after you train the model. They are used to check where the model is failing by evaluating which classes are harder to predict.
ROC curves are, usually, for binary classification problems. They can be adapted to multi-class by doing a one class vs the rest approach.
For the losses, it seems to me you might be confusing things. Training takes place over epochs, testing does not. If you train over 100 epochs, then you have 100 values for the loss to plot. Testing does not use epochs, at most it uses batches, therefore plotting the loss does not make sense. If instead you are talking about validation data, then yes you can plot the loss just like with the training data.
I am working on a multiclass classification problem. I want to know whether my model is overfitting or underfitting. I am learning how to plot learning curves. My question is, is the order of steps I have done correct?
Scaling
Baseline model
learning curve to see how well baseline model performs
Hyperparameter tuning
Fit the model and predict on test data
Final learning curve to determine if the model is over or under fitting
The first plot is after I do CV for baseline model and before hyperparameter tuning, and the second plot is done at the end, after hyperparameter tuning and fitting the best hyperparameters to the final model
I am training an LSTM to predict a price chart. In order to do so I am arranging the data into 3D with a certain time window like 10 days to predict the next day.
I am not shuffling the data before splitting into training and testing. So the most recent data is used as testing. Is that a good idea with a stateless LSTM?
Secondly, during one epoch, the training is firstly decreasing but halfway through the epoch it is suddenly increased by an order of magnitude? what are the possible reasons for that happening? Is this normal behavior. I.e. is it a misconception of mine that loss in one epoch should constantly decrease?
I used Bayesian optimization to find the right hyperparameter and I am using adam optimizer with the default learning rate. And normalizing data between 0 and 1.
working on self driving car cnn model to compute the steer with the speed by taking the pervous speed sequences.when we implemented the model the first loss is huge but it decreases.So does matter the start of the loss is huge.
It does not. If you are starting training your net from scratch then the network will predict pretty poorly so it's normal that the loss will be very high.
I' new in neural network, I built a ANN for regression task. I've plotted the loss function for training and validation. But I'm not sure that it is a good curve. I think this because the val loss go down very steeply. What do you think?
Thanks.