Loading a trained Keras model and continue training

Loading a trained Keras model and continue training - python

I was wondering if it was possible to save a partly trained Keras model and continue the training after loading the model again.
The reason for this is that I will have more training data in the future and I do not want to retrain the whole model again.
The functions which I am using are:
#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)
#Save partly trained model
model.save('partly_trained.h5')
#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')
#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)
Edit 1: added fully working example
With the first dataset after 10 epochs the loss of the last epoch will be 0.0748 and the accuracy 0.9863.
After saving, deleting and reloading the model the loss and accuracy of the model trained on the second dataset will be 0.1711 and 0.9504 respectively.
Is this caused by the new training data or by a completely re-trained model?
"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)
def baseline_model():
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
model.add(Dense(num_classes, init='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
if __name__ == '__main__':
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# build the model
model = baseline_model()
#Partly train model
dataset1_x = X_train[:3000]
dataset1_y = y_train[:3000]
model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
#Save partly trained model
model.save('partly_trained.h5')
del model
#Reload model
model = load_model('partly_trained.h5')
#Continue training
dataset2_x = X_train[3000:]
dataset2_y = y_train[3000:]
model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
Edit 2: tensorflow.keras remarks
For tensorflow.keras change the parameter nb_epochs to epochs in the model fit. The imports and basemodel function are:
import numpy
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import load_model
numpy.random.seed(7)
def baseline_model():
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model

Actually - model.save saves all information need for restarting training in your case. The only thing which could be spoiled by reloading model is your optimizer state. To check that - try to save and reload model and train it on training data.

Most of the above answers covered important points. If you are using recent Tensorflow (TF2.1 or above), Then the following example will help you. The model part of the code is from Tensorflow website.
import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
def create_model():
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
return model
# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)
Please save the model in *.tf format. From my experience, if you have any custom_loss defined, *.h5 format will not save optimizer status and hence will not serve your purpose if you want to retrain the model from where we left.
# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')
# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')
# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)
This approach will restart the training where we left before saving the model. As mentioned by others, if you want to save weights of best model or you want to save weights of model every epoch you need to use keras callbacks function (ModelCheckpoint) with options such as save_weights_only=True, save_freq='epoch', and save_best_only.
For more details, please check here and another example here.

The problem might be that you use a different optimizer - or different arguments to your optimizer. I just had the same issue with a custom pretrained model, using
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
patience=patience, min_lr=min_lr, verbose=1)
for the pretrained model, whereby the original learning rate starts at 0.0003 and during pre-training it is reduced to the min_learning rate, which is 0.000003
I just copied that line over to the script which uses the pre-trained model and got really bad accuracies. Until I noticed that the last learning rate of the pretrained model was the min learning rate, i.e. 0.000003. And if I start with that learning rate, I get exactly the same accuracies to start with as the output of the pretrained model - which makes sense, as starting with a learning rate that is 100 times bigger than the last learning rate used in the pretrained model will result in a huge overshoot of GD and hence in heavily decreased accuracies.

Notice that Keras sometimes has issues with loaded models, as in here.
This might explain cases in which you don't start from the same trained accuracy.

You might also be hitting Concept Drift, see Should you retrain a model when new observations are available. There's also the concept of catastrophic forgetting which a bunch of academic papers discuss. Here's one with MNIST Empirical investigation of catastrophic forgetting

All above helps, you must resume from same learning rate() as the LR when the model and weights were saved. Set it directly on the optimizer.
Note that improvement from there is not guaranteed, because the model may have reached the local minimum, which may be global. There is no point to resume a model in order to search for another local minimum, unless you intent to increase the learning rate in a controlled fashion and nudge the model into a possibly better minimum not far away.

If you are using TF2, use the new saved_model method(format pb). More information available here and here.
model.fit(x=X_train, y=y_train, epochs=10,callbacks=[model_callback])#your first training
tf.saved_model.save(model, save_to_dir_path) #save the model
del model #to delete the model
model = tf.keras.models.load_model(save_to_dir_path)
model.fit(x=X_train, y=y_train, epochs=10,callbacks=[model_callback])#your second training

It is completely okay to train a model with a saved model. I trained the saved model with the same data and found it was giving good accuracy. Moreover, the time taken was quite less in each epoch.
Here is the code have a look:
from keras.models import load_model
model = load_model('/content/drive/MyDrive/CustomResNet/saved_models/model_1.h5')
history=model.fit(train_gen,validation_data=valid_gen,epochs=5)

Related

In text classification, how to find the part of sentence that is important for the classification?

I have trained a text classification model that works well. I wanted to get deeper and understand what words/phrases from a sentence were most impactful in the classification outcome. I want to understand what words are most important for each classification outcome
I am using Keras for the classification and below is the code I am using to train the model. It's a simple embedding plus max-pooling text classification model that I am using.
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
import tensorflow as tf
from tensorflow.keras.callbacks import EarlyStopping
# early stopping
callbacks = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', min_delta=0,
patience=5, verbose=2, mode='auto', restore_best_weights=True)
# select optimizer
opt = tf.keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999,
epsilon=1e-07, amsgrad=False, name="Adam")
embedding_dim = 50
# declare model
model = Sequential()
model.add(layers.Embedding(input_dim=vocab_size,
output_dim=embedding_dim,
input_length=maxlen))
model.add(layers.GlobalMaxPool1D())
model.add(layers.Dense(10, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=opt,
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
# fit model
history = model.fit(X_tr, y_tr,
epochs=20,
verbose=True,
validation_data=(X_te, y_te),
batch_size=10, callbacks=[callbacks])
loss, accuracy = model.evaluate(X_tr, y_tr, verbose=False)
How do I extract the phrases/words that have the maximum impact on the classification outcome?

It seems that the keyword you need are "neural network interpretability" and "feature attribution". One of the best known methods in this area is called Integrated Gradients; it shows how model prediction depend on each input feature (each word embedding, in your case).
This tutorial shows how to implement IG in pure tensorflow for images, and this one uses the alibi library to highlight the words in the input text with the highest impact on a classification model.

Transfer learning on MobileNetV3 reaches plateau and I can't move past it

I'm trying to do transfer learning on MobileNetV3-Small using Tensorflow 2.5.0 to predict dog breeds (133 classes) and since it got reasonable accuracy on the ImageNet dataset (1000 classes) I thought it should have no problem adapting to my problem.
I've tried a multitude of training variations and recently had a breakthrough but now my training stagnates at about 60% validation accuracy with minor fluctuations in validation loss (accuracy and loss curves for training and validation below).
I tried using ReduceLROnPlateau in the 3rd graph below, but it didn't help to improve matters. Can anyone suggest how I could improve the training?
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.layers import GlobalMaxPooling2D, Dense, Dropout, BatchNormalization
from tensorflow.keras.applications import MobileNetV3Large, MobileNetV3Small
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True # needed for working with this dataset
# define generators
train_datagen = ImageDataGenerator(vertical_flip=True, horizontal_flip=True,
rescale=1.0/255, brightness_range=[0.5, 1.5],
zoom_range=[0.5, 1.5], rotation_range=90)
test_datagen = ImageDataGenerator(rescale=1.0/255)
train_gen = train_datagen.flow_from_directory(train_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
val_gen = test_datagen.flow_from_directory(val_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
test_gen = test_datagen.flow_from_directory(test_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
pretrained_model = MobileNetV3Small(input_shape=(224,224,3), classes=133,
weights="imagenet", pooling=None, include_top=False)
# set all layers trainable because when I froze most of the layers the model didn't learn so well
for layer in pretrained_model.layers:
layer.trainable = True
last_output = pretrained_model.layers[-1].output
x = GlobalMaxPooling2D()(last_output)
x = BatchNormalization()(x)
x = Dense(512, activation='relu')(x)
x = Dense(133, activation='softmax')(x)
model = Model(pretrained_model.input, x)
model.compile(optimizer=Adam(learning_rate=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
# val_acc with min_delta 0.003; val_loss with min_delta 0.01
plateau = ReduceLROnPlateau(monitor="val_loss", mode="min", patience=5,
min_lr=1e-8, factor=0.3, min_delta=0.01,
verbose=1)
checkpointer = ModelCheckpoint(filepath=savepath, verbose=1, save_best_only=True,
monitor="val_accuracy", mode="max",
save_weights_only=True)

Your code looks good, but it seems to have one issue - you might be rescaling the inputs twice. According to the docs for MobilenetV3:
The preprocessing logic has been included in the mobilenet_v3 model implementation. Users are no longer required (...) to normalize the input data.
Now, in your code, there is:
test_datagen = ImageDataGenerator(rescale=1.0/255)
which essentially, makes the first model layers to rescale, already rescaled values.
The same applies for train_datagen.
You could try removing the rescale argument from both train and test loaders, or setting rescale=None.
This could also explain why the model did not learn well with the backbone frozen.

Keras model accuracy is not improving - Image Classification

I have 4 classes and building a Keras model for image classification problem. I have tried a couple of adjustments but accuracy is not going beyond 75% and still loss is 64%.
I have 90,400 images as a training set and 20,000 images for testing.
Here is my model.
model = Sequential()
model.add(Conv2D(32, kernel_size = (3, 3),input_shape=(100,100,3),activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation = 'softmax'))
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
batch_size = 64
train_datagen = ImageDataGenerator (rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory('/dir/training_set', target_size=(100,100),batch_size=batch_size,class_mode='binary')
test_set = test_datagen.flow_from_directory('/dir/test_set',target_size=(100,100), batch_size=batch_size, class_mode='binary')
# 90,400 images I have under the training_set directory and 20,000 under the test directory.
model.fit(training_set, steps_per_epoch=90400//batch_size, epochs=1,validation_data=test_set, validation_steps= 20000//batch_size)
I tried adjusting layers and dropouts but no luck. Any ideas?

If I encounter something like this, I would do following:
Split my data into training-validation and test. Improve model by validation and use test to see final result.
Removing Dropout layers since I don't have a proof that model is overfitted.
If model is underfitted (your case),
3.a. Try different / bigger architecture and searching better hyperparameters
3.b. Training longer and try different optimization algorithms
If model is overfitted,
4.a. Try to get more data
4.b. Regularization (L2, dropout etc.)
4.c. Data augmentation
4.d. Searching better hyperparameters
Note: You can always consider transfer learning. Basically, transfer leaning is using gained information from a successful model for your model.

Consider
Adding multiple convolutional layers (with Max pooling in between) enables the model to learn "low level" and "higher level" features
Adding more epochs to enable the model to learn from the input pictures. Neural Networks only learn "a little bit" along the gradient each time, it often takes multiple up to many epochs to have a sufficiently trained model.
Maybe start with less pictures but increase the epochs (and add a second conv/max pool pair) to keep calculation time under control!

You could try using one of the existing models in Keras and train it from scratch.
I have used MobileNetV2 in the past and have gotten very good results.
When you initialize the model you can load pre-trained weights or None, and start traning from scratch with your images.

I was able to achieve accuracy with transfer learning using the pre-trained MobileNet model.
Attaching my code and confusion metrix here so it may be helpful to someone.
import pandas as pd
import numpy as np
import os
import keras
import matplotlib.pyplot as plt
from keras.layers import Dense,GlobalAveragePooling2D
from keras.applications import MobileNet
from keras.preprocessing import image
from keras.applications.mobilenet import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam
base_model=MobileNet(weights='imagenet',include_top=False) #imports the mobilenet model and discards the last 1000 neuron layer.
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(3,activation='softmax')(x) #final layer with softmax activation
model=Model(inputs=base_model.input,outputs=preds)
for layer in model.layers[:20]:
layer.trainable=False
for layer in model.layers[20:]:
layer.trainable=True
train_data_path = '../train_dataset_path'
train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input, validation_split=0.2) #included in our dependencies
train_generator=train_datagen.flow_from_directory(train_data_path,
target_size=(224,224),
color_mode='rgb',
batch_size=32,
class_mode='categorical',
shuffle=True,
subset='training')
test_generator=train_datagen.flow_from_directory(train_data_path,
target_size=(224,224),
color_mode='rgb',
batch_size=32,
class_mode='categorical',
shuffle=False,
subset='validation')
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
step_size_train = train_generator.n//train_generator.batch_size
step_size_test = test_generator.n//test_generator.batch_size
model_history = model.fit(train_generator,
steps_per_epoch=step_size_train,
epochs=5,
validation_data=test_generator,
validation_steps=step_size_test)
model.save('tl_interior_model_2')
#Load the model
model = keras.models.load_model('tl_interior_model_2')

Strange behavior of keras v1.2.2 vs. keras v2+ (HUGE differences in accuracy)

Today I've ran into some very strange behavior of Keras. When I try to do a classification run on the iris-dataset with a simple model, keras version 1.2.2 gives me +- 95% accuracy, whereas a keras version of 2.0+ predicts the same class for every training example (leading to an accuracy of +- 35%, as there are three types of iris). The only thing that makes my model predict +-95% accuracy is downgrading keras to a version below 2.0:
I think it is a problem with Keras, as I have tried the following things, all do not make a difference;
Switching activation function in the last layer (from Sigmoid to softmax).
Switching backend (Theano and Tensorflow both give roughly same performance).
Using a random seed.
Varying the number of neurons in the hidden layer (I only have 1 hidden layer in this simple model).
Switching loss-functions.
As the model is very simple and it runs on it's own (You just need the easy-to-obtain iris.csv dataset) I decided to include the entire code;
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
#Load data
data_frame = pd.read_csv("iris.csv", header=None)
data_set = data_frame.values
X = data_set[:, 0:4].astype(float)
Y = data_set[:, 4]
#Encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
def baseline_model():
#Create & Compile model
model = Sequential()
model.add(Dense(8, input_dim=4, init='normal', activation='relu'))
model.add(Dense(3, init='normal', activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
#Create Wrapper For Neural Network Model For Use in scikit-learn
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)
#Create kfolds-cross validation
kfold = KFold(n_splits=10, shuffle=True)
#Evaluate our model (Estimator) on dataset (X and dummy_y) using a 10-fold cross-validation procedure (kfold).
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: {:2f}% ({:2f}%)".format(results.mean()*100, results.std()*100))
if anyone wants to replicate the error here are the dependencies I used to observe the problem:
numpy=1.16.4
pandas=0.25.0
sk-learn=0.21.2
theano=1.0.4
tensorflow=1.14.0

In Keras 2.0, many parameters changed names, there is compatibility layer to keep things working, but somehow it did not apply when using KerasClassifier.
In this part of the code:
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)
You are using the old name nb_epoch instead of the modern name of epochs. The default value is epochs=1, meaning that your model was only being trained for one epoch, producing very low quality predictions.
Also note that here:
model.add(Dense(3, init='normal', activation='sigmoid'))
You should be using a softmax activation instead of sigmoid, as you are using the categorical cross-entropy loss:
model.add(Dense(3, init='normal', activation='softmax'))

I've managed to isolate the issue, if you change nb_epoch to epochs, (All else being exactly equal) the model predicts very good again, in keras 2 as well. I don't know if this is intended behavior or a bug.

Tensorflow——keras model.save() raise NotImplementedError

import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))
model.compile(optimizer ='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
When I tried to save the model
model.save('epic_num_reader.model')
I get a NotImplementedError:
NotImplementedError Traceback (most recent call last)
<ipython-input-4-99efa4bdc06e> in <module>()
1
----> 2 model.save('epic_num_reader.model')
NotImplementedError: Currently `save` requires model to be a graph network. Consider using `save_weights`, in order to save the weights of the model.
So how can I save the model defined in the code?

You forgot the input_shape argument in the definition of the first layer, which makes the model undefined, and saving undefined models has not been implemented yet, which triggers the error.
model.add(tf.keras.layers.Flatten(input_shape = (my, input, shape)))
Just add the input_shape to the first layer and it should work fine.

For those who still have not solved the problem even did as Matias suggested, you can consider using tf.keras.models.save_model() and load_model(). In my case, it worked.

tf.keras.models.save_model
Works here (tensorflow 1.12.0) (even when the input_shape is unspecified)

Reason For the Error:
I was getting the same error and tried the above answers but got errors. But I find a solution to the problem that I will share below:
Check whether you passed input_shape at the time of defining the input layer of the model if not you will get an error at the time of saving and loading the model.
How to define input_shape?
Lets consider the one example If you use minst dataset:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
It consists of images of handwritten digits 0-9 of size 28 x 28 resolution each.
For this, we can define input shape as (28,28) without mentioning batch size as follows:
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
In this way, you can give input shape by looking at your input training dataset.
Save your trained model:
Now after training and testing the model we can save our model. Following code worked for me it did not change the accuracy as well after reloading the model:
by using save_model()
import tensorflow as tf
tf.keras.models.save_model(
model,
"your_trained_model.model",
overwrite=True,
include_optimizer=True
)
by using .save()
your_trained_model.save('your_trained_model.model')
del model # deletes the existing model
Now load the model which we saved :
model2 = tf.keras.models.load_model("your_trained_model.model")
For more details refer to this link: Keras input explanation: input_shape, units, batch_size, dim, etc

<!-- Success, please check -->
import tensorflow as tf
import matplotlib.pyplot as plt
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
plt.imshow(x_train[0], cmap=plt.cm.binary)
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
plt.imshow(x_train[0], cmap=plt.cm.binary)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=x_train[0].shape))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
val_loss, val_acc = model.evaluate(x_test, y_test)
print(val_loss)
print(val_acc)
model.save('epic_num_reader.model')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Loading a trained Keras model and continue training - python

Actually - model.save saves all information need for restarting training in your case. The only thing which could be spoiled by reloading model is your optimizer state. To check that - try to save and reload model and train it on training data.

Notice that Keras sometimes has issues with loaded models, as in here. This might explain cases in which you don't start from the same trained accuracy.

You might also be hitting Concept Drift, see Should you retrain a model when new observations are available. There's also the concept of catastrophic forgetting which a bunch of academic papers discuss. Here's one with MNIST Empirical investigation of catastrophic forgetting

Related

In text classification, how to find the part of sentence that is important for the classification?

Transfer learning on MobileNetV3 reaches plateau and I can't move past it

Keras model accuracy is not improving - Image Classification

Strange behavior of keras v1.2.2 vs. keras v2+ (HUGE differences in accuracy)

Tensorflow——keras model.save() raise NotImplementedError

Categories

Resources