I'm using keras to build a deep autoencoder. I used its checkpointer to load the model and the weights but the result is always None which I think it means that the checkpoint dosen't work correctly and is not saving weights.
Here is the code how I proceed:
checkpointer = ModelCheckpoint(filepath="weights.best.h5",
verbose=0,
save_best_only=True)
tensorboard = TensorBoard(log_dir='/tmp/autoencoder',
histogram_freq=0,
write_graph=True,
write_images=True)
input_enc = Input(shape=(input_size,))
hidden_1 = Dense(hidden_size1, activation='relu')(input_enc)
hidden_11 = Dense(hidden_size2, activation='relu')(hidden_1)
code = Dense(code_size, activation='relu')(hidden_11)
hidden_22 = Dense(hidden_size2, activation='relu')(code)
hidden_2 = Dense(hidden_size1, activation='relu')(hidden_22)
output_enc = Dense(input_size, activation='tanh')(hidden_2)
autoencoder_yes = Model(input_enc, output_enc)
autoencoder_yes.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy'])
history_yes = autoencoder_yes.fit(df_noyau_norm_y, df_noyau_norm_y,
epochs=200,
batch_size=batch_size,
shuffle = True,
validation_data=(df_test_norm_y, df_test_norm_y),
verbose=1,
callbacks=[checkpointer, tensorboard]).history
autoencoder_yes.save_weights("weights.best.h5")
print(autoencoder_yes.load_weights("weights.best.h5"))
Can somebody help me find out a way to resolve the problem?
Thanks
No, your interpretation of load_weights returning None is not correct. Load weights is a procedure, it does not return anything, and if you assign the return value of a procedure to a variable, it will get the value of None.
So weight saving is probably working fine, its just your interpretation that is wrong.
you should use save_weights_only=True. Without this the whole model is saved not just the weights. To be able to load weights you must save weights like this:
checkpointer = ModelCheckpoint(filepath="weights.best.h5",
verbose=0, save_weights_only=True,
save_best_only=True)
This is expected behavior not an error. The autoencoder_yes.load_weights("weights.best.h5") doesn't actually return anything, so if you try to print the output of this function you will get None as output.
Expected behavior
In the code that you have provided, you have trained the model and saved the weights. So, the autoencoder_yes is a keras.Model object that has the fine-tuned weights.
In the same script if you load the saved weights once again, nothing is supposed to happen, the weights that you saved will get loaded again.
For clarity
Start with another fresh script, build the same model architecture and reload the weights from the h5 file and then do some predictions. In that case it will silently load the pre-trained weights and do the predictions according to that.
Related
So, I was wondering how to load the latest checkpoint in Tensorflow having its path/directory and continue the training where I left off. And also how to load the latest checkpoint and save it as a complete model. Please help me
My code:
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_weights_only=True,
save_freq=1*batch_size)
# Create a basic model instance
model = create_model(training, output)
model.save_weights(checkpoint_path.format(epoch=0))
# Create a TensorBoard callback (for metrics monitoring)
tb_callback = tf.keras.callbacks.TensorBoard(log_dir="chatbot/training/logs", histogram_freq=1, update_freq= 1, profile_batch= 1)
# Train the model with the new callback
model.fit(training, output, epochs=500, batch_size = batch_size, validation_data=(training, output), callbacks=[cp_callback, tb_callback], verbose = 1)
The simplest solution to your problem would be to save the entire model with the ModelCheckpoint callback.
You only have to remove the save_weights_only argument for it to work.
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_freq=1*batch_size)
To load the checkpoint and continue training at a later point in time, just call
model = tf.keras.models.load_model(checkpoint_path)
If you want to load a checkpoint given you only saved the model weights, you have to first build your model and transfer your saved weights into it.
model.load_weights(checkpoint_path)
If you need further information about loading and saving models, I would recommend reading the documentation: https://www.tensorflow.org/guide/keras/save_and_serialize
This answer is referencing the answer of :
Save and load weights in keras
I recently being using a RobertaLarge model, which I perform a down stream Training, using "Trainer" package.
All goes well, I see the loss going down, and compare manually some results with valid dataset.
Problem goes when I try to save the model and reload it afterwards.
I keep seeing the warning when trying to reload the model:
Some weights of the model checkpoint at Roberta_trained_1epoch were not used when initializing RobertaPreTrainedModel: ['module.roberta.encoder.layer.10.output.dense.bias', [........................................340_LAYERS_..................................]
'module.roberta.encoder.layer.6.attention.self.key.bias', 'module.roberta.encoder.layer.22.output.dense.weight', 'module.roberta.encoder.layer.3.attention.self.key.bias', 'module.roberta.encoder.layer.15.attention.self.value.bias', 'module.roberta.encoder.layer.15.attention.self.query.bias', 'module.roberta.encoder.layer.2.attention.self.value.bias']
I looked extensively for an answer to why this problem, and so far couldn't find a solution. Some claim this is just a warning and there's nothing wrong, however suspiciously I did some manual checks, and indeed the model seems... virgin.
I'm using the: Trainer.save_model('save_here') after training, and using the RobertaForTokenClassification.from_pretrained('save_here', local_files_only=True)model to reload it.
However the results show me that the model is not loading currently clearly.
training code:
trainer = Trainer(
model=model,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=ds_train,
eval_dataset=ds_valid,
callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
)
trainer.train()
trainer.evaluate()
trainer.save_model('save_here')
this results in evaluation loss of: 0.002
Reloading and re-evaluation:
model = RobertaForTokenClassification.from_pretrained('save_here', local_files_only=True)
tokenizer = AutoTokenizer.from_pretrained('tokenizers_saved')
dl_valid = DataLoader(ds_valid, batch_size=Config.batch_size, shuffle=True)
with torch.no_grad():
for index, data in enumerate(dl_valid):
batch_input_ids = data['input_ids'].to(device, dtype=torch.long)
batch_att_mask = data['attention_mask'].to(device, dtype=torch.long)
batch_target = data['label_ids'].to(device, dtype=torch.long)
output = model(batch_input_ids, token_type_ids=None, attention_mask=batch_att_mask, labels=batch_target)
step_loss, eval_prediction = output['loss'], output['logits']
eval_prediction = np.argmax(eval_prediction.detach().to('cpu').numpy(), axis=2)
predictions.append(eval_prediction)
reals.append(batch_target)
eval_loss += step_loss
print(eval_loss)
This results in loss: 1.2 - 0.9 (randomly after loading)
I found out what was wrong.
Will share with others, given others may have the same issue.
My problem was that I wrapped my model into a DataParallel model = nn.DataParallel(model)
So it seems that Trainer can't save it properly and get it back the usual way.
As a work around:
model = trainer.model
model.module.save_pretrained('save_here')
....
# afterwards in another machine
....
model = RobertaForTokenClassification.from_pretrained('save_here')
Still think that this should be handled differently.
I am currently attempting to used keras tuner to create a model for my CNN, though I am having some issues with saving my model for future use.
As I am used to it, I could regularly just save my model with model.save(filename) to receive a .model file; however, when attempting this with code such as:
tuner = RandomSearch(
build_model,
objective = "val_accuracy",
max_trials = 5,
executions_per_trial = 1,
directory = LOG_DIR
)
tuner.search(x= x_train, y= y_train, epochs= 1, batch_size=64, validation_data= (x_test, y_test))
bestModels = tuner.get_best_models(num_models=1)
highestScoreModel= models[0]
highestScoreModel.fit(x=x_train, y=y_train, batch_size=64, epochs=5, verbose=1, validation_split=0.2)
highestScoreModel.save("Trained_Model")
I received a Trained_Model folder with no model within it, only the parameters. If anyone could assist me with saving the actual trained model I'd be very thankful.
================= Edit / Update ================
I have now found a way to obtain the trial_id by sifting the generated trial files. Though, when I run:
trial_id = getID()
tuner.save_model(trial_id=trial_id, model=highestScoreModel, step=0)
Nothing seems to happen, no save file appears. Again, I would be grateful for any help on this matter.
Try this
Tuner.save_model(trial_id, model, step=0)
where
trial_id is your model trial number
model is your trained model
and step is epochs numbers
I just trained a CNN to recognise sunspots with tensorflow. My model is pretty much the same as this.
The problem is that I cannot find anywhere a clear explanation on how to make predictions with the checkpoint generated by the training phase.
Tried using the standard restore method:
saver = tf.train.import_meta_graph('./model/model.ckpt.meta')
saver.restore(sess,'./model/model.ckpt')
but then I cannot figure out how to run it.
Tried using tf.estimator.Estimator.predict() like this:
# Create the Estimator (should reload the last checkpoint but it doesn't)
sunspot_classifier = tf.estimator.Estimator(
model_fn=cnn_model_fn, model_dir="./model")
# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=50)
# predict with the model and print results
pred_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": pred_data},
shuffle=False)
pred_results = sunspot_classifier.predict(input_fn=pred_input_fn)
print(pred_results)
but what it does is spitting out <generator object Estimator.predict at 0x10dda6bf8>.
While if I use the same code but with tf.estimator.Estimator.evaluate() it works like a charm (reloads the model, performs evaluation and sends it to TensorBoard).
I know there are many similar questions but I couldn't really find the way that worked for me.
sunspot_classifier.predict(input_fn=pred_input_fn) returns generator. So pred_results is generator object. To get value from it you need to iterate it by next(pred_results)
The solution is
print(next(pred_results))
I read this very helpful Keras tutorial on transfer learning here:
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
I am thinking that this is probably very applicable to the fish data here, and started going down that route. I tried to follow the tutorial as much as I could. The code is a mess as I was just tyring to figure out how everything works, but it can be found here:
https://github.com/MrChristophRivera/ClassifiyingFish/blob/master/notebooks/Anthony/Resnet50%2BTransfer%20Learning%20Attempt.ipynb
For brevity, here are the steps I did here:
model = ResNet50(top_layer = False, weights="imagenet"
# I would resize the image to that of the standard input size of ResNet50.
datagen=ImageDataGenerator(1./255)
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode=None,
shuffle=False)
# predict on the training data
bottleneck_features_train = model.predict_generator(generator,
nb_train_samples)
print(bottleneck_features_train)
file_name = join(save_directory, 'tbottleneck_features_train.npy')
np.save(open(file_name, 'wb'), bottleneck_features_train)
# Then I would use this output to feed my top layer and train it. Let's
say I defined
# it like so:
top_model = Sequential()
# Skipping some layers for brevity
top_model.add(Dense(8, activation='relu')
top_model.fit(train_data, train_labels)
top_model.save_weights(top_model_weights_path).
At this time, I have the weights saved. The next step would be to add the top layer to ResNet50. The tutorial simply did it like so:
# VGG16 model defined via Sequential is called bottom_model.
bottom_model.add(top_model)
The problem is when I try to do that this fails because "model does not have property add". My guess is that ResNet50 was defined in a different way. At any rate, my question is: How can I add this top model with the loaded weights to the bottom model? Can anyone give helpful pointers?
Try:
input_to_model = Input(shape=shape_of_your_image)
base_model = model(input_to_model)
top_model = Flatten()(base_model)
top_model = Dense(8, activation='relu')
...
Your problem comes from the fact that Resnet50 is defined in a so called functional API. I would also advise you to use different activation function because having relu as an output activation might cause problems. Moreover - your model is not compiled.