I wanted to ask if there was an easy way to visualize a Keras model built from the Functional API?
Right now, the best ways to debug at a high level a sequential model for me is:
model = Sequential()
model.add(...
...
print(model.summary())
SVG(model_to_dot(model).create(prog='dot', format='svg'))
However, I am having a hard time finding a good way to visualize the Keras API if we build a more complex, non-sequential model.
Yes there is, try checking the keras.utils which has a method plot_model() as explained on detail here. Seems that you already are familiar with keras.utils.vis_utils and the model_to_dot method, but this is another option. It's usage is something like:
from keras.utils import plot_model
plot_model(model, to_file='model.png')
To be honest, that is the best I have managed to find using Keras only. Using model.summary() as you did is also useful sometimes. I also wished there were some tool to enable for better visualization of one's models, perhaps even to be able to see the weights per layers as to decide on optimal network structures and initializations (if you know about one please tell :] ).
Probably the best option you currently have is to visualize things on Tensorboard, which you an include in Keras with the TensorBoard Callback. This enables you to visualize your training and the metrics of interest, as well as some info on activations of your layers,your biases and kernels, etc.. Basically you have to add this code to your program, before fitting your model:
from keras.callbacks import TensorBoard
#indicate folder to save, plus other options
tensorboard = TensorBoard(log_dir='./logs/run1', histogram_freq=1,
write_graph=True, write_images=False)
#save it in your callback list, where you can include other callbacks
callbacks_list = [tensorboard]
#then pass to fit as callback, remember to use validation_data also
regressor.fit(X, Y, callbacks=callbacks_list, epochs=64,
validation_data=(X_test, Y_test), shuffle=True)
You can then run Tensorboard (which runs locally on a webservice) with the following command on your terminal:
tensorboard --logdir=/logs/run1
This will then indicate you in which port to visualize your training. If you got different runs you can pass --logdir=/logs instead to be able to visualize them together for comparison. There are of course more options on the use of Tensorboard, so I suggest you check the included links if you are considering its use.
After a bit of googling and trial/error... Turns out you have to just convert the entire functional api model back into a "model format".
model = some_model()
output_layer = _build_output()
finalmodel = Model(inputs=model.input, outputs=finalmodel)
then, you can run finalmodel.summary(), or any of the plotting features for sequential modeling.
However, this requires I guess careful tracking of the model, which I admittedly did not do.
tf.keras.utils.plot_model(
model,
to_file="model.png",
show_shapes=False,
show_layer_names=True,
rankdir="TB",
expand_nested=False,
dpi=96,
)
Related
I've been training an efficientnetV2 network using this repository.
The train process goes well and I reach around 93-95% validation accuracy. After that I run an inference process over a set test which contains new images with an acceptable accuracy, around 88% (for example).
After I check if the model works fine on pytorch I need to convert it to ONNX and then to a tensorrt engine. I have a script to run inference with an ONNX model to check if I'm having some problems with the conversion process.
I'm using this code to convert the model:
import torch
from timm.models import create_model
import os
# create model
base_model = create_model(
model_arch,
num_classes=num_classes,
in_chans=3,
checkpoint_path=model_path)
model = torch.nn.Sequential(
base_model,
torch.nn.Softmax(dim=1)
)
model.cpu()
model.eval()
dummy_input = torch.randn(1, 3, 224, 224, requires_grad=True)
torch.onnx.export(model,
dummy_input,
model_export,
verbose=False,
export_params=True,
do_constant_folding=True
)
I've tried several tutorials like this one but unfortunately I'm getting the same result.
I've tried different onset combinations, with and without do_constant_folding, I've even trained another model with parameter called 'exportable', which is a bool and tells the train script if the model is exportable or not (is an experimental feature according to repository's documentation).
Do you have any idea about this issue?
Thanks in advance.
Hard to guess which bug you get, here is few typical:
some layers have not properly converted even after eval
you may need to write timm.create_model(...scriptable=True, exportable=True)
different preprocessing, e.g. timm model input normalized to specific values, after conversion - not.
Do those models output the near values on model(dummy_input) ?
After a lot of research, it seems like there is no good way to properly stop and resume training using a Tensorflow 2 / Keras model. This is true whether you are using model.fit() or using a custom training loop.
There seem to be 2 supported ways to save a model while training:
Save just the weights of the model, using model.save_weights() or save_weights_only=True with tf.keras.callbacks.ModelCheckpoint. This seems to be preferred by most of the examples I've seen, however it has a number of major issues:
The optimizer state is not saved, meaning training resumption will not be correct.
Learning rate schedule is reset - this can be catastrophic for some models.
Tensorboard logs go back to step 0 - making logging essentually useless unless complex workarounds are implemented.
Save the entire model, optimizer, etc. using model.save() or save_weights_only=False. The optimizer state is saved (good) but the following issues remain:
Tensorboard logs still go back to step 0
Learning rate schedule is still reset (!!!)
It is impossible to use custom metrics.
This doesn't work at all when using a custom training loop - custom training loops use a non-compiled model, and saving/loading a non-compiled model doesn't seem to be supported.
The best workaround I've found is to use a custom training loop, manually saving the step. This fixes the tensorboard logging, and the learning rate schedule can be fixed by doing something like keras.backend.set_value(model.optimizer.iterations, step). However, since a full model save is off the table, the optimizer state is not preserved. I can see no way to save the state of the optimizer independently, at least without a lot of work. And messing with the LR schedule as I've done feels messy as well.
Am I missing something? How are people out there saving/resuming using this API?
tf.keras.callbacks.experimental.BackupAndRestore API for resuming training from interruptions has been added for tensorflow>=2.3. It works great in my experience.
Reference:
https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/experimental/BackupAndRestore
You're right, there isn't builtin support for resumability - which is exactly what motivated me to create DeepTrain. It's like Pytorch Lightning (better and worse in different regards) for TensorFlow/Keras.
Why another library? Don't we have enough? You have nothing like this; if there was, I'd not build it. DeepTrain's tailored for the "babysitting approach" to training: train fewer models, but train them thoroughly. Closely monitor each stage to diagnose what's wrong and how to fix.
Inspiration came from my own use; I'd see "validation spikes" throughout a long epoch, and couldn't afford to pause as it'd restart the epoch or otherwise disrupt the train loop. And forget knowing which batch you were fitting, or how many remain.
How's it compare to Pytorch Lightning? Superior resumability and introspection, along unique train debug utilities - but Lightning fares better in other regards. I have a comprehensive list comparison in working, will post within a week.
Pytorch support coming? Maybe. If I convince the Lightning dev team to make up for its shortcomings relative to DeepTrain, then not - otherwise probably. In the meantime, you can explore the gallery of Examples.
Minimal example:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from deeptrain import TrainGenerator, DataGenerator
ipt = Input((16,))
out = Dense(10, 'softmax')(ipt)
model = Model(ipt, out)
model.compile('adam', 'categorical_crossentropy')
dg = DataGenerator(data_path="data/train", labels_path="data/train/labels.npy")
vdg = DataGenerator(data_path="data/val", labels_path="data/val/labels.npy")
tg = TrainGenerator(model, dg, vdg, epochs=3, logs_dir="logs/")
tg.train()
You can KeyboardInterrupt at any time, inspect the model, train state, data generator - and resume.
tf.keras.callbacks.BackupAndRestore can take care of this.
Just use the callback function as
callback = tf.keras.callbacks.experimental.BackupAndRestore(
backup_dir="backup_directory")
I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
After a training procedure, I wanted to check the accuracy by loading the created model.h5 and executing an evaluation procedure. However, I am getting a following warning:
/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py:269:
UserWarning: No training configuration found in save file: the model
was not compiled. Compile it manually. warnings.warn('No training
configuration found in save file:
This dist-packages/keras/engine/saving.py file
so the problem in loading created model -> this line of code
train_model = load_model('model.h5')
Problem indicates that the model was not compiled, however, I did it.
optimizer = Adam(lr=lr, clipnorm=0.001)
train_model.compile(loss=dummy_loss, optimizer=optimizer)
I can't understand what I am doing wrong . . .
Please help me! SOS :-(
Intro
I'd like to add to olejorgenb's answer - for a specific scenario, where you don't want to train the model, just use it (e.g. in production).
"Compile" means "prepare for training", which includes mainly setting up the optimizer. It could also have been saved before, and then you can continue the "same" training after loading the saved model.
The fix
But, what about the scenario - I want to just run the model? Well, use the compile=False argument to load_model like that:
trained_model = load_model('model.h5', compile=False)
You won't be able to .fit() this model without using trained_model.compile(...) first, but most importantly - the warning will go away.
Misc Notes
Btw, in my Keras version, the argument include_optimizer has a default of True. This should work also for trainig callbacks like Checkpoint. This means, when loading a model saved by Keras, you can usually count on the optimizer being included (except for the situation: see Hull Gasper's answer).
But, when you have a model which was not trained by Keras (e.g. when converting a model trained by Darknet), the model is saved un-compiled. This produces the warning, and you can get rid of it in the way described above.
Do you get this warning when saving the model?
WARNING:tensorflow:TensorFlow optimizers do not make it possible to access
optimizer attributes or optimizer state after instantiation. As a result, we
cannot save the optimizer as part of the model save file.You will have to
compile your model again after loading it. Prefer using a Keras optimizer
instead (see keras.io/optimizers).
Seems tensorflow optimizers can't be preserved by keras :/
As mentioned keras can't save Tensorflow optimizers. Use the keras one:
optimizer = keras.optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=optimizer,
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(...)
model.save('...')
This way works for me without manual compiling after calling load.
I am trying to visualize the errors of each layer in CNN by tensor-board with Keras to see how they change in every layer timely. How do I get errors for each layer?
The loss is defined only in the output layer to measure how good your model fits the data. keras provides a function to keep track of relevant variables during the training process called History().
from keras.callbacks import History
history = History()
# define and compile your model
model.fit(..., callbacks=[history])
print(history.history)
The last command shows you all tracked values during the training process. You can access single variables via the get() method. To get the training loss, you can access it via
history.history.get('loss')