Tensorflow Custom Training With Phases

Tensorflow Custom Training With Phases - python

I need to create a custom training loop with Tensorflow / Keras (because I want to have more than one optimizer and tell which weights each optimizer should act upon).
Although this tutorial and that one too are quite clear regarding this matter, they miss a very important point: how do I predict for training phase and how do I predict for validation phase?
Suppose my model has Dropout layers, or BatchNormalization layers. They certainly work in a completely different way whether they are in training or validation.
How do I adapt these tutorials? This is a dummy example (may contain one or two pieces of pseudocode):
# Iterate over epochs.
for epoch in range(3):
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
#model with two outputs
#IMPORTANT: must be in training phase (use dropouts, calculate batch statistics)
logits1, logits2 = model(x_batch_train) #must be "training"
loss_value1 = loss_fn1(y_batch_train[0], logits1)
loss_value2 = loss_fn2(y_batch_train[1], logits2)
grads1 = tape.gradient(loss_value1, model.trainable_weights[selection1])
grads2 = tape.gradient(loss_value2, model.trainable_weights[selection2])
optimizer1.apply_gradients(zip(grads1, model.trainable_weights[selection1]))
optimizer2.apply_gradients(zip(grads2, model.trainable_weights[selection2]))
# Run a validation loop at the end of each epoch.
for x_batch_val, y_batch_val in val_dataset:
##Important: must be validation phase
#dropouts are off: calculate all neurons and divide value
#batch norms use previously calculated statistics
val_logits1, val_logits2 = model(x_batch_val)
#.... do the evaluations

I think you can just pass a training parameter when you call a tf.keras.Model, and it will be passed down to the layers:
# On training
logits1, logits2 = model(x_batch_train, training=True)
# On evaluation
val_logits1, val_logits2 = model(x_batch_val, training=False)

Related

PyTorch - Train imbalanced dataset (set weights) for object detection

I am quite new with PyTorch, and I am trying to use an object detection model to do transfer learning in order to learn how to detect my new dataset.
Here is how I load the dataset:
train_dataset = MyDataset(train_data_path, 512, 512, train_labels_path, get_train_transform())
train_loader = DataLoader(train_dataset,batch_size=8,shuffle=True,num_workers=4,collate_fn=collate_fn)
valid_dataset = MyDataset(test_data_path, 512, 512, test_labels_path, get_valid_transform())
valid_loader = DataLoader(valid_dataset,batch_size=8, shuffle=False,num_workers=4,collate_fn=collate_fn)
I define the model and optimizer as follows:
# load Faster RCNN pre-trained model
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights="FasterRCNN_ResNet50_FPN_Weights.COCO_V1") # get the number of input features
in_features = model.roi_heads.box_predictor.cls_score.in_features
# define a new head for the detector with the required number of classes
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
model = model.to(DEVICE)
# get the model parameters
params = [p for p in model.parameters() if p.requires_grad]
# define the optimizer
# We are using the SGD optimizer with a learning rate of 0.001 and momentum on 0.9.
optimizer = torch.optim.SGD(params, lr=0.001, momentum=0.9, weight_decay=0.0005)
I train the model as follows:
def train(train_data_loader, model, optimizer, train_loss_hist):
global train_itr
global train_loss_list
prog_bar = tqdm(train_data_loader, total=len(train_data_loader), position=0, leave=True, ascii=True)
# Then we have the for loop iterating over the batches.
for i, data in enumerate(prog_bar):
optimizer.zero_grad()
images, targets = data
images = list(image.to(DEVICE) for image in images)
targets = [{k: v.to(DEVICE) for k, v in t.items()} for t in targets]
# Forward pass
loss_dict = model(images, targets)
# Then we sum the losses and append the current iterations loss value to train_loss_list list.
losses = sum(loss for loss in loss_dict.values())
loss_value = losses.item()
# We also send the current loss value to train_loss_hist of the Averager class.
train_loss_list.append(loss_value)
train_loss_hist.send(loss_value)
# Then we backpropagate the gradients and update parameters.
losses.backward()
optimizer.step()
train_itr += 1
return train_loss_list
Considering that I adapted one code I found and I am not sure where the loss is defined (I have not defined any kind of loss in the code, so I believe it will use the default loss that was used to train the original object detector), how can I train my network considering such an imbalanced dataset and update my code?

It seems that you have two questions.
How to deal with imbalanced dataset.
Note that Faster-RCNN is an Anchor-Based detector, which means number of anchors containing the object is extremely small compared to the number of total anchors, so you don't need to deal with the imbalanced dataset. Or you can use RetinaNet which proposed a loss function called focal loss to improve performance upon imbalanced dataset.
Where is the loss function.
torchvision integrated the loss function inside the model object, you can debug your python code step by step inside the torchvision package and see the implementation details

Is it necessary to use with torch.no_grad() for feature extraction?

I'm attempting feature extraction in an unorthodox way. I extract features in eval() mode to switch off the batch norm and dropout layers and use the running means and std provided by ImageNet.
I use a feature extractor to extract features from two related images and concatenate the two tensors stackwise before passing through a linear dense classifier model for training. I'm wondering whether I can avoid using with torch.no_grad() as the two models are unrelated.
Here is a simplified version:
num_classes = 2
num_epochs = 10
criterion = nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)
densenet= DenseNetConv()
# set densenet to eval to switch off batch norm and dropout layers and use ImageNet running means/ std devs
densenet.eval()
densenet.to(device)
classifier = nn.Linear(4416, num_classes)
classifier.to(device)
for epoch in range(num_epochs):
classifier.train()
for i, (inputs_1, inputs_2, labels) in enumerate(dataloaders_dict['train']):
inputs_1= inputs_1.to(device)
inputs_2 = inputs_2.to(device)
labels = labels.to(device)
features_1 = densenet(inputs_1) # extract features 1
features_2 = densenet(inputs_2) # extract features 2
combined = torch.cat([features_1, features_2], dim=1) # combine features
combined = combined(-1, 4416) # reshape
optimizer.zero_grad()
# Forward pass to get output/logits
outputs = classifier(combined)
# Calculate Loss: softmax --> cross entropy loss
loss = criterion(outputs, labels)
_, pred = torch.max(outputs, 1)
equality_check = (labels.data == pred)
# Getting gradients w.r.t. parameters
loss.backward()
optimizer.step()
As you can see, I do not call with torch.no_grad(), despite having densenet.eval() as my separate feature extractor. Is there an issue with the way this is implemented or can I assume that this will not interfere with the classifier model?

If you are doing inference on a model, applying torch.no_grad() won't have any effect on the resulting output. As you've said only nn.Module.eval will since it modifies how the forward operation is performed (namely which statistics to use to normalize the batch elements).
It is recommended to switch off gradient computation when backpropagation is not necessary. This avoids caching activations on forward call resulting in faster inference time.
In your case, you can either wrap your inference call on densenet with torch.no_grad:
torch.no_grad():
features_1 = densenet(inputs_1) # extract features 1
features_2 = densenet(inputs_2) # extract features 2
Or alternatively, switch off the requires_grad flag on your module's parameter tensors using nn.Module.requires_grad_:
densenet.eval()
densenet.requires_grad_(False)

How to update weights in Stochastic Weight Averaging (SWA) on tensorflow?

I'm confused about how to implement tfa's SWA optimizer. There are two points here:
When you look at the documentation it points you to [this] model averaging tutorial. That tutorial uses tfa.callbacks.AverageModelCheckpoint, which allows you to
Assign the moving average weights to the model, and save them.
(or) Keep the old non-averaged weights, but the saved model uses the average weights.
Having a distinct ModelCheckpoint that allows you to save moving average weights (rather than the current weights) makes sense. However - it seems like SWA should be managing the weight averaging. That makes me want to set update_weights=False.
Is this correct? The tutorial uses update_weights=True.
There is a note about SWA not updating the BN layers in the documentation. Following the suggestion here I did this,
# original training
model.fit(...)
# updating weights from final run
optimizer.assign_average_vars(model.variables)
# batch-norm-hack: lr=0 as suggested https://stackoverflow.com/a/64376062/607528
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0),
loss=loss,
metrics=metrics)
model.fit(
data,
validation_data=None,
epochs=1,
callbacks=final_callbacks)
before saving my model.
Is this correct?
Thanks!

The easiest way to deal with the batch norm is the following:
First, loop through all layers in your model and reset the moving mean and moving variance in the batch norm layers (in my example I assume the batch norm layers end with "bn"):
for l in model.layers:
if l.name.split('_')[-1] == 'bn': # e.g. conv1_bn
l.moving_mean.assign(tf.zeros_like(l.moving_mean))
l.moving_variance.assign(tf.ones_like(l.moving_variance))
After that run your model for one epoch and set training to true to update the moving average and variance:
count = 0
for x,_ in dataset_train:
_ = model(x, training = True)
count += 1
if count > steps_per_epoch:
break

There are two ways of doing this, the first one is you manually update the weights before saving, like this example from the documentation.
import tensorflow as tf
import tensorflow_addons as tfa
model = tf.Sequential([...])
opt = tfa.optimizers.SWA(
tf.keras.optimizers.SGD(lr=2.0), 100, 10)
model.compile(opt, ...)
model.fit(x, y, ...)
# Update the weights to their mean before saving
opt.assign_average_vars(model.variables)
model.save('model.h5')
The second option is to update the weight through AverageModelCheckpoint if you set update_weights = True. As the collab notebook example shows
avg_callback = tfa.callbacks.AverageModelCheckpoint(filepath=checkpoint_dir,
update_weights=True)
...
#Build Model
model = create_model(moving_avg_sgd)
#Train the network
model.fit(fmnist_train_ds, epochs=5, callbacks=[avg_callback])
Notice that AverageModelCheckpoint also calls assign_average_vars before saving the model, from source code:
def _save_model(self, epoch, logs):
optimizer = self._get_optimizer()
assert isinstance(optimizer, AveragedOptimizerWrapper)
if self.update_weights:
optimizer.assign_average_vars(self.model.variables)
return super()._save_model(epoch, logs)
...

Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

What I do:
I am training a pre-trained CNN with Keras fit_generator(). This produces evaluation metrics (loss, acc, val_loss, val_acc) after each epoch. After training the model, I produce evaluation metrics (loss, acc) with evaluate_generator().
What I expect:
If I train the model for one epoch, I would expect that the metrics obtained with fit_generator() and evaluate_generator() are the same. They both should derive the metrics based on the entire dataset.
What I observe:
Both loss and acc are different from fit_generator() and evaluate_generator():
What I don't understand:
Why the accuracy from fit_generator() is
different to that from evaluate_generator()
My code:
def generate_data(path, imagesize, nBatches):
datagen = ImageDataGenerator(rescale=1./255)
generator = datagen.flow_from_directory\
(directory=path, # path to the target directory
target_size=(imagesize,imagesize), # dimensions to which all images found will be resize
color_mode='rgb', # whether the images will be converted to have 1, 3, or 4 channels
classes=None, # optional list of class subdirectories
class_mode='categorical', # type of label arrays that are returned
batch_size=nBatches, # size of the batches of data
shuffle=True) # whether to shuffle the data
return generator
[...]
def train_model(model, nBatches, nEpochs, trainGenerator, valGenerator, resultPath):
history = model.fit_generator(generator=trainGenerator,
steps_per_epoch=trainGenerator.samples//nBatches, # total number of steps (batches of samples)
epochs=nEpochs, # number of epochs to train the model
verbose=2, # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
callbacks=None, # keras.callbacks.Callback instances to apply during training
validation_data=valGenerator, # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
validation_steps=
valGenerator.samples//nBatches, # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
class_weight=None, # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
max_queue_size=10, # maximum size for the generator queue
workers=32, # maximum number of processes to spin up when using process-based threading
use_multiprocessing=True, # whether to use process-based threading
shuffle=False, # whether to shuffle the order of the batches at the beginning of each epoch
initial_epoch=0) # epoch at which to start training
print("%s: Model trained." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
# Save model
modelPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelArchitecture.h5')
weightsPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelWeights.h5')
model.save(modelPath)
model.save_weights(weightsPath)
print("%s: Model saved." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
return history, model
[...]
def evaluate_model(model, generator):
score = model.evaluate_generator(generator=generator, # Generator yielding tuples
steps=
generator.samples//nBatches) # number of steps (batches of samples) to yield from generator before stopping
print("%s: Model evaluated:"
"\n\t\t\t\t\t\t Loss: %.3f"
"\n\t\t\t\t\t\t Accuracy: %.3f" %
(datetime.now().strftime('%Y-%m-%d_%H-%M-%S'),
score[0], score[1]))
[...]
def main():
# Create model
modelUntrained = create_model(imagesize, nBands, nClasses)
# Prepare training and validation data
trainGenerator = generate_data(imagePathTraining, imagesize, nBatches)
valGenerator = generate_data(imagePathValidation, imagesize, nBatches)
# Train and save model
history, modelTrained = train_model(modelUntrained, nBatches, nEpochs, trainGenerator, valGenerator, resultPath)
# Evaluate on validation data
print("%s: Model evaluation (valX, valY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
evaluate_model(modelTrained, valGenerator)
# Evaluate on training data
print("%s: Model evaluation (trainX, trainY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
evaluate_model(modelTrained, trainGenerator)
Update
I found some sites that report on this issue:
The Batch Normalization layer of Keras is broken
Strange
behaviour of the loss function in keras model, with pretrained
convolutional base
model.evaluate() gives a different loss on
training data from the one in training process
Got different accuracy between history and evaluate
ResNet: 100% accuracy during training, but 33% prediction
accuracy with the same data
I tried following some of their suggested solutions without success so far. acc and loss are still different from fit_generator() and evaluate_generator(), even when using the exact same data generated with the same generator for training and validation. Here is what I tried:
statically setting the learning_phase for the entire script or before adding new layers to the pre-trained ones
K.set_learning_phase(0) # testing
K.set_learning_phase(1) # training
unfreezing all batch normalization layers from the pre-trained model
for i in range(len(model.layers)):
if str.startswith(model.layers[i].name, 'bn'):
model.layers[i].trainable=True
not adding dropout or batch normalization as untrained layers
# Create pre-trained base model
basemodel = ResNet50(include_top=False, # exclude final pooling and fully connected layer in the original model
weights='imagenet', # pre-training on ImageNet
input_tensor=None, # optional tensor to use as image input for the model
input_shape=(imagesize, # shape tuple
imagesize,
nBands),
pooling=None, # output of the model will be the 4D tensor output of the last convolutional layer
classes=nClasses) # number of classes to classify images into
# Create new untrained layers
x = basemodel.output
x = GlobalAveragePooling2D()(x) # global spatial average pooling layer
x = Dense(1024, activation='relu')(x) # fully-connected layer
y = Dense(nClasses, activation='softmax')(x) # logistic layer making sure that probabilities sum up to 1
# Create model combining pre-trained base model and new untrained layers
model = Model(inputs=basemodel.input,
outputs=y)
# Freeze weights on pre-trained layers
for layer in basemodel.layers:
layer.trainable = False
# Define learning optimizer
learningRate = 0.01
optimizerSGD = optimizers.SGD(lr=learningRate, # learning rate.
momentum=0.9, # parameter that accelerates SGD in the relevant direction and dampens oscillations
decay=learningRate/nEpochs, # learning rate decay over each update
nesterov=True) # whether to apply Nesterov momentum
# Compile model
model.compile(optimizer=optimizerSGD, # stochastic gradient descent optimizer
loss='categorical_crossentropy', # objective function
metrics=['accuracy'], # metrics to be evaluated by the model during training and testing
loss_weights=None, # scalar coefficients to weight the loss contributions of different model outputs
sample_weight_mode=None, # sample-wise weights
weighted_metrics=None, # metrics to be evaluated and weighted by sample_weight or class_weight during training and testing
target_tensors=None) # tensor model's target, which will be fed with the target data during training
using different pre-trained CNNs as base model (VGG19, InceptionV3, InceptionResNetV2, Xception)
from keras.applications.vgg19 import VGG19
basemodel = VGG19(include_top=False, # exclude final pooling and fully connected layer in the original model
weights='imagenet', # pre-training on ImageNet
input_tensor=None, # optional tensor to use as image input for the model
input_shape=(imagesize, # shape tuple
imagesize,
nBands),
pooling=None, # output of the model will be the 4D tensor output of the last convolutional layer
classes=nClasses) # number of classes to classify images into
Please let me know if there are other solutions around that I am missing.

I now managed having the same evaluation metrics. I changed the following:
I set seed in flow_from_directory() as suggested by #Anakin
def generate_data(path, imagesize, nBatches):
datagen = ImageDataGenerator(rescale=1./255)
generator = datagen.flow_from_directory(directory=path, # path to the target directory
target_size=(imagesize,imagesize), # dimensions to which all images found will be resize
color_mode='rgb', # whether the images will be converted to have 1, 3, or 4 channels
classes=None, # optional list of class subdirectories
class_mode='categorical', # type of label arrays that are returned
batch_size=nBatches, # size of the batches of data
shuffle=True, # whether to shuffle the data
seed=42) # random seed for shuffling and transformations
return generator
I set use_multiprocessing=False in fit_generator() according to the warning: use_multiprocessing=True and multiple workers may duplicate your data
history = model.fit_generator(generator=trainGenerator,
steps_per_epoch=trainGenerator.samples//nBatches, # total number of steps (batches of samples)
epochs=nEpochs, # number of epochs to train the model
verbose=2, # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
callbacks=callback, # keras.callbacks.Callback instances to apply during training
validation_data=valGenerator, # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
validation_steps=
valGenerator.samples//nBatches, # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
class_weight=None, # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
max_queue_size=10, # maximum size for the generator queue
workers=1, # maximum number of processes to spin up when using process-based threading
use_multiprocessing=False, # whether to use process-based threading
shuffle=False, # whether to shuffle the order of the batches at the beginning of each epoch
initial_epoch=0) # epoch at which to start training
I unified my python setup as suggested in the keras documentation on how to obtain reproducible results using Keras during development
import tensorflow as tf
import random as rn
from keras import backend as K
np.random.seed(42)
rn.seed(12345)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
tf.set_random_seed(1234)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
Instead of rescaling input images with datagen = ImageDataGenerator(rescale=1./255), I now generate my data with:
from keras.applications.resnet50 import preprocess_input
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
With this, I managed to have a similar accuracy and loss from fit_generator() and evaluate_generator(). Also, using the same data for training and testing now results in a similar metrics. Reasons for remaining differences are provided in the keras documentation.

Set use_multiprocessing=False at fit_generator level fixes the problem BUT at the cost of slowing down training significantly. A better but still imperfect workround would be to set use_multiprocessing=False for only the validation generator as the code below modified from keras' fit_generator function.
...
try:
if do_validation:
if val_gen and workers > 0:
# Create an Enqueuer that can be reused
val_data = validation_data
if isinstance(val_data, Sequence):
val_enqueuer = OrderedEnqueuer(val_data,
**use_multiprocessing=False**)
validation_steps = len(val_data)
else:
val_enqueuer = GeneratorEnqueuer(val_data,
**use_multiprocessing=False**)
val_enqueuer.start(workers=workers,
max_queue_size=max_queue_size)
val_enqueuer_gen = val_enqueuer.get()
...

Training for one epoch might not be informative enough in this case. Also your train and test data may not be exactly same, since you are not setting a random seed to the flow_from_directory method. Have a look here.
Maybe, you can set a seed, remove augmentations (if any) and save trained model weights to load them later to check.

TensorFlow 2.0: Eager execution of training either returns bad results or doesn't learn at all

I am experimenting with TensorFlow 2.0 (alpha). I want to implement a simple feed forward Network with two output nodes for binary classification (it's a 2.0 version of this model).
This is a simplified version of the script. After I defined a simple Sequential() model, I set:
# import layers + dropout & activation
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.activations import elu, softmax
# Neural Network Architecture
n_input = X_train.shape[1]
n_hidden1 = 15
n_hidden2 = 10
n_output = y_train.shape[1]
model = tf.keras.models.Sequential([
Dense(n_input, input_shape = (n_input,), activation = elu), # Input layer
Dropout(0.2),
Dense(n_hidden1, activation = elu), # hidden layer 1
Dropout(0.2),
Dense(n_hidden2, activation = elu), # hidden layer 2
Dropout(0.2),
Dense(n_output, activation = softmax) # Output layer
])
# define loss and accuracy
bce_loss = tf.keras.losses.BinaryCrossentropy()
accuracy = tf.keras.metrics.BinaryAccuracy()
# define optimizer
optimizer = tf.optimizers.Adam(learning_rate = 0.001)
# save training progress in lists
loss_history = []
accuracy_history = []
# loop over 1000 epochs
for epoch in range(1000):
with tf.GradientTape() as tape:
# take binary cross-entropy (bce_loss)
current_loss = bce_loss(model(X_train), y_train)
# Update weights based on the gradient of the loss function
gradients = tape.gradient(current_loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
# save in history vectors
current_loss = current_loss.numpy()
loss_history.append(current_loss)
accuracy.update_state(model(X_train), y_train)
current_accuracy = accuracy.result().numpy()
accuracy_history.append(current_accuracy)
# print loss and accuracy scores each 100 epochs
if (epoch+1) % 100 == 0:
print(str(epoch+1) + '.\tTrain Loss: ' + str(current_loss) + ',\tAccuracy: ' + str(current_accuracy))
accuracy.reset_states()
print('\nTraining complete.')
Training goes without errors, however strange things happen:
Sometimes, the Network doesn't learn anything. All loss and accuracy scores are constant throughout all the epochs.
Other times, the network is learning, but very very badly. Accuracy never went beyond 0.4 (while in TensorFlow 1.x I got an effortless 0.95+). Such a low performance suggests me that something went wrong in the training.
Other times, the accuracy is very slowly improving, while the loss remains constant all the time.
What can cause these problems? Please help me understand my mistakes.
UPDATE:
After some corrections, I can make the Network learn. However, its performance is extremely poor. After 1000 epochs, it reaches about %40 accuracy, which clearly means something is still wrong. Any help is appreciated.

The tf.GradientTape is recording every operation that happens inside its scope.
You don't want to record in the tape the gradient calculation, you only want to compute the loss forward.
with tf.GradientTape() as tape:
# take binary cross-entropy (bce_loss)
current_loss = bce_loss(model(df), classification)
# End of tape scope
# Update weights based on the gradient of the loss function
gradients = tape.gradient(current_loss, model.trainable_variables)
# The tape is now consumed
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
More importantly, I don't see the loop on the training set, therefore I suppose the complete code looks like:
for epoch in range(n_epochs):
for df, classification in dataset:
# your code that computes loss and trains
Moreover, the usage of the metrics is wrong.
You want to accumulate, thus update the internal state of the accuracy operation, at every training step and measure the overall accuracy at the end of every epoch.
Thus you have to:
# Measure the accuracy inside the training loop
accuracy.update_state(model(df), classification)
And call accuracy.result() only at the end of the epoch, when all the accuracy value have been saved into the metric.
Remember to call to the .reset_states() method to clears the variable states, resetting it to zero at the end of every epoch.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensorflow Custom Training With Phases - python

I think you can just pass a training parameter when you call a tf.keras.Model, and it will be passed down to the layers: # On training logits1, logits2 = model(x_batch_train, training=True) # On evaluation val_logits1, val_logits2 = model(x_batch_val, training=False)

Related

PyTorch - Train imbalanced dataset (set weights) for object detection

Is it necessary to use with torch.no_grad() for feature extraction?

How to update weights in Stochastic Weight Averaging (SWA) on tensorflow?

Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

TensorFlow 2.0: Eager execution of training either returns bad results or doesn't learn at all

Categories

Resources