I am using model.fit() several times, each time is responsible for training a block of layers where other layers are freezed
CODE
# create the base pre-trained model
base_model = efn.EfficientNetB0(input_tensor=input_tensor,weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# add a fully-connected layer
x = Dense(x.shape[1], activation='relu',name='first_dense')(x)
x=Dropout(0.5)(x)
x = Dense(x.shape[1], activation='relu',name='output')(x)
x=Dropout(0.5)(x)
no_classes=10
predictions = Dense(no_classes, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional layers
for layer in base_model.layers:
layer.trainable = False
#FIRST COMPILE
model.compile(optimizer='Adam', loss=loss_function,
metrics=['accuracy'])
#FIRST FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=top_epoch,
verbose=verbosity,
validation_split=validation_split)
# Generate generalization metrics
scores = model.evaluate(features[test], labels[test], verbose=1)
print(scores)
#Let all layers be trainable
for layer in model.layers:
layer.trainable = True
from tensorflow.keras.optimizers import SGD
#FIRST COMPILE
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss=loss_function,
metrics=['accuracy'])
#SECOND FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=no_epochs,
verbose=verbosity,
validation_split=validation_split)
What is weird is that in the second fit, accuracy resulted from first epoch is much lower that the accuracy of the last epoch of the first fit.
RESULT
Epoch 40/40
6286/6286 [==============================] - 14s 2ms/sample - loss: 0.2370 - accuracy: 0.9211 - val_loss: 1.3579 - val_accuracy: 0.6762
874/874 [==============================] - 2s 2ms/sample - loss: 0.4122 - accuracy: 0.8764
Train on 6286 samples, validate on 1572 samples
Epoch 1/40
6286/6286 [==============================] - 60s 9ms/sample - loss: 5.9343 - accuracy: 0.5655 - val_loss: 2.4981 - val_accuracy: 0.5115
I think the weights of the second fit are not taken from the first fit
Thanks in advance!!!
I think this is the result of using a different optimizer. You used Adam in the first fit and SGD in the second fit. Try using Adam in the second fit and see if it works correctly
I solved this by removing the second compiler.
Related
I am getting zero validation accuracy on my LSTM model.
As my model is a many to one model, I am using one unit in the last dense layer. But, it is giving me this accuracy.
536/536 [==============================] - 6s 8ms/step - loss: nan -
accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
<keras.callbacks.History at 0x7efd6b9bc5d0>
My model is:
classifier1 = Sequential()
classifier1.add(CuDNNLSTM(100, input_shape = (x1_train.shape[1], x1_train.shape[2]), return_sequences= True))
# classifier1.add(Dropout(0.02))
classifier1.add(CuDNNLSTM(100))
classifier1.add(Dropout(0.02))
classifier1.add(Dense(100, activation = 'sigmoid'))
# classifier1.add(Dense(300))
classifier1.add(Dropout(0.02))
classifier1.add(Dense(1, activation='softmax'))
# classifier1.add(Dropout(0.02))
# classifier1.add(Dense(1))
early_stopping = EarlyStopping(monitor='val_loss',patience=3, verbose = 1)
callback = [early_stopping]
classifier1.compile(
loss='sparse_categorical_crossentropy',
# loss = 'mean_squared_error',
# optimizer=Adam(learning_rate=0.05, decay= 1e-6),
optimizer='rmsprop',
metrics=['accuracy'])
classifier1.fit(x1_train, y1_train, epochs=1 ,
validation_data=(x1_test, y1_test),
batch_size=50
# class_weight= 'balanced'
# callbacks = callback)
)
The output of your network is of dimension 1, so you should use in the last layer a sigmoid activation instead of a softmax and especially replace the categorical cross entropy by a binary cross entropy, because it is a two class problem. Then it should work.
I have a time-series data set with 1 feature that represents multiple games. The goal is to classify each game as a win or loss - binary classification. Each game has 61 rows, and the feature has been scaled to be between 0 and 1:
x_train = array([[0.55340617],
[0.54956823],
[0.54588505],
...,
[0.87483364],
[0.8956947 ],
[0.90343248]])
y_train = array([0, 0, 0, ..., 0, 0, 0])
The problem should be quite easy, and I was expecting around 70% accuracy based on other models.
I'm trying to train an LSTM with the data. I think I should be resetting the state on every game, and so the batch shape is defined by 61 timesteps, and 1 feature:
timesteps = 61
n_features = 1
# Reshape data for LSTM
x_train = x_train.reshape(len(x_train)//timesteps, timesteps, n_features)
# Get class of each game
y_train = x_test[0: len(y_train): timesteps]
model = Sequential()
# Hidden layer
n_neurons = 8
model.add(LSTM(n_neurons,
input_shape=(timesteps, n_features),
stateful=False))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(x_train,
y_train,
epochs=3,
batch_size=1)
But when I train the model, the accuracy remains constant:
Epoch 1/3
301/301 [==============================] - 7s 23ms/step - loss: 8.2524 - accuracy: 0.4618
Epoch 2/3
301/301 [==============================] - 6s 21ms/step - loss: 8.2524 - accuracy: 0.4618
Epoch 3/3
301/301 [==============================] - 6s 21ms/step - loss: 8.2524 - accuracy: 0.4618
I have tried switching the optimiser to 'RMSprop', but I get the exact same result? I think the problem lies with the batch shape?
Any help would be greatly appreciated!
EDIT: Fixed some typos in the code. Sorry!
I am playing around with custom loss functions on Keras models. My "custom" loss seems to fail (in terms of accuracy score), even though I am only using a wrapper that returns an original keras loss.
As a toy example, I am using the "Basic classification" Tensorflow/Keras tutorial that uses a simple NN on the fashion-MNIST data set and I am following the related Keras documentation and this SO post.
This is the model:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
Now, if I leave the sparse_categorical_crossentropy as a string argument in compile() function, the training results to a ~ 87% accuracy which is fine:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10)
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
But when I just create a trivial wrapper function to call keras' cross-entropy I get a ~ 10% accuracy both on training and test sets:
from tensorflow.keras import losses
def my_loss(y_true, y_pred):
return losses.sparse_categorical_crossentropy(y_true, y_pred)
model.compile(optimizer='adam',
loss=my_loss,
metrics=['accuracy'])
Epoch 1/10 60000/60000 [==============================] - 3s 51us/sample - loss: 0.5030 - accuracy: 0.1032
Epoch 2/10 60000/60000 [==============================] - 3s 45us/sample - loss: 0.3766 - accuracy: 0.1035
...
Test accuracy: 0.1013
By plotting a few images and checking their classified labels, it doesn't look like the results differ in each case, but accuracies printed are very different. So, is it the case that the default metrics do not play nicely with custom losses? Can it be the case that what I see is the error rather than the accuracy? Am I missing something from the documentation?
Edit: The values of the loss functions in both cases end up roughly the same, so training indeed takes place. The accuracy is the point of failure.
Here's the reason:
When you use inbuilt loss and use loss='sparse_categorical_crossentropy' at that time accuracy metric used is sparse_categorical_accuracy But when you use custom loss function at that time accuracy metric used is categorical_accuracy.
Example:
model.compile(optimizer='adam',
loss=losses.sparse_categorical_crossentropy,
metrics=['categorical_accuracy', 'sparse_categorical_accuracy'])
model.fit(train_images, train_labels, epochs=1)
'''
Train on 60000 samples
60000/60000 [==============================] - 5s 86us/sample - loss: 0.4955 - categorical_accuracy: 0.1045 - sparse_categorical_accuracy: 0.8255
'''
model.compile(optimizer='adam',
loss=my_loss,
metrics=['accuracy', 'sparse_categorical_accuracy'])
model.fit(train_images, train_labels, epochs=1)
'''
Train on 60000 samples
60000/60000 [==============================] - 5s 87us/sample - loss: 0.4956 - acc: 0.1043 - sparse_categorical_accuracy: 0.8256
'''
I am training an image classification model to classify certain images containing mountain, waterfall and people classes. Currently, I am using Vgg-net (transfer learning) to train this data. While training, I am getting almost 93% accuracy on training data and 90% accuracy on validation data. However, when I want to check the classes being wrongly classified during training by using confusion matrix on the training data, the accuracy seems to be much less. Only about 30% of the data is classified correctly.
I have tried checking the confusion matrix on other similar image classification problems but there it seems to be showing the accuracy that we see while training.
Code to create ImageDataGenerator objects for training and validation
from keras.applications.inception_v3 import preprocess_input, decode_predictions
#Train DataSet Generator with Augmentation
print("\nTraining Data Set")
train_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
train_flow = train_generator.flow(
train_images, train_labels,
batch_size = BATCH_SIZE
)
#Validation DataSet Generator with Augmentation
print("\nValidation Data Set")
val_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
val_flow = val_generator.flow(
validation_images,validation_labels,
batch_size = BATCH_SIZE
)
Code to build the model and compile it
# Initialize InceptionV3 with transfer learning
base_model = applications.vgg16.VGG16(weights='imagenet',
include_top=False,
input_shape=(WIDTH, HEIGHT,3))
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# and a dense layer
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(categories), activation='softmax')(x)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer=optimizers.RMSprop(lr=1e-4), metrics=['accuracy'], loss='categorical_crossentropy')
model.summary()
Code to fit the model to the training dataset and validate using validation dataset
import math
top_layers_file_path=r"/content/drive/My Drive/intel-image-classification/intel-image-classification/top_layers.iv3.hdf5"
checkpoint = ModelCheckpoint(top_layers_file_path, monitor='loss', verbose=1, save_best_only=True, mode='min')
tb = TensorBoard(log_dir=r'/content/drive/My Drive/intel-image-classification/intel-image-classification/logs', batch_size=BATCH_SIZE, write_graph=True, update_freq='batch')
early = EarlyStopping(monitor="loss", mode="min", patience=5)
csv_logger = CSVLogger(r'/content/drive/My Drive/intel-image-classification/intel-image-classification/logs/iv3-log.csv', append=True)
history = model.fit_generator(train_flow,
epochs=30,
verbose=1,
validation_data=val_flow,
validation_steps=math.ceil(validation_images.shape[0]/BATCH_SIZE),
steps_per_epoch=math.ceil(train_images.shape[0]/BATCH_SIZE),
callbacks=[checkpoint, early, tb, csv_logger])
training steps show the following accuracy:
Epoch 1/30
91/91 [==============================] - 44s 488ms/step - loss: 0.6757 - acc: 0.7709 - val_loss: 0.4982 - val_acc: 0.8513
Epoch 2/30
91/91 [==============================] - 32s 349ms/step - loss: 0.4454 - acc: 0.8395 - val_loss: 0.3980 - val_acc: 0.8557
.
.
Epoch 20/30
91/91 [==============================] - 32s 349ms/step - loss: 0.2026 - acc: 0.9238 - val_loss: 0.2684 - val_acc: 0.8940
.
.
Epoch 30/30
91/91 [==============================] - 32s 349ms/step - loss: 0.1739 - acc: 0.9364 - val_loss: 0.2616 - val_acc: 0.8984
Ran predictions on the training dataset itself:
import math
import numpy as np
predictions = model.predict_generator(
train_flow,
verbose=1,
steps=math.ceil(train_images.shape[0]/BATCH_SIZE))
predicted_classes = [x[0] for x in enc.inverse_transform(predictions)]
true_classes = [x[0] for x in enc.inverse_transform(train_labels)]
enc - OneHotEncoder
However, the confusion matrix looks like the following:
import sklearn
sklearn.metrics.confusion_matrix(
true_classes,
predicted_classes,
labels=['mountain','people','waterfall'])
Confusion Matrix (Could not upload a better looking picture)
([[315, 314, 283],
[334, 309, 273],
[337, 280, 263]])
The complete code has been uploaded on https://nbviewer.jupyter.org/github/paridhichoudhary/scene-image-classification/blob/master/Classification_v7.ipynb
I believe it's because of the train_generator.flow has shuffle=True by default. This result in predicted_classes doesn't match the train_labels.
Maybe set shuffle=False on train_generator.flow should help this, or instead use something like this might easier to understand.
predicted_classes = []
true_classes = []
for i in range(len(train_flow)): # you can just use `len(train_flow)` instead of `math.ceil(train_images.shape[0]/BATCH_SIZE))`
x, y = train_flow[i] # you can use `train_flow[i]` like this
z = model.predict(x)
for j in range(x.shape[0]):
predicted_classes.append(z[j])
true_classes.append(y[j])
I didn't try this yet but it should work though.
I'm new to tensorflow keras and dataset. Can anyone help me understand why the following code doesn't work?
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.python.data.ops import dataset_ops
from tensorflow.python.data.ops import iterator_ops
from tensorflow.python.keras.utils import multi_gpu_model
from tensorflow.python.keras import backend as K
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
print( dataset)
print( dataset.output_types)
print( dataset.output_shapes)
dataset.batch(10)
dataset.repeat(100)
inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset, epochs=5, steps_per_epoch=100)
It failed with the following error:
model.fit(x=dataset, y=None, epochs=5, steps_per_epoch=100)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1510, in fit
validation_split=validation_split)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 994, in _standardize_user_data
class_weight, batch_size)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1113, in _standardize_weights
exception_prefix='input')
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_utils.py", line 325, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
According to tf.keras guide, I should be able to directly pass the dataset to model.fit, as this example shows:
Input tf.data datasets
Use the Datasets API to scale to large datasets or multi-device training. Pass a tf.data.Dataset instance to the fit method:
# Instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()
Don't forget to specify steps_per_epoch when calling fit on a dataset.
model.fit(dataset, epochs=10, steps_per_epoch=30)
Here, the fit method uses the steps_per_epoch argument—this is the number of training steps the model runs before it moves to the next epoch. Since the Dataset yields batches of data, this snippet does not require a batch_size.
Datasets can also be used for validation:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()
model.fit(dataset, epochs=10, steps_per_epoch=30,
validation_data=val_dataset,
validation_steps=3)
What's the problem with my code, and what's the correct way of doing it?
To your original question as to why you're getting the error:
Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
The reason your code breaks is because you haven't applied the .batch() back to the dataset variable, like so:
dataset = dataset.batch(10)
You simply called dataset.batch().
This breaks because without the batch() the output tensors are not batched, i.e. you get shape (32,) instead of (1,32).
You are missing defining an iterator which is the reason why there is an error.
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
dataset = dataset.batch(10).repeat()
inputs = Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset.make_one_shot_iterator(), epochs=5, steps_per_epoch=100)
Epoch 1/5
100/100 [==============================] - 1s 8ms/step - loss: 11.5787 - acc: 0.1010
Epoch 2/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4846 - acc: 0.0990
Epoch 3/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4690 - acc: 0.1270
Epoch 4/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4611 - acc: 0.1300
Epoch 5/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4546 - acc: 0.1360
This is the result on my system.