Working on IMDB Dataset Google Colab, the model accuracy refuses to go above 50%.
The dataset has already been tokenized and cleaned before this.
Any suggestions on how the accuracy can be improved are welcome.
le=LabelEncoder()
df['sentiment']= le.fit_transform(df['sentiment'])
labels=to_categorical(df['sentiment'],num_classes=2) # this is output
max_len = 400
embeddings=256
sequences = tokenizer.texts_to_sequences(df['review'])
sequences_padded=pad_sequences(sequences,maxlen=max_len,padding='post',truncating='post')
num words = 10000
embeddings = 256
max_len=400
X_train,X_test,y_train,y_test=train_test_split(sequences_padded,labels,test_size=0.20,random_state=42)
model= keras.Sequential()
model.add(Embedding(num_words,embeddings,input_length=max_len))
model.add(Conv1D(256,10,activation='relu'))
model.add(keras.layers.Bidirectional(LSTM(128,return_sequences=True,kernel_regularizer=tf.keras.regularizers.l1(0.01),activity_regularizer=tf.keras.regularizers.l2(0.01))))
model.add(LSTM(64))
model.add(keras.layers.Dropout(0.4))
model.add(Dense(2,activation='softmax'))
model.summary()
history=model.fit(X_train,y_train,epochs=3, batch_size=128, verbose=1)
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
Epoch 1/3
310/310 [==============================] - 157s 398ms/step - loss: 35.7756 - accuracy: 0.5007
Epoch 2/3
310/310 [==============================] - 123s 395ms/step - loss: 1.0212 - accuracy: 0.5003
Epoch 3/3
310/310 [==============================] - 123s 397ms/step - loss: 1.0211 - accuracy: 0.5015
Update:
Model accuracy started improving when I changed from post to pre padding. Any leads on why this happens would be highly appreciated.
You use binary_crossentropy + softmax instead of categorical_crossentropy + softmax.
Change to:
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
Related
I'm trying to build and train LSTM Neural Network.
Here is my code (summary version):
X_train, X_test, y_train, y_test = train_test_split(np.array(sequences), to_categorical(labels).astype(int), test_size=0.2)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)
log_dir = os.path.join('Logs')
tb_callback = TensorBoard(log_dir=log_dir)
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='tanh', input_shape=(60,1662)))
model.add(LSTM(128, return_sequences=True, activation='tanh', dropout=0.31))
model.add(LSTM(64, return_sequences=False, activation='tanh'))
model.add(Dense(32, activation='relu'))
model.add(Dense(len(actions), activation='softmax'))
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val)) # default slice percentage check
val_dataset = val_dataset.batch(256)
model.fit(X_train, y_train, batch_size=256, epochs=250, callbacks=[tb_callback], validation_data=val_dataset)
And model fit result:
Epoch 248/250
8/8 [==============================] - 2s 252ms/step - loss: 0.4563 - categorical_accuracy: 0.8641 - val_loss: 2.1406 - val_categorical_accuracy: 0.6104
Epoch 249/250
8/8 [==============================] - 2s 255ms/step - loss: 0.4542 - categorical_accuracy: 0.8672 - val_loss: 2.2365 - val_categorical_accuracy: 0.5667
Epoch 250/250
8/8 [==============================] - 2s 234ms/step - loss: 0.4865 - categorical_accuracy: 0.8562 - val_loss: 2.1668 - val_categorical_accuracy: 0.5875
I wanna reduce value gap between categorical_accuracy and val_categorical_accuracy.
Can I know how to do it?
Thank you for reading my article.
When there is so much difference between your train and validation data this mean your model is overfitting.
So look for how prevent from overfitting. Usually what you have to do is add more data to your dataset.
It won’t work every time, but training with more data can help algorithms detect the signal better.
Try to stop before overfit
another aspect is to try stop the model and reduce the learning rate
I am trying to train DenseNet121 (among other models) in tensorflow/keras and I need to keep track of accuracy and val_accuracy. However, running this does not log the val_accuracy in the model's history:
clf_model = tf.keras.models.Sequential()
clf_model.add(pre_trained_model)
clf_model.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
clf_model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = clf_model.fit(train_processed, epochs=10, validation_data=validation_processed)
output (no val_accuracy, I need val_accuracy):
Epoch 1/10
1192/1192 [==============================] - 75s 45ms/step - loss: 2.3908 - accuracy: 0.4374
Epoch 2/10
451/1192 [==========>...................] - ETA: 22s - loss: 1.3556 - accuracy: 0.6217
When I tried to pass val_accuracy to the metrics as follows:
clf_model = tf.keras.models.Sequential()
clf_model.add(pre_trained_model)
clf_model.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
clf_model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy','val_accuracy'])
history = clf_model.fit(train_processed, epochs=10, validation_data=validation_processed)
I get the following error:
ValueError: Unknown metric function: val_accuracy. Please ensure this object is passed to the `custom_objects` argument.
See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.
Any idea what am I doing wrong?
Update
It turned out the test dataset was empty.
I want to train my model for different batch sizes i.e: [64, 128]
I am doing it with for loop like below
epoch=2
batch_sizes = [128,256]
for i in range(len(batch_sizes)):
history = model.fit(x_train, y_train, batch_sizes[i], epochs=epochs,
callbacks=[early_stopping, chk], validation_data=(x_test, y_test))
for above code my model produce following results:
Epoch 1/2
311/311 [==============================] - 157s 494ms/step - loss: 0.2318 -
f1: 0.0723
Epoch 2/2
311/311 [==============================] - 152s 488ms/step - loss: 0.1402 -
f1: 0.4360
Epoch 1/2
156/156 [==============================] - 137s 877ms/step - loss: 0.1197 -
f1: **0.5450**
Epoch 2/2
156/156 [==============================] - 136s 871ms/step - loss: 0.1132 -
f1: 0.5756
it looks like the model continues training after completing training for batch size 64, i.e I want to get my model trained for the next batch from scratch, how can I do it kindly guide me.
p.s: what i have tried:
epoch=2
batch_sizes = [128,256]
for i in range(len(batch_sizes)):
history = model.fit(x_train, y_train, batch_sizes[i], epochs=epochs,
callbacks=[early_stopping, chk], validation_data=(x_test, y_test))
keras.backend.clear_session()
it also did not worked
You can write a function to define a model, and you would need to call that before the subsequent fit calls. If your model is contained within model, the weights are updated during training, and they stay that way after the fit call. That is why you need to redefine the model. This can help you
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
X = np.random.rand(1000,5)
Y = np.random.rand(1000,1)
def build_model():
model = Sequential()
model.add(Dense(64,input_shape=(X.shape[1],)))
model.add(Dense(Y.shape[1]))
model.compile(loss='mse',optimizer='Adam')
return model
epoch=2
batch_sizes = [128,256]
for i in range(len(batch_sizes)):
model = build_model()
history = model.fit(X, Y, batch_sizes[i], epochs=epoch, verbose=2)
model.save('Model_' + str(batch_sizes[i]) + '.h5')
Then, the output looks like:
Epoch 1/2
8/8 - 0s - loss: 0.3164
Epoch 2/2
8/8 - 0s - loss: 0.1367
Epoch 1/2
4/4 - 0s - loss: 0.7221
Epoch 2/2
4/4 - 0s - loss: 0.4787
I have created a CNN model for classifying text data. Please help me interpret my result and tell me why is my training accuracy less than validation accuracy?
I have a total of 2619 Data, all of them are text data. There are two different classes. Here is a sample of my dataset.
The validation set has 34 data. Rest of 2619 data are training data.
I have done RepeatedKfold cross-validation. Here is my code.
from sklearn.model_selection import RepeatedKFold
kf = RepeatedKFold(n_splits=75, n_repeats=1, random_state= 42)
for train_index, test_index in kf.split(X,Y):
#print("Train:", train_index, "Validation:",test_index)
x_train, x_test = X.iloc[train_index], X.iloc[test_index]
y_train, y_test = Y.iloc[train_index], Y.iloc[test_index]
I have used CNN. Here is my model.
model = Sequential()
model.add(Embedding(2900,2 , input_length=1))
model.add(Conv1D(filters=2, kernel_size=3, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), activation='sigmoid'))
model.add(Dropout(0.25))
adam = optimizers.Adam(lr = 0.0005, beta_1 = 0.9, beta_2 = 0.999, epsilon = None, decay = 0.0, amsgrad = False)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
print(model.summary())
history = model.fit(x_train, y_train, epochs=300,validation_data=(x_test, y_test), batch_size=128, shuffle=False)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
And here is the result.
Epoch 295/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6920 - acc: 0.7528 - val_loss: 0.5839 - val_acc: 0.8235
Epoch 296/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6532 - acc: 0.7617 - val_loss: 0.5836 - val_acc: 0.8235
Epoch 297/300
2585/2585 [==============================] - 0s 27us/step - loss: 1.5328 - acc: 0.7551 - val_loss: 0.5954 - val_acc: 0.8235
Epoch 298/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6289 - acc: 0.7524 - val_loss: 0.5897 - val_acc: 0.8235
Epoch 299/300
2585/2585 [==============================] - 0s 21us/step - loss: 1.7000 - acc: 0.7582 - val_loss: 0.5854 - val_acc: 0.8235
Epoch 300/300
2585/2585 [==============================] - 0s 25us/step - loss: 1.5475 - acc: 0.7451 - val_loss: 0.5934 - val_acc: 0.8235
Accuracy: 82.35%
Please help me with my problem. Thank you.
You may have too much regularization for your model causing it to underfit your data.
A good way to start is begin with no regularization at all (no Dropout, no weights decay, ..) and look if it's overfitting:
If not, regularization is useless
If it's overfitting, add regularization little by little, start by small dropout / weights decay, and then icrease it if it's continue to overfit
Moroever, don't put Dropout as final layer, and don't put two Dropout layers successively.
Your training accuracy is less than validation accuracy likely because of using dropout: it "turns off" some neurons during training to prevent overfitting. During validation dropout is off, so your network uses all its neurons, thus making (in that particular case) more accurate predictions.
In general, I agree with advice by Thibault Bacqueyrisses and want to add that it's also usually a bad practice to put dropout before batch normalization (which is not about this particular case anyway).
I am playing around with custom loss functions on Keras models. My "custom" loss seems to fail (in terms of accuracy score), even though I am only using a wrapper that returns an original keras loss.
As a toy example, I am using the "Basic classification" Tensorflow/Keras tutorial that uses a simple NN on the fashion-MNIST data set and I am following the related Keras documentation and this SO post.
This is the model:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
Now, if I leave the sparse_categorical_crossentropy as a string argument in compile() function, the training results to a ~ 87% accuracy which is fine:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10)
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
But when I just create a trivial wrapper function to call keras' cross-entropy I get a ~ 10% accuracy both on training and test sets:
from tensorflow.keras import losses
def my_loss(y_true, y_pred):
return losses.sparse_categorical_crossentropy(y_true, y_pred)
model.compile(optimizer='adam',
loss=my_loss,
metrics=['accuracy'])
Epoch 1/10 60000/60000 [==============================] - 3s 51us/sample - loss: 0.5030 - accuracy: 0.1032
Epoch 2/10 60000/60000 [==============================] - 3s 45us/sample - loss: 0.3766 - accuracy: 0.1035
...
Test accuracy: 0.1013
By plotting a few images and checking their classified labels, it doesn't look like the results differ in each case, but accuracies printed are very different. So, is it the case that the default metrics do not play nicely with custom losses? Can it be the case that what I see is the error rather than the accuracy? Am I missing something from the documentation?
Edit: The values of the loss functions in both cases end up roughly the same, so training indeed takes place. The accuracy is the point of failure.
Here's the reason:
When you use inbuilt loss and use loss='sparse_categorical_crossentropy' at that time accuracy metric used is sparse_categorical_accuracy But when you use custom loss function at that time accuracy metric used is categorical_accuracy.
Example:
model.compile(optimizer='adam',
loss=losses.sparse_categorical_crossentropy,
metrics=['categorical_accuracy', 'sparse_categorical_accuracy'])
model.fit(train_images, train_labels, epochs=1)
'''
Train on 60000 samples
60000/60000 [==============================] - 5s 86us/sample - loss: 0.4955 - categorical_accuracy: 0.1045 - sparse_categorical_accuracy: 0.8255
'''
model.compile(optimizer='adam',
loss=my_loss,
metrics=['accuracy', 'sparse_categorical_accuracy'])
model.fit(train_images, train_labels, epochs=1)
'''
Train on 60000 samples
60000/60000 [==============================] - 5s 87us/sample - loss: 0.4956 - acc: 0.1043 - sparse_categorical_accuracy: 0.8256
'''