I was just following a TensorFlow example from the book Hands-On Machine Learning with Scikit-Learn and TensorFlow but got weird results.
The example:
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:] / 255.0
y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid))
As the epochs evolve we should se an improvement for accuracy as indicated in the book:
Train on 55000 samples, validate on 5000 samples
Epoch 1/30
55000/55000 [==========] - 3s 55us/sample - loss: 1.4948 - acc: 0.5757 - val_loss: 1.0042 - val_acc: 0.7166
Epoch 2/30
55000/55000 [==========] - 3s 55us/sample - loss: 0.8690 - acc: 0.7318 - val_loss: 0.7549 - val_acc: 0.7616
[...]
Epoch 50/50
55000/55000 [==========] - 4s 72us/sample - loss: 0.3607 - acc: 0.8752 - acc: 0.8752 -val_loss: 0.3706 - val_acc: 0.8728
But when I ran I got the following:
Epoch 1/30
1719/1719 [==============================] - 3s 2ms/step - loss: 0.0623 - accuracy: 0.1005 - val_loss: 0.0011 - val_accuracy: 0.0914
Epoch 2/30
1719/1719 [==============================] - 3s 2ms/step - loss: 8.7637e-04 - accuracy: 0.1011 - val_loss: 5.2079e-04 - val_accuracy: 0.0914
Epoch 3/30
1719/1719 [==============================] - 3s 2ms/step - loss: 4.9200e-04 - accuracy: 0.1019 - val_loss: 3.4211e-04 - val_accuracy: 0.0914
[...]
Epoch 49/50
1719/1719 [==============================] - 3s 2ms/step - loss: 3.1710e-05 - accuracy: 0.0992 - val_loss: 3.2966e-05 - val_accuracy: 0.0914
Epoch 50/50
1719/1719 [==============================] - 3s 2ms/step - loss: 2.7711e-05 - accuracy: 0.1022 - val_loss: 3.1833e-05 - val_accuracy: 0.0914
So, as you can see the reproduction got a strongly lower accuracy that has not improved: it stayed at 0.0914 instead of 0.8728.
Is there something wrong in my TensorFlow installation, setup or even in the code?
you can not divide y such as y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0. The completed code is following :
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_train_full = X_train_full / 255.0
X_test = X_test / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train_full, y_train_full, epochs=5, validation_data=(X_test, y_test))
It will give the acc like :
Epoch 1/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.9880 - accuracy: 0.6923 - val_loss: 0.5710 - val_accuracy: 0.8054
Epoch 2/5
1875/1875 [==============================] - 2s 944us/step - loss: 0.5281 - accuracy: 0.8227 - val_loss: 0.5112 - val_accuracy: 0.8228
Epoch 3/5
1875/1875 [==============================] - 2s 913us/step - loss: 0.4720 - accuracy: 0.8391 - val_loss: 0.4782 - val_accuracy: 0.8345
Epoch 4/5
1875/1875 [==============================] - 2s 915us/step - loss: 0.4492 - accuracy: 0.8462 - val_loss: 0.4568 - val_accuracy: 0.8410
Epoch 5/5
1875/1875 [==============================] - 2s 935us/step - loss: 0.4212 - accuracy: 0.8550 - val_loss: 0.4469 - val_accuracy: 0.8444
Also, optimizer adam may be give better result than sgd.
Related
I am learning python deep learning tools on Tensorflow official websites.
Trying to build several Text-Classification network, do as tutorials. But LSTM does not work as except.
import numpy as np
import tensorflow_datasets as tfds
import tensorflow as tf
from tensorflow.keras import utils
from tensorflow.keras import losses
import matplotlib.pyplot as plt
seed = 42
BATCH_SIZE = 64
train_ds = utils.text_dataset_from_directory(
'stack_overflow_16k/train',
validation_split=0.2,
subset='training',
batch_size=BATCH_SIZE,
seed=seed)
val_ds = utils.text_dataset_from_directory(
'stack_overflow_16k/train',
validation_split=0.2,
subset='validation',
batch_size=BATCH_SIZE,
seed=seed)
test_ds = utils.text_dataset_from_directory(
'stack_overflow_16k/test',
batch_size=BATCH_SIZE)
class_names = train_ds.class_names
train_ds = train_ds.prefetch(buffer_size=tf.data.AUTOTUNE)
val_ds = val_ds.prefetch(buffer_size=tf.data.AUTOTUNE)
test_ds = test_ds.prefetch(buffer_size=tf.data.AUTOTUNE)
VOCAB_SIZE = 1000
MAX_SEQUENCE_LENGTH = 500
encoder = tf.keras.layers.TextVectorization(
max_tokens=VOCAB_SIZE,
output_sequence_length=MAX_SEQUENCE_LENGTH)
encoder.adapt(train_ds.map(lambda text, label: text))
model = tf.keras.Sequential([
encoder,
tf.keras.layers.Embedding(VOCAB_SIZE, 64, mask_zero=True),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
# tf.keras.layers.LSTM(128),
# tf.keras.layers.Dense(64, activation='relu'),
# tf.keras.layers.Conv1D(64, 5, padding="valid", activation="relu", strides=2),
# tf.keras.layers.GlobalMaxPooling1D(),
# tf.keras.layers.GRU(64),
# tf.keras.layers.SimpleRNN(64),
# tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(4)
])
model.summary()
model.compile(loss='sparse_categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(train_ds, epochs=10,
validation_data=val_ds)
This is my complete code, the core part is same as tutorials.
But the training output as follow:
Epoch 1/10
100/100 [==============================] - 33s 273ms/step - loss: 9.6882 - accuracy: 0.2562 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 2/10
100/100 [==============================] - 25s 250ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 3/10
100/100 [==============================] - 25s 252ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 4/10
100/100 [==============================] - 25s 254ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 5/10
100/100 [==============================] - 25s 255ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 6/10
100/100 [==============================] - 26s 256ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 7/10
100/100 [==============================] - 26s 257ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 8/10
100/100 [==============================] - 26s 258ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 9/10
100/100 [==============================] - 26s 258ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
Epoch 10/10
100/100 [==============================] - 26s 256ms/step - loss: 12.1238 - accuracy: 0.2478 - val_loss: 11.9475 - val_accuracy: 0.2587
The accuracy does not increase and loss does not decrease at all, like not trained. And the accuracy just same as the reciprocal of the number of classes. (E.g. If it's a binary classification problem then the accuracy would keep aroud 0.5, four classification problem 0.25)
Later I compare with CNN, just change the LSTM layers to CNN layers as tutorials, it works as expect. (Same datasets, same params of model.compile() and model.fit())
I also tried GRU, same problem occurs.
I don't get it.
Am I missing some configuration with RNN-like model? Can somebody help me with this problem? Thanks!
P.S.
tensorflow-macos 2.9
tensorflow-metal 0.5.0
Chip: Apple M1
Datasets: https://storage.googleapis.com/download.tensorflow.org/data/stack_overflow_16k.tar.gz Same as tutorials.
I tried config the optimizer(sgd, adam) and learning rate, does not work. It is not like overfitting.
Methods I tried:
Keras accuracy does not change
Update 2023-01-30
I run the same code on my linux server it work expected. It maybe a bug of tensorflow-macos.
Update 2023-02-01
Tried the official version of tensorflow for macos m1, just
conda install tensorflow
it works. Suppposed to be the problem of tensorflow-macos GPU support. And I try again by using CPU only on tensorflow-macos, it works.
Conclusion:
The RNN-like model have problem on tensorflow-macos with GPU.
I use the Keras API to train a CNN on Cifar10.
Here is my code :
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
conv_network = Input(shape=(32, 32, 3), name="img")
x = Conv2D(filters=32, kernel_size=(3,3), strides=2, activation="relu")(conv_network)
x = Conv2D(filters=64, kernel_size=(3,3), strides=2, activation="relu")(x)
x = Conv2D(filters=128, kernel_size=(3,3), strides=2, activation="relu")(x)
x = Flatten()(x)
x = Dense(1024, activation='relu')(x)
output = Dense(10, activation='softmax')(x)
model = tf.keras.Model(conv_network, output, name="convolutional_network")
model.compile(loss='sparse_categorical_crossentropy',optimizer='Adam', metrics=['accuracy'])
I train my model using the following :
r = model.fit(x_train, y_train, epochs=25,validation_data=(x_test, y_test))
It trains successfully :
Epoch 1/25
1563/1563 [==============================] - 7s 4ms/step - loss: 1.7196 - accuracy: 0.4259 - val_loss: 1.3780 - val_accuracy: 0.5105
Epoch 2/25
1563/1563 [==============================] - 6s 4ms/step - loss: 1.2711 - accuracy: 0.5519 - val_loss: 1.2598 - val_accuracy: 0.5600
Epoch 3/25
1563/1563 [==============================] - 7s 4ms/step - loss: 1.1004 - accuracy: 0.6137 - val_loss: 1.2390 - val_accuracy: 0.5776
Epoch 4/25
1563/1563 [==============================] - 7s 4ms/step - loss: 0.9520 - accuracy: 0.6678 - val_loss: 1.2774 - val_accuracy: 0.5767
Epoch 5/25
1563/1563 [==============================] - 7s 4ms/step - loss: 0.7858 - accuracy: 0.7257 - val_loss: 1.3226 - val_accuracy: 0.5921
Epoch 6/25
1563/1563 [==============================] - 6s 4ms/step - loss: 0.6334 - accuracy: 0.7791 - val_loss: 1.5789 - val_accuracy: 0.5586
Epoch 7/25
1563/1563 [==============================] - 6s 4ms/step - loss: 0.5178 - accuracy: 0.8227 - val_loss: 1.7296 - val_accuracy: 0.5730
Epoch 8/25
1563/1563 [==============================] - 6s 4ms/step - loss: 0.4163 - accuracy: 0.8589 - val_loss: 2.0499 - val_accuracy: 0.5682
Epoch 9/25
1563/1563 [==============================] - 6s 4ms/step - loss: 0.3794 - accuracy: 0.8739 - val_loss: 2.0991 - val_accuracy: 0.5820
Epoch 10/25
1563/1563 [==============================] - 7s 4ms/step - loss: 0.3453 - accuracy: 0.8901 - val_loss: 2.3261 - val_accuracy: 0.5697
Now, when I train with a ImageDataGenerator that doesn't do any kind of augmentation, the predictions are random and it doesn't train at all :
datagen = ImageDataGenerator()
model.fit(datagen.flow(x_train, y_train, batch_size=32),
steps_per_epoch=50000 / 32,
epochs=10)
Results in :
Epoch 1/10
1562/1562 [==============================] - 7s 4ms/step - loss: 1.6822 - accuracy: 0.1010
Epoch 2/10
1562/1562 [==============================] - 7s 4ms/step - loss: 1.2881 - accuracy: 0.0982
Epoch 3/10
1562/1562 [==============================] - 7s 4ms/step - loss: 1.1302 - accuracy: 0.0987
Epoch 4/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.9817 - accuracy: 0.1001
Epoch 5/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.8215 - accuracy: 0.1011
Epoch 6/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.6760 - accuracy: 0.1000
Epoch 7/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.5445 - accuracy: 0.1005
Epoch 8/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.4660 - accuracy: 0.1006
Epoch 9/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.4048 - accuracy: 0.1002
Epoch 10/10
1562/1562 [==============================] - 7s 4ms/step - loss: 0.3641 - accuracy: 0.1006
What am I doing wrong here ?
I found a solution after trial and error but I still don't fully understand why my previous code didn't work.
conv_network = Input(shape=(32, 32, 3), name="img")
x = Conv2D(filters=32, kernel_size=(3,3), strides=2, activation="relu")(conv_network)
x = Conv2D(filters=64, kernel_size=(3,3), strides=2, activation="relu")(x)
x = Conv2D(filters=128, kernel_size=(3,3), strides=2, activation="relu")(x)
x = Flatten()(x)
x = Dense(1024, activation='relu')(x)
output = Dense(10, activation='softmax')(x)
model = tf.keras.Model(conv_network, output, name="convolutional_network")
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)
datagen = ImageDataGenerator()
# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32),
validation_data=(x_test, y_test),
steps_per_epoch=len(x_train) / 32,
epochs=10)
What changes is that
instead of using sparse_categorical_crossentropy I used categorical_crossentropy
instead of training the network with the raw categorical y values, I changed it to one-hot encoded y values.
If someone has a clear explanation of why does it work now, I would be glad to hear it.
Also, is there a way to successfully train the model without using one-hot encoding ?
Thank you
I want to binary classify breast cancer histopathological images from the BreakHis dataset (https://www.kaggle.com/ambarish/breakhis) using transfer learning and the Inception Resnet v2. The goal is to freeze all layers and train the fully connected layer by adding two neurons to the model. In particular, initially I want to consider the images related to the magnificant factor 40X (Benign: 625, Malignant: 1370). Here is a summary of what I do:
I read the images and resize them to 150x150
I partition the dataset into training, validation and test set
I load the pre-trained network Inception Resnet v2
I freeze all the layers I add the two neurons for binary
classification (1 = "benign", 0 = "malignant")
I compile the model using as activation function the Adam method
I carry out the training
I make the prediction
I calculate the accuracy
This is the code:
data = dataset[dataset["Magnificant"]=="40X"]
def preprocessing(dataset, img_size):
# images
X = []
# labels
y = []
i = 0
for image in list(dataset["Path"]):
# Ridimensiono e leggo le immagini
X.append(cv2.resize(cv2.imread(image, cv2.IMREAD_COLOR),
(img_size, img_size), interpolation=cv2.INTER_CUBIC))
basename = os.path.basename(image)
# Get labels
if dataset.loc[i][2] == "benign":
y.append(1)
else:
y.append(0)
i = i+1
return X, y
X, y = preprocessing(data, 150)
X = np.array(X)
y = np.array(y)
# Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y_40, shuffle=True, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=1)
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=[150, 150, 3])
# Freezing
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(learning_rate=0.0002)
loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
model.compile(loss=loss, optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
batch_size = 32
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow(X_train, y_train, batch_size=batch_size)
val_generator = val_datagen.flow(X_val, y_val, batch_size=batch_size)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size)
This is the output of the training at the last epoch:
Epoch 70/70
32/32 [==============================] - 3s 84ms/step - loss: 0.0499 - accuracy: 0.9903 - auc_5: 0.9996 - val_loss: 0.5661 - val_accuracy: 0.8250 - val_auc_5: 0.8521
I make the prediction:
test_datagen = ImageDataGenerator(rescale=1./255)
x = X_test
y_pred = model.predict(test_datagen.flow(x))
y_p = []
for i in range(len(y_pred)):
if y_pred[i] > 0.5:
y_p.append(1)
else:
y_p.append(0)
I calculate the accuracy:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_p)
print(accuracy)
This is the accuracy value I get: 0.5459098497495827
Why do I get such low accuracy, I have done several tests but I always get similar results?
Update
I have made the following changes but I always get the same results (place only the modified parts of the code):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y, shuffle=True, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, stratify = y_train, shuffle=True, random_state=1)
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size, callbacks=[callback])
Update 2
I also changed from_logits from True to False, but of course that's not the problem yet. I always get 57% accuracy.
This is the model.fit output over 30 epochs:
Epoch 1/30
32/32 [==============================] - 23s 202ms/step - loss: 0.7994 - accuracy: 0.6010 - auc: 0.5272 - val_loss: 0.5338 - val_accuracy: 0.7688 - val_auc: 0.7943
Epoch 2/30
32/32 [==============================] - 3s 87ms/step - loss: 0.5778 - accuracy: 0.7206 - auc: 0.7521 - val_loss: 0.4763 - val_accuracy: 0.7781 - val_auc: 0.8155
Epoch 3/30
32/32 [==============================] - 3s 85ms/step - loss: 0.5311 - accuracy: 0.7581 - auc: 0.7710 - val_loss: 0.4740 - val_accuracy: 0.7719 - val_auc: 0.8212
Epoch 4/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4684 - accuracy: 0.7718 - auc: 0.8219 - val_loss: 0.4270 - val_accuracy: 0.8031 - val_auc: 0.8611
Epoch 5/30
32/32 [==============================] - 3s 83ms/step - loss: 0.4280 - accuracy: 0.7943 - auc: 0.8617 - val_loss: 0.4496 - val_accuracy: 0.7969 - val_auc: 0.8468
Epoch 6/30
32/32 [==============================] - 3s 88ms/step - loss: 0.4237 - accuracy: 0.8250 - auc: 0.8673 - val_loss: 0.3993 - val_accuracy: 0.7937 - val_auc: 0.8840
Epoch 7/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4130 - accuracy: 0.8513 - auc: 0.8767 - val_loss: 0.4207 - val_accuracy: 0.7781 - val_auc: 0.8692
Epoch 8/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3446 - accuracy: 0.8485 - auc: 0.9077 - val_loss: 0.4229 - val_accuracy: 0.7937 - val_auc: 0.8730
Epoch 9/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3690 - accuracy: 0.8514 - auc: 0.9003 - val_loss: 0.4300 - val_accuracy: 0.8062 - val_auc: 0.8696
Epoch 10/30
32/32 [==============================] - 3s 100ms/step - loss: 0.3204 - accuracy: 0.8533 - auc: 0.9270 - val_loss: 0.4235 - val_accuracy: 0.7969 - val_auc: 0.8731
Epoch 11/30
32/32 [==============================] - 3s 86ms/step - loss: 0.3555 - accuracy: 0.8508 - auc: 0.9124 - val_loss: 0.4124 - val_accuracy: 0.8000 - val_auc: 0.8797
Epoch 12/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3243 - accuracy: 0.8481 - auc: 0.9308 - val_loss: 0.3979 - val_accuracy: 0.7969 - val_auc: 0.8908
Epoch 13/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3017 - accuracy: 0.8744 - auc: 0.9348 - val_loss: 0.4239 - val_accuracy: 0.8094 - val_auc: 0.8758
Epoch 14/30
32/32 [==============================] - 3s 89ms/step - loss: 0.3317 - accuracy: 0.8521 - auc: 0.9221 - val_loss: 0.4238 - val_accuracy: 0.8094 - val_auc: 0.8704
Epoch 15/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2840 - accuracy: 0.8908 - auc: 0.9490 - val_loss: 0.4131 - val_accuracy: 0.8281 - val_auc: 0.8858
Epoch 16/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2583 - accuracy: 0.8905 - auc: 0.9511 - val_loss: 0.3841 - val_accuracy: 0.8375 - val_auc: 0.9007
Epoch 17/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2810 - accuracy: 0.8648 - auc: 0.9470 - val_loss: 0.3928 - val_accuracy: 0.8438 - val_auc: 0.8972
Epoch 18/30
32/32 [==============================] - 3s 89ms/step - loss: 0.2622 - accuracy: 0.8923 - auc: 0.9550 - val_loss: 0.3732 - val_accuracy: 0.8438 - val_auc: 0.9089
Epoch 19/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2486 - accuracy: 0.8990 - auc: 0.9579 - val_loss: 0.4077 - val_accuracy: 0.8250 - val_auc: 0.8924
Epoch 20/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2412 - accuracy: 0.9074 - auc: 0.9635 - val_loss: 0.4249 - val_accuracy: 0.8219 - val_auc: 0.8787
Epoch 21/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2386 - accuracy: 0.9095 - auc: 0.9657 - val_loss: 0.4177 - val_accuracy: 0.8094 - val_auc: 0.8904
Epoch 22/30
32/32 [==============================] - 3s 99ms/step - loss: 0.2313 - accuracy: 0.8996 - auc: 0.9668 - val_loss: 0.4089 - val_accuracy: 0.8406 - val_auc: 0.8890
Epoch 23/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2424 - accuracy: 0.9067 - auc: 0.9654 - val_loss: 0.4033 - val_accuracy: 0.8500 - val_auc: 0.8953
Epoch 24/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2315 - accuracy: 0.9045 - auc: 0.9626 - val_loss: 0.3903 - val_accuracy: 0.8250 - val_auc: 0.9030
Epoch 25/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2001 - accuracy: 0.9321 - auc: 0.9788 - val_loss: 0.4276 - val_accuracy: 0.8000 - val_auc: 0.8855
Epoch 26/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2118 - accuracy: 0.9212 - auc: 0.9695 - val_loss: 0.4335 - val_accuracy: 0.8125 - val_auc: 0.8897
Epoch 27/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2463 - accuracy: 0.8941 - auc: 0.9665 - val_loss: 0.4112 - val_accuracy: 0.8438 - val_auc: 0.8882
Epoch 28/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2130 - accuracy: 0.9033 - auc: 0.9771 - val_loss: 0.3834 - val_accuracy: 0.8406 - val_auc: 0.9021
Epoch 29/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2021 - accuracy: 0.9229 - auc: 0.9754 - val_loss: 0.3855 - val_accuracy: 0.8469 - val_auc: 0.9008
Epoch 30/30
32/32 [==============================] - 3s 88ms/step - loss: 0.1859 - accuracy: 0.9314 - auc: 0.9824 - val_loss: 0.4018 - val_accuracy: 0.8375 - val_auc: 0.8928
You have to changefrom_logits=True to from_logits=False in your loss function. Once again Credits - #Frightera.
It seems like your model is over-fitting somewhere. It would be best if you could check for that.
Do the K-Fold test for 10 folds. It would show the true results
In your metrics, do add the F1 score. The F1 value would give you a real look into the metrics of the TP in terms of both FP and FN
Add some augmentations (apart from the rescaling one) to make the model robust to changes in the dataset.
Tweak the training parameters (if you feel).
If these changes fail, then there might be a possibility that the model fails to learn the artifacts of the image. You should go ahead with a different model!
I'm new to machine learning and I'm building an RNN classifier for a problem similar to Name Entity Recognition (NER) but with only two tags.
I followed a tutorial to build the classifier, and now when fitting the model, I get a constant validation accuracy for all the epochs, and some part of me thinks this may be a mistake. Is it normal to have a constant val_accuracy ?
this is my model:
input = Input(shape=(66,))
word_embedding_size = 66
model = Embedding(input_dim=n_words, output_dim=word_embedding_size, input_length=66)(input)
model = Bidirectional(LSTM(units=word_embedding_size,
return_sequences=True,
dropout=0.5,
recurrent_dropout=0.5,
kernel_initializer=k.initializers.he_normal()))(model)
model = LSTM(units=word_embedding_size * 2,
return_sequences=True,
dropout=0.5,
recurrent_dropout=0.5,
kernel_initializer=k.initializers.he_normal())(model)
model = TimeDistributed(Dense(n_tags, activation="sigmoid"))(model)
out = model
model = Model(input, out)
adam = k.optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(X, np.array(Y), batch_size=256, epochs=10, validation_split=0.3, verbose=1)
and this is how the epoch look
Epoch 1/10
2/2 [==============================] - 2s 801ms/step - loss: 0.6990 - accuracy: 0.3123 - val_loss: 0.5732 - val_accuracy: 0.9675
Epoch 2/10
2/2 [==============================] - 1s 334ms/step - loss: 0.5552 - accuracy: 0.9713 - val_loss: 0.4202 - val_accuracy: 0.9675
Epoch 3/10
2/2 [==============================] - 1s 310ms/step - loss: 0.3997 - accuracy: 0.9723 - val_loss: 0.2377 - val_accuracy: 0.9675
Epoch 4/10
2/2 [==============================] - 1s 303ms/step - loss: 0.2260 - accuracy: 0.9723 - val_loss: 0.1168 - val_accuracy: 0.9675
Epoch 5/10
2/2 [==============================] - 1s 312ms/step - loss: 0.1126 - accuracy: 0.9723 - val_loss: 0.0851 - val_accuracy: 0.9675
I have written a CNN that takes in MFCC spectrograms and is meant to classify the images into five different classes. I trained the model for 30 epochs and after the first epoch, no metrics change. Could it be a problem with imbalanced classification, and if so, how would I bias the model for the dataset, if possible? Below is the data generator code, the model definition, and the outputs. The original model had two additional layers however, I started tweaking things when I was trying to troubleshoot the issue
Data Generator Definition:
path = 'path_to_dataset'
CLASS_NAMES = ['belly_pain', 'burping', 'discomfort', 'hungry', 'tired']
CLASS_NAMES = np.array(CLASS_NAMES)
BATCH_SIZE = 32
IMG_HEIGHT = 150
IMG_WIDTH = 150
# 457 is the number of images total
STEPS_PER_EPOCH = np.ceil(457/BATCH_SIZE)
img_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, validation_split=0.2, horizontal_flip=True, rotation_range=45, width_shift_range=.15, height_shift_range=.15)
train_data_gen = img_generator.flow_from_directory( directory=path, batch_size=BATCH_SIZE, shuffle=True, target_size=(IMG_HEIGHT, IMG_WIDTH), classes = list(CLASS_NAMES), subset='training', class_mode='categorical')
validation_data_gen = img_generator.flow_from_directory( directory=path, batch_size=BATCH_SIZE, shuffle=True, target_size=(IMG_HEIGHT, IMG_WIDTH), classes = list(CLASS_NAMES), subset='validation', class_mode='categorical')
Model Definition:
EPOCHS = 30
model = Sequential([
Conv2D(128, 3, activation='relu',
input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
MaxPooling2D(),
Flatten(),
Dense(512, activation='sigmoid'),
Dense(1)
])
opt = tf.keras.optimizers.Adamax(lr=0.001)
model.compile(optimizer=opt,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
First 5 Epochs:
Epoch 1/30
368/368 [==============================] - 371s 1s/step - loss: 0.6713 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 2/30
368/368 [==============================] - 235s 640ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 3/30
368/368 [==============================] - 233s 633ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 4/30
368/368 [==============================] - 236s 641ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 5/30
368/368 [==============================] - 234s 636ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Last Five Epochs:
Epoch 25/30
368/368 [==============================] - 231s 628ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 26/30
368/368 [==============================] - 227s 617ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 27/30
368/368 [==============================] - 228s 620ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 28/30
368/368 [==============================] - 234s 636ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 29/30
368/368 [==============================] - 235s 638ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
Epoch 30/30
368/368 [==============================] - 234s 636ms/step - loss: 0.5004 - accuracy: 0.8000 - val_loss: 0.5004 - val_accuracy: 0.8000
You are trying to achieve a classification task with 4 classes but your last layer only contain one neuron.
It should be a dense layer with 4 neurons and a softmax activation :
Dense(4, activation="softmax")
You need to also change the loss function accordingly to classification loss like for example categorical_crossentropy.