I am really new to Data Science/ML and have been working on Tensorflow to implement Linear Regression on California Housing Prices from Kaggle.
I tried to train a mode in two different ways:
Using a Sequential model
Custom implementation
In both cases, the loss of the model was really high and I have not been able to understand what are the ways to improve it.
Dataset prep
df = pd.read_csv('california-housing-prices.zip')
df = df[['total_rooms', 'total_bedrooms', 'median_house_value', 'housing_median_age', 'median_income']]
print('Shape of dataset before removing NAs and duplicates {}'.format(df.shape))
df.dropna(inplace=True)
df.drop_duplicates(inplace=True)
input_train, input_test, target_train, target_test = train_test_split(df['total_rooms'].values, df['median_house_value'].values, test_size=0.2)
scaler = MinMaxScaler()
input_train = input_train.reshape(-1,1)
input_test = input_test.reshape(-1,1)
input_train = scaler.fit_transform(input_train)
input_test = scaler.fit_transform(input_test)
target_train = target_train.reshape(-1,1)
target_train = scaler.fit_transform(target_train)
target_test = target_test.reshape(-1,1)
target_test = scaler.fit_transform(target_test)
print('Number of training input elements {}'.format(input_train.shape))
print('Number of training target elements {}'.format(target_train.shape))
Using Sequential API:
BATCH_SIZE = 10
BUFFER = 5000
dataset = tf.data.Dataset.from_tensor_slices((input_train, target_train))
dataset = dataset.shuffle(BUFFER).batch(BATCH_SIZE)
DENSE_UNITS = 64
model = tf.keras.Sequential([
tf.keras.Input(shape=(1,)),
tf.keras.layers.Dense(DENSE_UNITS, activation='relu'),
tf.keras.layers.Dense(DENSE_UNITS, activation='relu'),
tf.keras.layers.Dense(1)
])
EPOCH = 5000
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001), loss='mean_squared_error', metrics=['accuracy', 'mse'])
history = model.fit(dataset, epochs=EPOCH, callbacks=[early_stopping])
Final training metrics -
Epoch 1/5000
1635/1635 [==============================] - 13s 8ms/step - loss: 0.0564 - accuracy: 0.0013 - mse: 0.0564
Epoch 2/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0552 - accuracy: 0.0016 - mse: 0.0552
Epoch 3/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0012 - mse: 0.0551
Epoch 4/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 9.1766e-04 - mse: 0.0551
Epoch 5/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0013 - mse: 0.0551
Epoch 6/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0013 - mse: 0.0551
Epoch 7/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 8/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 0.0012 - mse: 0.0550
Epoch 9/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 10/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 0.0012 - mse: 0.0550
Epoch 11/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0010 - mse: 0.0549
Epoch 12/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 13/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 14/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0016 - mse: 0.0549
Epoch 15/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 16/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0017 - mse: 0.0549
Epoch 17/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 18/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 19/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 20/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 21/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0012 - mse: 0.0550
Epoch 22/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 9.7883e-04 - mse: 0.0549
Epoch 23/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 7.3412e-04 - mse: 0.0549
Epoch 24/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 7.9530e-04 - mse: 0.0549
Epoch 25/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 0.0013 - mse: 0.0548
Epoch 26/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 7.9530e-04 - mse: 0.0549
Epoch 27/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.7295e-04 - mse: 0.0549
Epoch 28/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 0.0012 - mse: 0.0548
Epoch 29/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 30/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 9.7883e-04 - mse: 0.0549
Using custom training
class Linear(object):
def __init__(self):
"""
Y = mX + C
Initializing the intercet and the slope
"""
self.m = tf.Variable(tf.random.normal(shape=()))
self.C = tf.Variable(tf.random.normal(shape=()))
def __call__(self, x):
return self.m * x + self.C
# Defining a MSE loss function
def loss(predicted_y, target_y):
return tf.reduce_mean(tf.square(predicted_y - target_y))
def train(model, input, output, learning_rate):
with tf.GradientTape() as tape:
predicted_y = model(input)
current_loss = loss(predicted_y, output)
df_m, df_C = tape.gradient(current_loss, [model.m, model.C])
model.m.assign_sub(learning_rate * df_m)
model.C.assign_sub(learning_rate * df_C)
epochs = 5000
model = Linear()
print(model.m.assign_sub(1))
ms, Cs, losses = [], [], []
target_train = target_train.astype('float32')
for epoch in range(epochs):
ms.append(model.m.numpy())
Cs.append(model.C.numpy())
predicted_y = model(input_train)
current_loss = loss(predicted_y, target_train)
losses.append(current_loss)
train(model, input_train, target_train, 0.1)
if epoch % 500 == 0:
print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %
(epoch, ms[-1], Cs[-1], current_loss))
predicted_test = model(input_test[:10])
print(np.argmax(predicted_test.numpy()))
print(scaler.inverse_transform(predicted_test))
print(scaler.inverse_transform(target_test[:10]))
predicted_loss = loss(predicted_test, target_test[:10])
print(predicted_loss.numpy())
Final training metrics
Epoch 0: W=-1.86 b=-0.09, loss=0.44381
Epoch 500: W=-1.19 b=0.47, loss=0.06470
Epoch 1000: W=-0.73 b=0.44, loss=0.06034
Epoch 1500: W=-0.39 b=0.42, loss=0.05799
Epoch 2000: W=-0.13 b=0.40, loss=0.05671
Epoch 2500: W=0.05 b=0.39, loss=0.05602
Epoch 3000: W=0.19 b=0.38, loss=0.05565
Epoch 3500: W=0.29 b=0.38, loss=0.05545
Epoch 4000: W=0.36 b=0.37, loss=0.05534
Epoch 4500: W=0.41 b=0.37, loss=0.05528
In your first example, you shouldn't reshape your input into 1D. You transformed your matrix into a long 1D array. So, remove these lines:
input_train = input_train.reshape(-1,1)
input_test = input_test.reshape(-1,1)
Then you will keep the 8 features of your input data. Then, change the first lines of your model like this:
model = tf.keras.Sequential([
tf.keras.Input(shape=(8,)),
Your loss will decrease. After a few epochs I get this:
1644/1652 [============================>.] -
ETA: 0s - loss: 0.0144 - accuracy: 0.0434 - mse: 0.0144
I couldn't use your .zip file so I did it differently. Here is all my reproducible code:
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow as tf
from sklearn.datasets import fetch_california_housing
x, y = fetch_california_housing(return_X_y=True)
input_train, _, target_train, _ = train_test_split(x, y)
scaler = MinMaxScaler()
input_train = scaler.fit_transform(input_train)
target_train = target_train.reshape(-1,1)
target_train = scaler.fit_transform(target_train)
dataset = tf.data.Dataset.from_tensor_slices((input_train, target_train))
dataset = dataset.shuffle(5000).batch(32)
model = tf.keras.Sequential([
tf.keras.Input(shape=(8,)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1)])
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
loss='mean_squared_error', metrics=['accuracy', 'mse'])
history = model.fit(dataset, epochs=50, callbacks=[early_stopping])
Related
I'm doing a species classification task from kaggle (https://www.kaggle.com/competitions/yum-or-yuck-butterfly-mimics-2022/overview). I decided to use transfer learning to tackle this problem since there aren't that many images. The model is as follows:
inputs = tf.keras.layers.Input(shape=(224, 224, 3))
base_model = tf.keras.applications.resnet50.ResNet50(
input_shape=(224,224,3),
include_top=False,
weights="imagenet")
for layer in base_model.layers:
layer.trainable = False
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(1024, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(512, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
output = tf.keras.layers.Dense(6, activation="softmax")(x)
model = tf.keras.Model(inputs=inputs, outputs=output)
As per the guidelines when doing transfer learning:https://keras.io/guides/transfer_learning/, I'm freezing the resnet layers and training the model on inference only (training=False). However, the results show that the model is not learning properly. Convergence doesn't seem like it will be possible even after nearly 200 epochs:
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics="accuracy",
)
stop_early = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
min_delta=0.0001,
patience=20,
restore_best_weights=True
)
history = model.fit(train_generator,
validation_data = val_generator,
epochs = 200,
callbacks=[stop_early])
22/22 [==============================] - 19s 442ms/step - loss: 1.9317 - accuracy: 0.1794 - val_loss: 1.8272 - val_accuracy: 0.1618
Epoch 2/200
22/22 [==============================] - 9s 398ms/step - loss: 1.8250 - accuracy: 0.1882 - val_loss: 1.7681 - val_accuracy: 0.2197
Epoch 3/200
22/22 [==============================] - 9s 402ms/step - loss: 1.7927 - accuracy: 0.2294 - val_loss: 1.7612 - val_accuracy: 0.2139
Epoch 4/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7930 - accuracy: 0.2000 - val_loss: 1.7640 - val_accuracy: 0.2139
Epoch 5/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7872 - accuracy: 0.2132 - val_loss: 1.7489 - val_accuracy: 0.3121
Epoch 6/200
22/22 [==============================] - 9s 389ms/step - loss: 1.7700 - accuracy: 0.2574 - val_loss: 1.7378 - val_accuracy: 0.2543
Epoch 7/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7676 - accuracy: 0.2353 - val_loss: 1.7229 - val_accuracy: 0.3064
Epoch 8/200
22/22 [==============================] - 9s 427ms/step - loss: 1.7721 - accuracy: 0.2353 - val_loss: 1.7225 - val_accuracy: 0.2948
Epoch 9/200
22/22 [==============================] - 9s 399ms/step - loss: 1.7522 - accuracy: 0.2588 - val_loss: 1.7267 - val_accuracy: 0.2948
Epoch 10/200
22/22 [==============================] - 9s 395ms/step - loss: 1.7434 - accuracy: 0.2735 - val_loss: 1.7151 - val_accuracy: 0.2948
Epoch 11/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7500 - accuracy: 0.2632 - val_loss: 1.7083 - val_accuracy: 0.3064
Epoch 12/200
22/22 [==============================] - 9s 425ms/step - loss: 1.7307 - accuracy: 0.2721 - val_loss: 1.6899 - val_accuracy: 0.3179
Epoch 13/200
22/22 [==============================] - 9s 407ms/step - loss: 1.7439 - accuracy: 0.2794 - val_loss: 1.7045 - val_accuracy: 0.2948
Epoch 14/200
22/22 [==============================] - 9s 404ms/step - loss: 1.7376 - accuracy: 0.2706 - val_loss: 1.7118 - val_accuracy: 0.2659
Epoch 15/200
22/22 [==============================] - 9s 419ms/step - loss: 1.7588 - accuracy: 0.2647 - val_loss: 1.6684 - val_accuracy: 0.3237
Epoch 16/200
22/22 [==============================] - 9s 394ms/step - loss: 1.7289 - accuracy: 0.2824 - val_loss: 1.6733 - val_accuracy: 0.3064
Epoch 17/200
22/22 [==============================] - 9s 387ms/step - loss: 1.7184 - accuracy: 0.2809 - val_loss: 1.7185 - val_accuracy: 0.2659
Epoch 18/200
22/22 [==============================] - 9s 408ms/step - loss: 1.7242 - accuracy: 0.2765 - val_loss: 1.6961 - val_accuracy: 0.2717
Epoch 19/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7218 - accuracy: 0.2853 - val_loss: 1.6757 - val_accuracy: 0.3006
Epoch 20/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7248 - accuracy: 0.2882 - val_loss: 1.6716 - val_accuracy: 0.3064
Epoch 21/200
22/22 [==============================] - 9s 401ms/step - loss: 1.7134 - accuracy: 0.2838 - val_loss: 1.6666 - val_accuracy: 0.2948
Epoch 22/200
22/22 [==============================] - 9s 393ms/step - loss: 1.7140 - accuracy: 0.2941 - val_loss: 1.6427 - val_accuracy: 0.3064
I need to unfreeze the layers and turn off inference in order for the model to learn. I tested the same scenario with EfficientNet and the same thing happened. Finally, I also used Xception, and freezing the layers and running with inference was fine. So it seems they behave differently, even though they all have batchnorm layers.
I'm not understanding what is going on here. Why would I need to turn inference off? Could anyone have a clue about this?
EDIT:
results from Resnet50:
results from Xception:
So I am trying to create an LSTM that can predict the next time step of a double pendulum. The data that I am trying to train with is a (2001, 4) numpy array. (i.e. the first 5 rows will look like:
array([[ 1.04719755, 0. , 1.04719755, 0. ],
[ 1.03659984, -0.42301933, 1.04717544, -0.00178865],
[ 1.00508218, -0.83475539, 1.04682248, -0.01551541],
[ 0.95354768, -1.22094052, 1.04514269, -0.05838011],
[ 0.88372305, -1.56345555, 1.04009056, -0.15443162]])
where each row is a unique representation of the state of the double pendulum.)
So I wanted to created an LSTM that could learn to predict the next state given the current one.
Here was my code for it so far (full_sol is the (2001, 4) matrix:
import numpy as np
from tensorflow import keras
import tensorflow as tf
# full_sol = np.random.rand(2001, 4)
full_sol = full_sol.reshape((full_sol.shape[0], 1, full_sol.shape[1]))
model = keras.Sequential()
model.add(keras.layers.LSTM(100, input_shape=(None, 4), return_sequences=True, dropout=0.2))
model.add(keras.layers.TimeDistributed(keras.layers.Dense(4, activation=tf.keras.layers.LeakyReLU(
alpha=0.3))))
model.compile(loss="mean_squared_error", optimizer="adam", metrics="accuracy")
history = model.fit(full_sol[:-1,:,:], full_sol[1:,:,:], epochs=20)
Then when I train, I get the following results:
Epoch 1/20
63/63 [==============================] - 3s 4ms/step - loss: 1.7181 - accuracy: 0.4200
Epoch 2/20
63/63 [==============================] - 0s 4ms/step - loss: 1.0481 - accuracy: 0.5155
Epoch 3/20
63/63 [==============================] - 0s 5ms/step - loss: 0.7584 - accuracy: 0.5715
Epoch 4/20
63/63 [==============================] - 0s 5ms/step - loss: 0.5134 - accuracy: 0.6420
Epoch 5/20
63/63 [==============================] - 0s 5ms/step - loss: 0.3944 - accuracy: 0.7260
Epoch 6/20
63/63 [==============================] - 0s 5ms/step - loss: 0.3378 - accuracy: 0.7605
Epoch 7/20
63/63 [==============================] - 0s 5ms/step - loss: 0.3549 - accuracy: 0.7825
Epoch 8/20
63/63 [==============================] - 0s 4ms/step - loss: 0.3528 - accuracy: 0.7995
Epoch 9/20
63/63 [==============================] - 0s 5ms/step - loss: 0.3285 - accuracy: 0.8020
Epoch 10/20
63/63 [==============================] - 0s 5ms/step - loss: 0.2874 - accuracy: 0.8030
Epoch 11/20
63/63 [==============================] - 0s 4ms/step - loss: 0.3072 - accuracy: 0.8135
Epoch 12/20
63/63 [==============================] - 0s 4ms/step - loss: 0.3075 - accuracy: 0.8035
Epoch 13/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2942 - accuracy: 0.8030
Epoch 14/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2637 - accuracy: 0.8170
Epoch 15/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2675 - accuracy: 0.8150
Epoch 16/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2644 - accuracy: 0.8085
Epoch 17/20
63/63 [==============================] - 0s 5ms/step - loss: 0.2479 - accuracy: 0.8200
Epoch 18/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2475 - accuracy: 0.8215
Epoch 19/20
63/63 [==============================] - 0s 4ms/step - loss: 0.2243 - accuracy: 0.8340
Epoch 20/20
63/63 [==============================] - 0s 5ms/step - loss: 0.2430 - accuracy: 0.8240
So, quite high accuracy. But when I test it on the training set, the predictions aren't very good.
E.g. when I predict the first value:
model.predict(tf.expand_dims(full_sol[0], axis = 0))
I get array([[[ 1.0172144 , -0.3535697 , 1.1287913 , -0.23707283]]],dtype=float32)
Instead of array([[ 1.03659984, -0.42301933, 1.04717544, -0.00178865]]).
Where have I gone wrong?
I don't think you are doing anything wrong. What you are getting is still fairly close to the actual value. You can either change your choice of metric so it accurately represents the degree of error in your predictions, or you could try to increase the accuracy further.
I want to binary classify breast cancer histopathological images from the BreakHis dataset (https://www.kaggle.com/ambarish/breakhis) using transfer learning and the Inception Resnet v2. The goal is to freeze all layers and train the fully connected layer by adding two neurons to the model. In particular, initially I want to consider the images related to the magnificant factor 40X (Benign: 625, Malignant: 1370). Here is a summary of what I do:
I read the images and resize them to 150x150
I partition the dataset into training, validation and test set
I load the pre-trained network Inception Resnet v2
I freeze all the layers I add the two neurons for binary
classification (1 = "benign", 0 = "malignant")
I compile the model using as activation function the Adam method
I carry out the training
I make the prediction
I calculate the accuracy
This is the code:
data = dataset[dataset["Magnificant"]=="40X"]
def preprocessing(dataset, img_size):
# images
X = []
# labels
y = []
i = 0
for image in list(dataset["Path"]):
# Ridimensiono e leggo le immagini
X.append(cv2.resize(cv2.imread(image, cv2.IMREAD_COLOR),
(img_size, img_size), interpolation=cv2.INTER_CUBIC))
basename = os.path.basename(image)
# Get labels
if dataset.loc[i][2] == "benign":
y.append(1)
else:
y.append(0)
i = i+1
return X, y
X, y = preprocessing(data, 150)
X = np.array(X)
y = np.array(y)
# Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y_40, shuffle=True, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=1)
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=[150, 150, 3])
# Freezing
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(learning_rate=0.0002)
loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
model.compile(loss=loss, optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
batch_size = 32
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow(X_train, y_train, batch_size=batch_size)
val_generator = val_datagen.flow(X_val, y_val, batch_size=batch_size)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size)
This is the output of the training at the last epoch:
Epoch 70/70
32/32 [==============================] - 3s 84ms/step - loss: 0.0499 - accuracy: 0.9903 - auc_5: 0.9996 - val_loss: 0.5661 - val_accuracy: 0.8250 - val_auc_5: 0.8521
I make the prediction:
test_datagen = ImageDataGenerator(rescale=1./255)
x = X_test
y_pred = model.predict(test_datagen.flow(x))
y_p = []
for i in range(len(y_pred)):
if y_pred[i] > 0.5:
y_p.append(1)
else:
y_p.append(0)
I calculate the accuracy:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_p)
print(accuracy)
This is the accuracy value I get: 0.5459098497495827
Why do I get such low accuracy, I have done several tests but I always get similar results?
Update
I have made the following changes but I always get the same results (place only the modified parts of the code):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y, shuffle=True, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, stratify = y_train, shuffle=True, random_state=1)
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size, callbacks=[callback])
Update 2
I also changed from_logits from True to False, but of course that's not the problem yet. I always get 57% accuracy.
This is the model.fit output over 30 epochs:
Epoch 1/30
32/32 [==============================] - 23s 202ms/step - loss: 0.7994 - accuracy: 0.6010 - auc: 0.5272 - val_loss: 0.5338 - val_accuracy: 0.7688 - val_auc: 0.7943
Epoch 2/30
32/32 [==============================] - 3s 87ms/step - loss: 0.5778 - accuracy: 0.7206 - auc: 0.7521 - val_loss: 0.4763 - val_accuracy: 0.7781 - val_auc: 0.8155
Epoch 3/30
32/32 [==============================] - 3s 85ms/step - loss: 0.5311 - accuracy: 0.7581 - auc: 0.7710 - val_loss: 0.4740 - val_accuracy: 0.7719 - val_auc: 0.8212
Epoch 4/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4684 - accuracy: 0.7718 - auc: 0.8219 - val_loss: 0.4270 - val_accuracy: 0.8031 - val_auc: 0.8611
Epoch 5/30
32/32 [==============================] - 3s 83ms/step - loss: 0.4280 - accuracy: 0.7943 - auc: 0.8617 - val_loss: 0.4496 - val_accuracy: 0.7969 - val_auc: 0.8468
Epoch 6/30
32/32 [==============================] - 3s 88ms/step - loss: 0.4237 - accuracy: 0.8250 - auc: 0.8673 - val_loss: 0.3993 - val_accuracy: 0.7937 - val_auc: 0.8840
Epoch 7/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4130 - accuracy: 0.8513 - auc: 0.8767 - val_loss: 0.4207 - val_accuracy: 0.7781 - val_auc: 0.8692
Epoch 8/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3446 - accuracy: 0.8485 - auc: 0.9077 - val_loss: 0.4229 - val_accuracy: 0.7937 - val_auc: 0.8730
Epoch 9/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3690 - accuracy: 0.8514 - auc: 0.9003 - val_loss: 0.4300 - val_accuracy: 0.8062 - val_auc: 0.8696
Epoch 10/30
32/32 [==============================] - 3s 100ms/step - loss: 0.3204 - accuracy: 0.8533 - auc: 0.9270 - val_loss: 0.4235 - val_accuracy: 0.7969 - val_auc: 0.8731
Epoch 11/30
32/32 [==============================] - 3s 86ms/step - loss: 0.3555 - accuracy: 0.8508 - auc: 0.9124 - val_loss: 0.4124 - val_accuracy: 0.8000 - val_auc: 0.8797
Epoch 12/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3243 - accuracy: 0.8481 - auc: 0.9308 - val_loss: 0.3979 - val_accuracy: 0.7969 - val_auc: 0.8908
Epoch 13/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3017 - accuracy: 0.8744 - auc: 0.9348 - val_loss: 0.4239 - val_accuracy: 0.8094 - val_auc: 0.8758
Epoch 14/30
32/32 [==============================] - 3s 89ms/step - loss: 0.3317 - accuracy: 0.8521 - auc: 0.9221 - val_loss: 0.4238 - val_accuracy: 0.8094 - val_auc: 0.8704
Epoch 15/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2840 - accuracy: 0.8908 - auc: 0.9490 - val_loss: 0.4131 - val_accuracy: 0.8281 - val_auc: 0.8858
Epoch 16/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2583 - accuracy: 0.8905 - auc: 0.9511 - val_loss: 0.3841 - val_accuracy: 0.8375 - val_auc: 0.9007
Epoch 17/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2810 - accuracy: 0.8648 - auc: 0.9470 - val_loss: 0.3928 - val_accuracy: 0.8438 - val_auc: 0.8972
Epoch 18/30
32/32 [==============================] - 3s 89ms/step - loss: 0.2622 - accuracy: 0.8923 - auc: 0.9550 - val_loss: 0.3732 - val_accuracy: 0.8438 - val_auc: 0.9089
Epoch 19/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2486 - accuracy: 0.8990 - auc: 0.9579 - val_loss: 0.4077 - val_accuracy: 0.8250 - val_auc: 0.8924
Epoch 20/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2412 - accuracy: 0.9074 - auc: 0.9635 - val_loss: 0.4249 - val_accuracy: 0.8219 - val_auc: 0.8787
Epoch 21/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2386 - accuracy: 0.9095 - auc: 0.9657 - val_loss: 0.4177 - val_accuracy: 0.8094 - val_auc: 0.8904
Epoch 22/30
32/32 [==============================] - 3s 99ms/step - loss: 0.2313 - accuracy: 0.8996 - auc: 0.9668 - val_loss: 0.4089 - val_accuracy: 0.8406 - val_auc: 0.8890
Epoch 23/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2424 - accuracy: 0.9067 - auc: 0.9654 - val_loss: 0.4033 - val_accuracy: 0.8500 - val_auc: 0.8953
Epoch 24/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2315 - accuracy: 0.9045 - auc: 0.9626 - val_loss: 0.3903 - val_accuracy: 0.8250 - val_auc: 0.9030
Epoch 25/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2001 - accuracy: 0.9321 - auc: 0.9788 - val_loss: 0.4276 - val_accuracy: 0.8000 - val_auc: 0.8855
Epoch 26/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2118 - accuracy: 0.9212 - auc: 0.9695 - val_loss: 0.4335 - val_accuracy: 0.8125 - val_auc: 0.8897
Epoch 27/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2463 - accuracy: 0.8941 - auc: 0.9665 - val_loss: 0.4112 - val_accuracy: 0.8438 - val_auc: 0.8882
Epoch 28/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2130 - accuracy: 0.9033 - auc: 0.9771 - val_loss: 0.3834 - val_accuracy: 0.8406 - val_auc: 0.9021
Epoch 29/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2021 - accuracy: 0.9229 - auc: 0.9754 - val_loss: 0.3855 - val_accuracy: 0.8469 - val_auc: 0.9008
Epoch 30/30
32/32 [==============================] - 3s 88ms/step - loss: 0.1859 - accuracy: 0.9314 - auc: 0.9824 - val_loss: 0.4018 - val_accuracy: 0.8375 - val_auc: 0.8928
You have to changefrom_logits=True to from_logits=False in your loss function. Once again Credits - #Frightera.
It seems like your model is over-fitting somewhere. It would be best if you could check for that.
Do the K-Fold test for 10 folds. It would show the true results
In your metrics, do add the F1 score. The F1 value would give you a real look into the metrics of the TP in terms of both FP and FN
Add some augmentations (apart from the rescaling one) to make the model robust to changes in the dataset.
Tweak the training parameters (if you feel).
If these changes fail, then there might be a possibility that the model fails to learn the artifacts of the image. You should go ahead with a different model!
I have a classification model that is clearly overfitting and the validation accuracy doesn't change.
I've tried using feature selection and feature extraction methods but they didn't help.
feature selection method:
fs = SelectKBest(f_classif, k=10)
fs.fit(x, y)
feature_index = fs.get_support(True)
feature_index = feature_index.tolist()
best_features = []
# makes list of best features
for index in feature_index:
best_features.append(x[:, index])
x = (np.array(best_features, dtype=np.float)).T
model:
def model_and_print(x, y, Epochs, Batch_Size, loss, opt, class_weight, callback):
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# define 10-fold cross validation test harness
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
# K-fold Cross Validation model evaluation
fold_no = 1
for train, test in kfold.split(x, y):
# create model
model = Sequential()
model.add(Dropout(0.2, input_shape=(len(x[0]),)))
model.add(Dense(6, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(8, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.4))
model.add(Dense(1, activation=tf.nn.sigmoid))
# compile model
model.compile(optimizer=opt,
loss=loss, metrics=['accuracy']
)
history = model.fit(x[train], y[train], validation_data=(x[test], y[test]), epochs=Epochs,
batch_size=Batch_Size, verbose=1)
def main():
data = ["data.pkl", "data_list.pkl", "data_mean.pkl"]
df = pd.read_pickle(data[2])
x, y = data_frame_to_feature_and_target_arrays(df)
# hyper meters
Epochs = 200
Batch_Size = 1
learning_rate = 0.003
optimizer = optimizers.Adam(learning_rate=learning_rate)
loss = "binary_crossentropy"
model_and_print(x, y, Epochs, Batch_Size, loss, optimizer, class_weight, es_callback)
if __name__ == "__main__":
main()
output for part of one fold:
1/73 [..............................] - ETA: 0s - loss: 0.6470 - accuracy: 1.0000
62/73 [========================>.....] - ETA: 0s - loss: 0.5665 - accuracy: 0.7097
73/73 [==============================] - 0s 883us/step - loss: 0.5404 - accuracy: 0.7534 - val_loss: 0.5576 - val_accuracy: 0.5000
Epoch 100/200
1/73 [..............................] - ETA: 0s - loss: 0.4743 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.6388 - accuracy: 0.6522
73/73 [==============================] - 0s 806us/step - loss: 0.6316 - accuracy: 0.6575 - val_loss: 0.5592 - val_accuracy: 0.5000
Epoch 101/200
1/73 [..............................] - ETA: 0s - loss: 0.6005 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.5656 - accuracy: 0.7101
73/73 [==============================] - 0s 806us/step - loss: 0.5641 - accuracy: 0.7123 - val_loss: 0.5629 - val_accuracy: 0.5000
Epoch 102/200
1/73 [..............................] - ETA: 0s - loss: 0.2126 - accuracy: 1.0000
65/73 [=========================>....] - ETA: 0s - loss: 0.5042 - accuracy: 0.8000
73/73 [==============================] - 0s 847us/step - loss: 0.5340 - accuracy: 0.7671 - val_loss: 0.5608 - val_accuracy: 0.5000
Epoch 103/200
1/73 [..............................] - ETA: 0s - loss: 0.8801 - accuracy: 0.0000e+00
68/73 [==========================>...] - ETA: 0s - loss: 0.5754 - accuracy: 0.6471
73/73 [==============================] - 0s 819us/step - loss: 0.5780 - accuracy: 0.6575 - val_loss: 0.5639 - val_accuracy: 0.5000
Epoch 104/200
1/73 [..............................] - ETA: 0s - loss: 0.0484 - accuracy: 1.0000
70/73 [===========================>..] - ETA: 0s - loss: 0.5711 - accuracy: 0.7571
73/73 [==============================] - 0s 806us/step - loss: 0.5689 - accuracy: 0.7534 - val_loss: 0.5608 - val_accuracy: 0.5000
Epoch 105/200
1/73 [..............................] - ETA: 0s - loss: 0.1237 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.5953 - accuracy: 0.7101
73/73 [==============================] - 0s 820us/step - loss: 0.5922 - accuracy: 0.7260 - val_loss: 0.5672 - val_accuracy: 0.5000
Epoch 106/200
1/73 [..............................] - ETA: 0s - loss: 0.3360 - accuracy: 1.0000
67/73 [==========================>...] - ETA: 0s - loss: 0.5175 - accuracy: 0.7313
73/73 [==============================] - 0s 847us/step - loss: 0.5320 - accuracy: 0.7397 - val_loss: 0.5567 - val_accuracy: 0.5000
Epoch 107/200
1/73 [..............................] - ETA: 0s - loss: 0.1384 - accuracy: 1.0000
67/73 [==========================>...] - ETA: 0s - loss: 0.5435 - accuracy: 0.6866
73/73 [==============================] - 0s 833us/step - loss: 0.5541 - accuracy: 0.6575 - val_loss: 0.5629 - val_accuracy: 0.5000
Epoch 108/200
1/73 [..............................] - ETA: 0s - loss: 0.2647 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.6047 - accuracy: 0.6232
73/73 [==============================] - 0s 820us/step - loss: 0.5948 - accuracy: 0.6301 - val_loss: 0.5660 - val_accuracy: 0.5000
Epoch 109/200
1/73 [..............................] - ETA: 0s - loss: 0.8837 - accuracy: 0.0000e+00
66/73 [==========================>...] - ETA: 0s - loss: 0.5250 - accuracy: 0.7576
73/73 [==============================] - 0s 861us/step - loss: 0.5357 - accuracy: 0.7397 - val_loss: 0.5583 - val_accuracy: 0.5000
Epoch 110/200
final accuracy:
Score for fold 10: loss of 0.5600861310958862; accuracy of 50.0%
my question is what can I do about the overfitting if I've tried feature extraction and dropout layers and why is the validation accuracy not changing?
I am trying to build a deep learning model on Keras for a test and I am not very good at this. I have a scaled dataset with 128 features and these correspond to 6 different classes.
I have already tried adding/deleting layers or using regularisation like dropout/l1/l2, My model learns and accuracy goes up so high. But accuracy on test set is around 15%.
from tensorflow.keras.layers import Dense, Dropout
model = Sequential()
model.add(Dense(128, activation='tanh', input_shape=(128,)))
model.add(Dropout(0.5))
model.add(Dense(60, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(20, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(6, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='Nadam', metrics=['accuracy'])
model.fit(train_X, train_y, epochs=20, batch_size=32, verbose=1)
6955/6955 [==============================] - 1s 109us/sample - loss: 1.5805 - acc: 0.3865
Epoch 2/20
6955/6955 [==============================] - 0s 71us/sample - loss: 1.1512 - acc: 0.6505
Epoch 3/20
6955/6955 [==============================] - 0s 71us/sample - loss: 0.9191 - acc: 0.7307
Epoch 4/20
6955/6955 [==============================] - 0s 67us/sample - loss: 0.7819 - acc: 0.7639
Epoch 5/20
6955/6955 [==============================] - 0s 66us/sample - loss: 0.6939 - acc: 0.7882
Epoch 6/20
6955/6955 [==============================] - 0s 69us/sample - loss: 0.6284 - acc: 0.8099
Epoch 7/20
6955/6955 [==============================] - 0s 70us/sample - loss: 0.5822 - acc: 0.8240
Epoch 8/20
6955/6955 [==============================] - 1s 73us/sample - loss: 0.5305 - acc: 0.8367
Epoch 9/20
6955/6955 [==============================] - 1s 75us/sample - loss: 0.5130 - acc: 0.8441
Epoch 10/20
6955/6955 [==============================] - 1s 75us/sample - loss: 0.4703 - acc: 0.8591
Epoch 11/20
6955/6955 [==============================] - 1s 73us/sample - loss: 0.4679 - acc: 0.8650
Epoch 12/20
6955/6955 [==============================] - 1s 77us/sample - loss: 0.4399 - acc: 0.8705
Epoch 13/20
6955/6955 [==============================] - 1s 80us/sample - loss: 0.4055 - acc: 0.8904
Epoch 14/20
6955/6955 [==============================] - 1s 77us/sample - loss: 0.3965 - acc: 0.8874
Epoch 15/20
6955/6955 [==============================] - 1s 77us/sample - loss: 0.3964 - acc: 0.8877
Epoch 16/20
6955/6955 [==============================] - 1s 77us/sample - loss: 0.3564 - acc: 0.9048
Epoch 17/20
6955/6955 [==============================] - 1s 80us/sample - loss: 0.3517 - acc: 0.9087
Epoch 18/20
6955/6955 [==============================] - 1s 78us/sample - loss: 0.3254 - acc: 0.9133
Epoch 19/20
6955/6955 [==============================] - 1s 78us/sample - loss: 0.3367 - acc: 0.9116
Epoch 20/20
6955/6955 [==============================] - 1s 76us/sample - loss: 0.3165 - acc: 0.9192
The result I am recieving 39% With other models like GBM or XGB I can reach upto 85%
What am I doing wrong? Any suggestions?