Developing a neural network for the Spaceship Titanic comp binary classification problem. However, I keep getting a score of 0.0000 for train and val data, and can't figure out why. Models have worked for knn, lightxgb and random forest, so I don't think it's a data issue.
Code as below
print(X_train_scaled.shape)
print(y_train2.shape)
(6085, 23)
(6085, 1)
# Create model
model1 = Sequential()
model1.add(Dense(18, activation = 'relu', kernel_initializer='he_uniform', input_dim = X_train_scaled.shape[1]))
model1.add(Dense(9, activation='relu', kernel_initializer='he_uniform'))
model1.add(Dense(1, activation = 'sigmoid'))
optimizer = Adam(learning_rate=0.001)
model1.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=[tf.keras.metrics.Accuracy()])
history = model1.fit(X_train_scaled, y_train2, batch_size=100, epochs=30, validation_split = 0.3)
Epoch 1/30
43/43 [==============================] - 1s 7ms/step - loss: 0.7348 - accuracy: 0.0000e+00 - val_loss: 0.6989 - val_accuracy: 0.0000e+00
Epoch 2/30
43/43 [==============================] - 0s 4ms/step - loss: 0.6603 - accuracy: 0.0000e+00 - val_loss: 0.6324 - val_accuracy: 0.0000e+00
Epoch 3/30
43/43 [==============================] - 0s 3ms/step - loss: 0.5994 - accuracy: 0.0000e+00 - val_loss: 0.5784 - val_accuracy: 0.0000e+00
Epoch 4/30
43/43 [==============================] - 0s 3ms/step - loss: 0.5539 - accuracy: 0.0000e+00 - val_loss: 0.5401 - val_accuracy: 0.0000e+00
in place of:
metrics=[tf.keras.metrics.Accuracy()]
try:
metrics=['accuracy']
I have this code in which I am trying to fit a model of a neural network which has just three layers: the input layer, a hidden layer and, at the end, the ouput layer which must have just one neuron for the single ouput. The problem is that when doing the fit I'm always obtaining the same values for the accuracy (null) an the loss (remains constant), and I've tried changing the optimizer from 'sgd' to 'adam' and still anything works as it should be. What would you recommend?
Layer (type) Output Shape Param N°
=================================================================
data_in (InputLayer) [(None, 4, 256)] 0
dense (Dense) (None, 4, 124) 31868
dense_1 (Dense) (None, 4, 1) 125
=================================================================
Total params: 31,993
Trainable params: 31,993
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
20/20 [==============================] - 8s 350ms/step - loss: 0.3170 - accuracy: 1.7361e-05
Epoch 2/20
20/20 [==============================] - 7s 348ms/step - loss: 0.2009 - accuracy: 6.7817e-08
Epoch 3/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0513 - accuracy: 0.0000e+00
Epoch 4/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0437 - accuracy: 0.0000e+00
Epoch 5/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 6/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 7/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 8/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 9/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 10/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 11/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 12/20
20/20 [==============================] - 7s 344ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 13/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 14/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0433 - accuracy: 0.0000e+00
Epoch 15/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 16/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0432 - accuracy: 0.0000e+00
Epoch 17/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 18/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 19/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 20/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
*TEST*
1800/1800 [==============================] - 3s 2ms/step - loss: 0.0449 - accuracy: 0.0000e+00
accuracy: 0%
My input_shape is (4, 256) and my array of training data has shape (57600, 4, 256), meaning I have 57600 samples of shape (4,256). I also have my training labels array (the values I should obtain with the data), having shape (57600,). Finally, the library I am using is TENSORFLOW.
My code is the next one
from keras.layers import Input, Dense, concatenate, Conv2D, MaxPooling2D, Flatten
from keras.models import Model
from tensorflow import keras
from sklearn.preprocessing import MinMaxScaler
div_n = 240
#DIVIDING THE DATA WE WANT TO CLASIFFY AND ITS LABELS - It is being used the
#scaled data
data = np.array([self_mbyy_scaled,
self_mvyy_scaled,
self_mtpr_scaled,
self_mrho_scaled])
labels = self_iout_scaled
print(np.shape(data))
print(np.shape(labels))
#TRAINING SET AND DATA SET
tr_data = []
tr_labels = []
#Here I'm dividing the whole data in half for the nx, nz dimensions. The first half is the training set and the second half is the test set
for j in range(div_n):
for k in range(div_n):
tr_data.append([data[0][j,:,k],
data[1][j,:,k],
data[2][j,:,k],
data[3][j,:,k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
tr_labels.append(labels[j,k]) #the values from the column of targets
tr_data = np.array(tr_data)
tr_data = tr_data.reshape(div_n*div_n, len(data), self_ny, 1)
tr_labels = np.array(tr_labels)
print('\n training data shape')
print(np.shape(tr_data))
print('\n training labels shape')
print(np.shape(tr_labels))
te_data = []
te_labels = []
for j in range(div_n):
for k in range(div_n):
te_data.append([data[0][div_n+j,:,div_n+k],
data[1][div_n+j,:,div_n+k],
data[2][div_n+j,:,div_n+k],
data[3][div_n+j,:,div_n+k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
te_labels.append(labels[div_n+j,div_n+k]) #the values from the column of targets
te_data = np.array(te_data)
te_data = te_data.reshape(div_n*div_n, len(data), self_ny, 1)
te_labels = np.array(te_labels)
print('\n test data shape')
print(np.shape(te_data))
print('\n test labels shape')
print(np.shape(te_labels))
print('\n')
#NEURAL NETWORK MODEL
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(4, 256, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.summary()
model.compile(
optimizer=keras.optimizers.Adam(0.001),
loss=keras.losses.MeanSquaredError(),
metrics=['accuracy'],
)
model.fit(
tr_data, tr_labels,
epochs=6,
validation_data=ds_valid,
)
Since your data seems to have a spatial dimension 57600, 4, 256 --> (samples, timesteps, features), I would recommend using Conv1D layers instead of Conv2D. Here is a simple working example:
import tensorflow as tf
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(128, 2, activation='relu', input_shape=(4, 256)))
model.add(tf.keras.layers.Conv1D(64, 2, activation='relu'))
model.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
model.add(tf.keras.layers.GlobalMaxPool1D())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary()
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss=tf.keras.losses.MeanSquaredError(),
metrics=['mse'],
)
samples = 50
x = tf.random.normal((50, 4, 256))
y = tf.random.normal((50,))
model.fit(x, y, batch_size=10, epochs=6)
And note that you usually do not use the accuracy metric for the tf.keras.losses.MeanSquaredError loss function.
import pandas as pd
data=pd.read_csv("tesdata.csv")
data
x=data.iloc[:,0:2].values
y=data.iloc[:,2].values
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x=sc.fit_transform(x)
y=sc.fit_transform(y.reshape(1,-1))
from keras.models import Sequential
from keras.layers import Dense
model=Sequential()
model.add(Dense(2, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='Adam',metrics=['accuracy'])
model.fit(x,y.reshape(1,-1),epochs=30,batch_size=21)
ValueError: Data cardinality is ambiguous:
x sizes: 21 y sizes: 1
Please provide data which shares the same first dimension.
my data
I have replicated the same issue with one of sample csv file.
I found that you need not to standardize or reshape the label here. Check below code snippets:
x=data.iloc[:,0:7].values
y=data.iloc[:,7].values
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x=sc.fit_transform(x)
#y=sc.fit_transform(y.reshape(1,-1)) # Need not to standardize the label
from keras.models import Sequential
from keras.layers import Dense
model=Sequential()
model.add(Dense(2, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='Adam',metrics=['accuracy'])
model.fit(x,y,epochs=10,batch_size=21) #put y instead of y.reshape(1,-1)
Output:
Epoch 1/10
159/159 [==============================] - 4s 8ms/step - loss: 97.8257 - accuracy: 3.0120e-04
Epoch 2/10
159/159 [==============================] - 1s 9ms/step - loss: 94.2889 - accuracy: 3.0120e-04
Epoch 3/10
159/159 [==============================] - 1s 8ms/step - loss: 90.9619 - accuracy: 3.0120e-04
Epoch 4/10
159/159 [==============================] - 1s 8ms/step - loss: 89.7787 - accuracy: 3.0120e-04
Epoch 5/10
159/159 [==============================] - 1s 8ms/step - loss: 89.5762 - accuracy: 3.0120e-04
Epoch 6/10
159/159 [==============================] - 1s 7ms/step - loss: 89.5045 - accuracy: 3.0120e-04
Epoch 7/10
159/159 [==============================] - 1s 8ms/step - loss: 89.4718 - accuracy: 3.0120e-04
Epoch 8/10
159/159 [==============================] - 1s 7ms/step - loss: 89.4559 - accuracy: 3.0120e-04
Epoch 9/10
159/159 [==============================] - 1s 7ms/step - loss: 89.4472 - accuracy: 3.0120e-04
Epoch 10/10
159/159 [==============================] - 1s 9ms/step - loss: 89.4420 - accuracy: 3.0120e-04
<keras.callbacks.History at 0x7f59f0b1c7d0>
This is my CNN model structure.
def make_dcnn_model():
model = models.Sequential()
model.add(layers.Conv2D(5, (5, 5), input_shape=(9, 128,1), padding='same', strides = (1,2), activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.AveragePooling2D((4, 4), strides = (2,4)))
model.add(layers.Conv2D(10, (5, 5), padding='same', activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.AveragePooling2D((2, 2), strides = (1,2)))
model.add(layers.Flatten())
model.add(layers.Dense(50, activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.Dense(6, activation='softmax'))
return model
The result shows that this model fit well the training data and for the validation data the great fluctuation of validation accuracy occurred.
Train on 7352 samples, validate on 2947 samples
Epoch 1/3000 7352/7352
[==============================] - 3s 397us/sample - loss: 0.1016 -
accuracy: 0.9698 - val_loss: 4.0896 - val_accuracy: 0.5816
Epoch
2/3000 7352/7352 [==============================] - 2s 214us/sample -
loss: 0.0965 - accuracy: 0.9727 - val_loss: 1.2296 - val_accuracy:
0.7384 Epoch 3/3000 7352/7352 [==============================] - 1s 198us/sample - loss: 0.0930 - accuracy: 0.9727 - val_loss: 0.9901 -
val_accuracy: 0.7855 Epoch 4/3000 7352/7352
[==============================] - 2s 211us/sample - loss: 0.1013 -
accuracy: 0.9701 - val_loss: 0.5319 - val_accuracy: 0.9114 Epoch
5/3000 7352/7352 [==============================] - 1s 201us/sample -
loss: 0.0958 - accuracy: 0.9721 - val_loss: 0.6938 - val_accuracy:
0.8388 Epoch 6/3000 7352/7352 [==============================] - 2s 205us/sample - loss: 0.0925 - accuracy: 0.9743 - val_loss: 1.4033 -
val_accuracy: 0.7472 Epoch 7/3000 7352/7352
[==============================] - 1s 203us/sample - loss: 0.0948 -
accuracy: 0.9740 - val_loss: 0.8375 - val_accuracy: 0.7998
Reducing overfitting is a matter of trial and error. There are many ways to deal with it.
Try to add more data to the model or maybe augmenting your data if you're dealing with images. (very helpful)
Try reducing the complexity of the model by tweaking the parameters of the layers.
Try stopping the training earlier.
Regularization and batch normalization are very helpful but it may be the case that your model is already performing much worse without them in terms of overfitting. Try different types of regularization. (maybe Dropout)
My guess is that by adding more variety in the data you're model will overfit less.
I've got image classification model with CNN. I've done transfer learning using MobileNet. At the end of the Mobile Net, I added 4 layers to learn weights for my images(not update weights for MobileNet). Mobile Net's Weights are not changed. As a result, I've got 91% of accuracy with this model and evaluated with same training set(train _generator). But I've got lower accuracy about 41% at this time. Why different result is came out? I've used same training set though... Is there any difference with model.fit_generator's accuracy and model.evaluate_generator? or something wrong in data? Please help... How can I improve accuracy?? Here is my entire code below.
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.mobilenet import preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
base_model = MobileNet(weights='imagenet', include_top=False)
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024, activation='relu')(x)
x=Dense(1024, activation='relu')(x)
x=Dense(512, activation='relu')(x)
preds=Dense(7, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=preds)
for layer in model.layers[:-4]:
layer.trainable=False
for layer in model.layers[-4:]:
layer.trainable=True
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_generator = train_datagen.flow_from_directory('/Users/LG/Desktop/finger',
target_size=(224, 224),
color_mode='rgb',
batch_size=32,
class_mode='categorical',
shuffle=True)
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy'])
step_size_train=train_generator.n//train_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
epochs=10)
Epoch 1/10
17/17 [==============================] - 53s 3s/step - loss: 1.9354 - acc: 0.3026
Epoch 2/10
17/17 [==============================] - 52s 3s/step - loss: 1.1933 - acc: 0.5276
Epoch 3/10
17/17 [==============================] - 52s 3s/step - loss: 0.8936 - acc: 0.6787
Epoch 4/10
17/17 [==============================] - 54s 3s/step - loss: 0.6040 - acc: 0.7843
Epoch 5/10
17/17 [==============================] - 53s 3s/step - loss: 0.5367 - acc: 0.8080
Epoch 6/10
17/17 [==============================] - 55s 3s/step - loss: 0.2676 - acc: 0.9099
Epoch 7/10
17/17 [==============================] - 52s 3s/step - loss: 0.4531 - acc: 0.8387
Epoch 8/10
17/17 [==============================] - 53s 3s/step - loss: 0.3580 - acc: 0.8747
Epoch 9/10
17/17 [==============================] - 55s 3s/step - loss: 0.1963 - acc: 0.9301
Epoch 10/10
17/17 [==============================] - 53s 3s/step - loss: 0.2237 - acc: 0.9133
model.evaluate_generator(train_generator, steps=5)
[2.169835996627808, 0.41875]