Getting different results from Keras model.evaluate and model.predict - python

I have trained a model to predict topic categories using word2vec and an lstm model using keras and got about 98% accuracy during training, I saved the model then loaded it into another file for trying on the test set, I used model.evaluate and model.predict and the results were very different.
I'm using keras with tensorflow as a backend, the model summary is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 22) 19624
_________________________________________________________________
dropout_1 (Dropout) (None, 22) 0
_________________________________________________________________
dense_1 (Dense) (None, 40) 920
_________________________________________________________________
activation_1 (Activation) (None, 40) 0
=================================================================
Total params: 20,544
Trainable params: 20,544
Non-trainable params: 0
_________________________________________________________________
None
The code:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.load_weights(os.path.join('model', 'lstm_model_weights.hdf5'))
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print()
print('Score: %1.4f' % score)
print('Evaluation Accuracy: %1.2f%%' % (acc*100))
predicted = model.predict(x_test, batch_size=batch_size)
acc2 = np.count_nonzero(predicted.argmax(1) == y_test.argmax(1))/y_test.shape[0]
print('Prediction Accuracy: %1.2f%%' % (acc2*100))
The output of this code is
39680/40171 [============================>.] - ETA: 0s
Score: 0.1192
Evaluation Accuracy: 97.50%
Prediction Accuracy: 9.03%
Can anyone tell me what did I miss?

I think model evaluation works on dev set (or average of the dev-set accuracy if you use cross-validation), but prediction works on test set.

Related

model training starts all over again after unfreezing weights in tensorflow

I am training an image classifier using Large EfficientNet:
base_model = EfficientNetV2L(input_shape = (300, 500, 3),
include_top = False,
weights = 'imagenet',
include_preprocessing = True)
model = tf.keras.Sequential([base_model,
layers.GlobalAveragePooling2D(),
layers.Dropout(0.2),
layers.Dense(128, activation = 'relu'),
layers.Dropout(0.3),
layers.Dense(6, activation = 'softmax')])
base_model.trainable = False
model.compile(optimizer = optimizers.Adam(learning_rate = 0.001),
loss = losses.SparseCategoricalCrossentropy(),
metrics = ['accuracy'])
callback = [callbacks.EarlyStopping(monitor = 'val_loss', patience = 2)]
history = model.fit(ds_train, batch_size = 28, validation_data = ds_val, epochs = 20, verbose = 1, callbacks = callback)
it is working properly.
model summary:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
efficientnetv2-l (Functiona (None, 10, 16, 1280) 117746848
l)
global_average_pooling2d (G (None, 1280) 0
lobalAveragePooling2D)
dropout (Dropout) (None, 1280) 0
dense (Dense) (None, 128) 163968
dropout_1 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 6) 774
=================================================================
Total params: 117,911,590
Trainable params: 164,742
Non-trainable params: 117,746,848
_________________________________________________________________
output:
Epoch 4/20
179/179 [==============================] - 203s 1s/step - loss: 0.1559 - accuracy: 0.9474 - val_loss: 0.1732 - val_accuracy: 0.9428
But, while fine-tuning it, I am unfreezing some weights:
base_model.trainable = True
fine_tune_at = 900
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False
model.compile(optimizer = optimizers.Adam(learning_rate = 0.0001),
loss = losses.SparseCategoricalCrossentropy(),
metrics = ['accuracy'])
history = model.fit(ds_train, batch_size = 28, validation_data = ds_val, epochs = 20, verbose = 1, callbacks = callback)
model summary:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
efficientnetv2-l (Functiona (None, 10, 16, 1280) 117746848
l)
global_average_pooling2d (G (None, 1280) 0
lobalAveragePooling2D)
dropout (Dropout) (None, 1280) 0
dense (Dense) (None, 128) 163968
dropout_1 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 6) 774
=================================================================
Total params: 117,911,590
Trainable params: 44,592,230
Non-trainable params: 73,319,360
_________________________________________________________________
And, it is starting the training all over again. For the first time, when I trained it with freezed weights, the loss decreased to 0.1559, after unfreezing the weights, the model started training again from loss = 0.444. Why is this happening? I think fine tuning should't reset the weights.
When training again the Adam lr rate for each node is set again to the initial lr maybe that is the reason for the big jump after you start the learning again.You can also specify to save and load the optimizer values as well when saving/loading the model. Maybe look here. You are also retraining a lot of parameters maybe reduce the amount of parameters. If you keep more old parameters the jump might not be that high.

Error in shape (dimention) and type of Keras model input

I am desperate to set the Input shape of this simple Keras model :(
Both X and Y are numpy.narray but I don't know what's the wrong with it! I tried different X shape but the error is there! The info of the datasets (dimentions, number of samples, etc.) is available in the code.
The .pkl file for X_train is got from hidden state of a pre-trained model.
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
from keras import Input, Model
from keras.layers import Dense
import numpy as np
############################## X_Train ############################
X_Train_3embed1 = pd.read_pickle("XX_Train_3embeding.pkl")
X_Train_3embed = np.array(X_Train_3embed1)
print("X-Train")
print(X_Train_3embed.shape) # (230, 1, 128)
print(type(X_Train_3embed)) # <class 'numpy.ndarray'>
print(X_Train_3embed[0].shape) # (1, 128)
print(type(X_Train_3embed[0])) # <class 'numpy.ndarray'>
############################## Y_Train ############################
Y_Train_labels_list = pd.read_pickle("lis_Y_all_Train.pkl")
print(type(Y_Train_labels_list)) #<class 'numpy.ndarray'>
print(type(Y_Train_labels_list[0])) #<class 'str'>
encoder = LabelEncoder()
encoder.fit(Y_Train_labels_list)
encoded_Y = encoder.transform(Y_Train_labels_list)
Y_my_Train = np_utils.to_categorical(encoded_Y)
print("Y-Train")
print(Y_my_Train.shape) #(230, 83)
print(type(Y_my_Train)) # <class 'numpy.ndarray'>
print(Y_my_Train[0].shape) # (83,)
print(type(Y_my_Train[0])) # <class 'numpy.ndarray'>
################################## Model ##################################
first_input = Input(shape=(1, 128))
first_dense = Dense(128)(first_input)
output_layer = Dense(83, activation='softmax')(first_dense)
model = Model(inputs=first_input, outputs=output_layer)
model.summary()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])
history = model.fit((X_Train_3embed, Y_my_Train), epochs=2, batch_size=32)
Here is the result:
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 1, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 1, 128) 16512
_________________________________________________________________
dense_2 (Dense) (None, 1, 83) 10707
=================================================================
Total params: 27,219
Trainable params: 27,219
Non-trainable params: 0
_________________________________________________________________
Traceback (most recent call last):
File "/home/vahideh/PycharmProjects/3KArgen-master/MyTransferClassifier2.py", line 63, in <module>
history = model.fit((X_Train_3embed, Y_my_Train), epochs=2, batch_size=32)
File "/home/vahideh/PycharmProjects/MyVirtualEnvs/MyKargo/lib/python3.6/site-packages/keras/engine/training.py", line 1154, in fit
batch_size=batch_size)
File "/home/vahideh/PycharmProjects/MyVirtualEnvs/MyKargo/lib/python3.6/site-packages/keras/engine/training.py", line 579, in _standardize_user_data
exception_prefix='input')
File "/home/vahideh/PycharmProjects/MyVirtualEnvs/MyKargo/lib/python3.6/site-packages/keras/engine/training_utils.py", line 99, in standardize_input_data
data = [standardize_single_array(x) for x in data]
File "/home/vahideh/PycharmProjects/MyVirtualEnvs/MyKargo/lib/python3.6/site-packages/keras/engine/training_utils.py", line 99, in <listcomp>
data = [standardize_single_array(x) for x in data]
File "/home/vahideh/PycharmProjects/MyVirtualEnvs/MyKargo/lib/python3.6/site-packages/keras/engine/training_utils.py", line 34, in standardize_single_array
elif x.ndim == 1:
AttributeError: 'tuple' object has no attribute 'ndim'
How can I feed these dataset to the model? or change the input shape of the model?
Your models output is of shape (None, 1, 83) i.e each samples ouptput is 1 x 83 but your ground truth for each sample is a scalar. There are two way to deal with this problem
Flatten the outputs and continue using your data
Remove the unnecessary dimension in your data i.e flatten each sample from 1X128 to just 128 and change the model architecture to deal with 1D data which will result in output being 1D.
Fixed code:
Approach 1
from keras import Input, Model
from keras.layers import Dense, Flatten
import numpy as np
# Dummy data
X_Train_3embed = np.random.randn(230, 1, 128)
Y_my_Train = np.random.randn(230, 83)
#model
first_input = Input(shape=(1, 128))
first_dense = Dense(128)(first_input)
output_layer = Dense(83, activation='softmax')(first_dense)
outputs = Flatten()(output_layer)
model = Model(inputs=first_input, outputs=outputs)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])
model.summary()
model.fit(X_Train_3embed, Y_my_Train, epochs=2, batch_size=32)
Output:
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) [(None, 1, 128)] 0
_________________________________________________________________
dense_18 (Dense) (None, 1, 128) 16512
_________________________________________________________________
dense_19 (Dense) (None, 1, 83) 10707
_________________________________________________________________
flatten_6 (Flatten) (None, 83) 0
=================================================================
Total params: 27,219
Trainable params: 27,219
Non-trainable params: 0
_________________________________________________________________
Epoch 1/2
8/8 [==============================] - 1s 3ms/step - loss: 6.2275 - acc: 0.0162
Epoch 2/2
8/8 [==============================] - 0s 2ms/step - loss: 0.2639 - acc: 0.0150
Approach 2
# Dummy data
X_Train_3embed = np.random.randn(230, 1, 128)
Y_my_Train = np.random.randn(230, 83)
#model
first_input = Input(shape=(128))
first_dense = Dense(128)(first_input)
outputs = Dense(83, activation='softmax')(first_dense)
model = Model(inputs=first_input, outputs=outputs)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])
model.summary()
model.fit(X_Train_3embed.reshape(-1,128), Y_my_Train, epochs=2, batch_size=32)
Output:
Layer (type) Output Shape Param #
=================================================================
input_13 (InputLayer) [(None, 128)] 0
_________________________________________________________________
dense_24 (Dense) (None, 128) 16512
_________________________________________________________________
dense_25 (Dense) (None, 83) 10707
=================================================================
Total params: 27,219
Trainable params: 27,219
Non-trainable params: 0
_________________________________________________________________
Epoch 1/2
8/8 [==============================] - 0s 2ms/step - loss: -1.1705 - acc: 0.0100
Epoch 2/2
8/8 [==============================] - 0s 2ms/step - loss: -6.3587 - acc: 0.0015
Basically MarcoCerliani is asking you to remove the tuple
Your code: model.fit((X_Train_3embed, Y_my_Train), epochs=2, batch_size=32)
Changed code: model.fit(X_Train_3embed, Y_my_Train, epochs=2, batch_size=32)

"Processus stopped "during prediction after model with vgg model computed

I'm currently facing an issue with my Tensorflow pipeline.
Don't know if it's specific of Tensorflow or Python.
I'm trying to do a confusion matrix afterward my compiled vgg16 model.
So i used the model object got after the fit method and try to predict the same features to compute my CM.
But the message "Processus arrêté" or process stopped in English appear and the script stop working
Here is the output :
Using TensorFlow backend.
Load audio features and labels : 100% Time: 0:00:50 528.41 B/s
VGG16 model with last layer changed
Number of label: 17322
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 4, 13, 512) 14713536
_________________________________________________________________
flatten (Flatten) (None, 26624) 0
_________________________________________________________________
dense (Dense) (None, 256) 6816000
_________________________________________________________________
dropout (Dropout) (None, 256) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 257
=================================================================
Total params: 21,529,793
Trainable params: 13,895,681
Non-trainable params: 7,634,112
_________________________________________________________________
2772/2772 [==============================] - 121s 44ms/step - loss: 0.2315 - acc: 0.9407 - val_loss: 0.0829 - val_acc: 0.9948
Processus arrêté
Here is the model :
def launch2(self):
print("VGG16 model with last layer changed")
x = np.array(self.getFeatures())[...,np.newaxis]
print("Number of label: " + str(len(self.getLabels())))
vgg_conv=VGG16(weights=None, include_top=False, input_shape=(128, 431, 1))
#Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-4]:
layer.trainable = False
#Create the model
model = tensorflow.keras.Sequential()
#Add the vgg convolutional base model
model.add(vgg_conv)
opt = Adam(lr=1e-4)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['acc'])
model.summary()
model.fit(x=x,y=self.getLabels(),shuffle=True,batch_size=5,epochs=1, validation_split=0.2, verbose=1)
model.save('vggModelLastLayer.h5')
self.testModel(model,x)
Here is the function which allow me to compute the CM :
def testModel(self, model,x):
print("Informations about model still processing. Last step is long")
y_labels = [int(i) for i in self.getLabels().tolist()]
classes = model.predict_classes(x)
predicted_classes = np.argmax(results, axis=1)
# Call model info (true labels, predited labels)
#self.modelInfo(y_labels, predicted_classes)
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_labels,predicted_classes)
target_names=["Bulls","No bulls"]
print(classification_report(y_labels,predicted_classes, target_names=target_names))
print(cm)
How could I fix this ? Is this a memory leak or something ?
Thank you in advance
I've found why this turned out like this
Just because of memory. My memory RAM wasn't enough big to calculate the total amount of data i had

How to prevent overfitting in Keras sequential model?

I am already adding dropout regularization. I am trying to build a multiclass text classification multilayer perceptron model.
My model:
model = Sequential([
Dropout(rate=0.2, input_shape=features),
Dense(units=64, activation='relu'),
Dropout(rate=0.2),
Dense(units=64, activation='relu'),
Dropout(rate=0.2),
Dense(units=16, activation='softmax')])
My model.summary():
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dropout_1 (Dropout) (None, 20000) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 1280064
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 64) 4160
_________________________________________________________________
dropout_3 (Dropout) (None, 64) 0
_________________________________________________________________
dense_3 (Dense) (None, 16) 1040
=================================================================
Total params: 1,285,264
Trainable params: 1,285,264
Non-trainable params: 0
_________________________________________________________________
None
Train on 6940 samples, validate on 1735 samples
I am getting:
Epoch 16/1000
- 4s - loss: 0.4926 - acc: 0.8719 - val_loss: 1.2640 - val_acc: 0.6640
Validation accuracy: 0.6639769498140736, loss: 1.2639631692545559
The validation accuracy is ~20% less than the accuracy, and the validation loss is way higher than the training loss.
I am already using dropout regularization, and using epochs = 1000, batch size = 512 and early stopping on val_loss.
Any suggestions?

Why does this neural network have zero accuracy and very low loss?

I have network :
Tensor("input_1:0", shape=(?, 5, 1), dtype=float32)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 5, 1) 0
_________________________________________________________________
bidirectional_1 (Bidirection (None, 5, 64) 2176
_________________________________________________________________
activation_1 (Activation) (None, 5, 64) 0
_________________________________________________________________
bidirectional_2 (Bidirection (None, 5, 128) 16512
_________________________________________________________________
activation_2 (Activation) (None, 5, 128) 0
_________________________________________________________________
bidirectional_3 (Bidirection (None, 1024) 656384
_________________________________________________________________
activation_3 (Activation) (None, 1024) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 1025
_________________________________________________________________
p_re_lu_1 (PReLU) (None, 1) 1
=================================================================
Total params: 676,098
Trainable params: 676,098
Non-trainable params: 0
_________________________________________________________________
None
Train on 27496 samples, validate on 6875 samples
I fit and compile it by:
model.compile(loss='mse',optimizer=Adamx,metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=10,validation_data=(x_test,y_test),verbose=2)
When I run it and also evaluate it on unseen data,it returns 0.0 Accuracy with very low loss. I can't figure out what's the problem.
Epoch 10/10
- 29s - loss: 1.6972e-04 - acc: 0.0000e+00 - val_loss: 1.7280e-04 - val_acc: 0.0000e+00
What you are getting is expected. Your model is working correctly, it is your metrics of measure that is incorrect. The aim of the optimization function is to minimize loss, not to increase accuracy.
Since you are using PRelu as the activation function of your last layer, you always get float output from the network. Comparing these float output with actual label for measure of accuracy doesn't seem the right option. Since the outputs and labels are continuous random variable the joint probability for specific value will be zero. Therefore, even if the model predicts values very close to the true label value the model accuracy still will be zero unless the model predicts exactly the same value as true label - which is improbable.
e.g if y_true is 1.0 and the model predicts 0.99999 still this value does not add value to accuracy of the model since 1.0 != 0.99999
Update
The choice of metrics function depends on the type of problem. Keras also provides functionality for implementing custom metrics.
Assuming the problem on question is linear regression and two values are equal if difference between the two values is less than 0.01, the custom loss metrics can be defined as:-
import keras.backend as K
import tensorflow as tf
accepted_diff = 0.01
def linear_regression_equality(y_true, y_pred):
diff = K.abs(y_true-y_pred)
return K.mean(K.cast(diff < accepted_diff, tf.float32))
Now you can use this metrics for your model
model.compile(loss='mse',optimizer=Adamx,metrics=[linear_regression_equality])

Categories