What Does Accuracy Metrics Mean in Keras' Sample Denoising Autoencoder?

What Does Accuracy Metrics Mean in Keras' Sample Denoising Autoencoder? - python

I am working with Keras' sample denoising autoencoder;
https://keras.io/examples/mnist_denoising_autoencoder/
As I compile it, I use the following options:
autoencoder.compile(loss='mse', optimizer= Adadelta, metrics=['accuracy'])
Followed by training. I did training deliberately WITHOUT using noisy training data(x_train_noisy), but merely tried to recover x_train.
autoencoder.fit(x_train, x_train, epochs=30, batch_size=128)
After training 60,000 inputs of MNIST digits, it gives me an accuracy of 81.25%. Does it mean there are 60000*81.25% images are PERFECTLY recovered (equaling to the original input pixel by pixel), that is, 81.25% output images from the autoencoder are IDENTICAL to their input counterparts, or something else?
Furthermore, I also conducted a manual check by comparing output and the original data (60000 28X28 matrices) pixel by pixel--counting non-zeros elements from their differences:
x_decoded = autoencoder.predict(x_train)
temp = x_train*255
x_train_uint8 = temp.astype('uint8')
temp = x_decoded*255
x_decoded_uint8 = temp.astype('uint8')
c = np.count_nonzero(x_train_uint8 - x_decoded_uint8)
cp = 1-c /60000/28/28
Yet cp is only about 71%. Could any tell me why there is a difference?

Accuracy doesn't make sense for a regression problem, hence the keras sample doesn't use that metric during autoencoder.compile.
In this case, keras calculates the accuracy as per this metric.
binary_accuracy
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
Using this numpy implementation, you should get the same value as output by Keras for validation accuracy at the end of training.
x_decoded = autoencoder.predict(x_test_noisy)
acc = np.mean(np.equal(x_test, np.round(x_decoded)))
print(acc)
Refer this answer for more details:
What function defines accuracy in Keras when the loss is mean squared error (MSE)?

Related

Why do I need a very high learning rate for this model to converge?

I have a simple model in tensorflow which is being trained on the first 1000 images in the MNIST datset. From my previous experience the learning rates which I used were of the order of around 0.001, however for my model to converge the learning rate needs to be far heigher, at least larger than 1. The model is shown below.
def gen_model():
return tf.keras.models.Sequential([
tf.keras.Input(shape=(28,28,)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='softmax')
])
model = gen_model()
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=5), loss='mean_squared_error')
model.summary()
model.fit(x_train, y_train, batch_size=1000, epochs=10000)
Is it expected for models of this form to require an extremely high learning rate, or is there something I have missed? When I use a learning rate of around 0.001 the loss changes incredibly slowly.
The dataset was created with the following code:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype("float32") / 255.0
x_train = x_train.reshape(60000,28,28)[:1000];
y_train = y_train[:1000];
y_train = tf.one_hot(y_train, 10)

Generally speaking, models that require learning rates larger than 1 raise a red flag for me. It seems like your model is a vanilla multilayer perceptron, so there's nothing overly complicated about that, but there are a couple things about your setup that stand out:
The output from your model uses a softmax, which is normally used to represent values from a categorical distribution (i.e., 1-of-k) -- this is typical for a classification model. But the loss you're using is typically used for optimizing Gaussian or regression outputs. You might want to try using a cross-entropy loss to see if that helps.
The output from your model is in probability space, so the values you get out from your model are in [0, 1]. The loss you're using is averaging the squared differences between the model output and the target 1-hot vector (whose values are in {0, 1}). The value you'll get for this loss is always smaller than 1, so with a learning rate less than 1, and multiplying by the existing model weights, the delta that you'll apply to your model weights is always going to be small. Sometimes that's a good thing, but my guess is that in this case -- and particularly at the start of training when the model weights aren't near their optimal values -- this is going to be quite slow.
Related to the above point, you might try initializing your model weights with a larger range of values than the default. This would help make the gradient values larger, but could also make the model more likely to diverge.
You could also try to replace your softmax output activation with a plain linear activation, in effect converting your model's output to (unnormalized) log-probability space. Then you'd need to change your dataset labels to also represent target log-probability values, which isn't possible exactly, but could get close with something like 1e8 * (1 - one_hot). But if you wanted to go this route, you'd effectively be implementing a cross-entropy loss yourself; see the first point.

Keras neural network accuracy problem (100% in few epochs)

I'm writing here, hoping to solve a problem, that i had with a neural network, developed in python by using keras.
I'm newer with the deep learning world, and I'm studying the theory and trying to implement some code.
Goal: develop a net that allows me to recognize 2 different words (commands) that i say [in the future them will be used to drive a small robot-car]. Actually, I want only to achieve the identification of "yes/no"
Implementation: i'm trying to implement a binary classification network.
Here is my code:
i used librosa to convert the audio training and test set in a matrix input with 193 features
to overcame the possibility of batch normalization problem, i scaled the data, by using the preprocessing package (I saw that effectively that improves and affects performance): i notice that if I don't normalize training, test and data_to_be_anlyzed by using the same normalization, it doesn't work
I read that keras accept as input that can be both numpy array, so i convert the target y into numpy
i proceed with the construction of the model, training and test (i know that i'm using methods that actually are deprecated)
I use one audio to perform another text, because for the future i assume that the net will receive (and judge) one audio at-a-time
import support.myutilities as utilNP
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import StandardScaler
#READ AND BUILD INPUT
X_si = utilNP.np_array_features_dir('DIRECTORY PATH')
X_no = utilNP.np_array_features_dir('DIRECTORY PATH')
X_tot = np.concatenate((X_si, X_no), axis=0)
# Scale the train set
scaler = StandardScaler().fit(X_tot)
X_train = scaler.transform(X_tot)
#0=si 1=no
y=[]
for i in range(len(X_si)):
y.append(0)
for i in range(len(X_no)):
y.append(1)
Y=np.array(y)
#READ AND BUIL TEST TRAINING SET
X_si_test = utilNP.np_array_features_dir('DIRECTORY PATH')
X_no_test = utilNP.np_array_features_dir('DIRECTORY PATH')
X_tot_test = np.concatenate((X_si_test, X_no_test), axis=0)
# Scale the test set
scaler2 = StandardScaler().fit(X_tot_test)
X_test = scaler2.transform(X_tot_test)
y_test=[]
for i in range(len(X_si_test)):
y_test.append(0)
for i in range(len(X_no_test)):
y_test.append(1)
Y_test=np.array(y_test)
###### BUILD MODEL
model = Sequential()
model.add(Dense(100, input_dim=len(X_tot[0]), activation='relu')) #193 features as input
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid')) #1 output
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc'])
model.fit(X_train, Y, epochs=300, verbose=1)
#test
scores = model.evaluate(X_test, Y_test, verbose=0)
print('Accuracy on training data: {}% \n Error on training data: {}'.format(scores[1], 1 - scores[1]))
predictions = model.predict(X_test)
for i in range(len(predictions)):
print('=> %d (expected %d)' % (predictions[i], y_test[i]))
#TEST WITH A PRACTICAL NEW SOUND: supposed acquired
file_name = 'PATH AUDIO'
X = utilNP.np_array_features(file_name)
#Normalize according to input data
X_analyze = scaler2.transform(X)
y_analysis=[]
y_analysis.append(1) # i supposed that the audio is the one that return 1
pred_test= model.predict(X_analyze)
scores2 = model.evaluate(X_analyze, np.array(y_analysis), verbose=0)
print('Accuracy on test data: {}% \n Error on test data: {}'.format(scores2[1], 1 - scores2[1]))
Problems:
accuracy go in 100% in very few epochs. Is real that the training set is not so big (a total of 300 samples, and 40 for test), but this result is clearly wrong. By the way, if I use a number of epochs > 100, the net works well and performs its work good (practically the result of the single case study, is recognized)
if the number of epochs is low (20 for example), accuracy still reach 100% after few iterations, but the training is affect by the error in the results (why are not recognized?) and the final prediction wrong. It is not normal: i would expect a low accuracy to justify the wrong answer, but it remains at 100%
I test a lot of solution, passing from setting 'training=True/False', and read a lot of answer here and in stack exchange, but I don't solved nothing.
There is something wrong in my code?
Thanks in advance.

Selecting validation metric for `categorical_crossentropy` in Keras

I am looking at these two questions and documentation:
Whats the output for Keras categorical_accuracy metrics?
Categorical crossentropy need to use categorical_accuracy or accuracy as the metrics in keras?
https://keras.io/api/metrics/probabilistic_metrics/#categoricalcrossentropy-class
For classification of X-Rays images I (15 classes) I do:
# Compile a model
model1.compile(optimizer = 'adam', loss = 'categorical_crossentropy',
metrics = ['accuracy'])
# Fit the model
history1 = model1.fit_generator(train_generator, epochs = 10,
steps_per_epoch = 10, verbose = 1, validation_data = valid_generator)
My model works and I have an output:
But I am not sure how to add validation accuracy here to compare results and avoid over/underfitting.

I hope the following can help you:
The use of "categorical_crossentropy" tells me that your labels are a one hot encoding over different classes.
Let's say you have 15 classes, the correct prediction would be a vector with 14 zeros, and a one at the corresponding index. In this context "accuracy" will be very high as your model will be correctly predicting mostly zero everywhere, so the accuracy should easily be at least 13/15 = 0.86.
A more suitable metric would be "categorical_accuracy" which will give you 1 if the model predicts the correct index, and else 0.
If you have a validation "categorical_accuracy" better than 1/15 = 0.067 (assuming your class are correctly balanced), your model is better than random.
You can find a list of metrics at keras metrics.

Keras RNN accuracy doesn't improve

I'm trying to improve my model so it can become a bit more accurate. Right now I'm training the model and get this as my training and validation accuracy.
For every epoch I get an training accuracy of 0.0003 and an validation accuracy of 0. I know this isn't good but I don't know how I can fix this.
Data is normalized with the minmax scaler. 4 of the 8 features are normalized (other 4 are hour, day, day_of_week and month)
Update:
I've also tried to normalize the entire dataset and it doesn't make a differance
scaling = MinMaxScaler(feature_range=(0,1)).fit(df[cols])
df[[cols]] = scaling.transform(df[[cols]])
My model: The shape is (5351, 1, 8)
and the input_shape is (1, 8)
model = keras.Sequential()
model.add(keras.layers.Bidirectional(keras.layers.LSTM(2,input_shape=(X_train.shape[1], X_train.shape[2]), return_sequences=True, activation='linear')))
model.add(keras.layers.Dense(1))
model.compile(loss='mean_squared_error', optimizer='Adamax', metrics=['acc'])
history = model.fit(
X_train, y_train,
epochs=200,
batch_size=24,
validation_split=0.35,
shuffle=False,
)
i tried using the answer of this question:
Keras model accuracy not improving
but it didn't work

A mean_sqared_error loss is for regression tasks while a acc metric is for classification problems. So it makes no sense to use them together.
If you work on a classification problem, use binary_crossentropy or categorical_crossentropy as loss and keep the metric parameter as you did.
If it is a regression tasks, change the metric to [mse] for mean squares error instead of [acc].
Your model "works" and you have applied the standard formula for backpropagation by using the mean squares error loss. But measuring the accuracy will make Keras check if your model's output is EXACTLY equals to the expected values. Since the loss function is for regression, it will hardly ever be equal.
Three last points because that little change won't correct everything.
Firstly, your last dense layer should have an activation function. (It's safier)
Secondly, I'm pretty sure a Bidirectional+LSTM layer placed before a Dense layer should have a return_sequences=False. A LSTM layer (with or without Bidirectional) can return thé full séquence of vector (like a matrix) but a dense layer takes vectors as input. But in this case it will work because of the third point.
The last point is about the shape of your data. You have 5351 examples of shape (1, 8) each which a vector of size 8. But a LSTM layer takes a sequence of vectors still thé size of your séquence is one. I don't know if it is relevent to use an RNN type layer here.

R-squared results of test and validation differ by a huge margin

I am working on a regression problem with keras and tensorflow using a neural network. The data is split, so that 282774 datasets are for training, 70694 are for validation and 88367 are for testing. To evaluate my models I am printing out the mean squared error (MSE), the mean absolute error (MAE) and the R-squared score. These are some examples from the results I get:
MSE MAE R-squared
Training 1.562072899 0.958128839 0.849787137
Validation 0.687871457 0.62066941 0.935365564
Test 0.683918759 0.618674863 -16.22829222
I do not understand the value for R-squared on test data. I know that R-squared can be negative, but how can it be that there is such a big difference between validation and test if both fall into the category of unseen data. Can someone give me a hint?
Some background information:
Since keras does not have the R-squared metric built in, I implemented it with some code I found on the web and which seems logical for me:
def r2_keras(y_true, y_pred):
SS_res = K.sum(K.square(y_true - y_pred))
SS_tot = K.sum(K.square(y_true - K.mean(y_true)))
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
And if it helps: this is my model:
model = Sequential()
model.add(Dense(75, input_shape=(7,)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='linear'))
adam = optimizers.Adam(lr=0.001)
model.compile(loss='mse',
optimizer=adam,
metrics=['mse', 'mae', r2_keras])
history = model.fit(x_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.2)
score = model.evaluate(x_test, y_test, batch_size=32)
One strange thing I noticed is, that not all testing data seems to be considered. The console prints out the following:
86304/88367 [============================>.] - ETA: 0s-----
Maybe this leads to a miscalculation for R-squared?
I am thankful for any help/hint I can get on understanding this issue.
Update:
I checked for outliers, but could not find any significant one. Min and max-values for test and train are close by, considering the standard deviation. Also the histograms look very much alike.
So in the next step I let my model predict the values for test data again and used pandas + numpy to calculate the r2_score. This time I got a value which is approximately equal to the r2_score for validation.
Below is how I did it. Do you see any flaws in the way I performed the calculation? (I just want to be sure that the old r2_score for "test" was indeed a calculation error)
# "test" is a dataframe with input data and the real outputs
# "inputs" is a list of the input column names
# The real/true outputs are contained in the column "output"
test['output_pred'] = model.predict(x=np.array(test[inputs]))
output_mean = test['output'].mean() # Is this the correct mean value for r2 here?
test['SSres'] = np.square(test['output']-test['output_pred'])
test['SStot'] = np.square(test['output']-output_mean)
r2 = 1-(test['SSres'].sum()/(test['SStot'].sum()))

Tensorflow's built-in evaluate method evaluates your test set batch by batch and hence calculates r2 at each batch. The metrics produced from model.evaluate() is then simple average of all r2 from each batch. While in model.fit(), r2 (and all metrics on validation set) are calculated per epoch (instead of per batch and then take avg.)
You may slice your output and output_pred into batches of the same batch size you used in model.evaluate() and calculate r2 on each batch. I guess the model produces high r2 on batches with high total sum of squares (SS_tot) and bad r2 on lower ones. So when taken average, result would be poor (however when calculate r2 on entire dataset, samples with higher ss_tot usually dominate the result).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.