Keras Custom Objective requires Tensor Evaluation - python

I want to create a custom objective function for training a Keras deep net. I'm researching classification of imbalanced data, and I use the F1 score a lot in scikit-learn. I therefore had the idea of inverting the F1 metric (1 - F1 score) to use it as a loss function/objective for Keras to minimise while training:
(from sklearn.metric import f1_score)
def F1Loss(y_true, y_pred):
return 1. - f1_score(y_true, y_pred)
However, this f1_score method from scikit-learn requires numpy arrays or lists to calculate the F1 score. I found that Tensors need to be evaluated to their numpy array counterparts using .eval(), which requires a TensorFlow session to perform this task.
I do not know the session object that Keras uses. I have tried using the code below, assuming the Keras backend has its own session object defined somewhere, but this also did not work.
from keras import backend as K
K.eval(y_true)
Admittedly, this was a shot in the dark since I don't really understand the deeper workings of Keras or Tensorflow a the moment.
My question is: how do I evaluate the y_true and y_pred tensors to their numpy array counterparts?

Your problem is a classic problem with implementing a discontinous objective in Theano. It's impossible beacuse of two reasons:
F1-score is discontinous : here you can read what should be expected from an objective function in neural networks training. F1-score doesn's satisfy this conditions - so it cannot be used to train neural network.
There is no equivalency between Tensor and Numpy array: it's an fundamental issue. Theano tensor is like x in school equations. You cannot expect from an algebraic variable to be equivalent any object to which it can be assigned to. On the other hand - as a part of a computational graph - a tensor operations should be provided in order to compute objective. If not - you cannot differentiate it w.r.t. parameters what makes most of usual way of training of a neural network impossible.

If you have predicted and actual tensors in numpy array format then I guess that you can use this code snippet:
correct_prediction = tf.equal(tf.argmax(actual_tensor,1), tf.argmax(predicted_tensor,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
And in keras, I think that you can use this:
model.fit_generator(train_generator, validation_data=val_generator, nb_val_samples=X_val.shape[0],
samples_per_epoch=X_train.shape[0], nb_epoch=nb_epoch, verbose=1,
callbacks=[model_checkpoint, reduce_lr, tb], max_q_size=1000)
Where train_generator and val_generator generates the training and validation data while training and this also prints the loss and accuracies while training.
Hope this helps...

Related

Best Practice for Transforming y_pred in Tensorflow's Metric

In my previous project, I need to frame an image classification task as a regression problem. I implement the regression model using Tensorflow, with standard Sequential model with a 1 node Dense layer with no activation function as the last layer. In order to measure the performance, I need to use standard classification metrics, such as accuracy and cohen kappa.
However, I can't directly use those metrics because my model is a regression model, so I need to clip and round the output before feeding them to the metrics. I use a workaround by defining my own metric, however that workaround is not practical. Therefore, I'm thinking about contributing to Tensorflow by implementing a custom transformation_function to transform y_pred by a Tensor lambda function before storing them in the __update_state method. After reading the source code, I get doubts regarding this idea. So, I'm asking out to you, fellow Tensorflow user/contributors, what is the best practice of transforming y_pred before feeding it to a metric? Is this functionality already implemented in the newest version?
Thank you!

Keras built-in MSE loss on 2D data returns 2D matrix, not scalar loss

I'm trying to evaluate the MSE loss for single 2D test samples with an autoencoder (AE) in Keras once the model is trained and I'm surprised that when I call Keras MSE built-in function to get individual samples' loss it returns 2D tensors. That means the loss function computes one loss per pixel for each sample, and not one loss per sample as it should (?). To be perfectly clear, I expected MSE to associate to each 2D sample the mean of the squared errors computed over all pixels (as I've read on this SO post).
Since I didn't manage to get an array of scalar MSE errors with one scalar per test sample after training my AE using .predict() and .evaluate() (perhaps I missed something there as well), I went on trying to directly use keras.losses.mean_squared_error(), sample by sample. This returned me 2D tensors as a loss for each sample (input tensors are of size (N,M,1)). When one looks at Keras' original implementation of MSE loss, one finds:
def mean_squared_error(y_true, y_pred):
return K.mean(K.square(y_pred - y_true), axis=-1)
The axis=-1 explains why multiple dimensions aren't immediately reduced to a scalar when computing the loss.
I therefore wonder:
What exactly has my model been using during training ? Was it the
mean of squared error over all pixels for each sample as I expected
? This isn't what the built-in code suggests.
Do I absolutely need to re-define the MSE loss to get the individual MSE losses for each test sample ? To obtain a scalar I
would then have to flatten the samples and the associated
predictions, and then re-apply the built-in MSE (and this sample by sample).
Manually flattening before computing MSE seems what needs to be done according to this SO answer on Keras' MSE loss. Using MSE for an AE model with 2D data seemed fine to me as I read this keras.io Mnist denoising tutorial.
My code:
import keras
AE_testOutputs = autoencoder.predict(samplesList)
samplesMSE = []
for testSampleIndex in range(samplesList.shape[0]):
AE_output = AE_testOutputs[testSampleIndex,:,:,:]
samplesMSE.append(keras.losses.mean_squared_error(samplesList[testSampleIndex,:,:,:],AE_output))
Which returns a list samplesMSE of Tensor("Mean:0", shape=(15, 800), dtype=float64) objects.
I'm sorry if I missed a similar question, I did actively research before posting, and I'm still afraid there is a very simple explanation/I must have missed a built-in function somewhere.
Although it is not absolutely required, Keras loss functions are conventionally defined "per-sample", where "sample" is basically each element in the output tensor of the model. The loss function is then pass through a wrapping function weighted_masked_objective that adds support for masking and sample weighting. By default, the total loss is the average of the samples losses.
If you want to get the mean of some value across every dimension but the first one, you can simply use K.mean over the value that you get.

The Accuracy Metric Purpose

I am using Keras to build a CNN and I have come to a misunderstanding about what the Accuracy metric does exactly.
I have done some research and it appears that it returns the Accuracy of the model. Where is this information stored exactly? Does this metric effect the epoch results?
I cannot find any resources that actually describe in depth what the Accuracy metric does. How are my results affected by using this metric?
model.compile(
loss="sparse_categorical_crossentropy",
optimizer='adam',
metrics=['accuracy']
)
The Keras documentation does not explain the purpose of this metric.
In case of your question it is easier to check the Keras source code, because any Deep Learning framework has a poor documentation.
Firstly, you need to find how string representations are processed:
if metric in ('accuracy', 'acc'):
metric_fn = metrics_module.categorical_accuracy
This follows to metric module where the categorical_accuracy function is defined:
def categorical_accuracy(y_true, y_pred):
return K.cast(K.equal(K.argmax(y_true, axis=-1),
K.argmax(y_pred, axis=-1)),
K.floatx())
It is clear that the function returns a tensor, and just a number presented in logs, so there is a wrapper function for processing the tensor with comparison results:
weighted_metric_fn = weighted_masked_objective(metric_fn)
This wrapper function contains the logic for calculating the final values. As no weights and masks are defined, just a simple averaging is used:
return K.mean(score_array)
So, there is an equation as a result:
P.S. I slightly disagree with #VnC, because accuracy and precision are different terms. Accuracy shows the rate of correct predictions in a classification task, and precision shows the rate of positive predicted values (more).
It is only used to report on your model performance and shouldn't affect it in any way, e.g. how accurate your predictions are.
Accuracy basically means precision:
precision = true_positives / ( true_positives + false_positives )
I would recommend using f1_score (link) as it combines precision and recall.
Hope that clears it up.
Any metric is a function of the model's predictions and the ground truth, same as a loss. The accuracy of a model by itself makes no sense, its not a property of only the model, but also of the dataset where the model is being evaluated.
Accuracy in particular is a metric used for classification, and it is just the ratio between the number of correct predictions (prediction equal to label), and the total number of data points in the dataset.
Any metric that is evaluated during training/evaluation is information for you, its not used to train the model. Only the loss function is used for actual training of the weights.

Keras: Is it possible to use model.evaluate() for checkpoints rather than model.fit metrics

It's well known that when fitting a keras model, the metrics that are reported include all the layers including dropouts. As such, reported metrics (precision, recall, f1, even accuracy) will differ from what was reported during the fit, and when using the predicted results using model.predict().
I'm curious whether it is possible to have keras perform checkpoints on a user-defined routine that uses the current model.predict() rather than the internal metric.
Example:
model.fit(X,y,validation_data=(Xv,yv),\
callbacks=[K.ModelCheckpoint('/tmp/dump.h5',\
monitor='val_acc')])
will perform checkpoints on the validation accuracy that's been measured during the fit (and reported using verbose=1). But suppose I want to perform the checkpoint with F1-score using the actual predictions that model would make on the validation set. Is it possible to get ModelCheckpoint to use something like sklearn.metrics.f1_score(yv,model.predict(Xv))?

Weighing Training Data for Keras

Problem
I want to train a keras2 neural network (theano backend) with data of variable relevance. That means some of the samples are less important than others. They shall affect the training less than others. However I'm not able to simply omit them completely (I have a time series that goes into Conv1D layers).
Question
How can I tell keras to weigh certain training data samples less than others during the training?
Idea
I'm thinking about defining an own loss function that takes y_true, y_pred and y_weight as a third argument. Something like:
def mean_squared_error_weighted(y_true, y_pred, y_weight):
return y_weight * K.mean(K.square(y_pred - y_true), axis=-1)
But how would I let keras know about that third argument?
The fit function of of a keras model accepts an optional argument sample_weight that does exactly what you're looking for. More specifically from keras documentation:
sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only).

Categories