I have found an implementation of the Monte carlo Dropout on pytorch the main idea of implementing this method is to set the dropout layers of the model to train mode. This allows for different dropout masks to be used during the different various forward passes.
The implementation illustrate how multiple predictions from the various forward passes are stacked together and used for computing different uncertainty metrics.
import sys
import numpy as np
import torch
import torch.nn as nn
def enable_dropout(model):
""" Function to enable the dropout layers during test-time """
for m in model.modules():
if m.__class__.__name__.startswith('Dropout'):
m.train()
def get_monte_carlo_predictions(data_loader,
forward_passes,
model,
n_classes,
n_samples):
""" Function to get the monte-carlo samples and uncertainty estimates
through multiple forward passes
Parameters
----------
data_loader : object
data loader object from the data loader module
forward_passes : int
number of monte-carlo samples/forward passes
model : object
keras model
n_classes : int
number of classes in the dataset
n_samples : int
number of samples in the test set
"""
dropout_predictions = np.empty((0, n_samples, n_classes))
softmax = nn.Softmax(dim=1)
for i in range(forward_passes):
predictions = np.empty((0, n_classes))
model.eval()
enable_dropout(model)
for i, (image, label) in enumerate(data_loader):
image = image.to(torch.device('cuda'))
with torch.no_grad():
output = model(image)
output = softmax(output) # shape (n_samples, n_classes)
predictions = np.vstack((predictions, output.cpu().numpy()))
dropout_predictions = np.vstack((dropout_predictions,
predictions[np.newaxis, :, :]))
# dropout predictions - shape (forward_passes, n_samples, n_classes)
# Calculating mean across multiple MCD forward passes
mean = np.mean(dropout_predictions, axis=0) # shape (n_samples, n_classes)
# Calculating variance across multiple MCD forward passes
variance = np.var(dropout_predictions, axis=0) # shape (n_samples, n_classes)
epsilon = sys.float_info.min
# Calculating entropy across multiple MCD forward passes
entropy = -np.sum(mean*np.log(mean + epsilon), axis=-1) # shape (n_samples,)
# Calculating mutual information across multiple MCD forward passes
mutual_info = entropy - np.mean(np.sum(-dropout_predictions*np.log(dropout_predictions + epsilon),
axis=-1), axis=0) # shape (n_samples,)
What I'm trying to do is to calculate accuracy across different forward passes, can anyone please help me on how to get the accuracy and make any changes on the dimensions used in this implementation
I am using the CIFAR10 dataset and would like to use the dropout on test time The code for the data_loader
testset = torchvision.datasets.CIFAR10(root='./data', train=False,download=True, transform=test_transform)
#loading the test set
data_loader = torch.utils.data.DataLoader(testset, batch_size=n_samples, shuffle=False, num_workers=4) ```
Accuracy is the percentage of correctly classified samples. You can create a boolean array that indicates whether a certain prediction is equal to its corresponding reference value, and you can get the mean of these values to calculate accuracy. I have provided a code example of this below.
import numpy as np
# 2 forward passes, 4 samples, 3 classes
# shape is (2, 4, 3)
dropout_predictions = np.asarray([
[[0.2, 0.1, 0.7], [0.1, 0.5, 0.4], [0.9, 0.05, 0.05], [0.25, 0.74, 0.01]],
[[0.1, 0.5, 0.4], [0.2, 0.6, 0.2], [0.8, 0.10, 0.10], [0.25, 0.01, 0.74]]
])
# Get the predicted value for each sample in each forward pass.
# Shape of output is (2, 4).
classes = dropout_predictions.argmax(-1)
# array([[2, 1, 0, 1],
# [1, 1, 0, 2]])
# Test equality among the reference values and predicted classes.
# Shape is unchanged.
y_true = np.asarray([2, 1, 0, 1])
elementwise_equal = np.equal(y_true, classes)
# array([[ True, True, True, True],
# [False, True, True, False]])
# Calculate the accuracy for each forward pass.
# Shape is (2,).
elementwise_equal.mean(axis=1)
# array([1. , 0.5])
In the example above, you can see that the accuracy for the first forward pass was 100%, and the accuracy for the second forward pass was 50%.
#jakub's answer is correct. However, I wanted to propose an alternate approach that may be better especially for more nascent researchers.
Scikit-learn comes with many built in performance measurement functions, including accuracy. To get those approaches to work with PyTorch, you only need to convert your torch tensor to numpy arrays:
x = torch.Tensor(...) # Fill-in as needed
x_np = x.numpy() # Convert to numpy
Then, you simply import scikit-learn:
from sklearn.metrics import accuracy_score
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
accuracy_score(y_true, y_pred)
This simply returns 0.5. Easy peasy and less likely to have a bug.
Related
I am finding output of batchnormalization in Keras.
My model is:
#Import libraries
import numpy as np
import keras
from keras import layers
from keras.layers import Input, Dense, Activation, BatchNormalization, Flatten, Conv2D
from keras.models import Model
#Model
def HappyModel3(input_shape):
X_input = Input(input_shape, name='input_layer')
X = BatchNormalization(axis = 1, name = 'batchnorm_layer')(X_input)
X = Dense(1, activation='sigmoid', name='sigmoid_layer')(X)
model = Model(inputs = X_input, outputs = X, name='HappyModel3')
return model
Compiling Model | here number of epochs is 1
X_train=np.array([[1,1,-1],[2,1,1]])
Y_train=np.array([0,1])
happyModel_1=HappyModel3(X_train[0].shape)
happyModel_1.compile(optimizer=keras.optimizers.RMSprop(), loss=keras.losses.mean_squared_error)
happyModel_1.fit(x = X_train, y = Y_train, epochs = 1 , batch_size = 2, verbose=0 )
finding Batch Normalisation layer's output for model with epochs=1:
for i in range(0, len(happyModel_1.layers)):
tmp_model = Model(happyModel_1.layers[0].input, happyModel_1.layers[i].output)
tmp_output = tmp_model.predict(X_train)
if i in (0,1) :
print(happyModel_1.layers[i].name)
print(tmp_output.shape)
print(tmp_output)
print('\n')
Code Output is:
input_layer
(2, 3)
[[ 1. 1. -1.]
[ 2. 1. 1.]]
batchnorm_layer
(2, 3)
[[ 0.99003249 0.99388224 -0.99551398]
[ 1.99647105 0.99388224 0.9971655 ]]
We've normalized at axis=1 |
Batch Norm Layer Output: At axis=1, 1st dimension mean is 1.5, 2nd dimension mean is 1, 3rd dimension mean is 0.
Since its batch norm, I expect mean to be close to 0 for all 3 dimensions
This happens when I increase epochs to 1000:
happyModel_2=HappyModel3(X_train[0].shape)
happyModel_2.compile(optimizer=keras.optimizers.RMSprop(), loss=keras.losses.mean_squared_error)
happyModel_2.fit(x = X_train, y = Y_train, epochs = 1000 , batch_size = 2, verbose=0 )
finding Batch Normalisation layer's output for model with epochs=1000:
for i in range(0, len(happyModel_2.layers)):
tmp_model = Model(happyModel_2.layers[0].input, happyModel_2.layers[i].output)
tmp_output = tmp_model.predict(X_train)
if i in (0,1) :
print(happyModel_2.layers[i].name)
print(tmp_output.shape)
print(tmp_output)
print('\n')
#Code output
input_layer
(2, 3)
[[ 1. 1. -1.]
[ 2. 1. 1.]]
batchnorm_layer
(2, 3)
[[ -1.95576239e+00 8.08715820e-04 -1.86621261e+00]
[ 1.95795488e+00 8.08715820e-04 1.86590290e+00]]
We've normalized at axis=1 | Now At axis=1, batch norm layer output is: 1st dimension mean is 0, 2nd dimension mean is 0, 3rd dimension mean is 0. THIS IS AN EXPECTED OUTPUT NOW
My question is: Is output of Batch Normalization in Keras dependent on number of epochs?
(Probably YES, as we do backpropagation, batch Normalization parameters will be affected by increasing number of epochs)
The keras documentation for BatchNormalization gives an answer to your question:
Importantly, batch normalization works differently during training and
during inference.
What happens during training, i.e. when calling model.fit()?
During training [...], the layer normalizes its output
using the mean and standard deviation of the current batch of inputs.
But what will happen during inference, i.e. when calling mode.predict() as in your examples?
During inference [...], the layer normalizes its output using a moving average of
the mean and standard deviation of the batches it has seen during
training. That is to say, it returns (batch - self.moving_mean) / (self.moving_var + epsilon) * gamma + beta.
self.moving_mean and self.moving_var are non-trainable variables that
are updated each time the layer in called in training mode [...].
It's important to understand that batch normalization will calculate the statistics (mean and variance) of your whole training data during training by looking at statistics of single batches and internally updating the moving_mean and moving_variance parameters by a running average computed form the single batch statistics. Therefore they're not affected by backpropagation. Ideally, after your model has seen enough training examples (or did enough training epochs), moving_mean and moving_variance will correspond to the statistics of your whole training set. These two parameters are then used during inference to normalize test examples. At the start of training the two parameters will be initialized to 0 and 1. Further batch norm has two more parameters called gamma and beta, which will be updated by the optimizer and therefore depend on your loss.
In essence, yes, the output of batch normalization during inference is dependent on the number of epochs you have trained your model. Firstly, due to changing moving averages for mean and variance and second due to learned parameters gamma and beta.
For a deeper understanding of how batch normalization works and why it is needed, have a look at the original publication.
I want to do evaluation of a classification Tensorflow model.
To compute the accuracy, I have the following code :
predictions = tf.argmax(logits, axis=-1, output_type=tf.int32)
accuracy = tf.metrics.accuracy(labels=label_ids, predictions=logits)
It work well in single label classification, but now I want to do multilabel classification, where my labels are Array of Integers instead of Integers.
Here is an example of label [0, 1, 1, 0, 1, 0] that are stored in label_ids, and an example of predictions [0.1, 0.8, 0.9, 0.1, 0.6, 0.2] from the Tensor logits
What function should I use instead of argmax to do so ? (My labels are arrays of 6 Integers with value of either 0 or 1)
If needed, we can suppose that there is a threshold of 0.5.
It is probably better to do this type of post-processing evaluation outside of tensorflow, where it is more natural to try several different thresholds.
If you want to do it in tensorflow, you can consider:
predictions = tf.math.greater(logits, tf.constant(0.5))
This will return a tensor of the original logits shape with True for all entries greater than 0.5. You can then calculate accuracy as before. This is suitable for cases where many labels can be simultaneously true for a given sample.
Use below code to caclutae accuracy in multiclass classification:
tf.argmax will return the axis where y value is max for both y_pred and y_true(actual y).
Further tf.equal is used to find total number of matches (It returns True, False).
Convert the boolean into float(i.e. 0 or 1) and use tf.reduce_mean to calculate the accuracy.
correct_mask = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true,1))
accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
Edit
Example with data:
import numpy as np
y_pred = np.array([[0.1,0.5,0.4], [0.2,0.6,0.2], [0.9,0.05,0.05]])
y_true = np.array([[0,1,0],[0,0,1],[1,0,0]])
correct_mask = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true,1))
accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
with tf.Session() as sess:
# print(sess.run([correct_mask]))
print(sess.run([accuracy]))
Output:
[0.6666667]
I'm coding a simple neural network from scratch. The neural network is implemented in the method def simple_1_layer_classification_NN, which accepts an input matrix, output labels among other parameters. Before looping through every Epoch I wanted to shuffle the input matrix, only by its rows (i.e. its observations), just as one measure of avoiding over-fitting. I tried random.shuffle(dataset_input_matrix). Two strange things happened. I took snapshot of the matrix before and after the shuffle step (by using the code below with breakpoints to see the value of the matrix before and after, expecting it to shuffle). So matrix_input should give the value of the matrix before the shuffle, and matrix_input1 should give the value after, i.e. of the shuffled matrix.
input_matrix = dataset_input_matrix
# shuffle our matrix observation samples, to decrease the chance of overfitting
random.shuffle(dataset_input_matrix)
input_matrix1 = dataset_input_matrix
When I printed both values, I got the same matrix, with no changes.
ipdb> input_matrix
array([[3. , 1.5],
[3. , 1.5],
[2. , 1. ],
[3. , 1.5],
[3. , 1.5],
[3. , 1. ]])
ipdb> input_matrix1
array([[3. , 1.5],
[3. , 1.5],
[2. , 1. ],
[3. , 1.5],
[3. , 1.5],
[3. , 1. ]])
ipdb>
Not sure if I'm doing something wrong here.
The second strange thing is, when I ran the neural network (after the shuffle), its accuracy dropped dramatically. Before I was getting accuracy ranging from 60% - 95% (with very few 50%).
After doing the shuffle step for the input matrix, I was barely getting an accuracy above 50%, no matter how many times I run the model. Which is strange considering that it appears the shuffle hasn't even worked examining it with the breakpoints. And anyway why should the network accuracy drop this badly. Unless I'm doing the shuffling completely wrong.
So 2 questions:
1- How to shuffle only the rows of a matrix (as I only need to randomise the observations (rows), not the features (columns) of the dataset).
2- Secondly why is it when I did the shuffle it dropped the accuracy so much that the neural network is not able to get anything above 50%. After all, it is something recommended to shuffle data as a pre-processing step to avoid over-fitting.
Please refer to the full code below, and apologies for the large portion of code.
Many thanks in advance for any help.
# --- neural network structure diagram ---
# O output prediction
# / \ w1, w2, b
# O O datapoint 1, datapoint 2
def simple_1_layer_classification_NN(self, dataset_input_matrix, output_data_labels, input_dimension, epochs, activation_func='sigmoid', learning_rate=0.2, cost_func='squared_error'):
weights = []
bias = int()
cost = float()
costs = []
dCost_dWeights = []
chosen_activation_func_derivation = None
chosen_cost_func = None
chosen_cost_func_derivation = None
correct_pred = int()
incorrect_pred = int()
# store the chosen activation function to use to it later on in the activation calculation section and in the 'predict' method
# Also the same goes for the derivation section.
if activation_func == 'sigmoid':
self.chosen_activation_func = NN_classification.sigmoid
chosen_activation_func_derivation = NN_classification.sigmoid_derivation
elif activation_func == 'relu':
self.chosen_activation_func = NN_classification.relu
chosen_activation_func_derivation = NN_classification.relu_derivation
else:
print("Exception error - no activation function utilised, in training method", file=sys.stderr)
return
# store the chosen cost function to use to it later on in the cost calculation section.
# Also the same goes for the cost derivation section.
if cost_func == 'squared_error':
chosen_cost_func = NN_classification.squared_error
chosen_cost_func_derivation = NN_classification.squared_error_derivation
else:
print("Exception error - no cost function utilised, in training method", file=sys.stderr)
return
# Set initial network parameters (weights & bias):
# Will initialise the weights to a uniform distribution and ensure the numbers are small close to 0.
# We need to loop through all the weights to set them to a random value initially.
for i in range(input_dimension):
# create random numbers for our initial weights (connections) to begin with. 'rand' method creates small random numbers.
w = np.random.rand()
weights.append(w)
# create a random number for our initial bias to begin with.
bias = np.random.rand()
'''
I tried adding the shuffle step, where the matrix is shuffled only in terms of its observations (i.e. rows)
but this has dropped the accuracy dramaticaly, to the point where the 50% range was the best the model can achieve.
'''
input_matrix = dataset_input_matrix
# shuffle our matrix observation samples, to decrease the chance of overfitting
random.shuffle(dataset_input_matrix)
input_matrix1 = dataset_input_matrix
# We perform the training based on the number of epochs specified
for i in range(epochs):
#reset average accuracy with every epoch
self.train_average_accuracy = 0
for ri in range(len(dataset_input_matrix)):
# reset weighted sum value at the beginning of every epoch to avoid incrementing the previous observations weighted-sums on top.
weighted_sum = 0
input_observation_vector = dataset_input_matrix[ri]
# Loop through all the independent variables (x) in the observation
for x in range(len(input_observation_vector)):
# Weighted_sum: we take each independent variable in the entire observation, add weight to it then add it to the subtotal of weighted sum
weighted_sum += input_observation_vector[x] * weights[x]
# Add Bias: add bias to weighted sum
weighted_sum += bias
# Activation: process weighted_sum through activation function
activation_func_output = self.chosen_activation_func(weighted_sum)
# Prediction: Because this is a single layer neural network, so the activation output will be the same as the prediction
pred = activation_func_output
# Cost: the cost function to calculate the prediction error margin
cost = chosen_cost_func(pred, output_data_labels[ri])
# Also calculate the derivative of the cost function with respect to prediction
dCost_dPred = chosen_cost_func_derivation(pred, output_data_labels[ri])
# Derivative: bringing derivative from prediction output with respect to the activation function used for the weighted sum.
dPred_dWeightSum = chosen_activation_func_derivation(weighted_sum)
# Bias is just a number on its own added to the weighted sum, so its derivative is just 1
dWeightSum_dB = 1
# The derivative of the Weighted Sum with respect to each weight is the input data point / independant variable it's multiplied by.
# Therefore I simply assigned the input data array to another variable I called 'dWeightedSum_dWeights'
# to represent the array of the derivative of all the weights involved. I could've used the 'input_sample'
# array variable itself, but for the sake of readibility, I created a separate variable to represent the derivative of each of the weights.
dWeightedSum_dWeights = input_observation_vector
# Derivative chaining rule: chaining all the derivative functions together (chaining rule)
# Loop through all the weights to workout the derivative of the cost with respect to each weight:
for dWeightedSum_dWeight in dWeightedSum_dWeights:
dCost_dWeight = dCost_dPred * dPred_dWeightSum * dWeightedSum_dWeight
dCost_dWeights.append(dCost_dWeight)
dCost_dB = dCost_dPred * dPred_dWeightSum * dWeightSum_dB
# Backpropagation: update the weights and bias according to the derivatives calculated above.
# In other word we update the parameters of the neural network to correct parameters and therefore
# optimise the neural network prediction to be as accurate to the real output as possible
# We loop through each weight and update it with its derivative with respect to the cost error function value.
for ind in range(len(weights)):
weights[ind] = weights[ind] - learning_rate * dCost_dWeights[ind]
bias = bias - learning_rate * dCost_dB
# Compare prediction to target
error_margin = np.sqrt(np.square(pred - output_data_labels[ri]))
accuracy = (1 - error_margin) * 100
self.train_average_accuracy += round(accuracy)
# Evaluate whether guessed correctly or not based on classification binary problem 0 or 1 outcome. So if prediction is above 0.5 it guessed 1 and below 0.5 it guessed incorrectly. If it's dead on 0.5 it is incorrect for either guesses. Because it's no exactly a good guess for either 0 or 1. We need to set a good standard for the neural net model.
if (error_margin < 0.5) and (error_margin >= 0):
correct_pred += 1
elif (error_margin >= 0.5) and (error_margin <= 1):
incorrect_pred += 1
else:
print("Exception error - 'margin error' for 'predict' method is out of range. Must be between 0 and 1, in training method", file=sys.stderr)
return
costs.append(cost)
# Calculate average accuracy from the predictions of all obervations in the training dataset
self.train_average_accuracy = round(self.train_average_accuracy / len(dataset_input_matrix), 1)
# store the final optimised weights to the weights instance variable so it can be used in the predict method.
self.weights = weights
# store the final optimised bias to the weights instance variable so it can be used in the predict method.
self.bias = bias
# Print out results
print('Average Accuracy: {}'.format(self.train_average_accuracy))
print('Correct predictions: {}, Incorrect Predictions: {}'.format(correct_pred, incorrect_pred))
from numpy import array
#define array of dataset
# each observation vector has 3 datapoints or 3 columns: length, width, and outcome label (0, 1 to represent blue flower and red flower respectively).
data = array([[3, 1.5, 1],
[2, 1, 0],
[4, 1.5, 1],
[3, 1, 0],
[3.5, 0.5, 1],
[2, 0.5, 0],
[5.5, 1, 1],
[1, 1, 0]])
# separate data: split input, output, train and test data.
X_train, y_train, X_test, y_test = data[:6, :-1], data[:6, -1], data[6:, :-1], data[6:, -1]
nn_model = NN_classification()
nn_model.simple_1_layer_classification_NN(X_train, y_train, 2, 10000, learning_rate=0.2)
I created an LSTM model for intraday stock predictions. I took the training data with the shape of (290, 4). I did all the preprocessing like Normalize the data, taking the difference, taking window size of 4.
This is a sample of my input data.
X = array([[0, 0, 0, 0],
[array([ 0.19]), 0, 0, 0],
[array([-0.35]), array([ 0.19]), 0, 0],
...,
[array([ 0.11]), array([-0.02]), array([-0.13]), array([-0.09])],
[array([-0.02]), array([ 0.11]), array([-0.02]), array([-0.13])],
[array([ 0.07]), array([-0.02]), array([ 0.11]), array([-0.02])]], dtype=object)
y = array([[array([ 0.19])],
[array([-0.35])],
[array([-0.025])],
.....,
[array([-0.02])],
[array([ 0.07])],
[array([-0.04])]], dtype=object)
Note: I am giving as well as predicting the difference value. So input value is between range (-0.5,0.5)
Here is my Keras LSTM model :
dim_in = 4
dim_out = 1
model.add(LSTM(input_shape=(1, dim_in),
return_sequences=True,
units=6))
model.add(Dropout(0.2))
model.add(LSTM(batch_input_shape=(1, features.shape[1],features.shape[2]),return_sequences=False,units=6))
model.add(Dropout(0.3))
model.add(Dense(activation='linear', units=dim_out))
model.compile(loss = 'mse', optimizer = 'rmsprop')
for i in range(300):
#print("Completed :",i+1,"/",300, "Steps")
model.fit(X, y, epochs=1, batch_size=1, verbose=2, shuffle=False)
model.reset_states()
I am feeding the last sequence value of shape=(1,4) and predict the output.
This is my prediction :
base_value = df.iloc[290]['Close']
prediction = []
orig_pred = []
input_data = np.copy(test[0,:])
input_data = input_data.reshape(len(input_data),1)
for i in range(100):
inp = input_data[i:,:]
inp = inp.reshape(1,1,inp.shape[0])
y = model.predict(inp)
orig_pred.append(y[0][0])
input_data = np.insert(input_data,[i+4],y[0][0], axis=0)
base_value = base_value + y
prediction_apple.append(base_value[0][0])
sqrt(mean_squared_error(test_output, orig_pred))
RMSE = 0.10592485833344527
Here is the difference in prediction visualization along with stock price prediction.
fig:1 -> This is the LSTM prediction
fig:2 -> This is the Stock prediction
I am not sure why it is predicting the same output value after 10 iterations. Maybe it is the vanishing gradient problem or I am feeding fewer input data(290 approx) or problem in the model architecture. I am not sure.
Please Help how to get the reasonable result.
Thank you !!!
I don't work with Keras, but looking through your code and plots it seems like the complexity of your network might not be high enough to fit the data. Try enlarging the network with more units and also try larger window sizes.
Because your regressor secures the minimization of the cost function by replicating the feature you give as input feature. For example if you have BTC closing value as $6340 at time t, it will go for it at t+1 or some value close to it. Ensure that you are not giving a direct numerical intuition to a regressor that what the predicted label might be, especially when working with time-series data.
I learning AI with Python and have this situation: I created a deep learning model that has 10 neurons in his Input layer. On the output layer I have 3 neurons. I split up my data to 80% for learning and 20% for testing.
The trained model is ready for testing.
Until now, I always got situation that I have only one neuron in the output layer. So, I tested the accuracy in that way:
classifier = Sequential()
# ...
classifier.add(Dense(units = 3, kernel_initializer = 'uniform', activation = 'sigmoid'))
# ...
y_pred = classifier.predict(np.array(X_test))
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
which working great when the output layer has only ONE value on each prediction.
In my case, I have 3 values in each prediction.
y_pred = array ([[3.142904686503911194e-11, 1.000000000000000000e+00, 1.729809626091548085e-16],
[7.398544450698540942e-12, 1.000000000000000000e+00, 1.776427415878292515e-22],
[4.224535246066807304e-07, 1.000000000000000000e+00 7.929732391553923065e-12]])
And I want to compare it to my expected values, which:
y_test = [[0, 1, 0], [0, 1, 0], [0, 1, 0]]
So, I have the option to make this work manually:
Put 1 in the highest value in the prediction value. Other values are getting 0.
Compare the two vectors row by row.
It looks like must have a better way to do it?
You want to measure how "close" the prediction vector is to the expected vector. A good formula that describes the "amount of difference" between two vectors is to check the magnitude (or square magnitude) of the delta vector (prediction - expected).
In this case, you can do something like this:
def square_magnitude(vector):
return sum(x*x for x in vector)
def inaccuracy(pred, test): #should only get equal-length items
return square_magnitude([pred[i] - test[i] for i in range(len(pred))]) / len(pred)
Since you have three samples:
total_inaccuracy = sum(inaccuracy(y_pred[i], y_test[i]) for i in range(len(y_pred))) / len(y_pred)
This should be 0 when it's perfectly accurate and higher (positive) when it's less accurate.