ValueError: No Gradients provided for any variable:

ValueError: No Gradients provided for any variable: - python

I have been trying to understand where I am getting this error, and went through the forms here in stack overflow. Cant seem to fix it.
Tensorflow 2.3
I'm trying to build a multiple-input LSTM model
Input data:
print(plot_data)
tensor_stack = tf.stack([plot_data,plot_data2])
print(tensor_stack)
tf.Tensor(
[[39.0991 39.09879 39.09845 ... 0. 0. 0. ]
[39.09889 39.09914 39.09887 ... 0. 0. 0. ]
[39.09889 39.09847 39.09858 ... 0. 0. 0. ]
[39.09889 39.09886 39.0989 ... 39.09906 39.09923 39.09919]], shape=(4, 484), dtype=float64)
tf.Tensor(
[[[ 39.0991 39.09879 39.09845 ... 0. 0. 0. ]
[ 39.09889 39.09914 39.09887 ... 0. 0. 0. ]
[ 39.09889 39.09847 39.09858 ... 0. 0. 0. ]
[ 39.09889 39.09886 39.0989 ... 39.09906 39.09923 39.09919]]
[[-77.57008 -77.5702 -77.57081 ... 0. 0. 0. ]
[-77.57014 -77.56999 -77.57015 ... 0. 0. 0. ]
[-77.57012 -77.5699 -77.57022 ... 0. 0. 0. ]
[-77.57033 -77.56998 -77.56963 ... -77.5701 -77.56991 -77.56992]]], shape=(2, 4, 484), dtype=float64)
This is my model:
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.optimizers import Adam, RMSprop
import numpy as np
#from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from tensorflow import keras
from tensorflow.keras import layers
input1 = Input(shape=(plot_data.shape[0],plot_data.shape[1],))
input2 = Input(shape=(plot_data2.shape[0],plot_data.shape[1], ))
input = Concatenate()([input1, input2])
x = Dense(2)(input)
x = Dense(1)(x)
model = Model(inputs=[input1, input2], outputs=x)
model.summary()
print(input.shape[1])
print(input.shape[2])
maxlen = 484
#x_train = keras.preprocessing.sequence.pad_sequences(plot_data)
#x_train = x_train.reshape(-1, input.shape[1], input.shape[2])
# design network
model = Sequential()
model.add(LSTM(2, input_shape=( tensor_stack.shape[1],tensor_stack.shape[2]), return_sequences=True))
model.add(Dense(1))
model.compile(optimizer='adam',loss='mae', metrics=['mae'])
# fit network
history = model.fit(tensor_stack, epochs=50, batch_size=72, verbose=1, shuffle=False)

Related

MinMax inverse_transform after LSTM + Dense(1) layers

My initial data is:
[[ 8375.5 0. 8374.14285714 8374.14285714]
[ 8354.5 0. 8383.39285714 8371.52380952]
...
[11060. 0. 11055.21428571 11032.53702732]
[11076.5 0. 11061.60714286 11038.39875701]]
I create MinMax scaler to transform data to values from 0 to 1
scaler = MinMaxScaler(feature_range = (0, 1))
T = scaler.fit_transform(T)
So data now is:
[[0.5186697 , 0. , 0.46812344, 0.46950912],
[0.5161844 , 0. , 0.46935928, 0.46915412],
...,
[0.72264636, 0. , 0.6767292 , 0.6807525 ],
[0.7198651 , 0. , 0.6785377 , 0.6833385 ]]
I do some magic to prepare this data for LSTM layer and this is the result:
X_train variable of shape (6989, 4, 200)
[[[0.5186697 0. 0.46812344 ... 0. 0.45496237 0.45219505]
[0.48742527 0. 0.45273864 ... 0. 0.43144143 0.431924 ]
[0.4800284 0. 0.43054438 ... 0. 0.425362 0.4326681 ]
[0.5007989 0. 0.4290794 ... 0. 0.4696839 0.47831726]]
...
[[0.61240304 0. 0.57254803 ... 0. 0.5749577 0.57792616]
[0.61139715 0. 0.5746571 ... 0. 0.5971378 0.6017289 ]
[0.6365465 0. 0.59772 ... 0. 0.62671924 0.63145673]
[0.65719867 0. 0.62684333 ... 0. 0.6757128 0.6772785 ]]]
I process the data using this model with Dense(1) layer at the end:
model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', #return_sequences = True,
input_shape = (X_train.shape[1], window_size)))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
And when I set return_sequences to false the shape of new data after fit is (6989, 1) and when I want to inverse_transform scaler.inverse_transform(train_predict) using this scalar I get an error:
ValueError: non-broadcastable output operand with shape (6989,1) doesn't match the broadcast shape (6989,4)
When I do set return_sequences to true the new shape is (6989, 4, 1) and when I inverse_transform I get other error:
ValueError: Found array with dim 3. None expected <= 2.
============
I think I know why I get these errors, because scaler requires shape of (6989,4), but what and how can I do to transform this data so that I will be able to inverse_transform?
How can I inverse_transform the data of new shape of (6989, 1)?
How can I inverse_transform the data of new shape of (6989, 4, 1)?
Is it doable? My scaler can be used? Or should I create new scaler? Can you suggest something? What am I missing?
I will appreciate any help, thanks!

How to define a new Tensor with a dynamic shape to support batching in a custom layer

I am trying to implement a custom layer that would preprocess a tokenized sequence of words into a matrix with a predefined number of elements equal to the size of vocabulary. Essentially, I'm trying to implement a 'bag of words' layer. This is the closest I could come up with:
def get_encoder(vocab_size=args.vocab_size):
encoder = TextVectorization(max_tokens=vocab_size)
encoder.adapt(train_dataset.map(lambda text, label: text))
return encoder
class BagOfWords(tf.keras.layers.Layer):
def __init__(self, vocab_size=args.small_vocab_size, batch_size=args.batch_size):
super(BagOfWords, self).__init__()
self.vocab_size = vocab_size
self.batch_size = batch_size
def build(self, input_shape):
super().build(input_shape)
def call(self, inputs):
if inputs.shape[-1] == None:
return tf.constant(np.zeros([self.batch_size, self.vocab_size])) # 32 is the batch size
outputs = tf.zeros([self.batch_size, self.vocab_size])
if inputs.shape[-1] != None:
for i in range(inputs.shape[0]):
for ii in range(inputs.shape[-1]):
ouput_idx = inputs[i][ii]
outputs[i][ouput_idx] = outputs[i][ouput_idx] + 1
return outputs
model = keras.models.Sequential()
model.add(encoder)
model.add(bag_of_words)
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
No surprise that I get an error when calling fit() on the model: "Incompatible shapes: [8,1] vs. [32,1]". This happens on the last steps, when the batch size is less than 32.
My question is: Putting aside performance, how do I define the outputs Tensor for my bag of words matrix so that it has a dynamic shape for batching and get my code working?
Edit 1
After the comment, I realised that the code doesn't work indeed because it never goes to the 'else' branch.
I edited it a bit so that it uses only tf functions:
class BagOfWords(tf.keras.layers.Layer):
def __init__(self, vocab_size=args.small_vocab_size, batch_size=args.batch_size):
super(BagOfWords, self).__init__()
self.vocab_size = vocab_size
self.batch_size = batch_size
self.outputs = tf.Variable(tf.zeros([batch_size, vocab_size]))
def build(self, input_shape):
super().build(input_shape)
def call(self, inputs):
if tf.shape(inputs)[-1] == None:
return tf.zeros([self.batch_size, self.vocab_size])
self.outputs.assign(tf.zeros([self.batch_size, self.vocab_size]))
for i in range(tf.shape(inputs)[0]):
for ii in range(tf.shape(inputs)[-1]):
output_idx = inputs[i][ii]
if output_idx >= tf.constant(self.vocab_size, dtype=tf.int64):
output_idx = tf.constant(1, dtype=tf.int64)
self.outputs[i][output_idx].assign(self.outputs[i][output_idx] + 1)
return outputs
It didn't help though: AttributeError: 'Tensor' object has no attribute 'assign'.

Here is an example of a Bag-of-Words custom keras layer without using any additional preprocessing layers:
import tensorflow as tf
class BagOfWords(tf.keras.layers.Layer):
def __init__(self, vocabulary_size):
super(BagOfWords, self).__init__()
self.vocabulary_size = vocabulary_size
def call(self, inputs):
batch_size = tf.shape(inputs)[0]
outputs = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
for i in range(batch_size):
string = inputs[i]
string_length = tf.shape(tf.where(tf.math.not_equal(string, b'')))[0]
string = string[:string_length]
string_array = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
for s in string:
string_array = string_array.write(string_array.size(), tf.where(tf.equal(s, self.vocabulary_size), 1.0, 0.0))
outputs = outputs.write(i, tf.cast(tf.reduce_any(tf.cast(string_array.stack(), dtype=tf.bool), axis=0), dtype=tf.float32))
return outputs.stack()
And here are the manual preprocessing steps and the model:
labels = [[1], [0], [1], [0]]
texts = ['All my cats in a row',
'When my cat sits down, she looks like a Furby toy!',
'The cat from the outer space',
'Sunshine loves to sit like this for some reason.']
DEFAULT_STRIP_REGEX = r'[!"#$%&()\*\+,-\./:;<=>?#\[\\\]^_`{|}~\']'
tensor_of_strings = tf.constant(texts)
tensor_of_strings = tf.strings.lower(tensor_of_strings)
tensor_of_strings = tf.strings.regex_replace(tensor_of_strings, DEFAULT_STRIP_REGEX, "")
split_strings = tf.strings.split(tensor_of_strings).to_tensor()
flattened_split_strings = tf.reshape(split_strings, (split_strings.shape[0] * split_strings.shape[1]))
unique_words, _ = tf.unique(flattened_split_strings)
unique_words = tf.random.shuffle(unique_words)
bag_of_words = BagOfWords(vocabulary_size = unique_words)
train_dataset = tf.data.Dataset.from_tensor_slices((split_strings, labels))
model = tf.keras.Sequential()
model.add(bag_of_words)
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss = tf.keras.losses.BinaryCrossentropy())
model.fit(train_dataset.batch(2), epochs=2)
Epoch 1/2
4/4 [==============================] - 2s 7ms/step - loss: 0.7081
Epoch 2/2
4/4 [==============================] - 0s 6ms/step - loss: 0.7008
<keras.callbacks.History at 0x7f5ba844bad0>
And this is what the 4 encoded sentences look like:
print(bag_of_words(split_strings))
tf.Tensor(
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 1. 1. 0.]
[1. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 1. 1. 0. 0. 0. 1. 0. 0.
0. 1. 1. 0.]
[0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 0.]
[0. 1. 0. 1. 1. 0. 0. 1. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 1.]], shape=(4, 28), dtype=float32)

Correct me if I am wrong, but I think that using the output_mode="multi_hot" of the TextVectorization layer would be sufficient to do what you want to do. According to the docs, the multi_hot output mode:
Outputs a single int array per batch, of either vocab_size or max_tokens size, containing 1s in all elements where the token mapped to that index exists at least once in the batch item
So it could be as simple as this:
import tensorflow as tf
def get_encoder():
encoder = tf.keras.layers.TextVectorization(output_mode="multi_hot")
encoder.adapt(train_dataset.map(lambda text, label: text))
return encoder
texts = [
'All my cats in a row',
'When my cat sits down, she looks like a Furby toy!',
'The cat from outer space',
'Sunshine loves to sit like this for some reason.']
labels = [[1], [0], [1], [1]]
train_dataset = tf.data.Dataset.from_tensor_slices((texts, labels))
model = tf.keras.Sequential()
model.add(get_encoder())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss = tf.keras.losses.BinaryCrossentropy())
model.fit(train_dataset.batch(2), epochs=2)
This is how your texts would be encoded:
import tensorflow as tf
texts = ['All my cats in a row',
'When my cat sits down, she looks like a Furby toy!',
'The cat from outer space',
'Sunshine loves to sit like this for some reason.']
encoder = get_encoder()
inputs = encoder(texts)
print(inputs)
tf.Tensor(
[[0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0.
0. 0. 1. 1.]
[0. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 1. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.
0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0. 0. 1. 1. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 0.
1. 0. 0. 0.]], shape=(4, 28), dtype=float32)
So just as you tried in your custom layer, the presence of words in a sequence is marked with 1 and the absence of words is marked with 0.

The answer above by #AloneTogether is perfectly relevant. Just wanted to publish the working code that I came up with in the first place without manual processing.
import tensorflow_datasets as tfds
ds, info = tfds.load('imdb_reviews', with_info=True, as_supervised=True, data_dir='/tmp/imdb')
train_dataset = ds['train']
def get_encoder(vocab_size=args.vocab_size):
encoder = TextVectorization(max_tokens=vocab_size)
encoder.adapt(train_dataset.map(lambda text, label: text))
return encoder
class BagOfWords(tf.keras.layers.Layer):
def __init__(self, vocabulary_size):
super(BagOfWords, self).__init__()
self.vocabulary_size = vocabulary_size
def call(self, inputs):
batch_size = tf.shape(inputs)[0]
outputs = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
for i in range(batch_size):
int_string = inputs[i]
array_string = tf.TensorArray(dtype=tf.float32, size=self.vocabulary_size)
array_string.unstack(tf.zeros(self.vocabulary_size))
for int_word in int_string:
idx = int_word
idx = tf.cond(idx >= self.vocabulary_size, lambda: 1, lambda: tf.cast(idx, tf.int32))
array_string = array_string.write(idx, array_string.read(idx) + 1.0)
outputs = outputs.write(i, array_string.stack())
return outputs.stack()
encoder = get_encoder(args.small_vocab_size)
bag_of_words = BagOfWords(args.small_vocab_size)
model = keras.models.Sequential()
model.add(encoder)
model.add(bag_of_words)
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
for d in train_dataset.batch(args.batch_size).take(1):
model(d[0])
model.compile(optimizer=keras.optimizers.Nadam(learning_rate=1e-3),
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()

Multi class confusion matrix Keras tool

I'm doing a neural network designed to classify between 10 different compounds, the data set is something like:
array([[400. , 23. , 52.38, ..., 1. , 0. , 0. ],
[400. , 21.63, 61.61, ..., 0. , 0. , 0. ],
[400. , 21.49, 61.95, ..., 0. , 0. , 0. ],
...,
[400. , 21.69, 41.98, ..., 0. , 0. , 0. ],
[400. , 22.48, 65.2 , ..., 0. , 0. , 0. ],
[400. , 22.02, 58.91, ..., 0. , 0. , 1. ]])
where the 10 last numbers are the one hot encoded for the compounds I want to identify. This is the code I'm using:
dataset=numpy.asfarray(dataset[1:,0:],float)
x = dataset[0:,0:30]
y = dataset[0:,30:40]
x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.20, random_state=1) #siempre ha sido 42
standard=preprocessing.StandardScaler().fit(x_train)
x_train=standard.transform(x_train)
x_test=standard.transform(x_test)
dump(standard, 'std_modelo_400.bin', compress=True)
model = Sequential()
model.add(Dense(50, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(30, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(15, input_dim = x_test.shape[1], activation = 'relu',kernel_regularizer=keras.regularizers.l1(0.01)))
model.add(Dense(10, activation='softmax',kernel_initializer='normal', bias_initializer=keras.initializers.Constant(value=0)))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
history=model.fit(x_train,y_train,validation_data=(x_test,y_test),verbose=2,epochs=epochs,batch_size=batch_size)#callbacks=[monitor] , verbose=2
I try to get the confusion matrix using the command multilabel_confusion_matrix(y_test,pred) and I get in this form:
array([[[929681, 158],
[ 308, 102180]],
[[930346, 407],
[ 6677, 94897]],
[[930740, 38],
[ 477, 101072]],
[[929287, 1522],
[ 69, 101449]],
[[929703, 8843],
[ 12217, 81564]],
[[902624, 474],
[ 1565, 127664]],
[[931152, 2236],
[ 12140, 86799]],
[[929085, 10],
[ 0, 103232]],
[[911158, 22378],
[ 5362, 93429]],
[[930412, 689],
[ 617, 100609]]], dtype=int64)
When I use multilabel_confusion_matrix(y_test,pred,labels=["Comp1","Comp2","Comp3", "Comp4", "Comp5", "Comp6", "Comp7", "Comp8", "Comp9", "Comp10",]) I get an error:
elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask &= (ar1 != a)
Traceback (most recent call last):
File "<ipython-input-18-00af06ffcbef>", line 1, in <module>
multilabel_confusion_matrix(y_test,pred,labels=["Comp1","Comp2","Comp3", "Comp4", "Comp5", "Comp6", "Comp7", "Comp8", "Comp9", "Comp10",])
File "C:\Users\fmarin\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py", line 485, in multilabel_confusion_matrix
if np.max(labels) > np.max(present_labels):
I have no idea how to fix it. I also like to get the graphic version of the confusion matrix, I'm using scikit-learn toolbox.
Thank you!

Keras masking zero before softmax

Suppose that I have the following output from an LSTM layer
[0. 0. 0. 0. 0.01843184 0.01929785 0. 0. 0. 0. 0. 0. ]
and I want to apply softmax on this output but I want to mask the 0's first.
When I used
mask = Masking(mask_value=0.0)(lstm_hidden)
combined = Activation('softmax')(mask)
It didnt work. Any ideas?
Update: The output from the LSTM hidden is (batch_size, 50, 4000)

You can define custom activation to achieve it. This is equivalent to mask 0.
from keras.layers import Activation,Input
import keras.backend as K
from keras.utils.generic_utils import get_custom_objects
import numpy as np
import tensorflow as tf
def custom_activation(x):
x = K.switch(tf.is_nan(x), K.zeros_like(x), x) # prevent nan values
x = K.switch(K.equal(K.exp(x),1),K.zeros_like(x),K.exp(x))
return x/K.sum(x,axis=-1,keepdims=True)
lstm_hidden = Input(shape=(12,))
get_custom_objects().update({'custom_activation': Activation(custom_activation)})
combined = Activation(custom_activation)(lstm_hidden)
x = np.array([[0.,0.,0.,0.,0.01843184,0.01929785,0.,0.,0.,0.,0.,0. ]])
with K.get_session()as sess:
print(combined.eval(feed_dict={lstm_hidden:x}))
[[0. 0. 0. 0. 0.49978352 0.50021654
0. 0. 0. 0. 0. 0. ]]

Returning probabilities in a classification prediction in Keras?

I am trying to make a simple proof-of-concept where I can see the probabilities of different classes for a given prediction.
However, everything I try seems to only output the predicted class, even though I am using a softmax activation. I am new to machine learning, so I'm not sure if I am making a simple mistake or if this is a feature not available in Keras.
I'm using Keras + TensorFlow. I have adapted one of the basic examples given by Keras for classifying the MNIST dataset.
My code below is exactly the same as the example, except for a few (commented) extra lines that exports the model to a local file.
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
import h5py # added import because it is required for model.save
model_filepath = 'test_model.h5' # added filepath config
batch_size = 128
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
model.save(model_filepath) # added saving model
print('Model saved') # added log
Then the second part of this is a simple script that should import the model, predict the class for some given data, and print out the probabilities for each class. (I am using the same mnist class included with the Keras codebase to make an example as simple as possible).
import keras
from keras.datasets import mnist
from keras.models import Sequential
import keras.backend as K
import numpy
# loading model saved locally in test_model.h5
model_filepath = 'test_model.h5'
prev_model = keras.models.load_model(model_filepath)
# these lines are copied from the example for loading MNIST data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
# for this example, I am only taking the first 10 images
x_slice = x_train[slice(1, 11, 1)]
# making the prediction
prediction = prev_model.predict(x_slice)
# logging each on a separate line
for single_prediction in prediction:
print(single_prediction)
If I run the first script to export the model, then the second script to classify some examples, I get the following output:
[ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
This is great for seeing which class each is predicted to be, but what if I want to see the relative probabilities of each class for each example? I am looking for something more like this:
[ 0.94 0.01 0.02 0. 0. 0.01 0. 0.01 0.01 0.]
[ 0. 0. 0. 0. 0.51 0. 0. 0. 0.49 0.]
...
In other words, I need to know how sure each prediction is, not just the prediction itself. I thought seeing the relative probabilities was a part of using a softmax activation in the model, but I can't seem to find anything in the Keras documentation that would give me probabilities instead of the predicted answer. Am I making some kind of silly mistake, or is this feature not available?

So it turns out that the problem was I was not fully normalizing the data in the prediction script.
My prediction script should have had the following lines:
# these lines are copied from the example for loading MNIST data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_train = x_train.astype('float32') # this line was missing
x_train /= 255 # this line was missing too
Because the data was not cast to float, and divided by 255 (so it would be between 0 and 1), it was just showing up as 1s and 0s.

Keras predict indeed returns probabilities, and not classes.
Cannot reproduce your issue with my system configuration:
Python version 2.7.12
Tensorflow version 1.3.0
Keras version 2.0.9
Numpy version 1.13.3
Here is my prediction output for your x_slice with the loaded model (trained for 20 epochs, as in your code):
print(prev_model.predict(x_slice))
# Result:
[[ 1.00000000e+00 3.31656316e-37 1.07806675e-21 7.11765177e-30
2.48000320e-31 5.34837679e-28 3.12470132e-24 4.65175406e-27
8.66994134e-31 5.26426367e-24]
[ 0.00000000e+00 5.34361977e-30 3.91144999e-35 0.00000000e+00
1.00000000e+00 0.00000000e+00 1.05583665e-36 1.01395577e-29
0.00000000e+00 1.70868685e-29]
[ 3.99137559e-38 1.00000000e+00 1.76682222e-24 9.33333581e-31
3.99846307e-15 1.17745576e-24 1.87529709e-26 2.18951752e-20
3.57518280e-17 1.62027896e-28]
[ 6.48006586e-26 1.48974980e-17 5.60530329e-22 1.81973780e-14
9.12573406e-10 1.95987500e-14 8.08566866e-27 1.17901132e-12
7.33970447e-13 1.00000000e+00]
[ 2.01602060e-16 6.58242856e-14 1.00000000e+00 6.84244084e-09
1.19809885e-16 7.94907624e-14 3.10690434e-19 8.02848586e-12
4.68330721e-11 5.14736501e-15]
[ 2.31014903e-35 1.00000000e+00 6.02224725e-21 2.35928828e-23
7.50006509e-15 4.06930881e-22 1.13288827e-24 4.20440718e-17
4.95182972e-17 1.85492109e-18]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
0.00000000e+00 6.30200370e-27 0.00000000e+00 5.19937755e-33
1.63205659e-31 1.21508034e-20]
[ 1.44608573e-26 1.00000000e+00 1.78712268e-18 6.84598301e-19
1.30042071e-11 2.53873986e-14 5.83169942e-17 1.20201071e-12
2.21844570e-14 3.75015198e-15]
[ 0.00000000e+00 6.29184453e-34 9.22474943e-29 0.00000000e+00
1.00000000e+00 3.05067233e-34 1.43097161e-28 1.34234082e-29
4.28647272e-36 9.29760838e-34]
[ 4.68828449e-30 5.55172479e-20 3.26705529e-19 9.99999881e-01
3.49577992e-22 1.27715460e-11 4.99185615e-36 1.19164204e-20
4.21086124e-16 1.52631387e-07]]
I suspect some rounding issue when printing (or you have trained for much more epochs, and your probabilities for the training set have gotten very close to 1)...
To convince yourself that you indeed get probabilities and not class predictions, I suggest to try getting predictions from your model trained for a single epoch; normally you should see much less 1.0's - here is the case here for a model trained for epochs=1:
print(model.predict(x_slice))
# Result:
[[ 9.99916673e-01 5.36548761e-08 6.10747229e-05 8.21199933e-07
6.64725164e-08 6.78853041e-07 9.09637220e-06 4.56192402e-06
1.62688798e-06 5.23997733e-06]
[ 7.59836894e-07 1.78043920e-05 1.79073555e-04 2.95592145e-05
9.98031914e-01 1.75839632e-05 5.90557102e-06 1.27705920e-03
3.94643757e-06 4.36416740e-04]
[ 4.48473330e-08 9.99895334e-01 2.82608235e-05 5.33154832e-07
9.78453227e-06 1.58954310e-06 3.38150176e-06 5.26260410e-05
8.09341054e-06 3.28643267e-07]
[ 7.38236849e-07 4.80247072e-05 2.81726116e-05 4.77648537e-05
7.21933879e-03 2.52177160e-05 3.88786475e-07 3.56770557e-04
2.83472677e-04 9.91990149e-01]
[ 5.03611082e-05 2.69402866e-04 9.92011130e-01 4.68175858e-03
9.57477605e-05 4.26214538e-04 7.66683661e-05 7.05923303e-04
1.45670515e-03 2.26032615e-04]
[ 1.36330849e-10 9.99994516e-01 7.69141934e-07 1.44130311e-07
9.52201333e-07 1.45219332e-07 4.43408908e-07 6.93398249e-07
2.18685204e-06 1.50741769e-07]
[ 2.39427478e-09 3.75754922e-07 3.89349816e-06 9.99889374e-01
1.85837867e-09 1.16176770e-05 1.89989760e-11 3.12301523e-07
1.13220040e-05 8.29571582e-05]
[ 1.45760115e-08 9.99900222e-01 3.67058942e-06 4.04857201e-06
1.97999962e-05 7.85745397e-06 8.13850420e-06 1.87294081e-05
2.81870762e-05 9.38157609e-06]
[ 7.52560858e-09 8.84437856e-09 9.71140025e-07 5.20911703e-10
9.99986649e-01 3.12135370e-07 1.06521384e-05 1.25693066e-06
7.21853368e-08 5.21001624e-08]
[ 8.67672298e-08 2.17907742e-04 2.45352840e-06 9.95455265e-01
1.43749105e-06 1.51766278e-03 1.83744309e-08 3.83995541e-07
9.90309782e-05 2.70584645e-03]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

ValueError: No Gradients provided for any variable: - python

Related

MinMax inverse_transform after LSTM + Dense(1) layers

How to define a new Tensor with a dynamic shape to support batching in a custom layer

Multi class confusion matrix Keras tool

Keras masking zero before softmax

Returning probabilities in a classification prediction in Keras?

Categories

Resources