I have a GRU model
GRU = keras.models.Sequential([keras.layers.GRU(32),
keras.layers.Dense(32, activation= 'relu'),
keras.layers.Dense(1, activation=None)])
GRU.compile(loss="mae", optimizer="adam")
resultsGRU = GRU.fit_generator(generator = train, validation_data = train, epochs = 3, verbose= 1, shuffle = False)
If I convert train data to numpy array, I can see I don't have any zero or Nan values (I also dropna values before)
trainArray= np.array(train)
print(trainArray)
I copied only a part of array, just so you can see the values:
[[array([[[-0.86286026, 0.51805955, 1.0427724 , ..., 0.27464896,
0.08823532, -1.1183959 ],
[-0.3186916 , 0.00295895, 0.740636 , ..., 0.27464896,
0.08823532, -1.1304985 ],
[-0.31057638, 0.00295895, 0.5593542 , ..., 0.27464896,
-0.5521559 , -1.1183959 ],
...,
If I print resultsGRU
print(resultsGRU.history.values())
I get
dict_values([[0.597104012966156, 0.5652544498443604, 0.5574262142181396], [0.6241905093193054, 0.6183988451957703, 0.6134349703788757]])
Then I use predict, but values are returned 0
predictGRU = GRU.predict(test)
print(predictGRU)
[0. 0. 0. ... 0. 0. 0.]
I then save this model and use it for API and the values are NaN.
What is the problem here? How do I get the model to predict a different, reasonable value?
I also use metrics later on
print(metrics.mean_absolute_error(test, predictGRU))
print(metrics.mean_squared_error(test, predictGRU))
print(metrics.explained_variance_score(test, predictGRU))
And I get normal numbers
0.6471065
0.50334525
0.23076766729354858
I don't know how to fix this on my own.
My initial data is:
[[ 8375.5 0. 8374.14285714 8374.14285714]
[ 8354.5 0. 8383.39285714 8371.52380952]
...
[11060. 0. 11055.21428571 11032.53702732]
[11076.5 0. 11061.60714286 11038.39875701]]
I create MinMax scaler to transform data to values from 0 to 1
scaler = MinMaxScaler(feature_range = (0, 1))
T = scaler.fit_transform(T)
So data now is:
[[0.5186697 , 0. , 0.46812344, 0.46950912],
[0.5161844 , 0. , 0.46935928, 0.46915412],
...,
[0.72264636, 0. , 0.6767292 , 0.6807525 ],
[0.7198651 , 0. , 0.6785377 , 0.6833385 ]]
I do some magic to prepare this data for LSTM layer and this is the result:
X_train variable of shape (6989, 4, 200)
[[[0.5186697 0. 0.46812344 ... 0. 0.45496237 0.45219505]
[0.48742527 0. 0.45273864 ... 0. 0.43144143 0.431924 ]
[0.4800284 0. 0.43054438 ... 0. 0.425362 0.4326681 ]
[0.5007989 0. 0.4290794 ... 0. 0.4696839 0.47831726]]
...
[[0.61240304 0. 0.57254803 ... 0. 0.5749577 0.57792616]
[0.61139715 0. 0.5746571 ... 0. 0.5971378 0.6017289 ]
[0.6365465 0. 0.59772 ... 0. 0.62671924 0.63145673]
[0.65719867 0. 0.62684333 ... 0. 0.6757128 0.6772785 ]]]
I process the data using this model with Dense(1) layer at the end:
model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', #return_sequences = True,
input_shape = (X_train.shape[1], window_size)))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'linear'))
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
And when I set return_sequences to false the shape of new data after fit is (6989, 1) and when I want to inverse_transform scaler.inverse_transform(train_predict) using this scalar I get an error:
ValueError: non-broadcastable output operand with shape (6989,1) doesn't match the broadcast shape (6989,4)
When I do set return_sequences to true the new shape is (6989, 4, 1) and when I inverse_transform I get other error:
ValueError: Found array with dim 3. None expected <= 2.
============
I think I know why I get these errors, because scaler requires shape of (6989,4), but what and how can I do to transform this data so that I will be able to inverse_transform?
How can I inverse_transform the data of new shape of (6989, 1)?
How can I inverse_transform the data of new shape of (6989, 4, 1)?
Is it doable? My scaler can be used? Or should I create new scaler? Can you suggest something? What am I missing?
I will appreciate any help, thanks!
Based on a matrix, I am trying to approximate a value (regression). However, the CNN always predicts a matrix which is identical to the input of predict.
I am not getting any errors.
The data (matrices) used for training are stored in a numpy array but I only have around 9000 samples available. The values for each matrix are stored in a one dimensional array (one value for each matrix).
This is my model:
model = keras.Sequential([
layers.Conv2D(64, kernel_size=3, activation='selu', input_shape=(8, 8, 1)),
layers.Conv2D(64, kernel_size=3, activation='selu'),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=2, activation='selu'),
layers.Flatten(),
layers.Dense(1, activation='linear')
])
optimizer = keras.optimizers.RMSprop(0.001)
model.compile(optimizer=optimizer,
loss='mean_squared_error',
metrics=['mean_squared_error'])
model.fit(matrices, values, epochs=10)
test_loss = model.evaluate(test_boards, test_values, verbose=2)
Example output when calling prediction = model.predict(some_matrix) can be found below. In this case some_matrix is equal to the output below.
[[ 51. 0. 33. 0. 100. 33. 0. 51.]
[ 10. 10. 10. 0. 0. 10. 10. 10.]
[ 0. 0. 32. 0. 0. 32. 0. 0.]
[ 0. 0. 0. 88. 10. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. -10. 0. -32. 0. 0.]
[ -10. -10. -10. 0. 0. -10. -10. -10.]
[ -51. -32. -33. -88. -100. -33. 0. -51.]]
What am I missing to get a single value as output? Or at least a modified version of the input?
Edit:
My matrix data (did not fit in a free pastebin account, sorry)
My values
An example google colab file
I did not find a way to provide the data into Google Colab and include them in the link, I'm sorry for the inconvenience.
I did get an error this time which I did not get when running the code in my own environment. This is definitely the issue but I am still unaware of how to fix this.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-595f98617fa0> in <module>()
97 [ -51, -32, -33, -88, -100, -33, 0, -51,]])
98 print(test_boards[0])
---> 99 prediction = model.predict(test_boards[0])
100 print("Prediction:")
101 print(prediction)
3 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
561 ': expected ' + names[i] + ' to have ' +
562 str(len(shape)) + ' dimensions, but got array '
--> 563 'with shape ' + str(data_shape))
564 if not check_batch_axis:
565 data_shape = data_shape[1:]
ValueError: Error when checking input: expected conv2d_12_input to have 4 dimensions, but got array with shape (8, 8, 1)
You need to add the batch size dimension to the test sample.
some_matrix = some_matrix[np.newaxis,:,:,np.newaxis]
I have recently started learning how to build LSTM model for multivariate time series data. I have looked here and here on how to pad sequences and implement many-to-many LSTM model. I have created a dataframe to test the model but I keep getting an error (below).
d = {'ID':['a12', 'a12','a12','a12','a12','b33','b33','b33','b33','v55','v55','v55','v55','v55','v55'], 'Exp_A':[2.2,2.2,2.2,2.2,2.2,3.1,3.1,3.1,3.1,1.5,1.5,1.5,1.5,1.5,1.5],
'Exp_B':[2.4,2.4,2.4,2.4,2.4,1.2,1.2,1.2,1.2,1.5,1.5,1.5,1.5,1.5,1.5],
'A':[0,0,1,0,1,0,1,0,1,0,1,1,1,0,1], 'B':[0,0,1,1,1,0,0,1,1,1,0,0,1,0,1],
'Time_Interval': ['11:00:00', '11:10:00', '11:20:00', '11:30:00', '11:40:00',
'11:00:00', '11:10:00', '11:20:00', '11:30:00',
'11:00:00', '11:10:00', '11:20:00', '11:30:00', '11:40:00', '11:50:00']}
df = pd.DataFrame(d)
df.set_index('Time_Interval', inplace=True)
I tried to pad using brute force:
from keras.preprocessing.sequence import pad_sequences
x1 = df['A'][df['ID']== 'a12']
x2 = df['A'][df['ID']== 'b33']
x3 = df['A'][df['ID']== 'v55']
mx = df['ID'].size().max() # Find the largest group
seq1 = [x1, x2, x3]
padded1 = np.array(pad_sequences(seq1, maxlen=6, dtype='float32')).reshape(-1,mx,1)
In similar ways I have created padded2, padded3 and padded4 for each feature:
padded_data = np.dstack((padded1, padded1, padded3, padded4))
padded_data.shape = (3, 6, 4)
padded_data
array([[[0. , 0. , 0. , 0. ],
[0. , 0. , 2.2, 2.4],
[0. , 0. , 2.2, 2.4],
[1. , 1. , 2.2, 2.4],
[0. , 0. , 2.2, 2.4],
[1. , 1. , 2.2, 2.4]],
[[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0. , 3.1, 1.2],
[1. , 1. , 3.1, 1.2],
[0. , 0. , 3.1, 1.2],
[1. , 1. , 3.1, 1.2]],
[[0. , 0. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5],
[0. , 0. , 1.5, 1.5],
[1. , 1. , 1.5, 1.5]]], dtype=float32)
edit
#split into train/test
train = pad_1[:2] # train on the 1st two samples.
test = pad_1[-1:]
train_X = train[:,:-1] # one step ahead prediction.
train_y = train[:,1:]
test_X = test[:,:-1] # test on the last sample
test_y = test[:,1:]
# check shapes
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)
#(2, 5, 4) (2, 5, 4) (1, 5, 4) (1, 5, 4)
# design network
model = Sequential()
model.add(Masking(mask_value=0., input_shape=(train.shape[1], train.shape[2])))
model.add(LSTM(32, input_shape=(train.shape[1], train.shape[2]), return_sequences=True))
model.add(Dense(4))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
# fit network
history = model.fit(train, test, epochs=300, validation_data=(test_X, test_y), verbose=2, shuffle=False)
[![enter image description here][3]][3]
So my questions are:
Surely, there must be an efficient way of transforming the data?
Say I want a single time-step prediction for future sequence, I have
first time-step = array([[[0.5 , 0.9 , 2.5, 3.5]]], dtype=float32)
Where first time-step is a single 'frame' of a sequence.
How do adjust the model to incorporate this?
To resolve the error, remove return_sequence=True from the LSTM layer arguments (since with this architecture you have defined, you only need the output of last layer) and also simply use train[:, -1] and test[:, -1] (instead of train[:, -1:] and test[:, -1:]) to extract the labels (i.e. removing : causes the second axis to be dropped and therefore makes the labels shape consistent with the output shape of the model).
As a side note, wrapping a Dense layer inside a TimeDistributed layer is redundant, since the Dense layer is applied on the last axis.
Update: As for the new question, either pad the input sequence which has only one timestep to make it have a shape of (5,4), or alternatively set the input shape of the first layer (i.e. Masking) to input_shape=(None, train.shape[2]) so the model can work with inputs of varying length.
I am trying to make a simple proof-of-concept where I can see the probabilities of different classes for a given prediction.
However, everything I try seems to only output the predicted class, even though I am using a softmax activation. I am new to machine learning, so I'm not sure if I am making a simple mistake or if this is a feature not available in Keras.
I'm using Keras + TensorFlow. I have adapted one of the basic examples given by Keras for classifying the MNIST dataset.
My code below is exactly the same as the example, except for a few (commented) extra lines that exports the model to a local file.
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
import h5py # added import because it is required for model.save
model_filepath = 'test_model.h5' # added filepath config
batch_size = 128
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
model.save(model_filepath) # added saving model
print('Model saved') # added log
Then the second part of this is a simple script that should import the model, predict the class for some given data, and print out the probabilities for each class. (I am using the same mnist class included with the Keras codebase to make an example as simple as possible).
import keras
from keras.datasets import mnist
from keras.models import Sequential
import keras.backend as K
import numpy
# loading model saved locally in test_model.h5
model_filepath = 'test_model.h5'
prev_model = keras.models.load_model(model_filepath)
# these lines are copied from the example for loading MNIST data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
# for this example, I am only taking the first 10 images
x_slice = x_train[slice(1, 11, 1)]
# making the prediction
prediction = prev_model.predict(x_slice)
# logging each on a separate line
for single_prediction in prediction:
print(single_prediction)
If I run the first script to export the model, then the second script to classify some examples, I get the following output:
[ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
This is great for seeing which class each is predicted to be, but what if I want to see the relative probabilities of each class for each example? I am looking for something more like this:
[ 0.94 0.01 0.02 0. 0. 0.01 0. 0.01 0.01 0.]
[ 0. 0. 0. 0. 0.51 0. 0. 0. 0.49 0.]
...
In other words, I need to know how sure each prediction is, not just the prediction itself. I thought seeing the relative probabilities was a part of using a softmax activation in the model, but I can't seem to find anything in the Keras documentation that would give me probabilities instead of the predicted answer. Am I making some kind of silly mistake, or is this feature not available?
So it turns out that the problem was I was not fully normalizing the data in the prediction script.
My prediction script should have had the following lines:
# these lines are copied from the example for loading MNIST data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_train = x_train.astype('float32') # this line was missing
x_train /= 255 # this line was missing too
Because the data was not cast to float, and divided by 255 (so it would be between 0 and 1), it was just showing up as 1s and 0s.
Keras predict indeed returns probabilities, and not classes.
Cannot reproduce your issue with my system configuration:
Python version 2.7.12
Tensorflow version 1.3.0
Keras version 2.0.9
Numpy version 1.13.3
Here is my prediction output for your x_slice with the loaded model (trained for 20 epochs, as in your code):
print(prev_model.predict(x_slice))
# Result:
[[ 1.00000000e+00 3.31656316e-37 1.07806675e-21 7.11765177e-30
2.48000320e-31 5.34837679e-28 3.12470132e-24 4.65175406e-27
8.66994134e-31 5.26426367e-24]
[ 0.00000000e+00 5.34361977e-30 3.91144999e-35 0.00000000e+00
1.00000000e+00 0.00000000e+00 1.05583665e-36 1.01395577e-29
0.00000000e+00 1.70868685e-29]
[ 3.99137559e-38 1.00000000e+00 1.76682222e-24 9.33333581e-31
3.99846307e-15 1.17745576e-24 1.87529709e-26 2.18951752e-20
3.57518280e-17 1.62027896e-28]
[ 6.48006586e-26 1.48974980e-17 5.60530329e-22 1.81973780e-14
9.12573406e-10 1.95987500e-14 8.08566866e-27 1.17901132e-12
7.33970447e-13 1.00000000e+00]
[ 2.01602060e-16 6.58242856e-14 1.00000000e+00 6.84244084e-09
1.19809885e-16 7.94907624e-14 3.10690434e-19 8.02848586e-12
4.68330721e-11 5.14736501e-15]
[ 2.31014903e-35 1.00000000e+00 6.02224725e-21 2.35928828e-23
7.50006509e-15 4.06930881e-22 1.13288827e-24 4.20440718e-17
4.95182972e-17 1.85492109e-18]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
0.00000000e+00 6.30200370e-27 0.00000000e+00 5.19937755e-33
1.63205659e-31 1.21508034e-20]
[ 1.44608573e-26 1.00000000e+00 1.78712268e-18 6.84598301e-19
1.30042071e-11 2.53873986e-14 5.83169942e-17 1.20201071e-12
2.21844570e-14 3.75015198e-15]
[ 0.00000000e+00 6.29184453e-34 9.22474943e-29 0.00000000e+00
1.00000000e+00 3.05067233e-34 1.43097161e-28 1.34234082e-29
4.28647272e-36 9.29760838e-34]
[ 4.68828449e-30 5.55172479e-20 3.26705529e-19 9.99999881e-01
3.49577992e-22 1.27715460e-11 4.99185615e-36 1.19164204e-20
4.21086124e-16 1.52631387e-07]]
I suspect some rounding issue when printing (or you have trained for much more epochs, and your probabilities for the training set have gotten very close to 1)...
To convince yourself that you indeed get probabilities and not class predictions, I suggest to try getting predictions from your model trained for a single epoch; normally you should see much less 1.0's - here is the case here for a model trained for epochs=1:
print(model.predict(x_slice))
# Result:
[[ 9.99916673e-01 5.36548761e-08 6.10747229e-05 8.21199933e-07
6.64725164e-08 6.78853041e-07 9.09637220e-06 4.56192402e-06
1.62688798e-06 5.23997733e-06]
[ 7.59836894e-07 1.78043920e-05 1.79073555e-04 2.95592145e-05
9.98031914e-01 1.75839632e-05 5.90557102e-06 1.27705920e-03
3.94643757e-06 4.36416740e-04]
[ 4.48473330e-08 9.99895334e-01 2.82608235e-05 5.33154832e-07
9.78453227e-06 1.58954310e-06 3.38150176e-06 5.26260410e-05
8.09341054e-06 3.28643267e-07]
[ 7.38236849e-07 4.80247072e-05 2.81726116e-05 4.77648537e-05
7.21933879e-03 2.52177160e-05 3.88786475e-07 3.56770557e-04
2.83472677e-04 9.91990149e-01]
[ 5.03611082e-05 2.69402866e-04 9.92011130e-01 4.68175858e-03
9.57477605e-05 4.26214538e-04 7.66683661e-05 7.05923303e-04
1.45670515e-03 2.26032615e-04]
[ 1.36330849e-10 9.99994516e-01 7.69141934e-07 1.44130311e-07
9.52201333e-07 1.45219332e-07 4.43408908e-07 6.93398249e-07
2.18685204e-06 1.50741769e-07]
[ 2.39427478e-09 3.75754922e-07 3.89349816e-06 9.99889374e-01
1.85837867e-09 1.16176770e-05 1.89989760e-11 3.12301523e-07
1.13220040e-05 8.29571582e-05]
[ 1.45760115e-08 9.99900222e-01 3.67058942e-06 4.04857201e-06
1.97999962e-05 7.85745397e-06 8.13850420e-06 1.87294081e-05
2.81870762e-05 9.38157609e-06]
[ 7.52560858e-09 8.84437856e-09 9.71140025e-07 5.20911703e-10
9.99986649e-01 3.12135370e-07 1.06521384e-05 1.25693066e-06
7.21853368e-08 5.21001624e-08]
[ 8.67672298e-08 2.17907742e-04 2.45352840e-06 9.95455265e-01
1.43749105e-06 1.51766278e-03 1.83744309e-08 3.83995541e-07
9.90309782e-05 2.70584645e-03]]