I would like to build a Recurrent Network using the functional API in Keras but change the output shape. For now, the output shape is (n,1), where n is the number of input vectors, and if I understood correctly the additional dimension represents the number of batches. I would like model.predict to have an output that is of shape (n,) (so that it has the same shape as y_test). I know how to reshape the output after running model.predict, but is there a way to change the network so that running model.predict already has the desired shape?
I also tried to use a Reshape layer but this did not change the output shape.
Would really appreciate any idea & help! Here is a toy example:
#random toy example
X=np.random.rand(100,3)
y=np.random.rand(100)
X_train, X_test = np.vsplit(X,[80])
y_train, y_test = np.split(y,[80])
#define model
inputs=Input(shape=(3,1))
h=Conv1D(filters=64,kernel_size=2, activation='relu')(inputs)
h=MaxPooling1D(pool_size=2)(h)
h=Flatten()(h)
h=Dense(50, activation='relu')(h)
outputs=Dense(1)(h)
model=Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='mse')
#fit model
model.fit(X_train, y_train,verbose=0)
print(model.predict(X_test).shape)
which prints the shape
(20, 1)
Using a Reshape Layer does not change the output shape. Am I using it wrong?
inputs=Input(shape=(3,1))
h=Conv1D(filters=64,kernel_size=2, activation='relu')(inputs)
h=MaxPooling1D(pool_size=2)(h)
h=Flatten()(h)
h=Dense(50, activation='relu')(h)
h=Dense(1)(h)
outputs=Reshape(target_shape=(1,))(h)
model=Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='mse')
#fit model
model.fit(X_train, y_train,verbose=0)
print(model.predict(X_test).shape)
As you understood, most of the tf.keras.layers layers work with the undefined batch dimension of size None. If you need to reshape your output this way, you need to use the Lambda layer.
outputs_1d = Lambda(lambda x: tf.squeeze(x))(outputs)
and,
model=Model(inputs=inputs, outputs=outputs_1d)
Related
I've just began using Keras to train a simple DNN and I'm struggling on setting my custom Loss Function, here's the code of the Model:
X_train = train_dataframe.to_numpy()[:, 0:4]
Y_train = train_dataframe.to_numpy()[:, 4]
model = Sequential()
model.add(Dense(1000, input_shape=(4,), activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(Y_train.shape[0], activation='linear', activity_regularizer=regularizers.l1(0.02)))
def custom_loss(y_true, y_pred):
mse_loss = tf.keras.losses.mean_squared_error(y_true,np.ones((450, 4)) * y_pred)
return mse_loss + y_pred
model.compile("adam", custom_loss(X_train, model.layers[2].output), metrics=["accuracy"])
model.fit(X_train, Y_train, epochs=5, batch_size=1)
I will briefly explain. I got a training set of 450 samples and 4 features for each one as input and a (450,1) numerical vector pared to the training set.
Now, what I would like to obtain is a sort of LASSO regression by applying the activity regularizer on the last layer and then building my custom loss function where I put a MSE between y_true (which is the input) y_pred which is not the output but a simple multiplication of the output layer values with a (450,4) matrix (for semplicity is filled with ones).
My problem is that I got this error when I run the script:
ValueError: Dimensions must be equal, but are 4 and 450 for 'mul' (op: 'Mul') with input shapes:
[450,4], [?,450].
And maybe it is because I'm not extracting well the values of the output layer doing model.layers[2].output. So How can I do this properly using Keras?
I think you are making 2 crucial mistakes:
Don't pass arguments for the loss in .compile keras is smart enough for that:
model.compile(loss=custom_loss, optimizer='adam', metrics=["accuracy"])
If you want apply some multiplication to the last layer then create a custom layer for that, don't do it inside the loss function; the loss function's job is only to find out how far the predicted value are from real one
I am using the mnist dataset(digits), and would like to implement mean squared error loss function, however I have the following error:
ValueError: A target array with shape (60000, 1) was passed for an output of shape (None, 10) while using as loss mean_squared_error. This loss expects targets to have the same shape as the output.
this is my code:
Originally, I tried sparse_categorical_crossentropy
Code modified from: https://www.youtube.com/watch?v=wQ8BIBpya2k
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test,y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis = 1)
x_test = tf.keras.utils.normalize(x_test, axis = 1)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(128, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='sigmoid')
])
model.compile(optimizer='SGD',
loss='mean_squared_error',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
How can I reshape my data so that it works with MSE?
I guess you missing something very important here. You are trying to use a metric used in regression (Mean-Squared-Error) for a classification task (predicting classes). These two objectives are different tasks in the machine learning world.
If like to try it anyway, just reshape your last layer to one output-neuron and ReLU-activation:
tf.keras.layers.Dense(1, activation='relu')
One output neuron and ReLU-activation since your label is just the (integer) numbers from 0 to 9. Sigmoid gives you continuous values between 0 and 1, so this won't bring you any success in this case.
Keep in mind your model doesn't do classification anymore, it will give you a continuous number between 0 and inf. So don't be surprised if you get e.g. 3.1415 as output if you feed an image of a 3 into your model. The model tries now to produce outputs as close as possible to the number in the label.
I am trying to use an LSTM for multi-class classification of time series data.
The training set has dimensions (390, 179), i.e. 390 objects with 179 time steps each.
There are 37 possible classes.
I would like to use a Keras model with just an LSTM and activation layer to classify input data.
I also need the hidden states for all the training data and test data passed through the model, at every step of the LSTM (not just the final state).
I know return_sequences=True is needed, but I'm having trouble getting dimensions to match.
Below is some code I've tried, but I've tried a ton of other combinations of calls from a motley of stackexchange and git issues. In all of them I get some dimension mismatch or another.
I don't know how to extract the hidden state representations from the model.
We have X_train.shape = (390, 1, 179), Y_train.shape = (390, 37) (one-shot binary vectors)/.
n_units = 8
n_sequence = 179
n_class = 37
x = Input(shape=(1, n_sequence))
y = LSTM(n_units, return_sequences=True)(x)
z = Dense(n_class, activation='softmax')(y)
model = Model(inputs=[x], outputs=[y])
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X_train, Y_train, epochs=100, batch_size=128)
Y_test_predict = model.predict(X_test, batch_size=128)
This is what the above gives me:
ValueError: A target array with shape (390, 37) was passed for an output of shape (None, 1, 37) while using as loss 'categorical_crossentropy'. This loss expects targets to have the same shape as the output.
You input shape should like this: (samples, timesteps, features)
Where samples are how many sequences you have, timesteps how long are your sequences, and features how many input you wanna input in one timestep.
If you set return_sequences=True, your label array should have the shape of (samples, timesteps, output features).
There didn't seem to be any way to build a working trainable model while also returning the hidden states with return_sequences=True.
The fix I found was to build a predictor model and train it, and save the weights. Then I built a new model which ended with my LSTM layer, and fed it the trained weights. So, using return_sequences=True, I was able to predict on new data and get the data's representations at each hidden state.
Note: I already read keras forward pass with tensorflow variable as input but it did not help.
I'm training an auto-encoder unsupervised neural-network with Keras with the MNIST database:
import keras, cv2
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255.0
x_test = x_test.reshape(10000, 784).astype('float32') / 255.0
model = Sequential()
model.add(Dense(100, activation='sigmoid', input_shape=(784,)))
model.add(Dense(10, activation='sigmoid'))
model.add(Dense(100, activation='sigmoid'))
model.add(Dense(784, activation='sigmoid'))
model.compile(loss='mean_squared_error', optimizer='sgd')
history = model.fit(x_train, x_train, batch_size=1, epochs=1, verbose=0)
Then I would like to get the output vector when the input vector is x_test[i]:
for i in range(100):
x = x_test[i]
a = model(x)
cv2.imshow('img', a.reshape(28,28))
cv2.waitKey(0)
but I get this error:
All inputs to the layer should be tensors.
How should I modify this code to do a forward pass of an input vector in the neural network, and get a vector in return?
Also how to get the activation after, say, the 2nd layer? i.e. don't propagate until the last layer, but get the output after the 2nd layer.
Example: input: vector of size 784, output: vector of size 10
To run a model after you've finished training it you need to use keras predict(). This will evaluate the graph, given your input data. Note that the input data must be the same dimensions as the specified model inputs, which in your case looks to be [None, 784]. Keras does not require you to specify the batch dimension but you still need a 2D array going in. Do something like..
x = x_test[5]
x = x[numpy.newaxis,:]
out_val = model.predict(x)[0]
if you just want to process a single value.
The numpy.newaxis is required to make a 2D array and thus match your input size. You can skip this if you pass in an array of values to evaluate all at once.
With Keras/Tensorflow, your model is a graph/function, not standard python procedural code. You can't call it with data directly. You need to create functions and then call the functions. To get the output from an intermediate layer you can do something like..
OutFunc = K.function([model.input], [model.layers[2].output])
out_val = OutFunc([x])[0]
again, keep in mind there is a batch dimension on the input which will be produced in the output. There's a number of posts on getting data from intermediate layers if you need some additional examples. For instance see Keras, How to get the output of each layer?
An other way to do this than the accepted answer: when x is just a (784,) or (784,1) numpy array, we can use this:
model.predict([[x]])
with a double [[...]].
I am trying to predict the next value in the time series using the previous 20 values. Here is a sample from my code:
X_train.shape is (15015, 20)
Y_train.shape is (15015,)
EMB_SIZE = 1
HIDDEN_RNN = 3
model = Sequential()
model.add(LSTM(input_shape = (EMB_SIZE,), input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=True))
model.add(LSTM(input_shape = (EMB_SIZE,), input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=False))
model.add(Dense(1))
model.add(Activation('softmax'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
score = model.evaluate(X_test, Y_test, batch_size=128)
print score
Though when I ran my code I got the following error:
TypeError: ('Bad input argument to theano function with name "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py:484" at index 0(0-based)', 'Wrong number of dimensions: expected 3, got 2 with shape (32, 20).')
I was trying to replicate the results in this post: neural networks for algorithmic trading. Here is a link to the git repo: link
It seems to be a conceptual error. Please post any sources where I can get a better understanding of LSTMS for time series prediction. Also please explain me how I fix this error, so that I can reproduce the results mentioned in the article mentioned above.
If I understand your problem correctly, your input data a set of 15015 1D sequences of length 20. According to Keras doc, the input is a 3D tensor with shape (nb_samples, timesteps, input_dim). In your case, the shape of X should then be (15015, 20, 1).
Also, you just need to give input_dim to the first LSTM layer. input_shape is redundant and the second layer will infer its input shape automatically:
model = Sequential()
model.add(LSTM(input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=True))
model.add(LSTM(output_dim=HIDDEN_RNN, return_sequences=False))
LSTM in Keras has an input tensor shape of (nb_samples, timesteps, feature_dim)
In your case, X_train should probably have an input shape of (15015, 20, 1). Just reshape it accordingly and the model should run.