The input shape and fitting in Keras LSTM model - python

I am learning the LSTM model to fit the data set to the multi-class classification, which is eight genres of music, but unsure about the input shape in the Keras model.
I've followed the tutorials here:
How to reshape input data for LSTM model
Multi-Class Classification Tutorial with the Keras Deep Learning Library
Sequence Classification with LSTM Recurrent Neural Networks in Python with Keras
My data is like this:
vector_1,vector_2,...vector_30,genre
23.5 20.5 3 pop
.
.
.
(7678)
I transformed my data shape into (7678,1,30), which is 7678 pieces of music, 1 timestep, and 30 vectors. For the music genre, I used train_labels = pd.get_dummies(df['genre'])
Here is my model:
# build a sequential model
model = Sequential()
# keras convention to use the (1,30) from the scaled_train
model.add(LSTM(32,input_shape=(1,30),return_sequences=True))
model.add(LSTM(32,return_sequences=True))
model.add(LSTM(32))
# to avoid overfitting
model.add(Dropout(0.3))
# output layer
model.add(Dense(8,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
Fitting the model
model.fit(scaled_train,train_labels,epochs=5,validation_data=(scaled_validation,valid_labels))
But when trying to fit the model, I got the error ValueError: Shapes (None, 8) and (None, 1, 8) are incompatible. Is there anything I did wrong in the code? Any help is highly appreciated.
The shape of my data
print(scaled_train.shape)
print(train_labels.shape)
print(scaled_validation.shape)
print(valid_labels.shape)
(7678, 1, 30)
(7678, 8)
(450, 30)
(450, 8)
EDIT
I've tried How to stack multiple lstm in keras?
But still, get the error ValueError: Input 0 of layer sequential_21 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 30]

As the name suggests, return_sequences=True will return a sequence (with a time step), That's why your output shape is (None, 1, 8): the time step is maintained. It doesn't flatten automatically when it goes through the dense layer. Try:
model = Sequential()
model.add(LSTM(32,input_shape=(1,30),return_sequences=False))
model.add(Dense(32,activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(8,activation='softmax'))
I guess this doesn't happen if you uncomment the second LSTM layer?

Related

Hyperparameter Tuning with Keras Tuner RandomSearch Error

I am using keras tuner to optimize hyperparameters: hidden layers, neurons, activation function, and learning rate. I have time series regression problem with 31 inputs, 32 outputs with N number of data samples.
My original X_train shape is (N,31) and Y_train shape is (N,32). I transform it to work for keras shape and I reshape X_train and Y_train as following:
X_train.shape: (N,31,1)
Y_train.shape: (N,32).
In the above code, X_train.shape(1) is 31 and Y_train.shape(1) is 32. When I used hyperparameter tuning, it says ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 20).
Following Error exists:
What I am missing and what is its issues.
LSTM layers expects a 3D tensor input with the shape [batch, timesteps, feature]. Since you are using number of layers are a tuning parameter along with LSTM layers, when the number of LSTM layers is 2 and above, the LSTM layers after the first LSTM layer will also expect a 3D tensor as input which means that you will need to add the 'return_sequences=True' parameter to the setup so that the output tensor from previous LSTM layer has ndim=3 (i.e. batch size, timesteps, hidden state) which is fed into the next LSTM layer.

Python Keras Sequential model input

I have an array for attempting some times series sliding window method for machine learning forecasting with tf.Keras:
X.shape
(8779, 6, 1)
to fit the MLP model:
# define model
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(6,)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
Could anyone give me a tip on how to correct this model input?
input_shape=(6,)
I cant figure out to how get past this error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 6 but received input with shape (None, 6, 1)
Even though it was solved by a recommendation from comments, here is the solution:
Changing:
input_shape=(6,)
Into:
input_shape=(6,1)
worked.

Add GlobalAveragePooling2D (before ResNet50)

I'm trying to do a model using ResNet50 for image classification into 6 classes and I want to reduce the dimension of the images before using them to train the ResNet50 model. To do this I start creating a ResNet50 model using the model in keras:
ResNet = ResNet50(
include_top= None, weights='imagenet', input_tensor=None, input_shape=([64, 109, 3]),
pooling=None, classes=6)
And then I create a sequential model that includes ResNet50 but adding some final layers for the classification and also the first layer for dimensionality reduction before using ResNet50:
(About the input shape: The images I'm using have a dimension of 128x217 and the 3 is for the channel that ResNet needs)
model = models.Sequential()
model.add(GlobalAveragePooling2D(input_shape = ([128, 217, 3])))
model.add(ResNet)
model.add(GlobalAveragePooling2D())
model.add(Dense(units=512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=6, activation='softmax'))
But this doesn't work because the dimension after the first global average pooling doesn't fit with the input shape in the Resnet, the error I get is:
WARNING:tensorflow:Model was constructed with shape (None, 64, 109, 3) for input Tensor("input_6:0", shape=(None, 64, 109, 3), dtype=float32), but it was called on an input with incompatible shape (None, 3).
ValueError: Input 0 of layer conv1_pad is incompatible with the layer: expected ndim=4, found ndim=2. Full shape received: [None, 3]
I think I understand what is the problem but I don't know how to fix it since (None, 3) is not a valid input shape for ResNet50. How can I fix this? Thank you!:)
You should first understand what GlobalAveragePooling actually does. This layer cannot be apllied right after the input, because it will only give the maximum value of all the images for each channel (in your case 3 values, because you have 3 channels).
You have to use another method to reduce the size of the images (e.g. simple conversion to a smaller size.

Conversion from VGG19 to ResNet in Keras

I have the following code which works on pre-trained VGG model but fails on ResNet and Inception model.
vgg_model = keras.applications.vgg16.VGG16(weights='imagenet')
type(vgg_model)
vgg_model.summary()
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
Now, changing the model to ResNet as follows:
resnet_model=keras.applications.resnet50.ResNet50(weights='imagenet')
type(resnet_model)
resnet_model.summary()
model = Sequential()
for layer in resnet_model.layers:
model.add(layer)
gives the following error:
ValueError: Input 0 is incompatible with layer res2a_branch1: expected axis -1 of input shape to have value 64 but got shape (None, 56, 56, 256)
The problem is due to the fact that unlike VGG, Resnet does not have a sequential architecture (e.g. some layers are connected to more than one layers, there are skip connections, etc.). Therefore you cannot iterate over the layers in the model one after another and connect each layer to the previous one (i.e. sequentially). You can plot the architecture of the model using plot_model() to have a better understanding of this point.

Saving LSTM hidden states while training and predicting for multi-class time series classification

I am trying to use an LSTM for multi-class classification of time series data.
The training set has dimensions (390, 179), i.e. 390 objects with 179 time steps each.
There are 37 possible classes.
I would like to use a Keras model with just an LSTM and activation layer to classify input data.
I also need the hidden states for all the training data and test data passed through the model, at every step of the LSTM (not just the final state).
I know return_sequences=True is needed, but I'm having trouble getting dimensions to match.
Below is some code I've tried, but I've tried a ton of other combinations of calls from a motley of stackexchange and git issues. In all of them I get some dimension mismatch or another.
I don't know how to extract the hidden state representations from the model.
We have X_train.shape = (390, 1, 179), Y_train.shape = (390, 37) (one-shot binary vectors)/.
n_units = 8
n_sequence = 179
n_class = 37
x = Input(shape=(1, n_sequence))
y = LSTM(n_units, return_sequences=True)(x)
z = Dense(n_class, activation='softmax')(y)
model = Model(inputs=[x], outputs=[y])
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X_train, Y_train, epochs=100, batch_size=128)
Y_test_predict = model.predict(X_test, batch_size=128)
This is what the above gives me:
ValueError: A target array with shape (390, 37) was passed for an output of shape (None, 1, 37) while using as loss 'categorical_crossentropy'. This loss expects targets to have the same shape as the output.
You input shape should like this: (samples, timesteps, features)
Where samples are how many sequences you have, timesteps how long are your sequences, and features how many input you wanna input in one timestep.
If you set return_sequences=True, your label array should have the shape of (samples, timesteps, output features).
There didn't seem to be any way to build a working trainable model while also returning the hidden states with return_sequences=True.
The fix I found was to build a predictor model and train it, and save the weights. Then I built a new model which ended with my LSTM layer, and fed it the trained weights. So, using return_sequences=True, I was able to predict on new data and get the data's representations at each hidden state.

Categories