merging recurrent layers with dense layer in Keras - python

I want to build a neural network where the two first layers are feedforward and the last one is recurrent.
here is my code :
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(SimpleRNN(2,init='normal'))
adam =OP.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")
and I get this error :
Exception: Input 0 is incompatible with layer simplernn_11: expected ndim=3, found ndim=2.
model.compile(loss='mse', optimizer=adam)

It is correct that in Keras, RNN layer expects input as (nb_samples, time_steps, input_dim). However, if you want to add RNN layer after a Dense layer, you still can do that after reshaping the input for the RNN layer. Reshape can be used both as a first layer and also as an intermediate layer in a sequential model. Examples are given below:
Reshape as first layer in a Sequential model
model = Sequential()
model.add(Reshape((3, 4), input_shape=(12,)))
# now: model.output_shape == (None, 3, 4)
# note: `None` is the batch dimension
Reshape as an intermediate layer in a Sequential model
model.add(Reshape((6, 2)))
# now: model.output_shape == (None, 6, 2)
For example, if you change your code in the following way, then there will be no error. I have checked it and the model compiled without any error reported. You can change the dimension as per your need.
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, Reshape
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(Reshape((1, 80)))
model.add(SimpleRNN(2,init='normal'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")

In Keras, you cannot put a Reccurrent layer after a Dense layer because the Dense layer gives output as (nb_samples, output_dim). However, a Recurrent layer expects input as (nb_samples, time_steps, input_dim). So, a Dense layer gives a 2-D output, but a Recurrent layer expects a 3-D input. However, you can do the reverse, i.e., put a Dense layer after a Recurrent layer.

Related

LSTM Text generation Input_shape

I am attempting to add another LSTM layer to my model but I am only a beginner and I am not very good. I am using the (Better) - Donal Trump Tweets! dataset on Kaggle for LSTM text generation.
I am struggling to get it to run as it returns an Error:
<ValueError: Input 0 of layer lstm_16 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 128]>
My model is:
print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))
# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')
The Model works with only two LSTM layers, two Dropout layers, and one dense layer. I think something is wrong with my setup for input_shape, but I could be wrong. My model is based off of a notebook from the above data set notebook here.
In order to stack RNN's you will have to use return_sequences=True.
From the error it could be seen that the layer was expecting 3 dimentional tensor, but received a 2 dimentional. Here you can read that that return_sequences=True flag will output a 3 dimentional tensor.
If True the full sequences of successive outputs for each timestep is
returned (a 3D tensor of shape (batch_size, timesteps,
output_features)).
Assuming, that there are no issues with your input layer and the input data is passed on correctly, I will propose to try the following model.
print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128, return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))
# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')

How to add an attention layer to LSTM autoencoder built as sequential keras model in python?

So I want to build an autoencoder model for sequence data. I have started to build a sequential keras model in python and now I want to add an attention layer in the middle, but have no idea how to approach this. My model so far:
from keras.layers import LSTM, TimeDistributed, RepeatVector, Layer
from keras.models import Sequential
import keras.backend as K
model = Sequential()
model.add(LSTM(20, activation="relu", input_shape=(time_steps,n_features), return_sequences=False))
model.add(RepeatVector(time_steps, name="bottleneck_output"))
model.add(LSTM(30, activation="relu", return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
model.compile(optimizer="adam", loss="mae")
So far I have tried to add an attention function copied from here
class attention(Layer):
def __init__(self,**kwargs):
super(attention,self).__init__(**kwargs)
def build(self,input_shape):
self.W=self.add_weight(name="att_weight",shape=(input_shape[-1],1),initializer="normal")
self.b=self.add_weight(name="att_bias",shape=(input_shape[1],1),initializer="zeros")
super(attention, self).build(input_shape)
def call(self,x):
et=K.squeeze(K.tanh(K.dot(x,self.W)+self.b),axis=-1)
at=K.softmax(et)
at=K.expand_dims(at,axis=-1)
output=x*at
return K.sum(output,axis=1)
def compute_output_shape(self,input_shape):
return (input_shape[0],input_shape[-1])
def get_config(self):
return super(attention,self).get_config()
and added it after first LSTM, before repeat vector, i.e:
model = Sequential()
model.add(LSTM(20, activation="relu", input_shape=(time_steps,n_features), return_sequences=False))
model.add(attention()) # this is added
model.add(RepeatVector(time_steps, name="bottleneck_output"))
model.add(LSTM(30, activation="relu", return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
model.compile(optimizer="adam", loss="mae")
but the code gives error, because the dimensions somehow do not fit and the problem is in putting output of attention() to repeat vector:
ValueError: Input 0 is incompatible with layer bottleneck_output: expected ndim=2, found ndim=1
.... but according to model.summary() the output dimension of attention layer is (None, 20), which is the same also for the first lstm_1 layer . The code works without attention layer.
I would appreciate also some explanation why the solution is the solution to the problem, I am fairly new to python and have problems understanding what the class attention() is doing. I just copied it and tried to use it which is ofcrs not working....
Ok, I solved it. There has to be return_sequence = True in first LSTM layer. Then it works as it is.

How to fix: ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

I have a question concerning time series data. My training dataset has the dimension (3183, 1, 6)
My model:
model = Sequential()
model.add(LSTM(100, input_shape = (training_input_data.shape[1], training_input_data.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(100, input_shape = (training_input_data.shape[1], training_input_data.shape[2])))
model.add(Dense(1))
model.compile(optimizer = 'adam', loss='mse')
I get the following error at the second LSTM layer:
ValueError: Input 0 is incompatible with layer lstm_2: expected
ndim=3, found ndim=2 But there is no ndim parameter.
The problem is that the first LSTM layer is returning something with shape (batch_size, 100). If you want to iterate with a 2nd LSTM layer, you should probably add the option return_sequences=True in the first LSTM layer (which would then return an object of shape (batch_size, training_input_data.shape[1], 100).
Note that passing input_shape = (..) in the 2nd LSTM is not mandatory, as the input shape of this layer is autoamtically computed based on the output shape of the first one.
You need to set parameter return_sequences=True to stack LSTM layers.
model = Sequential()
model.add(LSTM(
100,
input_shape = (training_input_data.shape[1], training_input_data.shape[2]),
return_sequences=True
))
model.add(Dropout(0.2))
model.add(LSTM(100, input_shape = (training_input_data.shape[1], training_input_data.shape[2])))
model.add(Dense(1))
model.compile(optimizer = 'adam', loss='mse')
See also How to stack multiple lstm in keras?

Unexpected output from Embedding layer

I've been trying to implement an LSTM in Keras for several hours (using a sequential model with an embedding layer, two LSTM layers, and a dense layer), but I wind up getting different error messages.
From what I can tell, the problem is that the output of the embedding layer has two dimensions instead of three, because I get this value error (ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2) when adding the second LSTM layer, and I get the error assert len(input_shape) >= 3 AssertionError when I delete the line for adding the second LSTM layer (which means the dense layer has the same issue).
These error occurs before I call the model's "train" method.
My code is here.
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.layers import Embedding
from keras.layers import TimeDistributed
from keras.preprocessing import text
from keras.preprocessing.sequence import pad_sequences
# The data in X was preprocessed using Keras' built in pad_sequences.
# Before preprocessing, it consisted of plain lists of integers (which
# were just integers wit a one-to-one map to plain words as strings)
X = pad_sequences(X)
model = Sequential()
model.add(Embedding(batch_size=32, input_dim=len(filtered_vocabulary)+1, output_dim=256, input_length=38))
model.add(LSTM(128))
model.add(LSTM(128)) # error occurs in this line
model.add(TimeDistributed(Dense(len(filtered_vocabulary)+1, activation="softmax")))
model.compile(optimizer = "rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, X, epochs=60, batch_size=32)
I'd be glad if any of you could help me out with this.
The error occurs because the first LSTM layer only returns the last output because you haven't specified return_sequences=True in the LSTM layers. This looks like a multi-output setup so you would need to return the LSTM output at every time step using that argument for both layers.
Just to be clear, without return_sequences=True the shape is (None, 128); with the shape is (None, 38, 128).

Popping upper layers in keras

Suppose I have the following pretrained model:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(3, activation='relu', input_dim=5))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
When I run it through the following data (X), I get the shape as expected:
import numpy as np
X = np.random.rand(20, 5)
model.predict(X).shape
giving the shape (20,1)
However, for transfer learning purposes I wish to pop the top layer and run it through the same data.
model.layers.pop()
model.summary()
>>>
Layer (type) Output Shape Param #
=================================================================
dense_3 (Dense) (None, 3) 18
=================================================================
Total params: 18
Trainable params: 18
Non-trainable params: 0
Looking at model.summary() after model.layers.pop() seems to have popped off the top layer. However, running model.predict(X).shape still results in a (20,1) shape and not (20,3) as expected.
Question: How am I supposed to correctly pop off the last few layers. This is an artificial example. In my case I need to delete the last 3 layers.
Found the answer here: https://github.com/keras-team/keras/issues/8909
The following is the answer that is needed. A second model had to be created unfortunately, and for some reason #Eric's answer doesn't seem to work anymore as suggested in the other github issue.
model.layers.pop()
model2 = Model(model.input, model.layers[-1].output)
model2.predict(X).shape
loaded_model = keras.models.load_model(fname)
# remove the last 2 layers
sliced_loaded_model = Sequential(loaded_model.layers[:-2])
# set trainable=Fasle for the layers from loaded_model
for layer in sliced_loaded_model.layers:
layer.trainable = False
# add new layers
sliced_loaded_model.add(Dense(32, activation='relu')) # trainable=True is default
sliced_loaded_model.add(Dense(1))
# compile
sliced_loaded_model.compile(loss='mse', optimizer='adam', metrics=[])
# fit
...
Simply, you can reconstruct the Sequential model

Categories