Merging LSTM and 1D CNN layers - python

I'm trying to test Non-Intrusive Load Monitoring algorithm which was Proposed by Kelly, 2015. The proposed architecture was,
Input (length determined by appliance duration)
1D conv (lter size=4, stride=1, number of lters=16,activation function=linear, border mode=same)
bidirectional LSTM (N=128, with peepholes)
bidirectional LSTM (N=256, with peepholes)
Fully connected (N=128, activation function=TanH)
Fully connected (N=1, activation function=linear)
Now I'm going to test this architecture in TensorFlow. I have an initial training set like fallows,
x_train_temp = (243127,)
y_train_temp = (243127,)
Then I converted these data into windows like below where each window has 250 x1 arrays.
x_train = (972, 250, 1)
y_train = (972, 250, 1)
When I'm implementing my model, It gives an error. Could you please help me to understand where the error is?
Model
input_shape=x_train.shape[1:]
model = Sequential()
model.add(tf.keras.layers.Conv1D(16,4, strides=1,activation='linear',input_shape=(x_train.shape[1:])))
model.add(tf.keras.layers.LSTM(128))
model.add(tf.keras.layers.LSTM(256))
model.add(tf.keras.layers.LSTM(128, activation='relu'))
model.add(tf.keras.layers.LSTM(128, activation='linear'))
print(model.summary())
Error
ValueError Traceback (most recent call last)
<ipython-input-47-4f5e2441909e> in <module>()
4 model.add(tf.keras.layers.Conv1D(16,4, strides=1,activation='linear',input_shape=(x_train.shape[1:])))
5 model.add(tf.keras.layers.LSTM(128))
----> 6 model.add(tf.keras.layers.LSTM(256))
7 model.add(tf.keras.layers.LSTM(128, activation='relu'))
8 model.add(tf.keras.layers.LSTM(128, activation='linear'))
5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
178 'expected ndim=' + str(spec.ndim) + ', found ndim=' +
179 str(ndim) + '. Full shape received: ' +
--> 180 str(x.shape.as_list()))
181 if spec.max_ndim is not None:
182 ndim = x.shape.ndims
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 128]

When using stacked LSTM layers use return_sequences=True as mentioned by #Ather Cheema. LSTM expects input of shape 3D tensor with shape [batch, timesteps, feature]. It is highly recommended to set return_sequences=True
return_sequences: Boolean. Whether to return the last output. in the output sequence, or the full sequence. Default: False.
Working sample code snippet
Without retuen_sequences
import tensorflow as tf
inputs = tf.random.normal([32, 10, 8])
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
Output
(32, 4)
#With return_sequences=True
import tensorflow as tf
inputs = tf.random.normal([32, 10, 8])
lstm = tf.keras.layers.LSTM(4, return_sequences=True, return_state=True)
whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
print(whole_seq_output.shape)
Output
(32, 10, 4)

Related

Tensorflow Keras ValueError on input shape

I am doing a simple Conv1D using TensorFlow Keras to try out a time-series dataset.
Data:
train_df = dff[:177] #get train data
tdf = train_df.shape #get shape = (177,4)
test = tf.convert_to_tensor(train_df)
Model:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters=32,
kernel_size=1,
strides=1,
padding="causal",
activation="relu",
input_shape=tdf),
tf.keras.layers.MaxPooling1D(pool_size=2, strides=1, padding="valid")
])
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(5e-4,
decay_steps=1000000,
decay_rate=0.98,
staircase=False)
model.compile(loss=tf.keras.losses.MeanSquaredError(),
optimizer=tf.keras.optimizers.SGD(learning_rate=lr_schedule, momentum=0.8),
metrics=['mae'])
model.summary()
Summary:
Model: "sequential_13"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_16 (Conv1D) (None, 177, 32) 160
_________________________________________________________________
max_pooling1d_8 (MaxPooling1 (None, 176, 32) 0
=================================================================
Total params: 160
Trainable params: 160
Non-trainable params: 0
Fit:
trainedModel = model.fit(test,
epochs=100,
steps_per_epoch=1,
verbose=1)
Error raised # Fit:
ValueError: Input 0 of layer sequential_13 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (2, 1)
From the various SO, it was said that this is due to the input data shape. So I tried a recommendation in a SO to reshape the my data and re-input it
Reshape:
X_train=np.reshape(test,(test.shape[0], test.shape[1],1))
Error raised # Fit after reshaping:
ValueError: Input 0 of layer sequential_14 is incompatible with the layer: expected axis -1 of input shape to have value 4 but received input with shape (177, 4, 1)
I am at a loss here. What is the way to tackle this?
Current Parametric values:
tdf = (177,4)
My assumption - "You have 4 features for 177 training samples".
The Reason of current Error - The model assumes that each sample is of shape (177,4) but when you try to pass it to the model the error comes which is
ValueError: Input 0 of layer sequential_13 is incompatible with the layer: :
expected min_ndim=3, found ndim=2. Full shape received: (2, 1)
This error says that the model expected the input to have a 3D dimension which in this case is that model wants a batch size. Say a batch of 16 images of height=177 and width=4. (Although you don't have images but model expects this cause of how you specified your input shape). That means the input should have a shape - (batch_size, 177, 4).
This could be solved by passing a parameter batch_size=1 in the model.fit. As below (Without reshaping the data)
trainedModel = model.fit(data,
epochs=100,
steps_per_epoch=1,
batch_size=16,
verbose=1)
But this will give another error which is as below
ValueError: Input 0 of layer sequential_1 is incompatible with the layer: :
expected min_ndim=3, found ndim=2. Full shape received: (None, 4)
Now this error mean that the input passed to the model had some batch_size represented by None and a feature vector of shape 4 but the model expects the input to have a shape of (batch_size, height, width). Here batch_size is expected by all the models but the rest 2 are specified by us while defining the input shape. We have defined this here:
tf.keras.layers.Conv1D(filters=32,
kernel_size=1,
strides=1,
padding="causal",
activation="relu",
input_shape=tdf), # Here We save input_shape = (177,4)
As you can see the input_shape has been defined as height=177, width=4. (I used height and width for easier explanation otherwise there is no height/width only the dimension number). BUT, we wanted the model to take input of 4 features. So, now we have to change this to below:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters=32,
kernel_size=1,
strides=1,
padding="causal",
activation="relu",
input_shape=(4,)),
tf.keras.layers.MaxPooling1D(pool_size=2, strides=1, padding="valid")
])
But now when you try to run this you will get another error as below:
ValueError: Input 0 of layer conv1d_10 is incompatible with the layer: :
expected min_ndim=3, found ndim=2. Full shape received: (None, 4)
The thing to note is the error arise from the Conv1D and this is cause this layer expects the input in 3D including the batch_size but we never specify the batch_size while creating the model the parameter input_shape should have a value like this input_shape = (dim1, dim2) but in case we only have 4 features and hence only dim1 and not dim2. In this case we will reshape our input to have the 4 as (4,1). By this we will have the dim1 = 4 and the dim2 = 1. And we will update our model as below:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters=32,
kernel_size=1,
strides=1,
padding="causal",
activation="relu",
input_shape=(4,1)),
tf.keras.layers.MaxPooling1D(pool_size=2, strides=1, padding="valid")
])
Now we also reshape our inputs to have shape (177,4, 1) as below:
train_df = dff[:177]
train_df = train_df.values.reshape(177, 4, 1)
test = tf.convert_to_tensor(train_df)
Now we can use it to pass into the model.
trainedModel = model.fit(test,
epochs=100,
steps_per_epoch=1,
verbose=1)
Sadly this will give yet another error like below.
ValueError: No gradients provided for any variable: ['conv1d_14/kernel:0',
'conv1d_14/bias:0'].
This is cause the model don't get any Y corresponding to your input X and hence it can't compute the gradients using the loss function and hence can't train. But it can still be used to get the outputs as below:
preds = model(test)
preds.shape # Result -> TensorShape([177, 3, 32])

How to solve "Input 0 of layer conv1d is incompatible with the layer:" error?

Please can you help me understand the error I am getting from the model I am trying to build?
I have a training, validation and test set. The training data has the following shape:
input_shape = train.shape[1:] #(1500,)
I have written the following model using Keras:
input = Input(shape=(input_shape))
# Conv1D + global max pooling
x = layers.Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(input)
x = layers.Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(x)
x = layers.GlobalMaxPooling1D()(x)
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
predictions = layers.Dense(1,kernel_initializer='normal', name="predictions")(x)
model = tf.keras.Model(input, predictions)
model.compile(loss="mean squared error", optimizer="adam", metrics=[concordance_index])
I get the following error:
ValueError Traceback (most recent call last)
<ipython-input-60-59c3578104d3> in <module>()
6
7 # Conv1D + global max pooling
----> 8 x = layers.Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(protein_input)
9 x = layers.Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(x)
10 x = layers.GlobalMaxPooling1D()(x)
5 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
230 ', found ndim=' + str(ndim) +
231 '. Full shape received: ' +
--> 232 str(tuple(shape)))
233 # Check dtype.
234 if spec.dtype is not None:
ValueError: Input 0 of layer conv1d_49 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (None, 1500)
Is my Input layer incorrect? Or is it due to the ordering of Conv1d, MaxPooling layers?
Since a Conv1D layer expects input as batch_shape + (steps, input_dim), you need to add a new dimension. So:
X = tf.expand_dims(X,axis=2)
print(X.shape) # X.shape=(Samples, 1500, 1)
Then, you have X shape as (Samples,150,1)
Now, let's specify the input shape as:
input = Input(shape=(X.shape[1:])

ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 7]

I'm building a RNN and I use LSTM.
The X matrix has this dimension (1824, 7) instead Y has this dim (1824, 1).
This is my model:
num_units = 64
learning_rate = 0.0001
activation_function = 'sigmoid'
adam = Adam(lr=learning_rate)
loss_function = 'mse'
batch_size = 5
num_epochs = 50
# Initialize the RNN
model = Sequential()
model.add(LSTM(units = num_units, activation=activation_function, input_shape=(1824, 7, )))
model.add(LeakyReLU(alpha=0.5))
model.add(Dropout(0.1))
model.add(Dense(units = 1))
# Compiling the RNN
model.compile(optimizer=adam, loss=loss_function, metrics=['accuracy'])
history = model.fit(
X,
y,
validation_split=0.1,
batch_size=batch_size,
epochs=num_epochs,
shuffle=False
)
I know the error is in input_shape parameter. When I try to fit the model I get this error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 7]
I have seen similar questions, And I tried to apply some of that changes, such as:
input_dim = X.shape
input_dim=(7,)
input_dim=(1824, 7, 1)
But in any case I got this kind of error. How can I fix it?
As commented by #Nicolas Gervais,
Tensorflow Keras LSTM expects inputs: A 3D tensor with shape [batch, timesteps, feature].
Working sample code
import tensorflow as tf
inputs = tf.random.normal([32, 10, 8])
print(inputs.shape)
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
Output
(32, 10, 8)
(32, 4)

Using pre-trained BERT embeddings as input to CNN with tensorflow.keras results in ValueError

I am brand new to NLP and deep learning, so I have what is (likely) a very basic problem.
I am trying to create a binary classifier based on pre-trained BERT embeddings as the features. So far I have successfully created the embeddings, and have built a simple Sequential() model with tensorflow.keras. The code below works:
model = tf.keras.Sequential([
Dense(4, activation = 'relu', input_shape = (768,)),
Dense(4, activation = 'relu'),
Dense(1, activation = 'sigmoid')])
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
What I would like to do is adapt this code to now be a CNN. However, when I add a convolutional layer, I get an error:
model = tf.keras.Sequential([
Conv1D(filters = 250, kernel_size = 3, padding='valid', activation='relu', strides=1, input_shape = (768,)),
GlobalMaxPooling1D(),
Dense(4, activation = 'relu'),
Dense(1, activation = 'sigmoid')])
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-23-59695050a94e> in <module>()
3 GlobalMaxPooling1D(),
4 Dense(4, activation = 'relu'),
----> 5 Dense(1, activation = 'sigmoid')])
6
7 model.compile(optimizer = 'adam',
5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
178 'expected ndim=' + str(spec.ndim) + ', found ndim=' +
179 str(ndim) + '. Full shape received: ' +
--> 180 str(x.shape.as_list()))
181 if spec.max_ndim is not None:
182 ndim = x.shape.ndims
ValueError: Input 0 of layer conv1d_2 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 768]
Here is what the data I am using look like.
Features:
train_features[0]
array([-4.97862399e-01, 1.49541467e-01, 5.81708886e-02, 1.63668215e-01,
-2.77605206e-01, 3.57868642e-01, 1.70950562e-01, 2.69330859e-01,
-3.29369396e-01, 2.12891083e-02, -4.02462274e-01, -1.98120754e-02,
-2.18944401e-01, 4.34780568e-01, -2.75409579e-01, 2.03015730e-01,...
train_features[0].shape
(768,)
Labels:
train_labels.iloc[0:3]
turnout
0 73446 0
1 53640 1
16895 1
Name: turnout, dtype: int64
Any advice is much appreciated. Thank you so much!
2D Convolutions need 4D inputs: (batch_size, width1, width2, channels).
Your data is a single array with shape (batch_size, 768). If you really want to use a convolution (if you think there might be a spatial relation in your data), you need to shape it properly before feeding it to your model.
1D Convolutions need 3D inputs: (batch_size, length, channels).

How to ensure arrays have proper dimensions for softmax layer in tensorflow python

I am building my first neural net from scratch with tensorflow and am getting stuck. I am trying to solve a multiclass text sequence classification problem (3 classes). I have been following tf tutorials, but I am not sure what is going wrong.
My input data is as follows:
samples = list of strings of text sequences ranging from 1 to 295 words long
labels = list of 295 integers (one per sample) that are either 0, 1 or 2.
My code:
import tensorflow as tf
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Embedding
from keras import preprocessing
from keras.models import Sequential
from keras.layers import Flatten, Dense
import numpy as np
# tokenize and vectorize text data to prepare for embedding
tokenizer = Tokenizer()
tokenizer.fit_on_texts(samples)
sequences = tokenizer.texts_to_sequences(samples)
word_index = tokenizer.word_index
print(f'Found {len(word_index)} unique tokens.')
# one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')
# setting variables
num_samples = len(word_index) # 1499
# Input_dim: This is the size of the vocabulary in the text data.
input_dim = num_samples # 1499
# output_dim: This is the size of the vector space in which words will be embedded.
output_dim = 32 # recommended by tutorial
# max_sequence_length: This is the length of input sequences
max_sequence_length = len(max(sequences, key=len)) # 295
# train/test index splice variable
training_samples = round(len(samples)*.8)
data = pad_sequences(sequences, maxlen=max_sequence_length)
# preprocess labels
labels = np.asarray(labels)
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices] # shape (499, 295)
labels = to_categorical(labels, num_classes=None, dtype='float32')
# Create test/train data (80% train, 20% test)
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_test = data[training_samples:]
y_test = labels[training_samples:]
# creating embedding layer with shape [amount of unique words in samples, length of longest sample in samples]
# embedding_layer = Embedding(len(word_index)+1, len(max(samples, key=len)))
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length))
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train,
y_train,
epochs=10,
batch_size=32,
validation_data=(x_test, y_test))
Essentially, I am just trying to feed in my text data as tokens into an embedding layer, and dense layer, and then the 3 node softmax layer for classification. When I run this, I get the following error:
Found 1499 unique tokens.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 295, 32) 47968
_________________________________________________________________
dense_1 (Dense) (None, 295, 32) 1056
_________________________________________________________________
dense_2 (Dense) (None, 295, 3) 99
=================================================================
Total params: 49,123
Trainable params: 49,123
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3576: The name tf.log is deprecated. Please use tf.math.log instead.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-94cb754412aa> in <module>()
49 epochs=10,
50 batch_size=32,
---> 51 validation_data=(x_test, y_test))
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
129 ': expected ' + names[i] + ' to have ' +
130 str(len(shape)) + ' dimensions, but got array '
--> 131 'with shape ' + str(data_shape))
132 if not check_batch_axis:
133 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (399, 3)
I need more information to solve the problem you're facing. But I can tell you what's wrong.
You have a target set of size [399, 3]. However, your model summary looks like follows.
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_5 (Embedding) (None, 295, 50) 5000
_________________________________________________________________
dense_12 (Dense) (None, 295, 32) 1632
_________________________________________________________________
dense_13 (Dense) (None, 295, 3) 99
=================================================================
You see that even though you have a list of labels with 2 dimensions, your model expects a set of targets with 3 dimensions.
Now solving that, well... that's a different story. Let's see our options.
You have exactly 295 words in each sample and 399 samples like that.
If you are 100% certain that each sequence in your data is 295 elements long, you can simply solve this problem by adding a Flatten() layer. However, keep in mind that this will increase the amount of parameters in your model with,
The length of your sequence
The embedding size
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length, input_shape=(295,)))
#model.add(lambda x: tf.reduce_mean(x, axis=1))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
You have 399 samples but each sample can have any number of items between 1-295
Now your input can have varying lengths. Therefore, you need to define input_shape argument to the first layer accordingly.
The second problem is that your Embedding layer produces varying size outputs (depending on the sequence length). But the dense layer cannot take variable size inputs. Therefore you need some "squashing" transformation inbetween. One that usually works well with Embeddings is taking the mean. You can do this the following way.
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length, input_shape=(None,)))
model.add(Lambda(lambda x: tf.reduce_mean(x, axis=1)))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
You have this Dense layer in your model which is supposed to be trained and generates the classes:
model.add(Dense(3, activation='softmax'))
So it expects to get 3-dimensional outputs. However, your dataset has 1-dimensional labels with 3 different values (0,1,2). You can easily convert your classes to categorical vectors. For example you can generate these vectors in your dataset:
0 --- (1,0,0)
1 --- (0,1,0)
2 --- (0,0,1)
If you want to automate this process, you can use Keras to_categorical function:
keras.utils.to_categorical(y, num_classes=None, dtype='float32')
For more information about to_categorical visit Keras documentation here

Categories