Keras: How to concatenate input layer with output layer? - python

I am trying to replicate the network from:
https://arxiv.org/pdf/1604.07176.pdf
I see their implementation in
https://github.com/wentaozhu/protein-cascade-cnn-lstm/blob/master/cb6133.py
I am trying to train on the Q8 task only, with no solvent task. I am trying to concatenate the [512, 50] embedding layer to the [512, 22] output layer, but I keep getting various errors. This is how I am trying to concatenate at the moment:
main_input = Input(shape=(maxlen_seq,), dtype='int32', name='main_input')
# Defining an embedding layer mapping from the words (n_words) to a vector of len 50
# input_orig = K.reshape(input, (maxlen_seq, n_words))
x = Embedding(input_dim=n_words, output_dim=50, input_length=maxlen_seq)(main_input)
aux_input = Input(shape=(maxlen_seq, n_words), name='aux_input')
x = concatenate([x, aux_input], axis=-1)
# ... rest of model ...
The model compiles fine with model.compile():
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
main_input (InputLayer) (None, 512) 0
__________________________________________________________________________________________________
embedding_27 (Embedding) (None, 512, 50) 1100 main_input[0][0]
__________________________________________________________________________________________________
aux_input (InputLayer) (None, 512, 22) 0
__________________________________________________________________________________________________
concatenate_54 (Concatenate) (None, 512, 72) 0 embedding_27[0][0]
aux_input[0][0]
________________________________________________________________________________________
...
But I get a ValueError: Error when checking input: expected aux_input to have 3 dimensions, but got array with shape (4464, 512)
The model is defined as:
model = Model([main_input, aux_input], [y1, y2])
And fit as:
model.fit({'main_input':X_train,
'aux_input': X_train},
{'main_output': y_train,
'aux_output': y_train},
batch_size=128, epochs=20, callbacks=[early, best_model],
validation_data=({'main_input':X_val,
'aux_input': X_val},
{'main_output': y_val,
'aux_output': y_val}),
verbose=1)

Related

How to fix ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)

I am using Convolutional Neural Network to train a text classification task, using Keras, Conv1D. When I run the model below to my multi class text classification task, I get error such as following. I put time to undrestand the error but I don't know how to fix it. can anyone help me please?
The data set and evaluation set shape is such as following:
df_train shape: (7198,)
df_val shape: (1800,)
np.random.seed(42)
#You needs to reshape your input data according to Conv1D layer input format - (batch_size, steps, input_dim). Try
# set parameters of matrices and convolution
embedding_dim = 300
nb_filter = 64
filter_length = 5
hidden_dims = 32
stride_length = 1
from keras.layers import Embedding
embedding_layer = Embedding(len(tokenizer.word_index) + 1,
embedding_dim,
input_length=35,
name="Embedding")
inp = Input(shape=(35,), dtype='int32')
embeddings = embedding_layer(inp)
conv1 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV1",
kernel_regularizer=regularizers.l2(l=0.0367))(embeddings)
conv2 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.02))(embeddings)
conv3 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.01))(embeddings)
max1 = MaxPool1D(10, strides=1,name="MaxPool1D1")(conv1)
max2 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv2)
max3 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv3)
conc = concatenate([max1, max2,max3])
flat = Flatten(name="FLATTEN")(max1)
....
Error is like following:
ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)
The model :
Model: "CNN"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_19 (InputLayer) [(None, 35)] 0
_________________________________________________________________
Embedding (Embedding) (None, 35, 300) 4094700
_________________________________________________________________
CONV1 (Conv1D) (None, 35, 32) 48032
_________________________________________________________________
MaxPool1D1 (MaxPooling1D) (None, 26, 32) 0
_________________________________________________________________
FLATTEN (Flatten) (None, 832) 0
_________________________________________________________________
Dropout (Dropout) (None, 832) 0
_________________________________________________________________
Dense (Dense) (None, 3) 2499
=================================================================
Total params: 4,145,231
Trainable params: 4,145,231
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
That error comes when you have not matched the network's input layer shape and the dataset's shape. If are you receiving an error like this, then you should try:
Set the network input shape at (None, 31) so that it matches the Dataset's shape.
Check that the dataset's shape is equal to (num_of_examples, 35).(Preferable)
If all of this informations are correct and there is no problem with the Dataset, it might be an error of the net itself, where the shapes af two adjcent layers don't match.

ValueError: Shape mismatch: The shape of labels (received (1,)) should equal the shape of logits except for the last dimension (received (10, 30))

i'm fairly new to tensorflow and would appreciate answers a lot.
i'm trying to use a transformer model as an embedding layer and feed the data to a custom model.
from transformers import TFAutoModel
from tensorflow.keras import layers
def build_model():
transformer_model = TFAutoModel.from_pretrained(MODEL_NAME, config=config)
input_ids_in = layers.Input(shape=(MAX_LEN,), name='input_ids', dtype='int32')
input_masks_in = layers.Input(shape=(MAX_LEN,), name='attention_mask', dtype='int32')
embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]
X = layers.Bidirectional(tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(embedding_layer)
X = layers.GlobalMaxPool1D()(X)
X = layers.Dense(64, activation='relu')(X)
X = layers.Dropout(0.2)(X)
X = layers.Dense(30, activation='softmax')(X)
model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)
for layer in model.layers[:3]:
layer.trainable = False
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
model = build_model()
model.summary()
r = model.fit(
train_ds,
steps_per_epoch=train_steps,
epochs=EPOCHS,
verbose=3)
I have 30 classes and the labels are not one-hot encoded so im using sparse_categorical_crossentropy as my loss function but i keep getting the following error
ValueError: Shape mismatch: The shape of labels (received (1,)) should equal the shape of logits except for the last dimension (received (10, 30)).
how can i solve this?
and why is the (10, 30) shape required? i know 30 is because of the last Dense layer with 30 units but why the 10? is it because of the MAX_LENGTH which is 10?
my model summary:
Model: "model_16"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_ids (InputLayer) [(None, 10)] 0
__________________________________________________________________________________________________
attention_mask (InputLayer) [(None, 10)] 0
__________________________________________________________________________________________________
tf_bert_model_21 (TFBertModel) TFBaseModelOutputWit 162841344 input_ids[0][0]
attention_mask[0][0]
__________________________________________________________________________________________________
bidirectional_17 (Bidirectional (None, 10, 100) 327600 tf_bert_model_21[0][0]
__________________________________________________________________________________________________
global_max_pooling1d_15 (Global (None, 100) 0 bidirectional_17[0][0]
__________________________________________________________________________________________________
dense_32 (Dense) (None, 64) 6464 global_max_pooling1d_15[0][0]
__________________________________________________________________________________________________
dropout_867 (Dropout) (None, 64) 0 dense_32[0][0]
__________________________________________________________________________________________________
dense_33 (Dense) (None, 30) 1950 dropout_867[0][0]
==================================================================================================
Total params: 163,177,358
Trainable params: 336,014
Non-trainable params: 162,841,344
10 is a number of sequences in one batch. I suspect that it is a number of sequences in your dataset.
Your model acting as a sequence classifier. So you should have one label for every sequence.

TensorFlow input shape error at Dense output layer is contradictory to what model.summary() says

I am playing around with an NLP problem (sentence classification) and decided to use HuggingFace's TFBertModel along with Conv1D, Flatten, and Dense layers. I am using the functional API and my model compiles. However, during model.fit(), I get a shape error at the output Dense layer.
Model definition:
# Build model with a max length of 50 words in a sentence
max_len = 50
def build_model():
bert_encoder = TFBertModel.from_pretrained(model_name)
input_word_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
input_mask = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
input_type_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_type_ids")
# Create a conv1d model. The model may not really be useful or make sense, but that's OK (for now).
embedding = bert_encoder([input_word_ids, input_mask, input_type_ids])[0]
conv_layer = tf.keras.layers.Conv1D(32, 3, activation='relu')(embedding)
dense_layer = tf.keras.layers.Dense(24, activation='relu')(conv_layer)
flatten_layer = tf.keras.layers.Flatten()(dense_layer)
output_layer = tf.keras.layers.Dense(3, activation='softmax')(flatten_layer)
model = tf.keras.Model(inputs=[input_word_ids, input_mask, input_type_ids], outputs=output_layer)
model.compile(tf.keras.optimizers.Adam(lr=1e-5), loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
# View model architecture
model = build_model()
model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_word_ids (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
input_mask (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
input_type_ids (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
tf_bert_model (TFBertModel) ((None, 50, 768), (N 177853440 input_word_ids[0][0]
input_mask[0][0]
input_type_ids[0][0]
__________________________________________________________________________________________________
conv1d (Conv1D) (None, 48, 32) 73760 tf_bert_model[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 48, 24) 792 conv1d[0][0]
__________________________________________________________________________________________________
flatten (Flatten) (None, 1152) 0 dense[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 3) 3459 flatten[0][0]
==================================================================================================
Total params: 177,931,451
Trainable params: 177,931,451
Non-trainable params: 0
__________________________________________________________________________________________________
# Fit model on input data
model.fit(train_input, train['label'].values, epochs = 3, verbose = 1, batch_size = 16,
validation_split = 0.2)
And this is the error message:
ValueError: Input 0 of layer dense_1 is incompatible with the layer: expected axis -1 of input shape to have value 1152 but received
input with shape [16, 6168]
I am unable to understand how the input shape to layer dense_1 (the output dense layer) can be 6168? As per the model summary, it should always be 1152.
The shape of your input is likely not as you expect. Check the shape of train_input.

How to feed and build a "Input->Dense->Conv2D->Dense" network in keras?

This is a simple example that reproduces my issue in a network I am trying to deploy.
I have an image input layer (which I need to maintain), then a Dense layer, Conv2D layer and a dense layer.
The idea is that the inputs are 10x10 images and the labels are 10x10 images. Inspired by my code and this example.
import numpy as np
from keras.models import Model
from keras.layers import Input, Conv2D
#Building model
size=10
a = Input(shape=(size,size,1))
hidden = Dense(size)(a)
hidden = Conv2D(kernel_size = (3,3), filters = size*size, activation='relu', padding='same')(hidden)
outputs = Dense(size, activation='sigmoid')(hidden)
model = Model(inputs=a, outputs=outputs)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
#Create random data and accounting for 1 channel of data
n_images=55
data = np.random.randint(0,2,(n_images,size,size,1))
labels = np.random.randint(0,2,(n_images,size,size,1))
#Fit model
model.fit(data, labels, verbose=1, batch_size=10, epochs=20)
print(model.summary())
I get the following error: ValueError: Error when checking target: expected dense_92 to have shape (10, 10, 10) but got array with shape (10, 10, 1)
I don't get an error if I change:
outputs = Dense(size, activation='sigmoid')(hidden)
with:
outputs = Dense(1, activation='sigmoid')(hidden)
No idea how Dense(1) is even valid and how it allows 10x10 output signal as model.summary() indicates:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_26 (InputLayer) (None, 10, 10, 1) 0
_________________________________________________________________
dense_93 (Dense) (None, 10, 10, 10) 20
_________________________________________________________________
conv2d_9 (Conv2D) (None, 10, 10, 100) 9100
_________________________________________________________________
dense_94 (Dense) (None, 10, 10, 1) 101
=================================================================
Total params: 9,221
Trainable params: 9,221
Non-trainable params: 0
_________________________________________________________________
None
Well, according to your comments:
what I am trying to do isn't standard. I have set of images and for
each image I want to find a binary image of the same size that if the
value of its pixel is 1 it means the feature exists in the input image
the insight wether a pixel has a feature should be taken both from
local information (extracted by a convolution layers) and global
information extracted by Dense layers.
I guess you are looking for creating a two branch model where one branch consists of convolution layers and another one is simply one or more dense layers on top of each other (although, I should mention that in my opinion one convolution network may achieve what you are looking for, because the combination of pooling and convolution layers and then maybe some up-sampling layers at the end somehow preserves both local and global information). To define such a model, you can use Keras functional API like this:
from keras import models
from keras import layers
input_image = layers.Input(shape=(10, 10, 1))
# branch one: dense layers
b1 = layers.Flatten()(input_image)
b1 = layers.Dense(64, activation='relu')(b1)
b1_out = layers.Dense(32, activation='relu')(b1)
# branch two: conv + pooling layers
b2 = layers.Conv2D(32, (3,3), activation='relu')(input_image)
b2 = layers.MaxPooling2D((2,2))(b2)
b2 = layers.Conv2D(64, (3,3), activation='relu')(b2)
b2_out = layers.MaxPooling2D((2,2))(b2)
# merge two branches
flattened_b2 = layers.Flatten()(b2_out)
merged = layers.concatenate([b1_out, flattened_b2])
# add a final dense layer
output = layers.Dense(10*10, activation='sigmoid')(merged)
output = layers.Reshape((10,10))(output)
# create the model
model = models.Model(input_image, output)
model.compile(optimizer='rmsprop', loss='binary_crossentropy')
model.summary()
Model summary:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 10, 10, 1) 0
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 8, 8, 32) 320 input_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 4, 4, 32) 0 conv2d_1[0][0]
__________________________________________________________________________________________________
flatten_1 (Flatten) (None, 100) 0 input_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 2, 2, 64) 18496 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 64) 6464 flatten_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 1, 1, 64) 0 conv2d_2[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 32) 2080 dense_1[0][0]
__________________________________________________________________________________________________
flatten_2 (Flatten) (None, 64) 0 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 96) 0 dense_2[0][0]
flatten_2[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 100) 9700 concatenate_1[0][0]
__________________________________________________________________________________________________
reshape_1 (Reshape) (None, 10, 10) 0 dense_3[0][0]
==================================================================================================
Total params: 37,060
Trainable params: 37,060
Non-trainable params: 0
__________________________________________________________________________________________________
Note that this is one way of achieving what you are looking for and it may or may not work for the specific problem and the data you are working on. You may modify this model (e.g. remove the pooling layers or add more dense layers) or completely use another architecture with different kind of layers (e.g. up-sampling, conv2dtrans) to reach a better accuracy. At the end, you must experiment to find the perfect solution.
Edit:
For completeness here is how to generate data and fitting the network:
n_images=10
data = np.random.randint(0,2,(n_images,size,size,1))
labels = np.random.randint(0,2,(n_images,size,size,1))
model.fit(data, labels, verbose=1, batch_size=32, epochs=20)

Keras dense layer shape mismatch

I am trying to make a multiclass classifier in Keras, but I am getting a dimension mismatch in the Dense layer.
MAX_SENT_LENGTH = 100
MAX_SENTS = 15
EMBEDDING_DIM = 100
x_train = data[:-nb_validation_samples]
y_train = labels[:-nb_validation_samples]
x_val = data[-nb_validation_samples:]
y_val = labels[-nb_validation_samples:]
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SENT_LENGTH,
trainable=True)
sentence_input = Input(shape=(MAX_SENT_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sentence_input)
l_lstm = Bidirectional(LSTM(100))(embedded_sequences)
sentEncoder = Model(sentence_input, l_lstm)
review_input = Input(shape=(MAX_SENTS,MAX_SENT_LENGTH), dtype='int32')
review_encoder = TimeDistributed(sentEncoder)(review_input)
l_lstm_sent = Bidirectional(LSTM(100))(review_encoder)
preds = Dense(7, activation='softmax')(l_lstm_sent)
model = Model(review_input, preds)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.fit(x_train, y_train, validation_data=(x_val, y_val),
epochs=10, batch_size=50)
The class labels are transformed into a 1-hot vector correctly, but when trying to fit the model, I am getting this mismatch error:
('Shape of data tensor:', (5327, 15, 100))
('Shape of label tensor:', (5327, 7))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 15, 100) 0
_________________________________________________________________
time_distributed_1 (TimeDist (None, 15, 200) 351500
_________________________________________________________________
bidirectional_2 (Bidirection (None, 200) 240800
_________________________________________________________________
dense_1 (Dense) (None, 7) 1407
=================================================================
Total params: 592,501
Trainable params: 592,501
Non-trainable params: 0
_________________________________________________________________
None
ValueError: Error when checking target: expected dense_1 to have
shape (None, 1) but got array with shape (4262, 7)
Where does this (None, 1) dimension come from and how can I solve this error?
You should use loss='categorical_crossentropy' instead of loss='sparse_categorical_crossentropy' if your label is one-hot encoded. 'sparse_categorical_crossentropy' takes integer labels, and that's why (None,1) dimension is required.

Categories