I would like to add a skip connection between residual blocks in keras. This is my current implementation, which does not work because the tensors have different shapes.
The function looks like this:
def build_res_blocks(net, x_in, num_res_blocks, res_block, num_filters, res_block_expansion, kernel_size, scaling):
net_next_in = net
for i in range(num_res_blocks):
net = res_block(net_next_in, num_filters, res_block_expansion, kernel_size, scaling)
# net tensor shape: (None, None, 32)
# x_in tensor shape: (None, None, 3)
# Error here, net_next_in should be in the shape of (None, None, 32) to be fed into next layer
net_next_in = Add()([net, x_in])
return net
But I get
ValueError: Operands could not be broadcast together with shapes (None, None, 32) (None, None, 3)
How to add or merge these tensors into the correct shape (None, None, 32)? If this is not the correct approach, how could you achieve the intended result?
This is what the res_block looks like:
def res_block(x_in, num_filters, expansion, kernel_size, scaling):
x = Conv2D(num_filters * expansion, kernel_size, padding='same')(x_in)
x = Activation('relu')(x)
x = Conv2D(num_filters, kernel_size, padding='same')(x)
x = Add()([x_in, x])
return x
You cant add tensors of different shape. You could concatenate them with keras.layers.Concatenate, but this would leave you with a tensor of shape [None, None, 35].
Alternatively, have a look at the
Resnet50 implementation in Keras. Their residual block features a 1x1xC convolution in the shortcut for those cases where the dimensions to be added are different.
Related
I was trying to use the class method (not something I always use) for building a Conv network but some-reason the input dimention is not matching with the training dimentions. I had seen another similar question but in that answer it was said to use numpy to expand the dimentions. I tried both numpy and tensorflow for expanding but it gives another new error.
So, if someone can help me out, it'll be really helpfulol.
Model:
class ANet(tf.keras.Model):
def __init__(self):
super(ANet, self).__init__()
# self.shape = Input(shape=(256, 256, 3))
self.conv1 = Conv2D(32, 3, activation='relu')
self.pooling = MaxPool2D()
self.flatten = Flatten()
self.d1 = Dense(64)
self.d2 = Dense(10)
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
return self.d2(x)
model_net = ANet()
input_shape=(256, 256, 3)
model_net.build(input_shape)
model_net.summary()
Error Code
ValueError: Input 0 of layer "conv2d_8" is incompatible with the layer: expected min_ndim=4, found ndim=3. Full shape received: (256, 256, 3)
I have solved the problem. Tensorflow raised an error as I didn't inlcude None in the input_shape.
I changed my code to this-
# input_shape=(256, 256, 3)
model_net.build(input_shape=(None, 256, 256, 3))
model_net.summary()
And it worked!
I am new to Tensorflow and trying to figure out how to build a simple text classification model. Taking a basic model from this tutorial, I am trying to adapt it to my own custom dataset.
I have tensors with shape=(32, 2, 500) grouped into training and validation datasets with shape=(None, 2, 500).
def get_model(max_features=20000, embedding_dim=128):
# A integer input for vocab indices.
inputs = tf.keras.Input(shape=(None,), dtype="int64")
# Next, we add a layer to map those vocab indices into a space of dimensionality
#'embedding_dim'.
x = layers.Embedding(max_features, embedding_dim)(inputs)
x = layers.Dropout(0.5)(x)
# Conv1D + global max pooling
x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
x = layers.GlobalMaxPooling1D()(x)
# We add a vanilla hidden layer:
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
# We project onto a single unit output layer, and squash it with a sigmoid:
predictions = layers.Dense(1, activation="sigmoid", name="predictions")(x)
model = tf.keras.Model(inputs, predictions)
# Compile the model with binary crossentropy loss and an adam optimizer.
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
return model
I get the following warning:
WARNING:tensorflow:Model was constructed with shape (None, None) for input KerasTensor(type_spec=TensorSpec(shape=(None, None), dtype=tf.int64, name='input_16'), name='input_16', description="created by layer 'input_16'"), but it was called on an input with incompatible shape (None, 2, 500).
And the following error message:
Input 0 of layer "global_max_pooling1d_6" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 2, 53, 128)
Call arguments received by layer "model_7" " f"(type Functional):
• inputs=tf.Tensor(shape=(None, 2, 500), dtype=int64)
• training=True
• mask=None
What do I need to change to get rid of this error and get the model working?
I am using Convolutional Neural Network to train a text classification task, using Keras, Conv1D. When I run the model below to my multi class text classification task, I get error such as following. I put time to undrestand the error but I don't know how to fix it. can anyone help me please?
The data set and evaluation set shape is such as following:
df_train shape: (7198,)
df_val shape: (1800,)
np.random.seed(42)
#You needs to reshape your input data according to Conv1D layer input format - (batch_size, steps, input_dim). Try
# set parameters of matrices and convolution
embedding_dim = 300
nb_filter = 64
filter_length = 5
hidden_dims = 32
stride_length = 1
from keras.layers import Embedding
embedding_layer = Embedding(len(tokenizer.word_index) + 1,
embedding_dim,
input_length=35,
name="Embedding")
inp = Input(shape=(35,), dtype='int32')
embeddings = embedding_layer(inp)
conv1 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV1",
kernel_regularizer=regularizers.l2(l=0.0367))(embeddings)
conv2 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.02))(embeddings)
conv3 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.01))(embeddings)
max1 = MaxPool1D(10, strides=1,name="MaxPool1D1")(conv1)
max2 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv2)
max3 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv3)
conc = concatenate([max1, max2,max3])
flat = Flatten(name="FLATTEN")(max1)
....
Error is like following:
ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)
The model :
Model: "CNN"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_19 (InputLayer) [(None, 35)] 0
_________________________________________________________________
Embedding (Embedding) (None, 35, 300) 4094700
_________________________________________________________________
CONV1 (Conv1D) (None, 35, 32) 48032
_________________________________________________________________
MaxPool1D1 (MaxPooling1D) (None, 26, 32) 0
_________________________________________________________________
FLATTEN (Flatten) (None, 832) 0
_________________________________________________________________
Dropout (Dropout) (None, 832) 0
_________________________________________________________________
Dense (Dense) (None, 3) 2499
=================================================================
Total params: 4,145,231
Trainable params: 4,145,231
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
That error comes when you have not matched the network's input layer shape and the dataset's shape. If are you receiving an error like this, then you should try:
Set the network input shape at (None, 31) so that it matches the Dataset's shape.
Check that the dataset's shape is equal to (num_of_examples, 35).(Preferable)
If all of this informations are correct and there is no problem with the Dataset, it might be an error of the net itself, where the shapes af two adjcent layers don't match.
I am trying to feed variable-sized sequences into an LSTM. Hence I am using a generator and a batch-size of 1.
I have a (sequence_length,)-input-tensor which is embedded, and outputs a (batch_size, sequence_length, embeding_dimension)-tensor.
In parallel, the other input data I have is of size (sequence_length, features) i.e. (None, 10), which I want to reshape to (batch_size, sequence_length, features) i.e. (None, None, 10).
I have tried using the Keras Reshape-Layer with target_shape (-1, 10) which should be equivalent to unfolding (None, 10) to (None, None, 10), but what I receive is a (None, 1, 10) tensor, which makes it impossible to Concatenate this and the embedded Data in order to feed it to an LSTM.
My code:
cluster = Input(shape=(None,))
embeded = Embedding(115, 25, input_length = None)(cluster)
features = Input(shape=(10,)) #variable
reshape = Reshape(target_shape=(-1, 10))(features)
merged = Concatenate(axis=-1)([embeded, reshape])
[...]
model.fit_generator(generator(), steps_per_epoch=1, epochs=5)
Output:
[...]
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, None, 25), (None, 1, 10)]
How can I reshape a (None, 10) to a (None, None, 10) tensor in Keras?
Doing this Keras won't have any benefit over doing reshaping in NumPy. You can:
# perform reshaping prior to passing to Keras
features = Input(shape=(None, 10))
and perform the reshaping prior to passing to Keras where you have the actual batch_size and sequence_length available in your input.
In tensorflow layers.dense(inputs, units, activation) implements a Multi-Layer Perceptron layer with arbitrary activation function.
Output = activation(matmul(input, weights) + bias)
Typically input has shape=[batch_size, input_size] and might look like this: (units = 128 and activation = tf.nn.relu are chosen arbitrarily)
inputx = tf.placeholder(float, shape=[batch_size, input_size])
dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)
I have not found any documentation on what would happen, if i fed higher dimensional input, e.g. because one might have time_steps resulting in a tensor of shape=[time_step, batch_size, input_size]. What one would want here is that the layer is applied to each single input_vector for each timestep for each element of the batch. To put it a bit differently, the internal matmul of layers.dense() should simply use broadcasting in numpy style. Is the behaviour i expect here what actually happens? I.e. is:
inputx = tf.placeholder(float, shape=[time_step, batch_size, input_size])
dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)
applying the dense layer to each input of size input_size for each time_step for each element in batch_size? This should then result in a tensor(in dense_layer above) of shape=[time_step, batch_size, 128]
I'm asking, as e.g. tf.matmul does not support broadcasting in the numpy style, so i'm not sure, how tensorflow handles these cases.
Edit: This post is related, but does not finally answer my question
You can verify your expectation by checking the shape of the dense kernel as follows.
>>> inputx = tf.placeholder(float, shape=[2,3,4])
>>> dense_layer = tf.layers.dense(inputx, 128, tf.nn.relu)
>>> g=tf.get_default_graph()
>>> g.get_collection('variables')
[<tf.Variable 'dense/kernel:0' shape=(4, 128) dtype=float32_ref>, <tf.Variable 'dense/bias:0' shape=(128,) dtype=float32_ref>]
The behavior of the dense layer is the same as a conv layer.
You can consider inputx as an image which has width=2, height=3 and channel=4 and the dense layer as a conv layer which has 128 filters and filters size is 1*1.