Tensorflow/Keras: tf.reshape to Concatenate after multiple Conv2D - python

I am implementing multiple Conv2D layers, then I concatenate the outputs.
x = Conv2D(f, kernel_size=(3,3), strides=(1,1))(input)
y = Conv2D(f, kernel_size=(5,5), strides=(2,2))(input)
output = Concatenate()([x, y])
As you know, different kernel size produces different output shape. Although I can do this:
x = Conv2D(f, kernel_size=(3,3), strides=(1,1), padding="same")(input)
y = Conv2D(f, kernel_size=(5,5), strides=(2,2), padding="same")(input)
output = Concatenate()([x, y])
But that would increase the number of channels a lot, which makes me run out of memory. I can also calculate the output shape, but that would be inconvenient if I change the kernel size.
I tried:
y = tf.reshape(y, x.shape)
But I gave the error:
ValueError: Cannot convert a partially known TensorShape to a Tensor
Is there an easy way to concatenate the outputs from multiple Conv2D layers?

You cannot concatenate outputs of two layers if their shapes don't match. You can utilize the ZeroPadding2D layer to add rows and columns with 0 values in order to match the shapes of outputs.
Here is the shortest example along with the shapes.
from tensorflow.keras.layers import *
from tensorflow.keras import *
import tensorflow as tf
input = Input(shape = (28,28,3))
x = Conv2D(3, kernel_size=(3,3), strides=(1,1))(input)
y = Conv2D(3, kernel_size=(5,5), strides=(2,2))(input)
z = tf.keras.layers.ZeroPadding2D(padding=(7,7))(y)
output = Concatenate()([x, z])
model = Model(inputs = input, outputs = output)
tf.keras.utils.plot_model(model, 'my_first_model.png', show_shapes=True)
For this example, I have taken input shape as (28,28,3). You can add your own input shape and accordingly change the number of padding rows and columns to be added.
You can take a look at the documentation of ZeroPadding2D here


How do I correctly use Keras Embedding layer?

I have written the following multi-input Keras TensorFlow model:
CHARPROTLEN = 25 #size of vocab
CHARCANSMILEN = 62 #size of vocab
protein_input = Input(shape=(train_protein.shape[1:]))
compound_input = Input(shape=(train_smile.shape[1:]))
#protein layers
x = Embedding(input_dim=CHARPROTLEN+1,output_dim=128, input_length=maximum_amino_acid_sequence_length) (protein_input)
x = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(x)
x = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=8)(x)
x = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=12)(x)
final_protein = GlobalMaxPooling1D()(x)
#compound layers
y = Embedding(input_dim=CHARCANSMISET+1,output_dim=128, input_length=maximum_SMILES_length) (compound_input)
y = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(y)
y = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=6)(y)
y = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=8)(y)
final_compound = GlobalMaxPooling1D()(y)
join = tf.keras.layers.concatenate([final_protein, final_compound], axis=-1)
x = Dense(1024, activation="relu")(join)
x = Dropout(0.1)(x)
x = Dense(1024, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(512, activation='relu')(x)
predictions = Dense(1,kernel_initializer='normal')(x)
model = Model(inputs=[protein_input, compound_input], outputs=[predictions])
The inputs have the following shapes:
TensorShape([5411, 1500, 1])
TensorShape([5411, 100, 1])
I get the following error message:
ValueError: One of the dimensions in the output is <= 0 due to downsampling in conv1d. Consider increasing the input size. Received input shape [None, 1500, 1, 128] which would produce output shape with a zero or negative value in a dimension.
Is this due to the Embedding layer having the incorrect output_dim? How do I correct this? Thanks.
A Conv1D layer requires the input shape (batch_size, timesteps, features), which train_protein and train_smile already have. For example, train_protein consists of 5411 samples, where each sample has 1500 timesteps, and each timestep one feature. Applying an Embedding layer to them results in adding an additional dimension, which Conv1D layers cannot work with.
You have two options. You either leave out the Embedding layer altogether and feed your inputs directly to the Conv1D layers, or you reshape your data to be (5411, 1500) for train_protein and (5411, 100) for train_smile. You can use tf.reshape, tf.squeeze, or tf.keras.layers.Reshape to reshape the data. Afterwards you can use the Embedding layer as planned. And note that output_dim determines the n-dimensional vector to which each timestep will be mapped. See also this and this.

How do I run an iterative 2D convolution for each slice of a tensor?

I'm working on a machine learning project with convolutional neural networks using TF/Keras in Python, and my goal is to split up an image up into patches, run a convolution on each one separately, and then put it back together.
What I can't figure out how to do is run a convolution for each slice of a 3D array.
For example, if I have a tensor of size (500,100,100) I want to do a separate convolution for all 500 slices of size (100 x 100). I'm implementing this within a custom Keras layer and want these to be trainable weights I've tried a few different things:
Using map.fn() to run a convolution for each slice of the array
This doesn't seem to attach weights to each layer separately.
Using the DepthwiseConv2D layer:
This works well for the first call of the layer, but fails when I call the layer the second time with more filters because it wants to perform the depthwise convolution on each of the previous filtered layers
This, of course isn't what I want because I want one convolution for each of the previous sets of filters from the previous layer.
Any ideas are appreciated, as I'm truly stuck here. Thank you!
If you have a tensor with shape (500,100,100) and want to feed some subset of this tensor, to separate conv2d layers at the same time, you may do this by defining conv2d layers in the same level. You should first define Lambda layers to split input, then feed their output to Conv2D layers, then concatenate them.
Let's take a tensor with shape (100,28,28,1) as an example, that we want to split it into 2 subset tensor and apply conv2d layers on each subset separately:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Conv2D, Input, concatenate, Lambda
from tensorflow.keras.models import Model
# define a sample dataset
x = tf.random.uniform((100, 28, 28, 1))
y = tf.random.uniform((100, 1), dtype=tf.int32, minval=0, maxval=9)
ds = tf.data.Dataset.from_tensor_slices((x, y))
ds = ds.batch(16)
def create_nn_model():
input = Input(shape=(28,28,1))
b1 = Lambda(lambda a: a[:,:14,:,:], name="first_slice")(input)
b2 = Lambda(lambda a: a[:,14:,:,:], name="second_slice")(input)
d1 = Conv2D(64, 2, padding='same', activation='relu', name="conv1_first_slice")(b1)
d2 = Conv2D(64, 2, padding='same', activation='relu', name="conv2_second_slice")(b2)
x = concatenate([d1,d2], axis=1)
x = Flatten()(x)
x = Dense(64, activation='relu')(x)
out = Dense(10, activation='softmax')(x)
model = Model(input, out)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = create_nn_model()
tf.keras.utils.plot_model(model, show_shapes=True)
Here is the plotted model architecture:

How to convert a tensorflow model to a pytorch model?

I'm new to pytorch. Here's an architecture of a tensorflow model and I'd like to convert it into a pytorch model.
I have done most of the codes but am confused about a few places.
1) In tensorflow, the Conv2D function takes filter as an input. However, in pytorch, the function takes the size of input channels and output channels as inputs. So how do I find the equivalent number of input channels and output channels, provided with the size of the filter.
2) In tensorflow, the dense layer has a parameter called 'nodes'. However, in pytorch, the same layer has 2 different inputs (the size of the input parameters and size of the targeted parameters), how do I determine them based on the number of the nodes.
Here's the tensorflow code.
from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dense(256, activation='relu'))
model.add(Dense(43, activation='softmax'))
Here's my code.:
import torch.nn.functional as F
import torch
# The network should inherit from the nn.Module
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# Define 2D convolution layers
# 3: input channels, 32: output channels, 5: kernel size, 1: stride
self.conv1 = nn.Conv2d(3, 32, 5, 1) # The size of input channel is 3 because all images are coloured
self.conv2 = nn.Conv2d(32, 64, 5, 1)
self.conv3 = nn.Conv2d(64, 128, 3, 1)
self.conv3 = nn.Conv2d(128, 256, 3, 1)
# It will 'filter' out some of the input by the probability(assign zero)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
# Fully connected layer: input size, output size
self.fc1 = nn.Linear(36864, 128)
self.fc2 = nn.Linear(128, 10)
# forward() link all layers together,
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = self.conv3(x)
x = F.relu(x)
x = self.conv4(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
Thanks in advance!
1) In pytorch, we take input channels and output channels as an input. In your first layer, the input channels will be the number of color channels in your image. After that it's always going to be the same as the output channels from your previous layer (output channels are specified by the filters parameter in Tensorflow).
2). Pytorch is slightly annoying in the fact that when flattening your conv outputs you'll have to calculate the shape yourself. You can either use an equation to calculate this (𝑂𝑢𝑡=(𝑊−𝐹+2𝑃)/𝑆+1), or make a shape calculating function to get the shape of a dummy image after it's been passed through the conv part of the network. This parameter will be your size of input argument; the size of your output argument will just be the number of nodes you want in your next fully connected layer.

How to pass sequence of image through Conv2D in Keras?

I have a sequence of 5 images that I want to pass through a CNN sequentially. A single input will have size: (5, width, height, channels) and I want to pass each image in the sequence in order to a 2D CNN, concatenate all 5 outputs at some layer and then feed to an LSTM. My model looks something like this:
from keras.models import Model
from keras.layers import Dense, Input, LSTM, Flatten, Conv2D, MaxPooling2D
# Feed images in sequential order here
inputs = Input(shape=(128, 128, 3))
x = Conv2D(16, 3, activation='relu')(inputs)
x = MaxPooling2D((2, 2))(x)
# Concatenate sequence outputs here
x = LSTM(8)(x)
x = Flatten()(x)
outputs = Dense(5, activation='sigmoid')
model = Model(inputs=inputs, outputs=outputs)
Eventually I want to concatenate all 5 outputs together at some point in the network and feed them to an LSTM but I am having trouble figuring out how to feed sequence of images in order to a 2D convolutional layer. I have looked into 3D convolutional layers and the ConvLSTM2D layer but I want to figure out how I can do it this way instead.

Something wrong when computing the receptive field using non-zero gradient in Keras

I'm trying to compute the receptive field of some specific neurons based on the non-zero gradient but found one strange thing.
The following is a simple NN model built in keras. The remaining parts are to calculate the gradient of the output (here the targeted neuron of which pos is (0,2) on the first channel) of conv2d_4 w.r.t its input. Through finding those non-zero values on the gradient map, we can easily locate the receptive field of one neuron. The ideal receptive field of one neuron in the output of conv2d_4 w.r.t its input should be 3x3 since the kernel size of conv2d_4 is 3x3, but the non-zero gradient map is a 4x5 patch (given by those TRUE values in f_sum).
import numpy as np
import keras.backend as K
import matplotlib.pyplot as plt
from keras.models import load_model, Model
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D,Input, AveragePooling2D, Lambda
def model_build_func(input_shape=(25,25,1)):
inp = Input(shape=input_shape, name='input')
x = Conv2D(32, (3,3), activation='linear', name='conv2d_1')(inp)
x = Conv2D(32, (3,3), activation='linear', name='conv2d_2')(x)
x = AveragePooling2D(pool_size=(2,2))(x)
x = Conv2D(64, (3,3), activation='linear', name='conv2d_3')(x)
x = Conv2D(64, (3,3), activation='linear', name='conv2d_4')(x)
x = AveragePooling2D(pool_size=(2,2))(x)
x = Flatten()(x)
x = Dense(units=64, name='dense_1')(x)
x = Dense(units=2, name='dense_2')(x)
model = Model(inputs=inp, outputs=x)
return model
# used for building the Lambda layer
def get_mask_tensor(input_tensors, x_pos, y_pos, channel_idx):
mask_tensor = K.tf.gradients(input_tensors[0][:,x_pos,y_pos,channel_idx], input_tensors[1])[0]
return mask_tensor
#specify the position of the neuron that we want to compute the RF
x_pos = 0
y_pos = 2
channel_idx = 0
layer_idx = 5 # the layer: conv2d_4
model = model_build_func()
current_layer = model.layers[layer_idx]
#get the gradient tensor
mask_tensor = Lambda(get_mask_tensor, output_shape=K.int_shape(model.input),
arguments={'x_pos':x_pos, 'y_pos':y_pos, 'channel_idx':channel_idx})([current_layer.output, current_layer.input])
#create a keras model
new_model = Model(inputs=[model.input], outputs=[mask_tensor])
#get the value of the gradient map
gradient_map = new_model.predict(0.1*(np.random.random(size=(32,25,25,1))-0.05))
f_sum = np.sum(np.abs(gradient_map), axis=-1)
f_sum = np.sum(np.abs(f_sum), axis=0)
#f_sum is a binary array.
#It should has a 3x3 patch with TRUE values, but here it's 4x5
