Setting custom kernel for CNN in pytorch - python

Is there a way to specify our own custom kernel values for a convolution neural network in pytorch? Something like kernel_initialiser in tensorflow? Eg. I want a 3x3 kernel in nn.Conv2d with initialization so that it acts as a identity kernel -
0 0 0
0 1 0
0 0 0
(this will effectively return the same output as my input in the very first iteration)
My non-exhaustive research on the subject -
I could use nn.init but it only has some pre-defined kernel initialisaition values.
I tried to follow the discussion on their official thread but it doesn't suit my needs.
I might have missed something in my research please feel free to point out.

I think an easier solution is to :
deconv = nn.ConvTranspose2d(
in_channels=channel_dim, out_channels=channel_dim,
kernel_size=kernel_size, stride=stride,
bias=False, padding=1, output_padding=1
get_upsampling_weight(channel_dim, channel_dim, kernel_size)
in other words use copy_

Thanks to ptrblck I was able to solve it.
I can define a new convolution layer as conv and as per the example I can set the identity kernel using -
weights = ch.Tensor([[0, 0, 0], [0, 1, 0], [0, 0, 0]]).unsqueeze(0).unsqueeze(0)
weights.requires_grad = True
conv = nn.Conv2d(1, 1, kernel_size=3, stride=1, padding=1, bias=False)
with ch.no_grad():
conv.weight = nn.Parameter(weights)
I can then continue to use conv as my regular nn.Conv2d layer.


How to update a pretrained model after Pruning of filters in its conv layer in PyTorch?

I have a pretrained model LeNet5 defined from scratch. I am performing pruning over filters in the convolution layers present in the model shown below.
class LeNet5(nn.Module):
def __init__(self, n_classes):
super(LeNet5, self).__init__()
self.feature_extractor = nn.Sequential(
nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5, stride=1),
nn.Conv2d(in_channels=20, out_channels=50, kernel_size=5, stride=1),
self.classifier = nn.Sequential(
nn.Linear(in_features=800, out_features=500),
nn.Linear(in_features=500, out_features=10), # 10 - possible classes
def forward(self, x):
#x = x.view(x.size(0), -1)
x = self.feature_extractor(x)
x = torch.flatten(x, 1)
logits = self.classifier(x)
probs = F.softmax(logits, dim=1)
return logits, probs
I have successfully removed 2 filters from 20 in layer 1 (now 18 filters in conv2d layer1) and 5 filters from 50 in layer 2 (now 45 filters in conv2d layer3). So, now I need to update the model with the changes done as follows -
out_channel of layer 1 - 20 to 18
in_channel of layer 3 - 20 to 18
out_channel of layer 3 - 50 to 45
However, I'm unable to run the model as it gives dimension error.
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x720 and 800x500)
How to update the no. of filters layers present in the model using Pytorch to perform pruning? Is there any library I can use for the same?
Assuming you do not want the model to automatically change structure during runtime, you can easily update the structure of the model by simply changing the input parameters to the constructor. For instance:
nn.Conv2d(in_channels = 1, out_channels = 18, kernel_size = 5, stride = 1),
nn.Conv2d(in_channels = 18, out_channels = 45, kernel_size = 5, stride = 1),
and so on.
If you are retraining from scratch every time you change the model structure, that's all you need to do. However, if you would like to maintain portions of the already learned parameters when you change the model, you'll need to select these relevant values and reassign them to the model parameters. For instance, consider the parameters associated with the first convolutional layer, 1 input, 20 outputs, and kernel size of 5. The weights and biases for this layer have size [1,20,5,5] and [1,20]. You need to modify these parameters such that they have size [1,18,5,5] and [1,18]. You'd thus need the indices for the particular kernels/filters you want to maintain and which kernels you'd like to prune. The code syntax for doing this is roughly:
params = net.state_dict()
params["feature_extractor"]["conv1.weight"] = params["feature_extractor"]["conv1.weight"][:,:18,:,:]
params["feature_extractor"]["conv1.bias"] = params["feature_extractor"]["conv1.bias"][:,:18]
# and so on for the other layers
Here, I simply drop the last two kernels/bias values for the first convolutional layer. (Note that the actual dictionary key names may differ slightly; I didn't code this up to check because, as indicated in the comments above, you included a picture of code rather than real, copy-able, code so try to do the latter in the future.)

Constraint on dimensions of activation/feature map in convolutional network

Let's say input to intermediate CNN layer is of size 512×512×128 and that in the convolutional layer we apply 48 7×7 filters at stride 2 with no padding. I want to know what is the size of the resulting activation map?
I checked some previous posts (e.g., here or here) to point to this Stanford course page. And the formula given there is (W − F + 2P)/S + 1 = (512 - 7)/2 + 1, which would imply that this set up is not possible, as the value we get is not an integer.
However if I run the following snippet in Python 2.7, the code seems to suggest that the size of activation map was computed via (512 - 6)/2, which makes sense but does not match the formula above:
>>> import torch
>>> conv = torch.nn.Conv2d(in_channels=128, out_channels=48, kernel_size=7, stride=2, padding=0)
>>> conv
Conv2d(128, 48, kernel_size=(7, 7), stride=(2, 2))
>>> img = torch.rand((1, 128, 512, 512))
>>> out = conv(img)
>>> out.shape
(1, 48, 253, 253)
Any help in understanding this conundrum is appreciated.
Here is the formula being used in pytorch: conv2d(go to the shape section)
Also, as far as I know, this is the best tutorial on this subject.
Bonus: here is a neat visualizer for conv calculations.

What's the difference between conv2d(SAME) and tf.pad + conv2d(VALID)?

I'm almost new to tensorflow, and when I learn tensorflow through some tutorials, i've read the following codes:
if stride == 1:
return slim.conv2d(inputs, num_outputs, kernel_size, stride=1, padding='SAME', scope=scope)
pad_total = kernel_size - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride, padding='VALID', scope=scope)
However, i also learn that, "SAME" padding means the output data has the same size with the input data, while "VALID" means different, and the the method of tf.pad also pad zero manually, so is there any difference between these two methods? Or what's the purpose of this tf.pad?
In many real-word use-cases, there is no difference.
For instance, in some imagenet architectures, we often pad with 1, then do a 3x3 convolution. Behaviour of the network would be the same if you first zero-pad with 1, then convolve, or if you convolve with "same" padding.
However, behaviour will be different in non-standard situations. Remember that you can define kernel size AND stride AND dilation rate at a convolution layer.
A counterexample where there is a difference between conv2d(SAM) and a symmetric tf.pad +conv2d(VALID):
Input: (7,7,1)
Kernel: (4,4)
Stride: (2,2)
conv2d(SAME) here would be the same as tf.pad(0 pixel left/top, 1 pixel right/bottom), and would yield a (3,3,1) output.

Understanding PyTorch CNN Channels

I'm a bit confused at how CNNs and channels work. Specifically, how come these two implementations are not equal? Isn't the # of output channels just applying however many # of filters?
self.conv1 = nn.Conv2d(1, 10, kernel_size=(3, self.embeds_size))
self.conv2 = nn.ModuleList([nn.Conv2d(1, 1, kernel_size=(3, self.embeds_size)) for f in range(10)])
conv1s = self.conv1(x)
conv2s = [conv(x) for conv in self.conv2]
conv2s = torch.stack(conv2s, 1).squeeze(2)
print(torch.equal(conv1s, conv2s))
Check the state dicts of the different modules. Unless you're doing something fancy that you didn't tell us about, PyTorch will initialize the weights randomly. Specifically, try this:
They will be different.

Implementing CNN with tensorflow

I'm new in convolutional neural networks and in Tensorflow and I need to implement a conv layer with further parameters:
Conv. layer1: filter=11, channel=64, stride=4, Relu.
The API is following:
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)
I understand, what is stride and that it should be [1, 4, 4, 1] in my case. But I do not understand, how should I pass a filter parameter and padding.
Could someone help with it?
At first, you need to create a filter variable:
W = tf.Variable(tf.truncated_normal(shape = [11, 11, 3, 64], stddev = 0.1), tf.float32)
First two fields of shape parameter stand for filter size, third for the number of input channels (I guess your images have 3 channels) and fourth for the number of output channels.
Now output of convolutional layer could be computed as follows:
conv1 = tf.nn.conv2d(input, W, strides = [1, 4, 4, 1], padding = 'SAME'), where padding = 'SAME' stands for zero padding and therefore size of the image remains the same, input should have size [batch, size1, size2, 3].
ReLU application is pretty straightforward:
conv1 = tf.nn.relu(conv1)
