I've been searching for a way to visualize parameters in Caffe after traning the network, I found this link. it send a transpose of parameter with
filters = net.params['conv1'][0].data
vis_square(filters.transpose(0, 2, 3, 1))
Which i don't understand why it transpose the data? and in the vis_square it use this code:
data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])
Which is too compressed to understand, any explanation would be appreciated. and then when i changed the code to get conv2 instead of conv1:
filters = net.params['conv2'][0].data
vis_square(filters.transpose(0, 2, 3, 1))
I get
TypeError: Invalid dimensions for image data
, Is there any different between conv1 and conv2 which cause this error ? How can we change the code to fix it and it work for all layer ?
Some debugging data :
net.params['conv1'][0].data.shape : (96, 3, 11, 11)
net.params['conv1'][1].data.shape : (96,)
net.params['conv2'][0].data.shape : (256, 48, 5, 5)
net.params['conv2'][1].data.shape : (256,)
net.params['conv3'][0].data.shape : (384, 256, 3, 3)
net.params['conv3'][1].data.shape : (384,)
for conv2:
data.shape[0] : 256
np.sqrt(data.shape[0]) : 16.0
np.ceil(np.sqrt(data.shape[0])) : 16.0
data.shape[0] : 256
data.shape[0:] : (256, 6, 6, 48)
data.shape[1] : 6
data.shape[1:] : (6, 6, 48)
data.ndim : 4
range(4, data.ndim + 1)) : [4]
tuple(range(4, data.ndim + 1)) : (4,)
AND after :
data = np.pad(data, padding, mode='constant', constant_values=1)
for conv2:
data.shape : (10, 12, 10, 12, 3)
and after
data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
data became :
data.shape : (120, 120, 3)
The code you inspected is written to visualize (i.e., convert to RGB image) convolutional filters.
The shape of conv1 filters (in your example) is (96, 3, 11, 11) which means
- 96 : you have 96 filters in conv1 of your net (i.e., num_output: 96), therefore you would wish to view 96 different filters.
- 3 : the input dimension of each filter is 3, because the input to conv1 in your net is an RGB image with three channels.
- 11, 11: the spatial size of each kernel/filter in your case is 11x11 (i.e., kernel_size: 11).
Therefore, to visualize 96 filters as 11x11x3 thumbnails.
However, when trying to visualize conv2 (or any other deeper layer) you have a problem. There is no longer RGB meaning to filter dimensions. The filters of conv2 work on the output feature of conv1 (which in your case is a 96-dim space). To date, AFAIK, there is no straight-forward way to convert a 96-dim data to a simple 3D RGB representation.
So, you cannot use the same code to visualize conv2 filters. You must use some other method for visualization.
Related
I'm a beginner at the PyTorch library, and I got stuck in an exercise.
The code below works for the input image with size 2x2. I'm trying to do the same thing as below but the input image with size 4x4.
The code:
import torch
Assume that we have a 2x2 input image
inputs = torch.tensor([[[[1., 2.],
[3., 4.]]]])
inputs.shape
Output: torch.Size([1,1,2,2]
A fully connected layer, which maps the 4 input features two 2 outputs, would be computed as follows:
fc = torch.nn.Linear(4, 2)
weights = torch.tensor([[1.1, 1.2, 1.3, 1.4],
[1.5, 1.6, 1.7, 1.8]])
bias = torch.tensor([1.9, 2.0])
fc.weight.data = weights
fc.bias.data = bias
torch.relu(fc(inputs.view(-1, 4)))
Output: torch.Size([2, 1, 2, 2])
Output: torch.Size([2])
Obtain the same outputs if we use convolutional layers where the kernel size is the same size as the input feature array:
conv = torch.nn.Conv2d(in_channels=1,
out_channels=2,
kernel_size=inputs.squeeze(dim=(0)).squeeze(dim=(0)).size())
print(conv.weight.size())
print(conv.bias.size())
Output: torch.Size([2, 1, 2, 2])
Output: torch.Size([2])
conv.weight.data = weights.view(2, 1, 2, 2)
conv.bias.data = bias
torch.relu(conv(inputs))
Output: tensor([[[[14.9000]],
[[19.0000]]]], grad_fn=<ReluBackward0>)
Replace the fully connected layer using a convolutional layer when we reshape the input image into a num_inputs x 1 x 1 image:
conv = torch.nn.Conv2d(in_channels=4,
out_channels=2,
kernel_size=(1, 1))
conv.weight.data = weights.view(2, 4, 1, 1)
conv.bias.data = bias
torch.relu(conv(inputs.view(1, 4, 1, 1)))
Output: tensor([[[[14.9000]],
[[19.0000]]]], grad_fn=<ReluBackward0>)
So based on this code how to input an image that has a size 4x4 and replace the Fully Connected Layers using Convolution Layers?
You simply need to change the shape of input and reshape weights as per 4x4.
inputs = torch.randn(1, 1, 4, 4)
fc = torch.nn.Linear(16, 2)
torch.relu(fc(inputs.view(-1, 16)))
# output
tensor([[0.0000, 0.2525]], grad_fn=<ReluBackward0>)
Now, for conv layer
conv = torch.nn.Conv2d(in_channels=1,
out_channels=2,
kernel_size=inputs.squeeze(dim=(0)).squeeze(dim=(0)).size())
conv.weight.data = fc.weight.data.view(2, 1, 4, 4)
conv.bias.data = fc.bias.data
torch.relu(conv(inputs))
# output
tensor([[[[0.0000]],
[[0.2525]]]], grad_fn=<ReluBackward0>)
You can read Converting FC layers to CONV layers if not sure how conv layers params are taken.
I used Taylor expansion in image classification task. Basically, firstly, pixel vector is generated from RGB image, and each pixel values from pixel vector is going to approximated with Taylor series expansion of sin(x). In tensorflow implementation, I tried possible of coding up this with tensorflow, and I still have some problem when I tried to create feature maps by stacking tensor with expansion terms. Can anyone provide possible perspective how can I make my current attempt more efficient? Any possible thoughts?
Here is the expansion terms of Taylor series of sin(x):
here is my current attempt:
term = 2
c = tf.constant([1, -1/6])
power = tf.constant([1, 3])
x = tf.keras.Input(shape=(32, 32, 3))
res =[]
for x in range(term):
expansion = c * tf.math.pow(tf.tile(x[..., None], [1, 1, 1, 1, term]),power)
m_ij = tf.math.cumsum(expansion, axis=-1)
res.append(m_i)
but this is not quite working because I want to create input features maps from each expansion neurons, delta_1, delta_2 needs to be stacked, which I didn't make correctly in my above attempt, and my code is not well generalized also. How can I refine my above coding attempts in correct way of implementation? Can any one give me possible ideas or canonical answer to improve my current attempts?
If doing series expansion as described, if the input has C channels and the expansion has T terms, the expanded input should have C*T channels and otherwise be the same shape. Thus, the original input and the function being approximated up to each term should be concatenated along the channel dimension. It is a bit easier to do this with a transpose and reshape than an actual concatenate.
Here is example code for a convolutional network trained on CIFAR10:
inputs = tf.keras.Input(shape=(32, 32, 3))
x = inputs
n_terms = 2
c = tf.constant([1, -1/6])
p = tf.constant([1, 3], dtype=tf.float32)
terms = []
for i in range(n_terms):
m = c[i] * tf.math.pow(x, p[i])
terms.append(m)
expansion = tf.math.cumsum(terms)
expansion_terms_last = tf.transpose(expansion, perm=[1, 2, 3, 4, 0])
x = tf.reshape(expansion_terms_last, tf.constant([-1, 32, 32, 3*n_terms]))
x = Conv2D(32, (3, 3), input_shape=(32,32,3*n_terms))(x)
This assumes the original network (without expansion) would have a first layer that looks like this:
x = Conv2D(32, (3, 3), input_shape=(32,32,3))(inputs)
and the rest of the network is exactly the same as it would be without expansion.
terms contains a list of c_i * x ^ p_i from the original; expansion contains the sum of the terms (1st, then 1st and 2nd, etc), in a single tensor (where T is the first dimension). expansion_terms_last moves the T dimension to be last, and the reshape changes the shape from (..., C, T) to (..., C*T)
The output of model.summary() then looks like this:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 32, 32, 3)] 0
__________________________________________________________________________________________________
tf_op_layer_Pow_6 (TensorFlowOp [(None, 32, 32, 3)] 0 input_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Pow_7 (TensorFlowOp [(None, 32, 32, 3)] 0 input_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_6 (TensorFlowOp [(None, 32, 32, 3)] 0 tf_op_layer_Pow_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_7 (TensorFlowOp [(None, 32, 32, 3)] 0 tf_op_layer_Pow_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_x_3 (TensorFlowOpLa [(2, None, 32, 32, 3 0 tf_op_layer_Mul_6[0][0]
tf_op_layer_Mul_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cumsum_3 (TensorFlo [(2, None, 32, 32, 3 0 tf_op_layer_x_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Transpose_3 (Tensor [(None, 32, 32, 3, 2 0 tf_op_layer_Cumsum_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_3 (TensorFl [(None, 32, 32, 6)] 0 tf_op_layer_Transpose_3[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 30, 30, 32) 1760 tf_op_layer_Reshape_3[0][0]
On CIFAR10, this network trains slightly better with expansion - maybe 1% accuracy gain (from 71 to 72%).
Step by step explanation of the code using sample data:
# create a sample input
x = tf.convert_to_tensor([[1,2,3],[4,5,6],[7,8,9]], dtype=tf.float32) # start with H=3, W=3
x = tf.expand_dims(x, axis=0) # add batch dimension N=1
x = tf.expand_dims(x, axis=3) # add channel dimension C=1
# x is now NHWC or (1, 3, 3, 1)
n_terms = 2 # expand to T=2
c = tf.constant([1, -1/6])
p = tf.constant([1, 3], dtype=tf.float32)
terms = []
for i in range(n_terms):
# this simply calculates m = c_i * x ^ p_i
m = c[i] * tf.math.pow(x, p[i])
terms.append(m)
print(terms)
# list of two tensors with shape NHWC or (1, 3, 3, 1)
# calculate each partial sum
expansion = tf.math.cumsum(terms)
print(expansion.shape)
# tensor with shape TNHWC or (2, 1, 3, 3, 1)
# move the T dimension last
expansion_terms_last = tf.transpose(expansion, perm=[1, 2, 3, 4, 0])
print(expansion_terms_last.shape)
# tensor with shape NHWCT or (1, 3, 3, 1, 2)
# stack the last two dimensions together
x = tf.reshape(expansion_terms_last, tf.constant([-1, 3, 3, 1*2]))
print(x.shape)
# tensor with shape NHW and C*T or (1, 3, 3, 2)
# if the input had 3 channels for example, this would be (1, 3, 3, 6)
# now use this as though it was the input
Key assumptions (1) The c_i and p_i are not learned parameters, therefore the "expansion neurons" are not actually neurons, they are just a multiply and sum node (althrough neurons sounds cooler :) and (2) the expansion happens for each input channel independently, thus C input channels expanded to T terms each produce C*T input features, but the T features from each channel are calculated completely independently of the other channels (it looks like that in the diagram), and (3) the input contains all the partial sums (ie c_1 * x ^ p_1, c_1 * x ^ p_1 + c_2 * x ^ p_2 and so forth) but does not contain the terms (again, looks like it in the diagram)
I am working on capsule network implementation in TensorFlow version-2-gpu. while I am doing reshaping on the output of the convolution layer(tensor) it gives the error of attempt to convert the value. error and my code are as below.
conv1_params = {"filters": 256,"kernel_size": 9,"strides": 1,"padding":
"valid","activation":tf.nn.relu,}
conv2_params = {"filters": caps1_n_maps * caps1_n_dims,"kernel_size":9,"strides": 2,"padding":
"valid","activation": tf.nn.relu}
conv1 = tf.keras.layers.Conv2D(input_shape=(None,28,28,1), name="conv1", **conv1_params)
conv2 = tf.keras.layers.Conv2D(name="conv2", **conv2_params)
#output shape of conv1=TensorShape([None, 20, 20, 256])
#output shape of conv2=TensorShape([None, 6, 6, 256])
caps1_raw=tf.keras.backend.reshape(conv2,shape=[-1,caps1_n_caps,caps1_n_dims])
error
ValueError: Attempt to convert a value
() with an unsupported type () to a Tensor.
I have a textfile that is ~ 10k lines long. There are always 216 lines describe a fact with a total of 17 values. I want to build a tensor that is 216 lines high, 13 columns wide and about 1000 layers deep. That would be the input.
The output would be one line high, 4 columns wide and also about 1000 layers deep.
Current status:
x_train = x_train.reshape (1308, 13, 216)
y_train = y_train.reshape (1308, 4, 216)
result = y_train [:,:, 0]
Conv:
model.add (Convolution2D (1, kernel_size = (13, 5), activation = 'relu', input_shape = (1308, 13, 216)))
Afterwards little maxpooling, etc., which should not disturb. I absolutely do not get along with the reshapes rightly. Would be very bad if someone could help me.
Current error message:
Input arrays should have the same number of samples as target arrays.
Found 1 input samples and 1308 target samples.
Many Thanks
I needed to change it into
input_shape = (13, 216, 1)
I think changing from input_shape = (1308, 13, 216) to input_shape = (13, 216) should work.
I am new to keras.
My goal is to have total of 4 max pooling layers. All of them take same input with shape (N, 256). The first layer does global max pooling and give 1 output. The second layer with N / 2 pooling size and N / 2 stride, gives 2 outputs. The third gives 4 outputs and the fourth gives 8 outputs. Here is my code.
test_x = np.random.rand(N, 256, 1)
model = Sequential()
input1 = Input(shape=test_x.shape, name='input1')
input2 = Input(shape=test_x.shape, name='input2')
input3 = Input(shape=test_x.shape, name='input3')
input4 = Input(shape=test_x.shape, name='input4')
max1 = MaxPooling2D(pool_size=(N, 256), strides=N)(input1)
max2 = MaxPooling2D(pool_size=(N / 2, 256), strides=N / 2)(input2)
max3 = MaxPooling2D(pool_size=(N / 4, 256), strides=N / 4)(input3)
max4 = MaxPooling2D(pool_size=(N / 8, 256), strides=N / 8)(input4)
mrg = Merge(mode='concat')([max1, max2, max3, max4])
After creating 4 max pooling layers, I try to merge them together, but keras gives this error.
ValueError: Dimension 1 in both shapes must be equal, but are 4 and 8 for 'merge_1/concat' (op: 'ConcatV2') with input shapes: [?,1,1,1], [?,2,1,1], [?,4,1,1], [?,8,1,1], [] and with computed input tensors: input[4] = <3>.
How can I solve this issue? Is merging the correct way to achieve my goal in keras?
For concatenation, all dimensions must have the same number of elements, except for the concat dimension itself.
As you can see, your results have shape:
(?, 1, 1, 1)
(?, 2, 1, 1)
(?, 4, 1, 1)
(?, 8, 1, 1)
Naturally, the only possible way to concatenate them is in the second axis (axis=1)
mrg = Concatenate(axis=1)([max1,max2,max3,max4])
But notice that (unless you have specific reasons for that and know exaclty what you're doing) this will result in a very weird image, since you're concatenating in a spatial dimension, not in a channel dimension.