torch Tensor add dimension - python

I have a tensor with this size
torch.Size([128, 64])
how do I add one "dummy" dimension such as
torch.Size([1, 128, 64])

There are several ways:
torch.unsqueeze:
torch.unsqueeze(x, 0)
Using None (or np.newaxis):
x[None, ...]
# or
x[np.newaxis, ...]
reshape or view:
x.reshape(1, *x.shape)
# or
x.view(1, *x.shape)

Related

Pytorch Conv1D gives different size to ConvTranspose1d

I am trying to build a basic/shallow CNN auto-encoder for 1D time series data in pytorch/pytorch-lightning.
Currently, my encoding block is:
class encodingBlock(nn.Module):
def __init__(self):
super().__init__()
self.conv1d_1 = nn.Conv1d(1, 64, kernel_size=32)
self.relu = nn.ReLU()
self.batchnorm = nn.BatchNorm1d(64)
self.maxpool = nn.MaxPool1d(kernel_size=2, stride=2, return_indices=True)
self.fc = nn.Linear(64, 4)
def forward(self, x):
cnn_out1 = self.conv1d_1(x)
norm_out1 = self.batchnorm(cnn_out1)
relu_out1 = self.relu(norm_out1)
maxpool_out, indices = self.maxpool(relu_out1)
gap_out = torch.mean(maxpool_out, dim = 2)
fc_out = self.relu(self.fc(gap_out))
return fc_out, indices
And my decoding block is:
class decodingBlock(nn.Module):
def __init__(self):
super().__init__()
self.Tconv1d_1 = nn.ConvTranspose1d(64, 1, kernel_size=32, output_padding=1)
self.relu = nn.ReLU()
self.batchnorm = nn.BatchNorm1d(1)
self.maxunpool = nn.MaxUnpool1d(kernel_size=2, stride=2)
self.upsamp = nn.Upsample(size=59, mode='nearest')
self.fc = nn.Linear(4, 64)
def forward(self, x, indices):
fc_out = self.fc(x)
relu_out = self.relu(fc_out)
relu_out = relu_out.unsqueeze(dim = 2)
upsamp_out = self.upsamp(relu_out)
maxpool_out = self.maxunpool(upsamp_out, indices)
cnnT_out = self.Tconv1d_1(maxpool_out)
norm_out = self.batchnorm(cnnT_out)
relu_out = self.relu(norm_out)
return relu_out
However, looking at the outputs:
Input size: torch.Size([1, 1, 150])
Conv1D out size: torch.Size([1, 64, 119])
Maxpool out size: torch.Size([1, 64, 59])
Global average pooling out size: torch.Size([1, 64])
Encoder dense out size: torch.Size([1, 4])
...
Decoder input: torch.Size([1, 4])
Decoder dense out size: torch.Size([1, 64])
Unsqueeze out size: torch.Size([1, 64, 1])
Upsample out size: torch.Size([1, 64, 59])
Decoder maxunpool out size: torch.Size([1, 64, 118])
Transpose Conv out size: torch.Size([1, 1, 149])
The outputs from the MaxUnpool1d and ConvTranspose1d layers are not the expected dimension.
I have two questions that I was hoping to get some help on:
Why are the dimensions wrong?
Is there a better way to "reverse" the global average pooling than the upsampling procedure I have used?
1. Regarding input and output shapes:
pytorch's doc has the explicit formula relating input and output sizes.
For convolution:
Similarly for pooling:
For transposed convolution:
And for unpooling:
Make sure your padding and output_padding values add up to the proper output shape.
2. Is there a better way?
Transposed convolution has its faults, as you already noticed. It also tends to produce "checkerboard artifacts".
One solution is to use pixelshuffle: that is, predict for each low-res point twice the number of channels, and then split them into two points with the desired number of features.
Alternatively, you can interpolate using a fixed method from the low resolution to the higher one. Apply regular convolutions to the upsampled vectors.
If you choose this path, you might consider using ResizeRight instead of pytorch's interpolate - it has better handling of edge cases.

Concatenate differently shaped keras layer outputs

The keras model is like this:
input_x = Input(shape=input_shape)
x=Conv2D(...)(input_x)
...
y_pred1 = Conv2D(...)(x) # shape of (None, 80, 80, 2)
y_pred2 = Dense(...)(x) # shape of (None, 4)
y_merged = Concatenate(...)([y_pred1, y_pred2])
model = Model(input_x, y_merged)
y_pred1 and y_pred2 are the results I want the model to learn to predict.
But the loss function fcn1 for the y_pred1 branch need y_pred2 prediction results, so I have to concatenate the results of the two branches to get y_merged, so that fcn1 will have access to y_pred2.
The problem is, I want to use the Concatenate layer to concatenate the y_pred1 (None, 4) output with the y_pred2 (None, 80, 80, 2) output, but I don't know how to do that.
How can I reshape the (None, 4) to (None, 80, 80, 1)? For example, by filling the (None, 80, 80, 1) with the 4 elements in y_pred2 and zeros.
Is there any better solutions than using the Concatenate layer?
Maybe this extracted piece of code could help you:
tf.print(condi_input.shape)
# shape is TensorShape([None, 1])
condi_i_casted = tf.expand_dims(condi_input, 2)
tf.print(condi_i_casted.shape)
# shape is TensorShape([None, 1, 1])
broadcasted_val = tf.broadcast_to(condi_i_casted, shape=tf.shape(decoder_outputs))
tf.print(broadcasted_val.shape)
# shape is TensorShape([None, 23, 256])
When you want to broadcast a value, first think about what exactly you want to broadcast. In this example, condi_input has shape(None,1) and helped me as a condition for my encoder-decoder lstm network. To match all dimensionalities, of the encoder states of the lstm, first I had to use tf.expand_dims() to expand the condition value from a shape like [[1]] to [[[1]]].
This is what you need to do first. If you have a prediction as a softmax from the dense layers, you might want to use tf.argmax() first, so you only have one value, which is way easier to broadcast. However, its also possible with 4 but keep in mind, that the dimensions need to match. You cannot broadcast shape(None,4) to shape(None,6), but to shape(None,8) since 8 is devidable through 4.
Then you you can use tf.broadcast() to broadcast your value into the desired shape. Then you have two shapes, you can concatenate together.
hope this helps you out.
Figured it out, the code is like this:
input_x = Input(shape=input_shape)
x=Conv2D(...)(input_x)
...
y_pred1 = Conv2D(...)(x) # shape of (None, 80, 80, 2)
y_pred2 = Dense(4)(x) # (None, 4)
# =========transform to concatenate:===========
y_pred2_matrix = Lambda(lambda x: K.expand_dims(K.expand_dims(x, -1)))(y_pred2) # (None,4, 1,1)
y_pred2_matrix = ZeroPadding2D(padding=((0,76),(0,79)))(y_pred2_matrix) # (None, 80, 80,1)
y_merged = Concatenate(axis=-1)([y_pred1, y_pred2_matrix]) # (None, 80, 80, 3)
The 4 elements of y_pred2 can be indexed as y_merged[None, :4, 0, 2]

Tensor reshape from (256, 256, 3) to (1,256, 256, 3)

My question is how can I reshape the tensor!
Here is one syntax tf.reshape(tensor, shape, name=None)
i do not know , where am i going wrong but i am not able to reshape
how can i do that?
thanks
axis=0
# image is your tensor
tf.expand_dims(your_image, axis)
You can use tf.expand_dims:
output = tf.expand_dims(input, 0)
Try the following code:
#assuming `img` contains the data which is in the format (256,256,3)
output = tf.reshape(img, [1, 256, 256, 3]) # 1 is batch size

why i can't reshape (None, 375) to (25,15) by usint tf.reshape()

There is a 25*15 image, and i want to identify what it is by using CNN.
When training my CNN, I input a numpy named 'img' as datasets which shape is (200, 375):
sess.run(train, feed_dict={X: imgs, Y: labels}
This numpy contains 200 sample ,each of them have 375 features.
But when i reshape this numpy to a (-1, 25, 15, 1) Tensor:
X = tf.placeholder(tf.float32, [None, 375])
X = tf.reshape(X,[-1,25,15,1])
Something wrong happened:
Cannot feed value of shape (200, 375) for Tensor 'Reshape:0', which has shape '(?, 25, 15, 1)'
I don't know why it can't work, 25*15 is indeed 375.
Thank you!
You don't seem to reshape the dict variable you are feeding to the placeholder. You have to reshape your img variable as well into shape [-1, 25, 15, 1]

Tensorflow conv2d_transpose output_shape

I want to implement a Generative adversarial network (GAN) with unfixed input size, like 4-D Tensor (Batch_size, None, None, 3).
But when I use conv2d_transpose, there is a parameter output_shape, this parameter must pass the true size after deconvolution opeartion.
For example, if the size of batch_img is (64, 32, 32, 128), w is weight with (3, 3, 64, 128) , after
deconv = tf.nn.conv2d_transpose(batch_img, w, output_shape=[64, 64, 64, 64],stride=[1,2,2,1], padding='SAME')
So, I get deconv with size (64, 64, 64, 64), it's ok if I pass the true size of output_shape.
But, I want to use unfixed input size (64, None, None, 128), and get deconv with (64, None, None, 64).
But, it raises an error as below.
TypeError: Failed to convert object of type <type'list'> to Tensor...
So, what can I do to avoid this parameter in deconv? or is there another way to implement unfixed GAN?
The output shape list does not accept to have None in the list because the None object can not be converted to a Tensor Object
None is only allowed in shapes of tf.placeholder
for varying size output_shape instead of None try -1 for example you want size(64, None, None, 128) so try [64, -1, -1, 128]... I am not exactly sure whether this will work... It worked for me for batch_size that is my first argument was not of fixed size so I used -1
How ever there is also one high level api for transpose convolution tf.layers.conv2d_transpose()
I am sure the high level api tf.layers.conv2d_transpose() will work for you because it takes tensors of varying inputs
You do not even need to specify the output-shape you just need to specify the output_channel and the kernel to be used
For more details : https://www.tensorflow.org/api_docs/python/tf/layers/conv2d_transpose... I hope this helps
I ran into this problem too. Using -1, as suggested in the other answer here, doesn't work. Instead, you have to grab the shape of the incoming tensor and construct the output_size argument. Here's an excerpt from a test I wrote. In this case it's the first dimension that's unknown, but it should work for any combination of known and unknown parameters.
output_shape = [8, 8, 4] # width, height, channels-out. Handle batch size later
xin = tf.placeholder(dtype=tf.float32, shape = (None, 4, 4, 2), name='input')
filt = tf.placeholder(dtype=tf.float32, shape = filter_shape, name='filter')
## Find the batch size of the input tensor and add it to the front
## of output_shape
dimxin = tf.shape(xin)
ncase = dimxin[0:1]
oshp = tf.concat([ncase,output_shape], axis=0)
z1 = tf.nn.conv2d_transpose(xin, filt, oshp, strides=[1,2,2,1], name='xpose_conv')
I find a solution to use tf.shape for unspecified shape and get_shape() for specified shape.
def get_deconv_lens(H, k, d):
return tf.multiply(H, d) + k - 1
def deconv2d(x, output_shape, k_h=2, k_w=2, d_h=2, d_w=2, stddev=0.02, name='deconv2d'):
# output_shape: the output_shape of deconv op
shape = tf.shape(x)
H, W = shape[1], shape[2]
N, _, _, C = x.get_shape().as_list()
H1 = get_deconv_lens(H, k_h, d_h)
W1 = get_deconv_lens(W, k_w, d_w)
with tf.variable_scope(name):
w = tf.get_variable('weights', [k_h, k_w, C, x.get_shape()[-1]], initializer=tf.random_normal_initializer(stddev=stddev))
biases = tf.get_variable('biases', shape=[C], initializer=tf.zeros_initializer())
deconv = tf.nn.conv2d_transpose(x, w, output_shape=[N, H1, W1, C], strides=[1, d_h, d_w, 1], padding='VALID')
deconv = tf.nn.bias_add(deconv, biases)
return deconv

Categories