Apply dilation rate individually to each dimension with tf.nn.atrous_conv2d? - python

I have a tensor with shape B x H x W x C and I like to apply dilation to only H. Do you know any way to have this small trick with tf.nn.atrous_conv2d?

There is no such feature in tf.nn.atrous_conv2d, but you can use tf.layers.conv2d and set dilation_rate=(2, 1) to achieve the same effect.

Related

Python: [PyTorch] Selectively add along axes without using a loop

let’s say I have a Tensor X with dimensions [batch, channels, H, W]
then I have another tensor b that holds bias values for each channel which has dims [channels,]
I want y = x + b (per sample)
Is there a nice way to broadcast this over H and W for each channel for each sample in the batch without using a loop.
If i’m convolving I know I can use the bias field in the function to achieve this, but I’m just wondering if this can be achieved just with primitive ops (not using explicit looping)
Link to PyTorch forum question
y = x + b[None, :, None, None] (basically expand b into x's axis template)

numpy non canonical dot product in higher dimension

I'm trying to vectorize a loop with NumPy but I'm stuck
I have a matrix A of shape (NN,NN) I define the A-dot product by
def scalA(u,v):
return v.T # A # u
Then I have two matrices B and C (B has a shape (N,NN) and C has a shape (K,NN) the loop I'm trying to vectorize is
res = np.zeros((N,K))
for n in range(N):
for k in range(K):
res[n,k] = scalA(B[n,:], C[k,:])
I found during my research functions like np.tensordot or np.einsum, but I haven't really understood how they work, and (if I have well understood) tensordot will compute the canonical dot product (that would correspond to A = np.eye(NN) in my case).
Thanks !
np.einsum('ni,ji,kj->nk', B,A,C)
I think this works. I wrote it 'by eye' without testing.
Probably you're looking for this:
def scalA(u,v):
return u # A # v.T
If shape of A is (NN,NN), shape of B is (N,NN), and shape of C is (K,NN), the result of scalA(B,C) has shape (N,K)
If shape of A is (NN,NN), shape of B is (NN,), and shape of C is (NN,), the result of scalA(B,C) is just a scalar.
However, if you're expecting B and C to have even higher dimensionality (greater than 2), this may need further tweaking. (I could not tell from your question whether that's the case)

What are b, y, x and c which get flattened and returned along with the max-pooled features in tf.nn.max_pool_with_argmax?

I went through the documentation of tf.nn.max_pool_with_argmax where it is written
Performs max pooling on the input and outputs both max values and indices.
The indices in argmax are flattened, so that a maximum value at
position [b, y, x, c] becomes flattened index ((b * height + y) *
width + x) * channels + c.
The indices returned are always in [0, height) x [0, width) before
flattening, even if padding is involved and the mathematically correct
answer is outside (either negative or too large). This is a bug, but
fixing it is difficult to do in a safe backwards compatible way,
especially due to flattening.
The variables b, y, x and c haven't been explicitly defined hence I was having issues implementing this method. Can someone please provide the same.
I am unable to comment due to reputation.
But I think the variables are referencing the position and size of the Max Pooling window. x and y are the x and y position of the kernel as it moves along the input matrix and b and c are the width and height of the kernel. You would set b and c in kernel size.
If you are having a problem implementing max pooling with argmax it has little to do with these variables. You might want to specify the issue you are having with Max Pooling.

PyTorch Linear layer input dimension mismatch

Im getting this error when passing the input data to the Linear (Fully Connected Layer) in PyTorch:
matrices expected, got 4D, 2D tensors
I fully understand the problem since the input data has a shape (N,C,H,W) (from a Convolutional+MaxPool layer) where:
N: Data Samples
C: Channels of the data
H,W: Height and Width
Nevertheless I was expecting PyTorch to do the "reshaping" of the data form:
[ N , D1,...Dn] --> [ N, D] where D = D1*D2*....Dn
I try to reshape the Variable.data, but I've read that this approach is not recommended since the gradients will conserve the previous shape, and that in general you should not mutate a Variable.data shape.
I am pretty sure there is a simple solution that goes along with the framework, but i haven't find it.
Is there a good solution for this?
PD: The Fully connected layer has as input size the value C * H * W
After reading some Examples I found the solution. here is how you do it without messing up the forward/backward pass flow:
(_, C, H, W) = x.data.size()
x = x.view( -1 , C * H * W)
A more general solution (would work regardless of how many dimensions x has) is to take the product of all dimension sizes but the first one (the "batch size"):
n_features = np.prod(x.size()[1:])
x = x.view(-1, n_features)
It is common to save the batch size and infer the other dimension in a flatten:
batch_size = x.shape[0]
...
x = x.view(batch_size, -1)

Dot product of patches in tensorflow

I have two square matrices of the same size and the dimensions of a square patch. I'd like to compute the dot product between every pair of patches. Essentially I would like to implement the following operation:
def patch_dot(A, B, patch_dim):
res_dim = A.shape[0] - patch_dim + 1
res = np.zeros([res_dim, res_dim, res_dim, res_dim])
for i in xrange(res_dim):
for j in xrange(res_dim):
for k in xrange(res_dim):
for l in xrange(res_dim):
res[i, j, k, l] = (A[i:i + patch_dim, j:j + patch_dim] *
B[k:k + patch_dim, l:l + patch_dim]).sum()
return res
Obviously this would be an extremely inefficient implementation. Tensorflow's tf.nn.conv2d seems like a natural solution to this as I'm essentially doing a convolution, however my filter matrix isn't fixed. Is there a natural solution to this in Tensorflow, or should I start looking at implementing my own tf-op?
The natural way to do this is to first extract overlapping image patches of matrix B using tf.extract_image_patches, then to apply the tf.nn.conv2D function on A and each B sub-patch using tf.map_fn.
Note that prior to use tf.extract_image_patches and tf.nn.conv2D you need to reshape your matrices as 4D tensors of shape [1, width, height, 1] using tf.reshape.
Also, prior to use tf.map_fn, you would also need to use the tf.transpose op so that the B sub-patches are indexed by the first dimension of the tensor you use as the elems argument of tf.map_fn.

Categories