I have a tf.Tensor of, for example, shape (31, 6, 6, 3).
I want to perform tf.signal.fft2d on the shapes 6, 6 so, in other words, in the middle. However, the description says:
Computes the 2-dimensional discrete Fourier transform over the inner-most 2 dimensions of input
I could do it with a for loop but I fear it might be very ineffective. Is there a fastest way?
The result must have the same output shape of course.
Thanks to this I implemented this solution using tf.transpose:
in_pad = tf.transpose(in_pad, perm=[0, 3, 1, 2])
out = tf.signal.fft2d(tf.cast(in_pad, tf.complex64))
out = tf.transpose(out, perm=[0, 2, 3, 1])
Related
Gist
Basically I want to perform an increase in dimension of two axes on a n-dimensonal tensor.
For some reason this operation seems very slow on bigger tensors.
If someone can give me a reason or better method I'd be very happy.
Goal
Going from (4, 8, 8, 4, 4, 4, 4, 4, 16, 8, 4, 4, 1) to (4, 32, 8, 4, 4, 4, 4, 4, 4, 8, 4, 4, 1) takes roughly 170 second. I'd like to improve on that. Below is an example, finding the correct indices is not necessary here.
Example Code
Increase dimension (0,2) of tensor
tensor = np.arange(16).reshape(2,2,4,1)
I = np.identity(4)
I tried 3 different methods:
np.kron
indices = [1,3,0,2]
result = np.kron(
I, tensor.transpose(indices)
).transpose(np.argsort(indices))
print(result.shape) # should be (8,2,16,1)
manual stacking
col = []
for i in range(4):
row = [np.zeros_like(tensor)]*4
row[i]=tensor
col.append(a)
result = np.array(col).transpose(0,2,3,1,4,5).reshape(8,2,16,1)
print(result.shape) # should be (8,2,16,1)
np.einsum
result =np.einsum("ij, abcd -> iabjcd", I, tensor).reshape(8,2,16,1)
print(result.shape) # should be (8,2,16,1)
Results
On my machine they performed the following (on the big example with complex entries):
np.einsum ~ 170s
manual stacking ~ 185s
np.kron ~ 580s
As Jérôme pointed out:
all your operations seems to involve a transposition which is known to be very expensive on modern hardware.
I reworked my algorithm to not rely on the dimensional increase by doing certain preprocessing steps. This indeed speeds up the overall process substantially.
I have a 512x512 image array and I want to perform operations on 8x8 blocks. At the moment I have something like this:
output = np.zeros(512, 512)
for i in range(0, 512, 8):
for j in rangerange(0, 512, 8):
a = input[i:i+8, j:j+8]
b = some_other_array[i:i+8, j:j+8]
output[i:i+8, j:j+8] = np.dot(a, b)
where a & b are 8x8 blocks derived from the original array. I would like to speed up this code by using vectorised operations. I have reshaped my inputs like this:
input = input.reshape(64, 8, 64, 8)
some_other_array = some_other_array.reshape(64, 8, 64, 8)
How could I perform a dot product on only axes 1 & 3 to output an array of shape (64, 8, 64, 8)?
I have tried np.tensordot(input, some_other_array, axes=([0, 1], [2, 3])) which gives the correct output shape, but the values do not match the output from the loop above. I've also looked at np.einsum but I haven't come across a simple example with what I'm trying to achieve.
As you suspected, np.einsum can take care of this. If input and some_other_array have shapes (64, 8, 64, 8), then if you write
output = np.einsum('ijkl,ilkm->ijkm', input, some_other_array)
then output will also have shape (64, 8, 64, 8), where matrix multiplication (i.e. np.dot) has been done only on axes 1 and 3.
The string argument to np.einsum looks complicated, but really it's a combination of two things. First, matrix multiplication is given by jl,lm->jm (see e.g. this answer on einsum). Second, we don't want to do anything to axis 0 and 2, so for them I just write ik,ik->ik. Combining the two gives ijkl,ilkm->ijkm.
They'll work if you reorder them a bit. If input and some_other_array are both shaped (64,8,64,8), then:
input = input.transpose(0,2,1,3)
some_other_array = some_other_array.transpose(0,2,1,3)
This will reorder them to 64,64,8,8. At this point you can compute a matrix multiplication. Do note that you need matmul to compute the block products, and not dot, which will try to multiply the entire matrices.
output = input # some_other_array
output = output.transpose(0,2,1,3)
output = output.reshape(512,512)
I am trying to create a custom layer that is similar to Max Pooling or the first step of a separable convolution.
For example with a 2-Tensor in which I want to extract the non-overlapping 2x2 patches:
if I have the [4,4] tensor
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9,10,11],
[12,13,14,15]]
I want to end up with the following [2,2,4] Tensor
[[[ 0, 1, 4, 5],[ 2, 3, 6, 7]],
[[ 8, 9,12,13],[10,11,14,15]]]
For a 3-Tensor, I want something similar but to also separate out the 3rd dimension. tf.extract_image_patches almost does what I want, but it folds the "depth" dimension into each patch.
Ideally if I had a tensor of shape [32,64,7] and wanted to extract all the [2,2] patches out of it: I would end up with a shape of [16,32,7,4]
To be clear, I just want to extract the patches, not to actually do max pooling nor separable convolution.
Since I am not actually augmenting the data, I suspect that you can do it with some tf.reshape trickery... Is there any nice way to achieve this in tensorflow without resorting to slicing+stitching/for loops?
Also, what is the correct terminology for this operation? Windowing? Tiling?
Turns out this is really easy to do with tf.transpose. The solution that ended up working for me is:
#Assume x is in BHWC form
def pool(x,size=2):
channels = x.get_shape()[-1]
x = tf.extract_image_patches(
x,
ksizes=[1,size,size,1],
strides=[1,size,size,1],
rates=[1,1,1,1],
padding="SAME"
)
x = tf.reshape(x,[-1],x.get_shape()[1:3]+[size**2,channels])
x = tf.transpose(x,[0,1,2,4,3])
return x
I use tensorflow in python easily for math ops such as reduce_sum or reduce_mean like this
array = np.ndarray(shape=(2, 2, 3), buffer=np.array([[[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]]]),
dtype=int)
mean = tf.reduce_mean(array)
sum = tf.reduce_sum(array)
with tf.Session() as sess:
print(sess.run(mean))
print(sess.run(sum))
from this, I can get the mean and sum of a tensor into one value, howerver, when I do these ops in C++, I get some problem, like this
Sum(root.WithOpName("sum"), tensor_input, 1)
In this example, the second param tensor_input is a tensor of shape [1, 160, 160, 3].
Differently,I have to set the third param to a number in range of (-rank, rank), but this can not get my wanted result for suming all values in the tensor such as in python, rather than, it Computes the sum of elements across dimensions of a tensor. so how can I get the same result such as in python for suming all values into one value.
It would be helpful if anyone can help me
I have solved it, when you want to reduce your sum or mean result, if you do this on a tensor in shapr [1, 160, 160, 3], you can use like this
Sum(root.WithOpName("sum"), tensor_input, {0, 1, 2, 3})
The last prama is range of (0, rank(tensor_input))
I'm trying to efficiently replicate numpy's ndarray.choose() method.
Here's a numpy example of what I'm looking for:
b = np.arange(15).reshape(3, 5)
c = np.array([1,0,4])
c.choose(b.T) # trying to replicate in tensorflow
-> array([ 1, 5, 14])
The best I've been able to do with this is generate a batch_size square matrix (which is huge if batch size is huge) and take the diagonal of it:
tf_b = tf.constant(b)
tf_c = tf.constant(c)
sess.run(tf.diag_part(tf.gather(tf.transpose(tf_b), tf_c)))
-> array([ 1, 5, 14])
Is there a way to do this that is just linear in the first dimension (instead of squared)?
Yeah, there's an easier way to do this. Flatten your b array to 1-d, so it's [0, 1, 2, ..., 13, 14]. Take an array of indices that are in the range of the number of 'choices' you are taking (3 in your case). That will be [0, 1, 2]. Multiply this range by the second dimension of your original shape, which is the number of options for each choice (5 in your case). That gives you [0, 5, 10]. Then add your indices to this to obtain [1, 5, 14]. Now you're good to call tf.gather().
Here is some code that I've taken from here that does a similar thing for RNN outputs. Yours will be slightly different, but the idea is the same.
index = tf.range(0, batch_size) * max_length + (length - 1)
flat = tf.reshape(output, [-1, out_size])
relevant = tf.gather(flat, index)
return relevant
In a big picture, the operation is pretty straightforward. You use the range operation to get the index of the beginning of each row, then add the index of where you are in each row. I think doing it in 1D is easiest, so that's why we flatten it.