Equivalent to torch.rfft() in newest PyTorch version - python

I want to estimate the fourier transform for a given image of size BxCxWxH
In previous torch version the following did the job:
fft_im = torch.rfft(img, signal_ndim=2, onesided=False)
and the output was of size:
BxCxWxHx2
However, with the new version of rfft :
fft_im = torch.fft.rfft2(img, dim=2, norm=None)
I do not get the same results. Do I miss something?

A few issues
The dim argument you provided is an invalid type, it should be a tuple of two numbers or should be omitted. Really PyTorch should raise an exception. I would argue that the fact this ran without exception is a bug in PyTorch (I opened a ticket stating as much).
PyTorch now supports complex tensor types, so FFT functions return those instead of adding a new dimension for the real/imaginary parts. You can use torch.view_as_real to convert to the old representation. Also worth pointing out that view_as_real doesn't copy data since it returns a view so shouldn't slow things down in any noticeable way.
PyTorch no longer gives the option of disabling one-sided calculation in RFFT. Probably because disabling one-sided makes the result identical to torch.fft.fft2, which is in conflict with the 13th aphorism of PEP 20. The whole point of providing a special real-valued version of the FFT is that you need only compute half the values for each dimension, since the rest can be inferred via the Hermition symmetric property.
So from all that you should be able to use
fft_im = torch.view_as_real(torch.fft.fft2(img))
Important If you're going to pass fft_im to other functions in torch.fft (like fft.ifft or fft.fftshift) then you'll need to convert back to the complex representation using torch.view_as_complex so those functions don't interpret the last dimension as a signal dimension.

Related

Parallelizing tensor operations in TensorFlow

I'm trying to parallelize different tensor operations. I'm aware that tf.vectorized_map and/or tf.map_fn can parallelize input tensor(s) with respect to their first axis, but that's not what I'm looking for. I'm looking for a way to parallelize a for loop on a set of tensors with possibly different shapes.
a = tf.ones((2))
b = tf.ones((2,2))
list_of_tensors = [a,b*2,a*3]
for t in list_of_tensors:
# some operation on t which may vary depending on its shape
Is there a possible way to parallelize this for loop on GPU with TensorFlow? (I'm open to any other library if possible i.e. JAX, numba etc.)
Thanks!
According to the documentation,
The shape and dtype of any intermediate or output tensors in the
computation of fn should not depend on the input to fn.
I'm struggling with this problem myself. I think the answer is one suggested in the comments: If you know the maximum length that your tensor can have, represent the variable length tensor by the maximum length tensor plus an integer which gives the actual length of the tensor. Whether this will be useful at all depends on the meaning of "any intermediate", because at some point you may still need the result of the actual shorter length tensor in your computation. It's a bit of a tail-chasing exercise. This part of Tensorflow is extremely frustrating, it's very very hacky to get things to work that should be easy, especially in the realm of obtaining true parallelism on the GPU for deterministic matrix algorithms, outside of the context of machine learning.
This might work inside the loop:
tf.autograph.experimental.set_loop_options(
shape_invariants=[(v, tf.TensorShape([None]))]
)

Why does PyTorch gather function require index argument to be of type LongTensor?

I'm writing some code in PyTorch and I came across the gather function. Checking the documentation I saw that the index argument takes in a LongTensor, why is that? Why does it need to take in a LongTensor instead of another type such as IntTensor? What are the benefits?
By default all indices in pytorch are represented as long tensors - allowing for indexing very large tensors beyond just 4GB elements (maximal value of "regular" int).

Can I input a Byte Tensor to my RNN/LSTM model?

I am developing an RNN/LSTM model to which I want to encode the sequence in a ByteTensor to save memory as I am limited to a very tight memory. However, when I do so, the model returns the following error:
Expected object of scalar type Byte but got scalar type Float for argument #2 'mat2'
So, there seems to be something else that is need to be Byte tensor as well, but I do not know what is it since the console only shows an error at the line:
output = model(predictor)
It means that inside the model there are float tensors which are being used to operate on your byte tensor (most likely operands in matrix multiplications, additions, etc). I believe you can technically cast them to byte by executing model.type(torch.uint8), but your approach will sooner or later fail anyway - since integers are discrete there is no way to used them in gradient calculations necessary for backpropagation. uint8 values can be used in deep learning to improve performance and memory footprint of inference in a network which is already trained, but this is an advanced technique. For this task your best bet are the regular float32s. If your GPU supports it, you could also use float16 aka half, though it introduces additional complexity and I wouldn't suggest it for beginners.

What's the difference between tf.expand_dims and tf.newaxis in Tensorflow?

Hi I am new to Tensorflow.
I want to change the dimention of Tensor, and I found 3 types of method to implement this, like below:
a = tf.constant([[1,2,3],[4,5,6]]) # shape (2,3)
# change dimention of a to (2,3,1)
b = tf.expand_dims(a,2) # shape(2,3,1)
c = a[:,:,tf.newaxis] # shape(2,3,1)
d = tf.reshape(a,(2,3,1)) # shape(2,3,1)
Is there any difference among the 3 methods, e.g. in terms of performance?
Which method should I use?
There is no real difference between the three, but sometimes one or the other may be more convenient:
tf.expand_dims(a, 2): Convenient when you want to add one dimension and its index is variable (for example the result of another TensorFlow operation, or some function parameter). Depending on your style you may find it more readable, since it clearly expresses the intention of adding a dimension.
a[:,:,tf.newaxis]: Personally I use this a lot because I find it readable (maybe because I'm used to it from NumPy), although not in every case. Especially convenient if you want to add multiple dimensions (instead of calling tf.expand_dims multiple times). Also (obviously) if you want to take a slice and add new dimensions at the same time. However it is not usable with variable axis indices, and if you have many dimensions tf.expand_dims may be less confusing.
tf.reshape(a,(2,3,1)): Personally I rarely or never use this to just add a dimension, because it requires me to know and specify all (or all but one) the remaining dimension sizes, and also it may be misleading when reading the code. However, if I need to reshape and add a dimension, I usually do it in the same operation.

redundant multiple implementations of one function in scipy?

I'm using scipy to do some image processing job, and I found something quite confusing, that is some functions, say scipy.signal.convolve, scipy.ndimage.filters.convolve, have the same name and functionality, but they belong the different modules of scipy, so I kinda wonder why not just implement them once ?
They do slightly different things, mostly related with how they handle the convolution when the two arrays being convolved don't fully overlap.
scipy.ndimage.filters.convolve always returns an array of the same size as its first parameter. To handle areas near the boundaries, where the second array may not fully overlap with the first, it makes up for those values using one of these options: reflect, constant, nearest, mirror or wrap.
scipy.signal.convolve always pads the arrays with zeros as needed, and gives a return with three options, full, valid or same, which determine the size of the return array, depending on whether values calculated relying on the zero-padding are to be kept or discarded.

Categories