Doubling input array when using numpy for fast fourier transforms - python

I wrote a function that returns the real component of the fast four transform of a grid.
def take_FFT(x):
# some arbitrary field for a 1D grid
y = abs(1.0/x)
# compute FFT (in general multi-dimensional) array of real numbers
y_k = np.fft.rfftn(y)
#compute the inverse FFT
y_invk = np.fft.irfftn(y_k)
return y,y_k, y_invk # return fourier transform and inv transform
# initialize sample x
x_test = np.arange(-5,5,0.001)
field,FFT_test, inv_test = take_FFT(x_test)
How do I make an appropriate new "x array" to plot against the FFT? It is not clear to me how to make an array of length = (n/2)+1, like the one that np.fft.irfftn returns

Welcome to StackOverflow, #Messier!
If I understand your question correctly, you want to slice a numpy.array.
Suppose we have a numpy.array arr that has length N. Then to slice up to length M (such that M<=N) or up to (N/2)+1:
sliced_arr = arr[:M]
slice_half = arr[:N//2+1]
where in python versions 3 or greater, N//2 does integer division.

The easiest way to get an array of frequencies to be used with np.fft.rfft is to make use of the convenient helper function np.fft.rfftfreq:
freqs = np.rfftfreq(x_test)
The multi-dimentional equivalent for np.fft.rfftn is slightly more complicated. You will need to get the frequencies along each axes, then use np.meshgrid:
per_axis_freq = [np.fft.fftfreq(N) for N in x_test.shape[0:-1]]
per_axis_freq.append(np.fft.rfftfreq(x_test.shape[-1]))
freqs = np.meshgrid(*per_axis_freq[::-1])

Related

2D Integration over a flattened Array

I'm hoping to find a way around the solution offered here to use 2D arrays in order to do 2D numerical integration.
import numpy as np
ksize = 50
a = 1.0
kdom = np.pi / a
x = np.linspace(- kdom, kdom, ksize)
y = np.linspace(- kdom, kdom, ksize)
dk = x[1]-x[0]
X,Y = np.meshgrid(x,y)
eigval = np.cos(X)+np.cos(Y)
eigvalflat = eigval.flatten()
intval = np.trapz(np.trapz(eigval,x),y)
sumval = np.sum(eigvalflat)*dk/ksize
print(intval,sumval)
Given my dummy example above, I'd like to find a way to properly integrate the 1D array (eigvalflat) while still as a flattened array even though it is a double integral.
Computationally, if the integrand is not separable, then the answer is that you can't recast the double integral as a single integral, unless you compute the integral one dimension at a time, which is what the assignment to intval is essentially doing.
Analytically, you'll have a better chance by asking yourself the question: given the 2d region of the integral (a rectangle in your example), can one find an integral over the boundary of that region? For that, Green's theorem has you covered with necessary and sufficient conditions.

Given a 2D Numpy array representing a 2D distribution, how to sample data from this distribution with the aid of Numpy or Scipy functions?

Given a 2D numpy array dist with shape (200,200), where each entry of the array represents the joint probability of (x1, x2) for all x1 , x2 ∈ {0, 1, . . . , 199}. How do I sample bivariate data x = (x1, x2) from this probability distribution with the aid of Numpy or Scipy API?
This solution works with probability distributions of any number of dimensions, assuming they are a valid probability distribution (its contents must sum to 1, etc.). It flattens the distribution, samples from that, and adjusts the random index to match the original array shape.
# Create a flat copy of the array
flat = array.flatten()
# Then, sample an index from the 1D array with the
# probability distribution from the original array
sample_index = np.random.choice(a=flat.size, p=flat)
# Take this index and adjust it so it matches the original array
adjusted_index = np.unravel_index(sample_index, array.shape)
print(adjusted_index)
Also, to get multiple samples, add a size keyword argument to the np.random.choice call, and modify adjusted_index before printing it:
adjusted_index = np.array(zip(*adjusted_index))
This is necessary because np.random.choice with a size argument outputs a list of indices for each coordinate dimension, so this zips them into a list of coordinate tuples. This is also much more efficient than simply repeating the first code.
Relevant documentation:
np.random.choice
np.unravel_index
Here's a way, but I'm sure there's a much more elegant solution using scipy.
numpy.random doesn't deal with 2d pmfs, so you have to do some reshaping gymnastics to go this way.
import numpy as np
# construct a toy joint pmf
dist=np.random.random(size=(200,200)) # here's your joint pmf
dist/=dist.sum() # it has to be normalized
# generate the set of all x,y pairs represented by the pmf
pairs=np.indices(dimensions=(200,200)).T # here are all of the x,y pairs
# make n random selections from the flattened pmf without replacement
# whether you want replacement depends on your application
n=50
inds=np.random.choice(np.arange(200**2),p=dist.reshape(-1),size=n,replace=False)
# inds is the set of n randomly chosen indicies into the flattened dist array...
# therefore the random x,y selections
# come from selecting the associated elements
# from the flattened pairs array
selections = pairs.reshape(-1,2)[inds]
I can't comment either, but #applemonkey496 's suggestion for getting multiple samples doesn't work as written. It's an excellent solution otherwise.
Instead of
adjusted_index = np.array(zip(*adjusted_index))
adjusted_index should be converted to a python list before trying to put it into a numpy array (numpy arrays do not accept zipped objects), eg:
adjusted_index = np.array(list(zip(*adjusted_index)))
I can't comment, but to improve kevinkayaks answer's :
pairs=np.indices(dimensions=(200,200)).T
selections = pairs.reshape(-1,2)[inds]
Is not needed can be replace by :
np.array([inds//m, inds%m]).T
The matrix "pairs" is not needed anymore.

Reshaping numpy array

What I am trying to do is take a numpy array representing 3D image data and calculate the hessian matrix for every voxel. My input is a matrix of shape (Z,X,Y) and I can easily take a slice along z and retrieve a single original image.
gx, gy, gz = np.gradient(imgs)
gxx, gxy, gxz = np.gradient(gx)
gyx, gyy, gyz = np.gradient(gy)
gzx, gzy, gzz = np.gradient(gz)
And I can access the hessian for an individual voxel as follows:
x = 100
y = 100
z = 63
H = [[gxx[z][x][y], gxy[z][x][y], gxz[z][x][y]],
[gyx[z][x][y], gyy[z][x][y], gyz[z][x][y]],
[gzx[z][x][y], gzy[z][x][y], gzz[z][x][y]]]
But this is cumbersome and I can't easily slice the data.
I have tried using reshape as follows
H = H.reshape(Z, X, Y, 3, 3)
But when I test this by retrieving the hessian for a specific voxel the, the value returned from the reshaped array is completely different than the original array.
I think I could use zip somehow but I have only been able to find that for making lists of tuples.
Bonus: If there's a faster way to accomplish this please let me know, I essentially need to calculate the three eigenvalues of the hessian matrix for every voxel in the 3D data set. Calculating the hessian values is really fast but finding the eigenvalues for a single 2D image slice takes about 20 seconds. Are there any GPUs or tensor flow accelerated libraries for image processing?
We can use a list comprehension to get the hessians -
H_all = np.array([np.gradient(i) for i in np.gradient(imgs)]).transpose(2,3,4,0,1)
Just to give it a bit of explanation : [np.gradient(i) for i in np.gradient(imgs)] loops through the two levels of outputs from np.gradient calls, resulting in a (3 x 3) shaped tensor at the outer two axes. We need these two as the last two axes in the final output. So, we push those at the end with the transpose.
Thus, H_all holds all the hessians and hence we can extract our specific hessian given x,y,z, like so -
x = 100
y = 100
z = 63
H = H_all[z,y,x]

scipy parallel interpolation of multiple arrays

I have multiple arrays of the same dimension, or rather a matrix say
data.shape
# (n, m)
I want to interpolate the m-axis and leave the n-axis. Ideally I would get a function which I can call by with an x-array of length n.
interpolated(x)
x.shape
# (n,)
I tried
from scipy import interpolate
interpolated = interpolate.interp1d(x=x_points, y=data)
interpolated(x).shape
# (n, n)
but this evaluates every array at the given point. Is there a better way to do it than ugly loops like
interpolated = array(interpolate.interp1d(x=x_points, y=array_) for
array_ in data)
array(func_(xi) for func_, xi in zip(interpolated, x))
Your (n,m)-shaped data is, as you said, is a collection of n datasets, each of length m. You're trying to pass this an n-length x array, and expect to obtain an n-length result. That is, you're querying the n independent datasets at n unrelated points.
This makes me believe that you need to use n independent interpolators. There is no real benefit in trying to get away with a single call to an interpolation routine. Interpolation routines as far as I know assume that the target of the interpolation is a single object. Either a multivariate function, or a function that has an array-shaped value; in either case you can query the function one (optionally higher-dimensional) point at a time. For instance, multilinear interpolation works across rows of the input, so there's (again, as far as I know) no way to "interpolate linearly along an axis". In your case, there is absolutely no relationship between the rows of your data, and there's no relationship between query points, so it's also semantically motivated to use n independent interpolators for your problem.
As for convenience, you can shove all those interpolating functions into a single function for ease of use:
interpolated = [interpolate.interp1d(x=x_points, y=array_) for
array_ in data]
def common_interpolator(x):
'''interpolate n separate datasets at n separate input points'''
return array([fun(xx) for fun,xx in zip(interpolated,x)])
This will allow you to use a single call to common_interpolator with an input array_like of length n.
But since you mentioned it in comments, you can actually make use of np.vectorize if you want to add multiple sets if query points to this function. Here's a complete example with three trivial dummy functions:
import numpy as np
# three scalar (well, or vectorized) functions:
funs = [lambda x,i=i: x+i for i in range(3)]
# define a wrapper for calling them together
def allfuns(xs):
'''bundled call to functions: n-length input to n-length output'''
return np.array([fun(x) for fun,x in zip(funs,xs)])
# define a vectorized version of the wrapper, (...,n) to (...,n)-shape
allfuns_vector = np.vectorize(allfuns,signature='(n)->(n)')
# print some examples
x = np.arange(3)
print([fun(xx) for fun,xx in zip(funs,x)])
# [0, 2, 4]
print(allfuns(x))
# [0 2 4]
print(allfuns_vector(x))
# [0 2 4]
print(allfuns_vector([x,x+10]))
#[[ 0 2 4]
# [10 12 14]]
As you can see, all of the above work the same way for a 1d input array. But we can pass a (k,n)-shaped array to the vectorized version and it will perform the interpolation row-wise, that is each [:,n] slice will be fed to the original interpolator bundle. As far as I know np.vectorize is essentially a wrapper for a for loop, but at least it makes calling your functions more convenient.

How to generate a number of random vectors starting from a given one

I have an array of values and would like to create a matrix from that, where each row is my starting point vector multiplied by a sample from a (normal) distribution.
The number of rows of this matrix will then vary in dependence from the number of samples I want.
%pylab
my_vec = array([1,2,3])
my_rand_vec = my_vec*randn(100)
Last command does not work, because array shapes do not match.
I could think of using a for loop, but I am trying to leverage on array operations.
Try this
my_rand_vec = my_vec[None,:]*randn(100)[:,None]
For small numbers I get for example
import numpy as np
my_vec = np.array([1,2,3])
my_rand_vec = my_vec[None,:]*np.random.randn(5)[:,None]
my_rand_vec
# array([[ 0.45422416, 0.90844831, 1.36267247],
# [-0.80639766, -1.61279531, -2.41919297],
# [ 0.34203295, 0.6840659 , 1.02609885],
# [-0.55246431, -1.10492863, -1.65739294],
# [-0.83023829, -1.66047658, -2.49071486]])
Your solution my_vec*rand(100) does not work because * corresponds to the element-wise multiplication which only works if both arrays have identical shapes.
What you have to do is adding an additional dimension using [None,:] and [:,None] such that numpy's broadcasting works.
As a side note I would recommend not to use pylab. Instead, use import as in order to include modules as pointed out here.
It is the outer product of vectors:
my_rand_vec = numpy.outer(randn(100), my_vec)
You can pass the dimensions of the array you require to numpy.random.randn:
my_rand_vec = my_vec*np.random.randn(100,3)
To multiply each vector by the same random number, you need to add an extra axis:
my_rand_vec = my_vec*np.random.randn(100)[:,np.newaxis]

Categories