How do I apply FFT on a 3D Array - python

I have a 3D array that has the shape (features, timestep, samples). I would like to apply the numpy fft function on each feature for the length of timestep for each sample. I have this, but I am uncertain whether this is the best way or whether there needs to be a loop to iterate through each sample.
import numpy as np
x_train_fft = np.fft.fft(x_train, axis=0) #selected axis 0 as this is the axis of features

Looks like this was the way to do it
X_transform_FFT =[]
for i in range(x_train.shape[0]):
f = abs(np.fft.fft(x_train[i, :, :], axis = 1))
X_transform_FFT.append(f)
np.asarray(X_transform_FFT)
print(X_transform_FFT)

Related

is there any way to calculate L2 norm of multiple 2d matrices at once, in python?

for example, I have a matrix of dimensions (a,b,c,d). I want to calculate L2 norm of all d matrices of dimensions (a,b,c). Is there any way to use numpy.linalg.norm with out any looping structure?
I mean, the resultant array should be 1 x d
How about this?
import numpy as np
mat = np.arange(2*3*4*5).reshape(2,3,4,5) # create 4d array
mat2 = np.moveaxis(mat,-1,0) # bring last axis to the front
*outarr, = map(np.linalg.norm,mat2) # use map

vectorized "by-layer" scaling of numpy array

I have a numpy array (let's say 100x64x64).
My goal is to scale each 64x64 layer independently and store a scaler for later use.
This is how it can be achieved with a for-loop solution:
scalers_dict={}
for i in range(X.shape[0]):
scalers_dict[i] = MinMaxScaler()
#fitting the scaler
X[i, :, :] = scalers_dict[i].fit_transform(X[i, :, :])
#saving dict of scalers
joblib.dump(value=scalers_dict,filename="dict_of_scalers.scaler")
My real array is much bigger, and it takes quite a while to iterate through it.
Do you have in mind some more vectorized solution for that, or for-loop is the only way?
If I understand correctly how MinMaxScaler works, it can operate on independent arrays which reduce along axis=0.
To make this useful for your case, you'd need to transform X into a (64 * 64, 100) array:
s = X.shape
X = np.moveaxis(X, 0, -1).reshape(-1, s[0])
Alternatively, you can write
X = X.reshape(s[0], -1).T
Now you can do the scaling with
M = MinMaxScaler()
X = M.fit_transform(X)
Since the actual fit is computed on the first dimension, all the results will be of size 100. This will broadcast perfectly now that the last dimension is of the same size.
To get the original shape back, invert the original transformation:
X = X.T.reshape(s)
When you are done, M will be a scaler calibrated for 100 features. There is no need for a dictionary here. Remember that a dictionary keyed by a sequence of integers can better be expressed as a list or array, which is what happens here.
IIUC, you can manually scale:
mm, MM = inputs.min(axis=(1,2)), inputs.max(axis=(1,2))
# save these for later use
joblib.dump((mm,MM), 'minmax.joblib')
def scale(inputs, mm, MM):
return (inputs - mm[:,None,None])/(MM-mm)[:,None,None]
# load pre-saved min & max
mm, MM = joblib.load('minmax.joblib')
# scaled inputs
scale(inputs, mm, MM)

Calculating 3D pixel variance from 4D array

Let there be some 4D array [x,y,z,k] comprised of k 3D images [x,y,z].
Is there any way to calculate the variance of each individual pixel in 3D from the 4D array?
E.g. I have a 10x10x10x5 array and would like to return a 10x10x10 variance array; the variance is calculated for each pixel (or voxel, really) along k
If this doesn't make sense, let me know and I'll try explaining better.
Currently, my code is:
tensors = []
while error > threshold:
for _ in range(5): #arbitrary
new_tensor = foo(bar) #always returns array of same size
tensors.append(new_tensor)
tensors = np.stack(tensors, axis = 3)
#tensors.shape
And I would like the calculate a variance array for tensors
There is a simple way to do that if you're using numpy:
variance = tensors.var(axis=3)

Pairwise Euclidean distances between two binary tensors

I am trying to compute the pairwise distances between all points in two binary areas/volume/hypervolume in Tensorflow.
E.g. In 2D the areas are defined as binary tensors with ones and zeros:
input1 = tf.constant(np.array([[1,0,0], [0,1,0], [0,0,1]))
input2 = tf.constant(np.array([[0,1,0], [0,0,1], [0,1,0]))
input1 has 3 points and input2 has 2 points.
So far I have managed to convert the binary tensors into arrays of spatial coordinates:
coord1 = tf.where(tf.cast(input1, tf.bool))
coord2 = tf.where(tf.cast(input2, tf.bool))
Where, coord1 will have shape=(3,2) and coord2 will have shape=(2,2). The first dimension refers to the number of points and the second to their spatial coordinates (in this case 2D).
The result that I want is a tensor with shape=(6, ) with the pairwise Euclidean distances between all of the points in the areas.
Example (the order of the distances might be incorrect):
output = [1, sqrt(5), 1, 1, sqrt(5), 1]
Since TensorFlow isn't great with loops and in my real application the number of points in each tensor is unknown, I think I might be missing some linear algebra here.
I'm not familiar with Tensorflow, but my understanding from reading this is that the underlying NumPy arrays should be easy to extract from your data. So I will provide a solution which shows how to calculate pairwise Euclidean distances between points of 3x2 and 2x2 NumPy arrays, and hopefully it helps.
Generating random NumPy arrays in the same shape as your data:
coord1 = np.random.random((3, 2))
coord2 = np.random.random((2, 2))
Import the relevant SciPy function and run:
from scipy.spatial.distance import cdist
distances = cdist(coord1, coord2, metric='euclidean')
This will return a 3x2 array, but you can use distances.flatten() to get your desired 1-dimensional array of length 6.
I have come up with an answer using only matrix multiplies and transposition. This makes use of the fact that distances can be expressed with inner products (d^2 = x^2 + y^2 - 2xy):
input1 = np.array([[1,0,0],[0,1,0],[0,0,1]])
input2 = np.array([[1,1,0],[0,0,1],[1,0,0]])
c1 = tf.cast(tf.where(tf.cast(input1, tf.bool)), tf.float32)
c2 = tf.cast(tf.where(tf.cast(input2, tf.bool)), tf.float32)
distances = tf.sqrt(-2 * tf.matmul(c1, tf.transpose(c2)) + tf.reduce_sum(tf.square(c2), axis=1)
+ tf.expand_dims(tf.reduce_sum(tf.square(c1), axis=1), axis=1))
with tf.Session() as sess:
d = sess.run(distances)
Since Tensorflow has broadcast by default the fact the arrays have different dimensions doesn't matter.
Hope it helps somebody.

Efficient way to Reshape Data for Time Series Prediction Machine Learning (Numpy)

Lets say I have a data set (numpy array) X of N samples of time series each with T time steps of a D-dimensional vector so that:
X.shape == (N,T,D)
Now I want to reshape it into x (data set) and y (labels) to apply a machine learning to predict the step in the times series.
I want to take every subseries of each sample of length n
x.shape==(N*(T-n),n,D) and y.shape==(N*(T-n)),D)
with
X[k,j:j+n,:]
being one of my samples in x and
X[k,j+n+1,:]
it's label in y.
Is a for-loop the only way to do that?
So I have the following method, but it has a for loop, and I am not sure that I cannot do better:
def reshape_data(self, X, n):
"""
Reshape a data set of N time series samples of T time steps each
Args:
data: Time series data of shape (N,T,D)
n: int, length of time window used to predict x[t+1]
Returns:
"""
N,T,D = X.shape
x = np.zeros((N*(T-n),n,D))
y = np.zeros((N*(T-n),D))
for i in range(T-n):
x[N*i:N*(i+1),:,:] = X[:,i:i+n,:]
y[N*i:N*(i+1),:] = X[:,i+n,:]
return x,y
you are looking for pandas data panel. (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Panel.html). just put into the numpy array, transpose on the minor axis and get its numpy representation (.as_matrix() or simply .values). if you want to truly do it only in numpy alone, numpy.transpose just for (https://docs.scipy.org/doc/numpy/reference/generated/numpy.transpose.html)

Categories