Parseval's Theorem with Numpy FFT is not fulfilled - python

I am trying to determine the total energy recorded by a detector in time domain by means of it's spectrum.
The first step after performing the Fast Fourier Transformation with Numpy's FFT library was to confirm Parseval's theorem.
According to the theorem, the total energy in time domain and in frequency domain must be the same. I have two problems that I am not able to solve.
I can confirm the theorem when I don't use the proper units for the x-Axis during the np.trapz() integration. As soon as I use my the actual sample points/frequencies, the result is off. I do not understand why this is the case and am wondering if I can apply a normalization to solve this error.
I cannot confirm the theorem when I apply a DC offset to the signal (uncomment the f = np.sin(np.pi**t)* line).
Below is my code with an examplatory Sine function.
# Python code
import matplotlib.pyplot as plt
import numpy as np
# Create a Sine function
dt = 0.001 # Time steps
t = np.arange(0,10,dt) # Time array
f = np.sin(np.pi*t) # Sine function
# f = np.sin(np.pi*t)+1 # Sine function with DC offset
N = len(t) # Number of samples
# Energy of function in time domain
energy_t = np.trapz(abs(f)**2)
# Energy of function in frequency domain
FFT = np.sqrt(2) * np.fft.rfft(f) # only positive frequencies; correct magnitude due to discarding of negative frequencies
FFT[0] /= np.sqrt(2) # DC magnitude does not have to be corrected
FFT[-1] /= np.sqrt(2) # Nyquist frequency does not have to be corrected
frq = np.fft.rfftfreq(N,d=dt) # FFT frequenices
# Energy of function in frequency domain
energy_f = np.trapz(abs(FFT)**2) / N
print('Parsevals theorem fulfilled: ' + str(energy_t - energy_f))
# Parsevals theorem with proper sample points
energy_t = np.trapz(abs(f)**2, x=t)
energy_f = np.trapz(abs(FFT)**2, x=frq) / N
print('Parsevals theorem NOT fulfilled: ' + str(energy_t - energy_f))

The FFT computes the Discrete Fourier Transform (DFT), which is not the same as the (continuous-domain) Fourier Transform.
For the DFT, Parseval’s theorem states that the sum of the square magnitude of the discrete signal equals the sum of the square magnitude of the DFT of the signal. There is no integration involved, and therefore you should not use trapz. Just use sum.
Note that a discrete signal is a set of samples x[n] at n=0..N-1. Fourier analysis in the discrete domain, and all related operations, only consider n, not t. The sampling frequency and the actual times those samples were recorded is irrelevant in these analyses. Likewise, the DFT produces a set of samples X[k] at k=0..N-1, not at any specific f or ω related to any sampling frequency.
Now it is possible to relate n to t because we know the sampling frequency, and it is possible to relate k to f because we know the sampling frequency. But these conversions should not make us think that X[k] is a sampling of the continuous-domain Fourier transform of the original continuous-domain signal. And they should especially not make us think that we can interpolate X[k].
Reconstructing the samples x[n] is accomplished by adding N sinusoids with parameters given by X[k]. “In between” those DFT components should not be anything. Interpolating them would mean we add sinusoids that do not exist in the samples x[n].
trapz uses linear interpolation to obtain an estimate of the integral, and therefore is inappropriate to use in discrete Fourier analysis.

Related

Interpolating measured sine wave using python

I have 2 sampled sine waves obtained as a measurement from a DSO. The sampling rate of the DSO is 160 GSa/s and my signal is 60 GHz. I need to find the phase difference between the two sine waves. Both are the same frequency. However, the sampling rate is not enough to accurately determine the phase. Is there any way to interpolate the measured signal to get a better sine wave and then calculate the phase difference?
You may fit to sine functions, but for the phase difference (delta phi=2pi frequency delta t) it would be sufficient to detect and compare the zero-crossings (respective a possible constant offset), which may be found from a segment of your series by an interpolation like
w=6.38 # some radian frequency
t = np.linspace(0, 0.5) # time interval containing ONE zero-crossing
delta_phi=0.1 # some phase difference
x = np.sin(w*t-delta_phi) # x(t)
f = interpolate.interp1d(x, t) # interpolate t(x), default is linear
delta_t = f(0) # zero-crossing time referred to t=0
delta_phi_detected= w*delta_t
You need to relate two adjacent zero-crossings of your signals.
Alternatively, you may obtain an average value by multiplication of both signals and numerical integration over time T which converges to (T/2)cos(delta_phi), if both signals have (or are made to) zero mean value.

Prefactors computing PSD of a signal with numpy.fft VS. scipy.signal.welch

The power spectral density St of a signal u may be computed as the product of the FFT of the signal, u_fft with its complex conjugate u_fft_c. In Python, this would be written as:
import numpy as np
u = # Some numpy array containing signal
u_fft = np.fft.rfft(u-np.nanmean(u))
St = np.multiply(u_fft, np.conj(u_fft))
However, the FFT definition in Numpy requires the multiplication of the result with a factor of 1/N, where N=u.size in order to have an energetically consistent transformation between u and its FFT. This leads to the corrected definition of the PSD using numpy's fft:
St = np.multiply(u_fft, np.conj(u_fft))
St = np.divide(St, u.size)
On the other hand, Scipy's function signal.welch computes the PSD directly from input u:
from spicy.signal import welch
freqs_st, St_welch = welch(u-np.nanmean(u),
return_onesided=True, nperseg=seg_size, axis=0)
The resulting PSD, St_welch, is obtained by performing several FFTs in segments of the array u with size seg_size. Thus, my question is:
Should St_welch be multiplied by a factor of 1/seg_size to give an energetically consistent PSD? Should it be multiplied by 1/N? Should it not be multiplied at all?
PD: Comparison by performing both operations on a signal is not straightforward, since the Welch method also introduces smoothing of the signal and changes the display in the frequency domain.
Information on the necessity of the prefactor when using numpy.fft :
Journal article on the matter
The definition of the paramater scale of scipy.signal.welch suggests that the appropriate scaling is performed by the function:
scaling : { ‘density’, ‘spectrum’ }, optional
Selects between computing the power spectral density (‘density’) where Pxx has units of V^2/Hz and computing the power spectrum (‘spectrum’) where Pxx has units of V^2, if x is measured in V and fs is measured in Hz. Defaults to ‘density’
The correct sampling frequency is to be provided as argumentfs to retreive the correct frequencies and an accurate power spectral density.
To recover a power spectrum similar to that computed using np.multiply(u_fft, np.conj(u_fft)), the length of the fft frame and the applied window must respectively be provided as the length of the frame and boxcar (equivalent to no window at all). The fact that scipy.signal.welch applies the correct scaling can be checked by testing a sine wave:
import numpy as np
import scipy.signal
import matplotlib.pyplot as plt
def abs2(x):
return x.real**2 + x.imag**2
if __name__ == '__main__':
framelength=1.0
N=1000
x=np.linspace(0,framelength,N,endpoint=False)
y=np.sin(44*2*np.pi*x)
#y=y-np.mean(y)
ffty=np.fft.fft(y)
#power spectrum, after real2complex transfrom (factor )
scale=2.0/(len(y)*len(y))
power=scale*abs2(ffty)
freq=np.fft.fftfreq(len(y) , framelength/len(y) )
# power spectrum, via scipy welch. 'boxcar' means no window, nperseg=len(y) so that fft computed on the whole signal.
freq2,power2=scipy.signal.welch(y, fs=len(y)/framelength,window='boxcar',nperseg=len(y),scaling='spectrum', axis=-1, average='mean')
for i in range(len(freq2)):
print i, freq2[i], power2[i], freq[i], power[i]
print np.sum(power2)
plt.figure()
plt.plot(freq[0:len(y)/2+1],power[0:len(y)/2+1],label='np.fft.fft()')
plt.plot(freq2,power2,label='scipy.signal.welch()')
plt.legend()
plt.xlim(0,np.max(freq[0:len(y)/2+1]))
plt.show()
For a real to complex transform, the correct scaling of np.multiply(u_fft, np.conj(u_fft)) is 2./(u.size*u.size). Indeed, the scaling of u_fft is 1./u.size. Furthermore, real to complex transforms only report half of the frequencies, because the magnitude of the bin N-k would be the complex conjugate of that of the bin k. The energy of that bin is therefore equal to that of bin k and it is to be summed to that of bin k. Hence the factor 2. For the tested sine wave signal of amplitude 1, the energy is reported as 0.5: it is indeed, the average of a squared sine wave of amplitude 1.
Windowing is useful if the length of the frame is not a multiple of the period of the signal or if the signal is not periodic. Using smaller fft frames is useful if the signal is made of damped waves: the signal could be considered as periodic on a characteristic time: choosing a fft frame smaller than that characteristic time but larger than the period of the waves seems judicious.

IFFT of a Gaussian power spectrum - Python

I want to calculate the Inverse Fourier Transform of a Gaussian power spectrum, thus obtaining a Gaussian again. I want to use this fact to check that the IFFT of my Gaussian power spectrum is sensible, in the sense that it produces an array of data effectively distributed in Gaussian way.
Now, it turns out that the IFFT must be multiplied by a factor 2*pi*N, where N is the dimension of the array, in order to recover the analytic correlation function (which is the Inverse Fourier Transform of the power spectrum). Can someone explain why?
Here is the piece of code that first fills an array with the Gaussian power spectrum and then does the IFFT of the power spectrum.
power_spectrum_k = np.zeros(n, float)
for k in range(1, int(n/2+1)):
power_spectrum_k[k] = math.exp(-(2*math.pi*k*sigma/n)*(2*math.pi*k*sigma/n))
for k in range(int(n/2+1), n):
power_spectrum_k[k] = power_spectrum_k[int(k - n/2)]
inverse_transform2 = np.zeros(n, float)
inverse_transform2 = np.fft.ifft(power_spectrum_k)
where the symmetry of the power spectrum comes from the need to get a real correlation function, at the same time following the rules for the use of numpy.ifft (quoting from the documentation:
"The input should be ordered in the same way as is returned by fft, i.e., a[0] should contain the zero frequency term, a[1:n/2+1] should contain the positive-frequency terms, and a[n/2+1:] should contain the negative-frequency terms, in order of decreasingly negative frequency".)
The reason is the Plancherel theorem, which states that the Fourier transform conserves the signal's energy, i.e., the integral over |x(t)|² equals the integral over |X(f)|². If you have more samples (e.g., caused by higher sampling rate or a longer interval), you have more energy. For that reason your IFFT result is scaled by a factor of N. Your factor depends on hand on the convention of Fourier Integral used, as #pv already noted. On the other hand, on the length of your interval, since integral over the power of the sampled and the continuous interval need to be the same.
I'd recommend using an existing library for an fft. Not as its particularly difficult but there are some well optimised solutions.
Try scipy http://docs.scipy.org/doc/scipy/reference/fftpack.html or my favourite fftw https://hgomersall.github.io/pyFFTW/

How do I match lomb-scargle and FFT plots of same dataset?

I am doing some work, comparing the interpolated fft of the concentrations of some gases over a period, of which is unevenly sampled, with the lomb-scargle periodogram of the same data. I am using scipy's fft function to calculate the fourier transform and then squaring the modulus of this to give what I believe to be the power spectral density, in units of parts per billion(ppb) squared.
I can get the lomb-scargle plot to match almost the exact pattern as the FFT but never the same scale of magnitude, the FFT power spectral density always is higher, even though I thought the lomb-scargle power was power spectral density. Now the lomb code I am using:http://www.astropython.org/snippet/2010/9/Fast-Lomb-Scargle-algorithm, normalises the dataset taking away the average and dividing by 2 times the variance from the data, therefore I have normalised the FFT data in the same manner, but still the magnitudes do not match.
Therefore I did some more research and found that the normalised lomb-scargle power could unitless and therefore I cannot the plots match. This leads me to the 2 questions:
What units (if any) are the power spectral density of a normalised lim-scargle perioogram in?
How would I proceed to match my fft plot with my lomb-scargle plot, in terms of magnitude and pattern?
Thank you.
The squared modulus of the Fourier transform of a series is defined as the energy spectral density (ESD). You need to divide the ESD by the length of the series to convert to an estimate of power spectral density (PSD).
Units
The units of a PSD are [units]**2/[frequency] where [units] represents the units of your original series.
Normalization
To check for proper normalization, one can numerically integrate the PSD of a white noise (with known variance). If the integrated spectrum equals the variance of the series, the normalization is correct. A factor of 2 (too low) is not incorrect, though, and may indicate the PSD is normalized to be double-sided; in that case, just multiply by 2 and you have a properly normalized, single-sided PSD.
Using numpy, the randn function generates pseudo-random numbers that are Gaussian distributed. For example
10 * np.random.randn(1, 100)
produces a 1-by-100 array with mean=0 and variance=100. If the sampling frequency is, say, 1-Hz, the single-sided PSD will theoretically be flat at 200 units**2/Hz, from [0,0.5] Hz; the integrated spectrum would thus be 10, equaling the variance of the series.
Update
I modified the example included in the python code you linked to demonstrate the normalization for a normally distributed series of length 20, with variance 1, and sampling frequency 10:
import numpy
import lomb
numpy.random.seed(999)
nd = 20
fs = 10
x = numpy.arange(nd)
y = numpy.random.randn(nd)
fx, fy, nout, jmax, prob = lomb.fasper(x, y, 1., fs)
fNy = fx[-1]
fy = fy/fs
Si = numpy.mean(fy)*fNy
print fNy, Si, Si*2
This gives, for me:
5.26315789474 0.482185882163 0.964371764327
which shows you a few things:
The "Nyquist" frequency asked for is actually the sampling frequency.
The result needs to be divided by the sampling frequency.
The output is normalized for a double-sided PSD, so multiplying by 2 makes the integrated spectrum nearly 1.
In the time since this question was asked and answered, the AstroPy project has gained a Lomb-Scargle method, and this question is addressed in the documentation: http://docs.astropy.org/en/stable/stats/lombscargle.html#psd-normalization-unnormalized
In brief, you can compute a Fourier periodogram and compare it to the astropy Lomb-Scargle periodogram as follows
import numpy as np
from astropy.stats import LombScargle
def fourier_periodogram(t, y):
N = len(t)
frequency = np.fft.fftfreq(N, t[1] - t[0])
y_fft = np.fft.fft(y)
positive = (frequency > 0)
return frequency[positive], (1. / N) * abs(y_fft[positive]) ** 2
t = np.arange(100)
y = np.random.randn(100)
frequency, PSD_fourier = fourier_periodogram(t, y)
PSD_LS = LombScargle(t, y).power(frequency, normalization='psd')
np.allclose(PSD_fourier, PSD_LS)
# True
Since AstroPy is a common tool used in astronomy, I thought this might be more useful than an answer based on the code snippet mentioned above.

Can you compute the amplitude/power of original signal from Fourier transform?

After taking the Discrete Fourier Transform of some samples with scipy.fftpack.fft() and plotting the magnitude of these I notice that it doesn't equal the amplitude of the original signal. Is there a relationship between the two?
Is there a way to compute the amplitude of the original signal from the Fourier coefficients without reversing the transform?
Here's an example of sin wave with amplitude 7.0 and fft amplitude 3.5
from numpy import sin, linspace, pi
from pylab import plot, show, title, xlabel, ylabel, subplot
from scipy import fft, arange
def plotSpectrum(y,Fs):
"""
Plots a Single-Sided Amplitude Spectrum of y(t)
"""
n = len(y) # length of the signal
k = arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[range(n/2)] # one side frequency range
Y = fft(y)/n # fft computing and normalization
Y = Y[range(n/2)]
plot(frq,abs(Y),'r') # plotting the spectrum
xlabel('Freq (Hz)')
ylabel('|Y(freq)|')
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = arange(0,1,Ts) # time vector
ff = 5; # frequency of the signal
y = 7.0 * sin(2*pi*ff*t)
subplot(2,1,1)
plot(t,y)
xlabel('Time')
ylabel('Amplitude')
subplot(2,1,2)
plotSpectrum(y,Fs)
show()
Yes, Parseval's Theorem tells us that the total power in the frequency domain is equal to the total power in the time domain.
What you may be seeing though is the result of a scaling factor in the forward FFT. The size of this scaling factor is a matter of convention, but most commonly it's a factor of N, where N is the number of data points. However it can also be equal to 1 or sqrt(N). Check your FFT documentation for this.
Also note that if you only take the power from half of the frequency domain bins (commonly done when the time domain signal is purely real and you have complex conjugate symmetry in the frequency domain) then there will be a factor of 2 to take care of.

Categories