Why is the amplitude I compute far, far away from original after fast Fourier transform (FFT)?
I have a signal with 1024 points and sampling frequency of 1/120000. I apply the fast Fourier transform in Python with scipy.fftpack. I normalize the calculated magnitude by number of bins and multiply by 2 as I plot only positive values.
As my initial signal amplitude is around 64 dB, I get very low amplitude values less then 1.
Please see my code.
Signal = well.ReadWellData(SignalNDB)
y, x = Signal.GetData(numpy=np)
N = y.size # Number of sample points 1024 ...
T = 1/120000 # sampling frequency (sec)
x = np.linspace(0.0, N*T, N)
yf = abs(fft(y)) # Perform fft returning Magnitude
xf = np.linspace(0.0, 1.0/(2.0*T), N//2) # Calculatel frequency bins
freqs = fftfreq(N, T)
ax1=plt.subplot(211)
ax1.plot(x,y)
plt.grid()
ax2=plt.subplot(212)
yf2 = 2/N * np.abs(yf[0:N//2]); # Normalize Magnitude by number of bins and multiply by 2
ax2.semilogy(xf, yf2) # freq vs ampl - positive only freq
plt.grid()
ax1.set_title(["check"])
#ax2.set_xlim([0,4000])
plt.show()
Please see my plot:
EDIT:
Finally my signal Amplitude after fft is exactly what I expected. What I did.
First I did fft for signal in mV. Then I converted the results to dB as per the formula: 20*log10(mV)+60; where 60 represents 1 mV proveded by the tool manufacturer.Therefore dB values presented on a linear scale format # the bottom plot rather than on the log format.
Please see the resulting plot below.
Results
Looks good to me. The FFT, or the Fourier transform in general, gives you the representation of your time-domain signal in the frequencies domain.
By taking a look at your signal, you have two main components : something oscillating at around 500Hz (period of 0.002s) and an offset (which corresponds to freq = 0Hz). Looking at the result of the FFT, we can see mainly two peaks : one at 0Hz and the other one could be at 500Hz (difficult to be sure without zooming on the signal).
The only relation between the intensities is defined by the Parseval's theorem, but having a signal oscillating around 64dB doesn't mean its FFT should have values close to 64dB. I suggest you take a look here.
Related
I have data from the accelerometer in m/s2 (Y-axis) for a time period in seconds (X-axis). I would like to convert this data real-time so that I get the value of an acceleration related to the frequency in Hz. I know that, for example, there is an FFT function in numpy, but I have no idea at all how to use it. I would appreciate, if somebody could provide an example code to convert the raw data (Y: m/s2, X: s) to the desired data (Y: m/s2, X: Hz). It should not be necessarily exactly this function. Thanks!
First, let's create a time-domain signal.
For simplicity, I will create a sine wave with frequency components 12Hz and 24Hz and you can assume the unit of the values are m/s^2:
import numpy as np
import matplotlib.pyplot as plt
# This would be the actual sample rate of your signal
# since you didn't provide that, I just picked one
# big enough to make our graphs look pretty
sample_rate = 22050
# To produce a 1-second wave
length = 1
# The x-axis of your time-domain signal
t = np.linspace(0, length, sample_rate * length)
# A signal with 2 frequency components
# - 12Hz and 24Hz
y = np.sin(12 * (2 * np.pi) * t) + 0.5*np.sin(24 * (2 * np.pi) * t)
# Plot time domain signal
plt.plot(t, y)
plt.xlabel("Time (s)")
plt.show()
This will output:
Now, we continue on with the script by taking the Fourier transform of our original time-domain signal and then creating the magnitude spectrum (since that gives us a better way to visualize how each component is contributing than the phase spectrum):
# This returns the fourier transform coeficients as complex numbers
transformed_y = np.fft.fft(y)
# Take the absolute value of the complex numbers for magnitude spectrum
freqs_magnitude = np.abs(transformed_y)
# Create frequency x-axis that will span up to sample_rate
freq_axis = np.linspace(0, sample_rate, len(freqs_magnitude))
# Plot frequency domain
plt.plot(freq_axis, freqs_magnitude)
plt.xlabel("Frequency (Hz)")
plt.xlim(0, 100)
plt.show()
So now we can visualize the frequency-domain:
Notice how the magnitude of the 24Hz component is about half of the 12Hz component. That is because I purposely timed the 24Hz component by 0.5 on the time-domain signal, so the 24Hz component 'contributes' less to the overall signal, hence we get this halved spike for that component.
Note, also, that the y-axis of our output signal is not really in m/s^2 per Hz as you wanted. But you could compute the actual m/s^2 values by taking the integral over your desired frequency band.
I'll leave the jupyter notebook I created available here, feel free to use it and open issues if you have any problems with it.
I'm trying to do some audio cleaning, which I have never done before using python or otherwise and I came across the idea that I could use FFT to find the frequencies that make up my audio and eliminate the frequencies that don't belong. i did it on normal audio, but I couldn't understand the results, so I tried it on a simple sine wave.
I made it
frequency = 1000
num_samples = 48000
# The sampling rate of the analog to digital convert
sampling_rate = 48000.0
sine_wave = [np.sin(2 * np.pi * frequency * x1 / sampling_rate) for x1 in range(num_samples)]
sine_wave = np.array(sine_wave)
Then I played and plotted it and it looked and sounded like a normal size wave.
fig, ax = plt.subplots(figsize=(20,3))
ax.plot(sine_wave[:500])
IPython.display.Audio(data=sine_wave, rate=44100)
But when I did fft and looked at the frequencies on a graph it didn't make sense
def do_fft(data_samples):
data_fft = np.fft.fft(data_samples)
freq = (np.abs(data_fft[:len(data_fft)]))
plt.subplots(figsize=(20,10))
plt.plot(freq)
print("The frequency is {} Hz".format(np.argmax(freq)))
return freq
sine_freq = do_fft(sine_wave)
sine_freq[47000]
For one, I don't really understand what my frequency array is supposed to mean. I know that a high number at a certain index K means that K Hz appears a lot in the sound. This would make sense since I got a value of like 23,999.99999 at 1000 Hz, which is what my wave frequency is. What doesn't make sense is that I got 24,000 for 47,000 Hz. That doesn't make any sense to me. Did I do something wrong? Is fft not working properly?
The FFT of strictly real (all imaginary components == zero) data is always conjugate mirror symmetric. That's just the way the math of an FFT works. So your 47kHz peak (the same as -1kHz) is just the mirror image of 1kHz at a 48k sample rate. The Nyquist folding frequency or mirroring hinge is at half the sample rate (and/or zero if you consider the upper bin frequencies negative).
I prefer to explicitly define time and calculate the frequencies. The FT should be plotted against those frequencies. The argmax alone computes the place in a vector, not a frequency.
A sampling rate (sa/sec) is not a frequency. The Nyquist theorem states (among others) that your maximum frequency (in Hz) is 1/2 the sampling rate in Sa/sec.
Since we are using the complex FT (pos. and neg. frequencies shown), the -999.979 Hz is actually +1000 Hz.
import numpy as np
import matplotlib.pyplot as p
sampling_rate = 48000
t= np.linspace(0,1,sampling_rate+1) # time vector
dt=t[1]-t[0]
print(f'first/last time {t[0]}, {t[-1]}')
print(f'time interval : {dt}')
f = 1000
sig = np.sin(2 * np.pi * f*t)
fig = p.figure(figsize=(15,10))
p.subplot(211)
p.plot(sig[:500])
p.subplot(212)
ft = np.fft.fftshift(np.fft.fft(sig))
freq=np.fft.fftshift(np.fft.fftfreq(len(t),dt))
p.plot(freq,ft )
print(f'argmax of FT is not a frequency, but a position in a vector : {np.argmax(ft)}')
f0=freq[np.argmax(ft)]
print(f'the frequency is {f0:.3f} Hz')
I work with vibration, and I am trying to get the following information from a FFT amplitude:
Peak to Peak
Peak
RMS
I am performing an FFT on a simple sine wave function, considering a Hanning windowing.
Note that the "full amplitude" from the sine wave function is 5, and running the code below the FFT gives me 2.5 amplitude result. So, in this case, I am getting the peak from FFT. What about peak to peak and RMS?
P.-S. - I am not interested in the RMS of a bandwidth frequency (i.e parsevall theorem). I am interested in the RMS from each peak, that is usually seen in vibration software.
import numpy as np
import matplotlib.pyplot as plt
f_s = 100.0 # Hz sampling frequency
f = 1.0 # Hz
time = np.arange(0.0, 10.0, 1/f_s)
x = 5 * np.sin(2*np.pi*f*time)
N = len(time)
T = 1/f_s
# apply hann window and take the FFT
win = np.hanning(len(x))
FFT = np.fft.fft(win * x)
n = len(FFT)
yf = np.linspace(0.0,1.0/(2.0*T),N//2)
plt.figure(1)
plt.plot(yf,2.0/N * np.abs(FFT[0:N//2]))
plt.grid()
plt.figure(2)
plt.plot(time,x)
plt.xlabel('time')
plt.ylabel('Amplitude')
plt.grid()
plt.show()
You are getting a peak of 2.5 in the frequency-domain because that's the average amplitude of the windowed signal, and you are not compensating for the window weights. After normalizing the frequency-domain results to account for the window using the following:
plt.plot(yf,2.0/win.sum() * np.abs(FFT[0:N//2]))
you should get an amplitude of 5, just like in the time-domain. Note that this works provided that the input signal frequency is an exact multiple of f_s/N (which in your case is 0.1Hz), and provided that the underlying assumption that the input signal is either a pure tone or comprised of tones which are sufficiently separated in frequency is valid.
The peak-to-peak value would simply be twice the amplitude, so 10 in your example.
For the RMS value, you are probably interested in the RMS value of the corresponding time-domain sinusoidal tone component (under the assumption the input signal is indeed composed of sinusoidal component whose frequencies are sufficiently separated in frequency). The RMS of a time-domain sinusoidal of amplitude A is A/sqrt(2), so you simply need to divide by sqrt(2) to get the corresponding equivalent RMS value from your amplitude values, so 5/sqrt(2) ~ 3.53 in your example.
I am trying to find the power spectral density of a signal measured at uneven times. The data looks something like this:
0 1.55
755 1.58
2412256 2.42
2413137 0.32
2497761 1.19
...
where the first column is the time since the first measurement (in seconds) and the second column is the value of the measurement.
Currently, using the periodogram function in Matlab, I have been able to estimate the power spectral density by using:
nfft = length(data(:,2));
pxx = periodogram(data(:,2),[],nfft);
Now at the moment, to plot this I have been using
len = length(pxx);
num = 1:1:len;
plot(num,pxx)
Which clearly does not place the correct x-axis on the power spectral density (and yields something like the plot below), which needs to be in frequency space. I am confused about how to go about this given the uneven sampling of the data.
What is the correct way to convert to (and then plot in) frequency space when estimating the power spectral density for data that has been unevenly sampled? I am also interested in tackling this from a python/numpy/scipy perspective but have so far only looked at the Matlab function.
I am not aware of any functions that calculate a PSD from irregulary sampled data, so you need to convert the data to a uniform sample rate first. So the first step is to use interp1 to resample at regular time intervals.
avg_fs = 1/mean(diff(data(:, 1)));
min_time = min(data(:, 1));
max_time = max(data(:, 1));
num_pts = floor((max_time - min_time) * avg_fs);
new_time = (1:num_pts)' / avg_fs;
new_time = new_time - new_time(1) + min_time;
new_x = interp1(data(:, 1), data(:, 2), new_time);
I always use pwelch for calculating PSD's, here is how I would go about it
nfft = 512; % play with this to change your frequency resolution
noverlap = round(nfft * 0.75); % 75% overlap
window = hanning(nfft);
[Pxx,F] = pwelch(new_x, window, noverlap, nfft, avg_fs);
plot(F, Pxx)
xlabel('Frequency (Hz)')
grid on
You will definitely want to experiment with nfft, larger numbers will give you more frequency resolution (smaller spacing between frequencies), but the PSD will be noisier. One trick you can do to get fine resolution and low noise is to make the window smaller than nfft.
After taking the Discrete Fourier Transform of some samples with scipy.fftpack.fft() and plotting the magnitude of these I notice that it doesn't equal the amplitude of the original signal. Is there a relationship between the two?
Is there a way to compute the amplitude of the original signal from the Fourier coefficients without reversing the transform?
Here's an example of sin wave with amplitude 7.0 and fft amplitude 3.5
from numpy import sin, linspace, pi
from pylab import plot, show, title, xlabel, ylabel, subplot
from scipy import fft, arange
def plotSpectrum(y,Fs):
"""
Plots a Single-Sided Amplitude Spectrum of y(t)
"""
n = len(y) # length of the signal
k = arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[range(n/2)] # one side frequency range
Y = fft(y)/n # fft computing and normalization
Y = Y[range(n/2)]
plot(frq,abs(Y),'r') # plotting the spectrum
xlabel('Freq (Hz)')
ylabel('|Y(freq)|')
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = arange(0,1,Ts) # time vector
ff = 5; # frequency of the signal
y = 7.0 * sin(2*pi*ff*t)
subplot(2,1,1)
plot(t,y)
xlabel('Time')
ylabel('Amplitude')
subplot(2,1,2)
plotSpectrum(y,Fs)
show()
Yes, Parseval's Theorem tells us that the total power in the frequency domain is equal to the total power in the time domain.
What you may be seeing though is the result of a scaling factor in the forward FFT. The size of this scaling factor is a matter of convention, but most commonly it's a factor of N, where N is the number of data points. However it can also be equal to 1 or sqrt(N). Check your FFT documentation for this.
Also note that if you only take the power from half of the frequency domain bins (commonly done when the time domain signal is purely real and you have complex conjugate symmetry in the frequency domain) then there will be a factor of 2 to take care of.