Why is the amplitude I compute far, far away from original after fast Fourier transform (FFT)?
I have a signal with 1024 points and sampling frequency of 1/120000. I apply the fast Fourier transform in Python with scipy.fftpack. I normalize the calculated magnitude by number of bins and multiply by 2 as I plot only positive values.
As my initial signal amplitude is around 64 dB, I get very low amplitude values less then 1.
Please see my code.
Signal = well.ReadWellData(SignalNDB)
y, x = Signal.GetData(numpy=np)
N = y.size # Number of sample points 1024 ...
T = 1/120000 # sampling frequency (sec)
x = np.linspace(0.0, N*T, N)
yf = abs(fft(y)) # Perform fft returning Magnitude
xf = np.linspace(0.0, 1.0/(2.0*T), N//2) # Calculatel frequency bins
freqs = fftfreq(N, T)
ax1=plt.subplot(211)
ax1.plot(x,y)
plt.grid()
ax2=plt.subplot(212)
yf2 = 2/N * np.abs(yf[0:N//2]); # Normalize Magnitude by number of bins and multiply by 2
ax2.semilogy(xf, yf2) # freq vs ampl - positive only freq
plt.grid()
ax1.set_title(["check"])
#ax2.set_xlim([0,4000])
plt.show()
Please see my plot:
EDIT:
Finally my signal Amplitude after fft is exactly what I expected. What I did.
First I did fft for signal in mV. Then I converted the results to dB as per the formula: 20*log10(mV)+60; where 60 represents 1 mV proveded by the tool manufacturer.Therefore dB values presented on a linear scale format # the bottom plot rather than on the log format.
Please see the resulting plot below.
Results
Looks good to me. The FFT, or the Fourier transform in general, gives you the representation of your time-domain signal in the frequencies domain.
By taking a look at your signal, you have two main components : something oscillating at around 500Hz (period of 0.002s) and an offset (which corresponds to freq = 0Hz). Looking at the result of the FFT, we can see mainly two peaks : one at 0Hz and the other one could be at 500Hz (difficult to be sure without zooming on the signal).
The only relation between the intensities is defined by the Parseval's theorem, but having a signal oscillating around 64dB doesn't mean its FFT should have values close to 64dB. I suggest you take a look here.
I work with vibration, and I am trying to get the following information from a FFT amplitude:
Peak to Peak
Peak
RMS
I am performing an FFT on a simple sine wave function, considering a Hanning windowing.
Note that the "full amplitude" from the sine wave function is 5, and running the code below the FFT gives me 2.5 amplitude result. So, in this case, I am getting the peak from FFT. What about peak to peak and RMS?
P.-S. - I am not interested in the RMS of a bandwidth frequency (i.e parsevall theorem). I am interested in the RMS from each peak, that is usually seen in vibration software.
import numpy as np
import matplotlib.pyplot as plt
f_s = 100.0 # Hz sampling frequency
f = 1.0 # Hz
time = np.arange(0.0, 10.0, 1/f_s)
x = 5 * np.sin(2*np.pi*f*time)
N = len(time)
T = 1/f_s
# apply hann window and take the FFT
win = np.hanning(len(x))
FFT = np.fft.fft(win * x)
n = len(FFT)
yf = np.linspace(0.0,1.0/(2.0*T),N//2)
plt.figure(1)
plt.plot(yf,2.0/N * np.abs(FFT[0:N//2]))
plt.grid()
plt.figure(2)
plt.plot(time,x)
plt.xlabel('time')
plt.ylabel('Amplitude')
plt.grid()
plt.show()
You are getting a peak of 2.5 in the frequency-domain because that's the average amplitude of the windowed signal, and you are not compensating for the window weights. After normalizing the frequency-domain results to account for the window using the following:
plt.plot(yf,2.0/win.sum() * np.abs(FFT[0:N//2]))
you should get an amplitude of 5, just like in the time-domain. Note that this works provided that the input signal frequency is an exact multiple of f_s/N (which in your case is 0.1Hz), and provided that the underlying assumption that the input signal is either a pure tone or comprised of tones which are sufficiently separated in frequency is valid.
The peak-to-peak value would simply be twice the amplitude, so 10 in your example.
For the RMS value, you are probably interested in the RMS value of the corresponding time-domain sinusoidal tone component (under the assumption the input signal is indeed composed of sinusoidal component whose frequencies are sufficiently separated in frequency). The RMS of a time-domain sinusoidal of amplitude A is A/sqrt(2), so you simply need to divide by sqrt(2) to get the corresponding equivalent RMS value from your amplitude values, so 5/sqrt(2) ~ 3.53 in your example.
I want to make a plot of power spectral density versus frequency for a signal using the numpy.fft.fft function. I want to do this so that I can preserve the complex information in the transform and know what I'm doing, as apposed to relying on higher-level functions provided by numpy (like the periodogram function). I'm following Mathwork's nice page about doing PSD analysis using Matlab's fft function: https://www.mathworks.com/help/matlab/ref/fft.html
In this example, I expect the PSD to peak at the frequency I used to construct the signal, which was 100 in this case. I generate the signal using 1000 time points a frequency of 100 inverse time units. I thought that the fft magnitude could be plotted against [0, nt/2] and the peaks would show up where there is the most energy in the frequency. When I did this, things went wrong. I expected my PSD to peak at 100.
How can I make a spectral density plot of frequency vs energy contained in that frequency using np.fft.fft?
Edit
to clarify, in my real problem, I only know that my characteristic frequency is much larger than my sample frequency
import matplotlib.pyplot as plt
import numpy as np
t = np.arange(1000)
sp = np.fft.fft(np.sin(100 * t * np.pi))
trange = np.linspace(0, t[-1] / 2, t.size)
plt.plot(trange, np.abs(sp) / t.size)
plt.show()
This is a sketch I made of the expected output:
What is your sample frequency? This sequence you are generating can represent a infinite number of continuous time signals according to the sample frequency.
The sample frequency needs to be at least twice the maximum signal frequency, as stated by the Sampling Theorem, so, using fs = 250Hz and using a sine of 10 seconds it becomes:
import matplotlib.pyplot as plt
import numpy as np
fs = 250
t = np.arange(0, 10, 1/fs)
sp = np.fft.fft(np.sin(2*np.pi * 100 * t))
trange = np.linspace(0, fs, len(t))
plt.plot(trange, np.abs(sp))
plt.show()
If you run this you will see a peak at 100Hz as expected.
I've got a little problem managing FFT data. I was looking for many examples of how to do FFT, but I couldn't get what I want from any of them. I have a random wave file with 44kHz sample rate and I want to get magnitude of N harmonics each X ms, let's say 100ms should be enough. I tried this code:
import scipy.io.wavfile as wavfile
import numpy as np
import pylab as pl
rate, data = wavfile.read("sound.wav")
t = np.arange(len(data[:,0]))*1.0/rate
p = 20*np.log10(np.abs(np.fft.rfft(data[:2048, 0])))
f = np.linspace(0, rate/2.0, len(p))
pl.plot(f, p)
pl.xlabel("Frequency(Hz)")
pl.ylabel("Power(dB)")
pl.show()
This was last example I used, I found it somewhere on stackoverflow. The problem is, this gets magnitude which I want, gets frequency, but no time at all. FFT analysis is 3D as far as I know and this is "merged" result of all harmonics. I get this:
X-axis = Frequency, Y-axis = Magnitude, Z-axis = Time (invisible)
From my understanding of the code, t is time - and it seems like that, but is not needed in the code - We'll maybe need it though. p is array of powers (or magnitude), but it seems like some average of all magnitudes of each frequency f, which is array of frequencies. I don't want average/merged value, I want magnitude for N harmonics each X milliseconds.
Long story short, we can get: 1 magnitude of all frequencies.
We want: All magnitudes of N freqeuencies including time when certain magnitude is present.
Result should look like this array: [time,frequency,amplitude]
So in the end if we want 3 harmonics, it would look like:
[0,100,2.85489] #100Hz harmonic has 2.85489 amplitude on 0ms
[0,200,1.15695] #200Hz ...
[0,300,3.12215]
[100,100,1.22248] #100Hz harmonic has 1.22248 amplitude on 100ms
[100,200,1.58758]
[100,300,2.57578]
[200,100,5.16574]
[200,200,3.15267]
[200,300,0.89987]
Visualization is not needed, result should be just arrays (or hashes/dictionaries) as listed above.
Further to #Paul R's answer, scipy.signal.spectrogram is a spectrogram function in scipy's signal processing module.
The example at the above link is as follows:
from scipy import signal
import matplotlib.pyplot as plt
# Generate a test signal, a 2 Vrms sine wave whose frequency linearly
# changes with time from 1kHz to 2kHz, corrupted by 0.001 V**2/Hz of
# white noise sampled at 10 kHz.
fs = 10e3
N = 1e5
amp = 2 * np.sqrt(2)
noise_power = 0.001 * fs / 2
time = np.arange(N) / fs
freq = np.linspace(1e3, 2e3, N)
x = amp * np.sin(2*np.pi*freq*time)
x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
#Compute and plot the spectrogram.
f, t, Sxx = signal.spectrogram(x, fs)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
It looks like you're trying to implement a spectrogram, which is a sequence of power spectrum estimates, typically implemented with a succession of (usually overlapping) FFTs. Since you only have one FFT (spectrum) then you have no time dimension yet. Put your FFT code in a loop, and process one block of samples (e.g. 1024) per iteration, with a 50% overlap between successive blocks. The sequence of generated spectra will then be a 3D array of time v frequency v magnitude.
I'm not a Python person, but I can give you some pseudo code which should be enough to get you coding:
N = length of data input
N_FFT = no of samples per block (== FFT size, e.g. 1024)
i = 0 ;; i = index of spectrum within 3D output array
for block_start = 0 to N - block_start
block_end = block_start + N_FFT
get samples from block_start .. block_end
apply window function to block (e.g. Hamming)
apply FFT to windowed block
calculate magnitude spectrum (20 * log10( re*re + im*im ))
store spectrum in output array at index i
block_start += N_FFT / 2 ;; NB: 50% overlap
i++
end
Edit: Oh, so it seems this returns values, but they don't fit to the audio file at all. Even though they can be used as magnitude on spectrogram, they won't work for example in those classic audio visualizers which you can see in many music players. I also tried matplotlib's pylab for the spectrogram, but the result is same.
import os
import wave
import pylab
import math
from numpy import amax
from numpy import amin
def get_wav_info(wav_file,mi,mx):
wav = wave.open(wav_file, 'r')
frames = wav.readframes(-1)
sound_info = pylab.fromstring(frames, 'Int16')
frame_rate = wav.getframerate()
wav.close()
spectrum, freqs, t, im = pylab.specgram(sound_info, NFFT=1024, Fs=frame_rate)
n = 0
while n < 20:
for index,power in enumerate(spectrum[n]):
print("%s,%s,%s" % (n,int(round(t[index]*1000)),math.ceil(power*100)/100))
n += 1
get_wav_info("wave.wav",1,20)
Any tips how to obtain dB that's usable in visualization?
Basically, we apparently have all we need from the code above, just how to make it return normal values? Ignore mi and mx as these are just adjusting values in array to fit into mi..mx interval - that would be for visualization usage. If I am correct, spectrum in this code returns array of arrays which contains amplitudes for each frequency from freqs array, which are present on time according to t array, but how does the value work - is it really amplitude if it returns these weird values and if it is, how to convert it to dBs for example.
tl;dr I need output for visualizer like music players have, but it shouldn't work realtime, I want just the data, but values don't fit the wav file.
Edit2: I noticed there's one more issue. For 90 seconds wav, t array contains times till 175.x, which seems very weird considering the frame_rate is correct with the wav file. So now we have 2 problems: spectrum doesn't seem to return correct values (maybe it will fit if we get correct time) and t seems to return exactly double time of the wav.
Fixed: Case completely solved.
import os
import pylab
import math
from numpy import amax
from numpy import amin
from scipy.io import wavfile
frame_rate, snd = wavfile.read(wav_file)
sound_info = snd[:,0]
spectrum, freqs, t, im = pylab.specgram(sound_info,NFFT=1024,Fs=frame_rate,noverlap=5,mode='magnitude')
Specgram needed a little adjustment and I loaded only one channel with scipy.io library (instead of wave library). Also without mode set to magnitude, it returns 10log10 instead of 20log10, which is reason why it didn't return correct values.
I am trying to find the power spectral density of a signal measured at uneven times. The data looks something like this:
0 1.55
755 1.58
2412256 2.42
2413137 0.32
2497761 1.19
...
where the first column is the time since the first measurement (in seconds) and the second column is the value of the measurement.
Currently, using the periodogram function in Matlab, I have been able to estimate the power spectral density by using:
nfft = length(data(:,2));
pxx = periodogram(data(:,2),[],nfft);
Now at the moment, to plot this I have been using
len = length(pxx);
num = 1:1:len;
plot(num,pxx)
Which clearly does not place the correct x-axis on the power spectral density (and yields something like the plot below), which needs to be in frequency space. I am confused about how to go about this given the uneven sampling of the data.
What is the correct way to convert to (and then plot in) frequency space when estimating the power spectral density for data that has been unevenly sampled? I am also interested in tackling this from a python/numpy/scipy perspective but have so far only looked at the Matlab function.
I am not aware of any functions that calculate a PSD from irregulary sampled data, so you need to convert the data to a uniform sample rate first. So the first step is to use interp1 to resample at regular time intervals.
avg_fs = 1/mean(diff(data(:, 1)));
min_time = min(data(:, 1));
max_time = max(data(:, 1));
num_pts = floor((max_time - min_time) * avg_fs);
new_time = (1:num_pts)' / avg_fs;
new_time = new_time - new_time(1) + min_time;
new_x = interp1(data(:, 1), data(:, 2), new_time);
I always use pwelch for calculating PSD's, here is how I would go about it
nfft = 512; % play with this to change your frequency resolution
noverlap = round(nfft * 0.75); % 75% overlap
window = hanning(nfft);
[Pxx,F] = pwelch(new_x, window, noverlap, nfft, avg_fs);
plot(F, Pxx)
xlabel('Frequency (Hz)')
grid on
You will definitely want to experiment with nfft, larger numbers will give you more frequency resolution (smaller spacing between frequencies), but the PSD will be noisier. One trick you can do to get fine resolution and low noise is to make the window smaller than nfft.