I want to make a plot of power spectral density versus frequency for a signal using the numpy.fft.fft function. I want to do this so that I can preserve the complex information in the transform and know what I'm doing, as apposed to relying on higher-level functions provided by numpy (like the periodogram function). I'm following Mathwork's nice page about doing PSD analysis using Matlab's fft function: https://www.mathworks.com/help/matlab/ref/fft.html
In this example, I expect the PSD to peak at the frequency I used to construct the signal, which was 100 in this case. I generate the signal using 1000 time points a frequency of 100 inverse time units. I thought that the fft magnitude could be plotted against [0, nt/2] and the peaks would show up where there is the most energy in the frequency. When I did this, things went wrong. I expected my PSD to peak at 100.
How can I make a spectral density plot of frequency vs energy contained in that frequency using np.fft.fft?
Edit
to clarify, in my real problem, I only know that my characteristic frequency is much larger than my sample frequency
import matplotlib.pyplot as plt
import numpy as np
t = np.arange(1000)
sp = np.fft.fft(np.sin(100 * t * np.pi))
trange = np.linspace(0, t[-1] / 2, t.size)
plt.plot(trange, np.abs(sp) / t.size)
plt.show()
This is a sketch I made of the expected output:
What is your sample frequency? This sequence you are generating can represent a infinite number of continuous time signals according to the sample frequency.
The sample frequency needs to be at least twice the maximum signal frequency, as stated by the Sampling Theorem, so, using fs = 250Hz and using a sine of 10 seconds it becomes:
import matplotlib.pyplot as plt
import numpy as np
fs = 250
t = np.arange(0, 10, 1/fs)
sp = np.fft.fft(np.sin(2*np.pi * 100 * t))
trange = np.linspace(0, fs, len(t))
plt.plot(trange, np.abs(sp))
plt.show()
If you run this you will see a peak at 100Hz as expected.
Related
I have data from the accelerometer in m/s2 (Y-axis) for a time period in seconds (X-axis). I would like to convert this data real-time so that I get the value of an acceleration related to the frequency in Hz. I know that, for example, there is an FFT function in numpy, but I have no idea at all how to use it. I would appreciate, if somebody could provide an example code to convert the raw data (Y: m/s2, X: s) to the desired data (Y: m/s2, X: Hz). It should not be necessarily exactly this function. Thanks!
First, let's create a time-domain signal.
For simplicity, I will create a sine wave with frequency components 12Hz and 24Hz and you can assume the unit of the values are m/s^2:
import numpy as np
import matplotlib.pyplot as plt
# This would be the actual sample rate of your signal
# since you didn't provide that, I just picked one
# big enough to make our graphs look pretty
sample_rate = 22050
# To produce a 1-second wave
length = 1
# The x-axis of your time-domain signal
t = np.linspace(0, length, sample_rate * length)
# A signal with 2 frequency components
# - 12Hz and 24Hz
y = np.sin(12 * (2 * np.pi) * t) + 0.5*np.sin(24 * (2 * np.pi) * t)
# Plot time domain signal
plt.plot(t, y)
plt.xlabel("Time (s)")
plt.show()
This will output:
Now, we continue on with the script by taking the Fourier transform of our original time-domain signal and then creating the magnitude spectrum (since that gives us a better way to visualize how each component is contributing than the phase spectrum):
# This returns the fourier transform coeficients as complex numbers
transformed_y = np.fft.fft(y)
# Take the absolute value of the complex numbers for magnitude spectrum
freqs_magnitude = np.abs(transformed_y)
# Create frequency x-axis that will span up to sample_rate
freq_axis = np.linspace(0, sample_rate, len(freqs_magnitude))
# Plot frequency domain
plt.plot(freq_axis, freqs_magnitude)
plt.xlabel("Frequency (Hz)")
plt.xlim(0, 100)
plt.show()
So now we can visualize the frequency-domain:
Notice how the magnitude of the 24Hz component is about half of the 12Hz component. That is because I purposely timed the 24Hz component by 0.5 on the time-domain signal, so the 24Hz component 'contributes' less to the overall signal, hence we get this halved spike for that component.
Note, also, that the y-axis of our output signal is not really in m/s^2 per Hz as you wanted. But you could compute the actual m/s^2 values by taking the integral over your desired frequency band.
I'll leave the jupyter notebook I created available here, feel free to use it and open issues if you have any problems with it.
I have a vibration data in time domain and want to convert it to frequency domain with fft. However the plot of the FFT only shows a big spike at zero and nothing else.
This is my vibration data: https://pastebin.com/7RK57kJW
My code:
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(3000)
a1_fft= np.fft.fft(a1, axis=0)
freq = np.fft.fftfreq(t.shape[-1])
plt.plot(freq, a1_fft)
My FFT Plot:
What am I doing wrong here? I am pretty sure my data is uniform, which provoces in other cases a similar problem with fft.
The bins of the FFT correspond to the frequencies at 0, df, 2df, 3df, ..., F-2df, F-df, where df is determined by the number of bins and F is 1 cycle per bin.
Notice the zero frequency at the beginning. This is called the DC offset. It's the mean of your data. In the data that you show, the mean is ~1.32, while the amplitude of the sine wave is around 0.04. It's not surprising that you can't see a peak that's 33x smaller than the DC term.
There are some common ways to visualize the data that help you get around this. One common methods is to keep the DC offset but use a log scale, at least for the y-axis:
plt.semilogy(freq, a1_fft)
OR
plt.loglog(freq, a1_fft)
Another thing you can do is zoom in on the bottom 1/33rd or so of the plot. You can do this manually, or by adjusting the span of the displayed Y-axis:
p = np.abs(a1_fft[1:]).max() * [-1.1, 1.1]
plt.ylim(p)
If you are plotting the absolute values already, use
p = np.abs(a1_fft[1:]).max() * [-0.1, 1.1]
Another method is to remove the DC offset. A more elegant way of doing this than what #J. Schmidt suggests is to simply not display the DC term:
plt.plot(freq[1:], a1_fft[1:])
Or for the positive frequencies only:
n = freq.size
plt.plot(freq[1:n//2], a1_fft[1:n//2])
The cutoff at n // 2 is only approximate. The correct cutoff depends on whether the FFT has an even or odd number of elements. For even numbers, the middle bin actual has energy from both sides of the spectrum and often gets special treatment.
The peak at 0 is the DC-gain, which is very high since you didn't normalize your data. Also, the Fourier transform is a complex number, you should plot the absolute value and phase separately. In this code I also plotted only the positive frequencies:
import numpy as np
import matplotlib.pyplot as plt
#Import data
a1 = np.loadtxt('a1.txt')
plt.plot(a1)
#Normalize a1
a1 -= np.mean(a1)
#Your code
t = np.arange(3000)
a1_fft= np.fft.fft(a1, axis=0)
freq = np.fft.fftfreq(t.shape[-1])
#Only plot positive frequencies
plt.figure()
plt.plot(freq[freq>=0], np.abs(a1_fft)[freq>=0])
I've got a little problem managing FFT data. I was looking for many examples of how to do FFT, but I couldn't get what I want from any of them. I have a random wave file with 44kHz sample rate and I want to get magnitude of N harmonics each X ms, let's say 100ms should be enough. I tried this code:
import scipy.io.wavfile as wavfile
import numpy as np
import pylab as pl
rate, data = wavfile.read("sound.wav")
t = np.arange(len(data[:,0]))*1.0/rate
p = 20*np.log10(np.abs(np.fft.rfft(data[:2048, 0])))
f = np.linspace(0, rate/2.0, len(p))
pl.plot(f, p)
pl.xlabel("Frequency(Hz)")
pl.ylabel("Power(dB)")
pl.show()
This was last example I used, I found it somewhere on stackoverflow. The problem is, this gets magnitude which I want, gets frequency, but no time at all. FFT analysis is 3D as far as I know and this is "merged" result of all harmonics. I get this:
X-axis = Frequency, Y-axis = Magnitude, Z-axis = Time (invisible)
From my understanding of the code, t is time - and it seems like that, but is not needed in the code - We'll maybe need it though. p is array of powers (or magnitude), but it seems like some average of all magnitudes of each frequency f, which is array of frequencies. I don't want average/merged value, I want magnitude for N harmonics each X milliseconds.
Long story short, we can get: 1 magnitude of all frequencies.
We want: All magnitudes of N freqeuencies including time when certain magnitude is present.
Result should look like this array: [time,frequency,amplitude]
So in the end if we want 3 harmonics, it would look like:
[0,100,2.85489] #100Hz harmonic has 2.85489 amplitude on 0ms
[0,200,1.15695] #200Hz ...
[0,300,3.12215]
[100,100,1.22248] #100Hz harmonic has 1.22248 amplitude on 100ms
[100,200,1.58758]
[100,300,2.57578]
[200,100,5.16574]
[200,200,3.15267]
[200,300,0.89987]
Visualization is not needed, result should be just arrays (or hashes/dictionaries) as listed above.
Further to #Paul R's answer, scipy.signal.spectrogram is a spectrogram function in scipy's signal processing module.
The example at the above link is as follows:
from scipy import signal
import matplotlib.pyplot as plt
# Generate a test signal, a 2 Vrms sine wave whose frequency linearly
# changes with time from 1kHz to 2kHz, corrupted by 0.001 V**2/Hz of
# white noise sampled at 10 kHz.
fs = 10e3
N = 1e5
amp = 2 * np.sqrt(2)
noise_power = 0.001 * fs / 2
time = np.arange(N) / fs
freq = np.linspace(1e3, 2e3, N)
x = amp * np.sin(2*np.pi*freq*time)
x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
#Compute and plot the spectrogram.
f, t, Sxx = signal.spectrogram(x, fs)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
It looks like you're trying to implement a spectrogram, which is a sequence of power spectrum estimates, typically implemented with a succession of (usually overlapping) FFTs. Since you only have one FFT (spectrum) then you have no time dimension yet. Put your FFT code in a loop, and process one block of samples (e.g. 1024) per iteration, with a 50% overlap between successive blocks. The sequence of generated spectra will then be a 3D array of time v frequency v magnitude.
I'm not a Python person, but I can give you some pseudo code which should be enough to get you coding:
N = length of data input
N_FFT = no of samples per block (== FFT size, e.g. 1024)
i = 0 ;; i = index of spectrum within 3D output array
for block_start = 0 to N - block_start
block_end = block_start + N_FFT
get samples from block_start .. block_end
apply window function to block (e.g. Hamming)
apply FFT to windowed block
calculate magnitude spectrum (20 * log10( re*re + im*im ))
store spectrum in output array at index i
block_start += N_FFT / 2 ;; NB: 50% overlap
i++
end
Edit: Oh, so it seems this returns values, but they don't fit to the audio file at all. Even though they can be used as magnitude on spectrogram, they won't work for example in those classic audio visualizers which you can see in many music players. I also tried matplotlib's pylab for the spectrogram, but the result is same.
import os
import wave
import pylab
import math
from numpy import amax
from numpy import amin
def get_wav_info(wav_file,mi,mx):
wav = wave.open(wav_file, 'r')
frames = wav.readframes(-1)
sound_info = pylab.fromstring(frames, 'Int16')
frame_rate = wav.getframerate()
wav.close()
spectrum, freqs, t, im = pylab.specgram(sound_info, NFFT=1024, Fs=frame_rate)
n = 0
while n < 20:
for index,power in enumerate(spectrum[n]):
print("%s,%s,%s" % (n,int(round(t[index]*1000)),math.ceil(power*100)/100))
n += 1
get_wav_info("wave.wav",1,20)
Any tips how to obtain dB that's usable in visualization?
Basically, we apparently have all we need from the code above, just how to make it return normal values? Ignore mi and mx as these are just adjusting values in array to fit into mi..mx interval - that would be for visualization usage. If I am correct, spectrum in this code returns array of arrays which contains amplitudes for each frequency from freqs array, which are present on time according to t array, but how does the value work - is it really amplitude if it returns these weird values and if it is, how to convert it to dBs for example.
tl;dr I need output for visualizer like music players have, but it shouldn't work realtime, I want just the data, but values don't fit the wav file.
Edit2: I noticed there's one more issue. For 90 seconds wav, t array contains times till 175.x, which seems very weird considering the frame_rate is correct with the wav file. So now we have 2 problems: spectrum doesn't seem to return correct values (maybe it will fit if we get correct time) and t seems to return exactly double time of the wav.
Fixed: Case completely solved.
import os
import pylab
import math
from numpy import amax
from numpy import amin
from scipy.io import wavfile
frame_rate, snd = wavfile.read(wav_file)
sound_info = snd[:,0]
spectrum, freqs, t, im = pylab.specgram(sound_info,NFFT=1024,Fs=frame_rate,noverlap=5,mode='magnitude')
Specgram needed a little adjustment and I loaded only one channel with scipy.io library (instead of wave library). Also without mode set to magnitude, it returns 10log10 instead of 20log10, which is reason why it didn't return correct values.
I have a set of data. It is obviously have some periodic nature. I want to find out what frequency it has by using the fourier transformation and plot it out.
Here is a shot of mine, but it seems not so good.
This is the corresponding code, I don't konw why it fails:
import numpy
from pylab import *
from scipy.fftpack import fft,fftfreq
import matplotlib.pyplot as plt
dataset = numpy.genfromtxt(fname='data.txt',skip_header=1)
t = dataset[:,0]
signal = dataset[:,1]
npts=len(t)
FFT = abs(fft(signal))
freqs = fftfreq(npts, t[1]-t[0])
subplot(211)
plot(t[:npts], signal[:npts])
subplot(212)
plot(freqs,20*log10(FFT),',')
xlim(-10,10)
show()
My question is:Since the original data is very periodic looking, and I expect to see that in the frequency domain the peak is very sharp; how can I make the peak nicer looking?
It's a problem of data analysis.
FFT works with complex number so the spectrum is symmetric on real data input : restrict on xlim(0,max(freqs)) .
The sampling period is not good : increasing period while keeping the same total number of input points will lead to a best quality spectrum on this exemple.
EDIT.
with :
dataset = numpy.genfromtxt(fname='data.txt',skip_header=1)[::30];
t,signal = dataset.T
(...)
plot(freqs,FFT)
xlim(0,1)
ylim(0,30)
the spectrum is
For best quality spectrum , just reacquire the signal for a long long time (for beautiful peaks), with sampling frequency of 1 Hz, which will give you a [0, 0.5 Hz] frequency scale (See Nyquist criterium).
I am performing FFTs on random binary data. I am confused by what the y-axis scaling factor is. My random data has a repetition rate of 400Hz, or a interval between measurements of 0.0025 seconds. The number of data points is 12489.
The code below works, and gives a mean amplitude of around 50.
My questions:
What does y.size exactly do in this context?
What is the expected amplitude of an FFT performed on 12489 random binary points? (I understand that this question is specifically for here, but if it's understood I'd appreciate the help).
The working code: (If you wish to copy and paste it into Python to look at)
from numpy import *
import pylab as P
import numpy as N
import scipy as S
import array
import scipy.fftpack
from random import *
#Produce random binary data
x = N.linspace(0,12489,12489)
randBinList = lambda n: [randint(0,1) for b in range(1,n+1)]
y = randBinList(12489)
y = asarray(y)
#Perform an FFT
FFT = abs(S.fft(y))
freqs = S.fftpack.fftfreq(y.size,0.0025)
#What does y.size do???
x_range = freqs[(freqs>0)]
y_range = FFT[(freqs>0)]
P.plot(x_range,y_range,'.r')
P.show()
fftfreq generates the frequencies of each bin of the result of the FFT, which is computed from the number of samples you pass in and the sampling rate (doc).