How to obtain small bins after FFT in python? - python

I'm using scipy.signal.fft.rfft() to calculate power spectral density of a signal. The sampling rate is 1000Hz and the signal contains 2000 points. So frequency bin is (1000/2)/(2000/2)=0.5Hz. But I need to analyze the signal in [0-0.1]Hz.
I saw several answers recommending chirp-Z transform, but I didn't find any toolbox of it written in Python.
So how can I complete this small-bin analysis in Python? Or can I just filter this signal to [0-0.1]Hz using like Butterworth filter?
Thanks a lot!

Even if you use another transform, that will not make more data.
If you have a sampling of 1kHz and 2s of samples, then your precision is 0.5Hz. You can interpolate this with chirpz (or just use sinc(), that's the shape of your data between the samples of your comb), but the data you have on your current point is the data that determines what you have in the lobes (between 0Hz and 0.5Hz).
If you want a real precision of 0.1Hz, you need 10s of data.

You can't get smaller frequency bins to separate out close spectral peaks unless you use more (a longer amount of) data.
You can't just use a narrower filter because the transient response of such a filter will be longer than your data.
You can get smaller frequency bins that are just a smooth interpolation between nearby frequency bins, for instance to plot the spectrum on wider paper or at a higher dpi graphic resolution, by zero-padding the data and using a longer FFT. But that won't create more detail.

Related

What is the conceptual purpose of librosa.amplitude_to_db?

I'm using the librosa library to get and filter spectrograms from audio data.
I mostly understand the math behind generating a spectrogram:
Get signal
window signal
for each window compute Fourier transform
Create matrix whose columns are the transforms
Plot heat map of this matrix
So that's really easy with librosa:
spec = np.abs(librosa.stft(signal, n_fft=len(window), window=window)
Yay! I've got my matrix of FFTs. Now I see this function librosa.amplitude_to_db and I think this is where my ignorance of signal processing starts to show. Here is a snippet I found on Medium:
spec = np.abs(librosa.stft(y, hop_length=512))
spec = librosa.amplitude_to_db(spec, ref=np.max)
Why does the author use this amplitude_to_db function? Why not just plot the output of the STFT directly?
The range of perceivable sound pressure is very wide, from around 20 μPa (micro Pascal) to 20 Pa, a ratio of 1 million. Furthermore the human perception of sound levels is not linear, but better approximated by a logarithm.
By converting to decibels (dB) the scale becomes logarithmic. This limits the numerical range, to something like 0-120 dB instead. The intensity of colors when this is plotted corresponds more closely to what we hear than if one used a linear scale.
Note that the reference (0 dB) point in decibels can be chosen freely. The default for librosa.amplitude_to_db is to compute numpy.max, meaning that the max value of the input will be mapped to 0 dB. All other values will then be negative. The function also applies a threshold on the range of sounds, by default 80 dB. So anything lower than -80 dB will be clipped -80 dB.

Simulating noise with specific time character

I am trying to generate synthetic data for a time-domain signal. Let's say my signal is a square wave and I have on top of it some random noise. I'll model the noise as Gaussian. If I generate the data as a vector of length N and then add to it random noise sampled from a normal distribution of mean 0 and width 1, I have a rough simulation of the situation I care about. However, this adds noise with characteristic timescale set by the sampling rate. I do not want this, as in reality the noise has a much longer timescale associated with it. What is an efficient way to generate noise with a specific bandwidth?
I've tried generating the noise at each sampled point and then using the FFT to cut out frequencies above a certain value. However, this severely attenuates the signal.
My idea was basically:
noise = normrnd(0,1);
f = fft(noise);
f(1000:end) = 0;
noise = ifft(f);
This kind of works but severly attenuates the signal.
It's pretty common just to generate white noise and filter it. Often an IIR is used since it's cheap and the noise phases are random anyway. It does attenuate the signal, but it costs nothing to amplify it.
You can also generate noise directly with an IFFT. In the example you give every coefficient in the output of fft(noise) is a Gaussian-distributed random variable, so instead of getting those coefficients with an FFT and zeroing out the ones you don't want, you can just set the ones you want and IFFT to get the resulting signal. Remember that the coefficents are complex, but the real and imaginary parts are independently Gaussion-distributed.

How to get the correct peaks and troughs from an 1d-array

We are trying to find peaks and troughs from an 1d-array.
We are using the numpy.r_() and it finds every peak and trough from an array but we want only the peaks and troughs that correspond to relaxation and contraction points of diaphragmatic motion.
Is there any function that rejects the wrong min and max points?
See a bad example below:
You have high-frequency, small-amplitude oscillations that are undesirable for peak finding purposes. Filter them out prior to searching for peaks. A simple filter to use is 1-dimensional Gaussian from scipy.ndimage. On the scale of your chart, it seems that
smooth_signal = ndimage.gaussian_filter1d(signal, 5)
should be about right (the window size should be large enough to suppress unwanted oscillation but small enough to not distort actual peaks). Then apply your peak finding algorithm to smooth_signal.
The signal processing module has more sophisticated filters, but those take some time to learn to use.

Measure how much activity is occurring within certain frequency values in Python

I'm trying to find a way to obtain frequency values (in Hz) of an audio file, and measure how often these frequency values occur in proportion to the rest of the frequency values in that file.
For example, in an audio file, I'd like to see what proportion of the audio activity occurs within the 300 - 500 Hz range.
This would be simple if I could somehow get a list or an array filled with all frequency values of a given audio file, but I don't know how to do that.
Thank you.
You may be looking for a Fourier Transform or Fast Fourier Transform.
From Wikipedia:
On the left would be your normal signal, and on the right is your frequency-domain signal. Of course, you can just cut out the 300-500 Hz range, take the integral over that area, then divide by the total area to get the proportion...
Not really my specialty, but consider a scipy solution?

Signal processing-remove noise for a series of spectra

I am a chemist and measured a series of spectra of one compound under increasing temperature. (-200 degree to 0 degree). The shape of spectra is very similar at different temperature. The only difference is the intensity: at higher temperature the intensity is lower.
My problem is at high temperature, e.g. 0 degree, the real signal's intensity is quite close to the background noise's amplitude, which make the spectra at high temperature very noisy. I tried some simple smoothing method but the result is not good.
The noise is much less affected by the temperature change than the real signal(which means we can assume the background noise doesn't change too much). Thus, I wonder is there any method that can remove the noise(background) using the series of spectra I have, since they share the "common" background noise.
Any information (e.g. name of method, tools in python or R, reference) will be helpful. Thanks for your help!

Categories