Simulation Lorentzian function with FFT - python

I need to utilize Fourier transform on Lorentzian function with ln scale.
I know Lorentzian function after FFT is exp(-pi|k|), it seems right.
I do that on ln scale. It supposed to be linear and no oscillation at all.
However there is oscillation. I lost it totally.
Here is my code:
import numpy as np
from scipy import fft
import matplotlib.pyplot as plt
a =1
N = 500
x =np.linspace(-5,5,N)
lorentz = (a/np.pi) * (1/(a**2 + x**2))
fourier = (fft.fft(lorentz))
fig, (ax1) = plt.subplots(nrows=1, ncols=1)
ax1.loglog(abs(fourier[0:int(N/2)]),basey=np.e)
ax1.grid(True)
plt.show()
How could I solve the problem?
Follow comment said:
Here is
x =np.linspace(-20,20,N)
It seems like postpone the oscillation but still there.
after adding hamming window :
Hamming window postpones it also.
I try to extend to
x =np.linspace(-60,60,N)
It seems correct(related to a and wider range and point interval). But I'm curious about what happened.

Pascal's remark helps to explain. At first glance, I felt the oscillating distortion you show was related to analysis window boundaries. When your signal is not zero on either side of the window, the FFT analysis will find a step, which results into "butterfly" problems that have nothing to do with your input signal. Hamming window - raised cosine - can solve that, but if the hamming has to do too much smoothing on the edges, you analyse a signal you don't have !
Nice to see the tip to enlarge the analysis window worked in this case.. so you took a larger range around zero, you get the expected result for a Lorentzian function. My science expertise is too limited to actually understand why this particular spectrum is the correct result.
Attempt to explain why 2xNyquist requirement is relevant: you are using signal analysis tools in the real domain (not complex input). For FFT analysis of real samples, the analysis window should accommodate 2 periods of the lowest frequency you are interested in. You are investigating an impulse response, so its "period" shape will be the only one you are interested in. By taking a larger interval around 0 into account, you have put your impulse response in the middle. Hamming window will be around 1 there. When your analysis window is wide enough, and a Hamming window is applied, the FFT will see proper input (zero on either side !) and yield proper, smooth output, as if you are analysing a very low frequency periodic signal.
My experience with FFT tools is in the field of speech research. In a speech sample, there is a pitch. The lowest frequency or "f0" of the speaker. For males, you have e.g. typically 100 Hz pitch. With a sampling frequency of 20kHz, a single pitch period will require 200 samples. Two pitch periods require 400 samples. I prefer setting FFT order instead of analysis window. Order 9 FFT is 512 samples in your window, it will yield 256 frequencies. Order 10 is 512 result frequencies, requiring 1024 samples, etc. The hamming window I use in my spectrum tool is always the full window, not a truncated window.

Related

What is the conceptual purpose of librosa.amplitude_to_db?

I'm using the librosa library to get and filter spectrograms from audio data.
I mostly understand the math behind generating a spectrogram:
Get signal
window signal
for each window compute Fourier transform
Create matrix whose columns are the transforms
Plot heat map of this matrix
So that's really easy with librosa:
spec = np.abs(librosa.stft(signal, n_fft=len(window), window=window)
Yay! I've got my matrix of FFTs. Now I see this function librosa.amplitude_to_db and I think this is where my ignorance of signal processing starts to show. Here is a snippet I found on Medium:
spec = np.abs(librosa.stft(y, hop_length=512))
spec = librosa.amplitude_to_db(spec, ref=np.max)
Why does the author use this amplitude_to_db function? Why not just plot the output of the STFT directly?
The range of perceivable sound pressure is very wide, from around 20 μPa (micro Pascal) to 20 Pa, a ratio of 1 million. Furthermore the human perception of sound levels is not linear, but better approximated by a logarithm.
By converting to decibels (dB) the scale becomes logarithmic. This limits the numerical range, to something like 0-120 dB instead. The intensity of colors when this is plotted corresponds more closely to what we hear than if one used a linear scale.
Note that the reference (0 dB) point in decibels can be chosen freely. The default for librosa.amplitude_to_db is to compute numpy.max, meaning that the max value of the input will be mapped to 0 dB. All other values will then be negative. The function also applies a threshold on the range of sounds, by default 80 dB. So anything lower than -80 dB will be clipped -80 dB.

Simulating noise with specific time character

I am trying to generate synthetic data for a time-domain signal. Let's say my signal is a square wave and I have on top of it some random noise. I'll model the noise as Gaussian. If I generate the data as a vector of length N and then add to it random noise sampled from a normal distribution of mean 0 and width 1, I have a rough simulation of the situation I care about. However, this adds noise with characteristic timescale set by the sampling rate. I do not want this, as in reality the noise has a much longer timescale associated with it. What is an efficient way to generate noise with a specific bandwidth?
I've tried generating the noise at each sampled point and then using the FFT to cut out frequencies above a certain value. However, this severely attenuates the signal.
My idea was basically:
noise = normrnd(0,1);
f = fft(noise);
f(1000:end) = 0;
noise = ifft(f);
This kind of works but severly attenuates the signal.
It's pretty common just to generate white noise and filter it. Often an IIR is used since it's cheap and the noise phases are random anyway. It does attenuate the signal, but it costs nothing to amplify it.
You can also generate noise directly with an IFFT. In the example you give every coefficient in the output of fft(noise) is a Gaussian-distributed random variable, so instead of getting those coefficients with an FFT and zeroing out the ones you don't want, you can just set the ones you want and IFFT to get the resulting signal. Remember that the coefficents are complex, but the real and imaginary parts are independently Gaussion-distributed.

How to obtain small bins after FFT in python?

I'm using scipy.signal.fft.rfft() to calculate power spectral density of a signal. The sampling rate is 1000Hz and the signal contains 2000 points. So frequency bin is (1000/2)/(2000/2)=0.5Hz. But I need to analyze the signal in [0-0.1]Hz.
I saw several answers recommending chirp-Z transform, but I didn't find any toolbox of it written in Python.
So how can I complete this small-bin analysis in Python? Or can I just filter this signal to [0-0.1]Hz using like Butterworth filter?
Thanks a lot!
Even if you use another transform, that will not make more data.
If you have a sampling of 1kHz and 2s of samples, then your precision is 0.5Hz. You can interpolate this with chirpz (or just use sinc(), that's the shape of your data between the samples of your comb), but the data you have on your current point is the data that determines what you have in the lobes (between 0Hz and 0.5Hz).
If you want a real precision of 0.1Hz, you need 10s of data.
You can't get smaller frequency bins to separate out close spectral peaks unless you use more (a longer amount of) data.
You can't just use a narrower filter because the transient response of such a filter will be longer than your data.
You can get smaller frequency bins that are just a smooth interpolation between nearby frequency bins, for instance to plot the spectrum on wider paper or at a higher dpi graphic resolution, by zero-padding the data and using a longer FFT. But that won't create more detail.

How to get the correct peaks and troughs from an 1d-array

We are trying to find peaks and troughs from an 1d-array.
We are using the numpy.r_() and it finds every peak and trough from an array but we want only the peaks and troughs that correspond to relaxation and contraction points of diaphragmatic motion.
Is there any function that rejects the wrong min and max points?
See a bad example below:
You have high-frequency, small-amplitude oscillations that are undesirable for peak finding purposes. Filter them out prior to searching for peaks. A simple filter to use is 1-dimensional Gaussian from scipy.ndimage. On the scale of your chart, it seems that
smooth_signal = ndimage.gaussian_filter1d(signal, 5)
should be about right (the window size should be large enough to suppress unwanted oscillation but small enough to not distort actual peaks). Then apply your peak finding algorithm to smooth_signal.
The signal processing module has more sophisticated filters, but those take some time to learn to use.

Followup: extracting phase information from FFT - proper use of frequency shifts and windows

This is a followup question to one I asked earlier based on the chat after the answer given by #hotpaw2. I have a signal which will be a single cosine wave, with a phase offset. My task is to extract (with very high accuracy required) the amplitude and phase of this single freuqnecy component.
On paper, the following relations hold assuming a properly normalized fourier transform T:
Unsurprisingly, there is a bit more to the DFT than simply taking the transform and picking off the relevent components. In particular, the discussion suggested to me that I was not entirely clear on what the phase offset is being measured with respect to, and that there are significant edge effects that can destroy the accuracy of the result if data is not properly windowed.
I have been googling around but most of the discussion is fairly technical and light on example, so I was hoping that someone can shed some light on things. In particular, I came across one example suggesting that instead of doing a simple transform, I should be shifting it first:
import numpy as np
import pylab as pl
f = 30.0
w = 2.0*np.pi*f
phase = np.pi/2
num_t = 10*f
t, dt = np.linspace(0, 1, num_t, endpoint=False, retstep=True)
signal = np.cos(w*t+phase)#+np.random.normal(0,0.25,len(t))
amp = np.fft.fftshift(np.fft.rfft(np.fft.ifftshift(signal)))
freqs = np.fft.fftshift(np.fft.rfftfreq(t.shape[-1],dt))
index = np.where(freqs==30)
print index[0][0]
print(np.angle(amp))[index[0][0]]
print (np.abs(amp))[index[0][0]]*(2.0/len(t))
pl.subplot(211)
pl.semilogy(freqs,np.abs(amp))
pl.subplot(212)
pl.plot(freqs,(np.angle(amp)))
pl.show()
So: first set of questions: can someone explain the point of using fftshift, and what exactly it is doing to the data? Why does using an inverse shift, transforming, and then shifting then require a frequency component set which is only shifted once, with no inverse opration? Is this approach the correct one (ignoring for the moment the issue of a window).
second set of questions: if I window my data, presumably this will affect the amplitude and likely the phase(?) of the result. Is there an analytical way to correct for the amplitude change for a given window shape? I can find a few tables that list correction factors, but I haven't really seen a good explanation yet.
In the linked question, it was stated that phase should be measured near the center of the hump of the window function. But because the window function is a time-domain function and I want the phase for a specific frequency, I don't quite understand what that means.
Any light that could be shed on the matter (perhaps in the form of references, since clearly I need to do some more reading) would be most appreciated.

Categories