Simulating noise with specific time character - python

I am trying to generate synthetic data for a time-domain signal. Let's say my signal is a square wave and I have on top of it some random noise. I'll model the noise as Gaussian. If I generate the data as a vector of length N and then add to it random noise sampled from a normal distribution of mean 0 and width 1, I have a rough simulation of the situation I care about. However, this adds noise with characteristic timescale set by the sampling rate. I do not want this, as in reality the noise has a much longer timescale associated with it. What is an efficient way to generate noise with a specific bandwidth?
I've tried generating the noise at each sampled point and then using the FFT to cut out frequencies above a certain value. However, this severely attenuates the signal.
My idea was basically:
noise = normrnd(0,1);
f = fft(noise);
f(1000:end) = 0;
noise = ifft(f);
This kind of works but severly attenuates the signal.

It's pretty common just to generate white noise and filter it. Often an IIR is used since it's cheap and the noise phases are random anyway. It does attenuate the signal, but it costs nothing to amplify it.
You can also generate noise directly with an IFFT. In the example you give every coefficient in the output of fft(noise) is a Gaussian-distributed random variable, so instead of getting those coefficients with an FFT and zeroing out the ones you don't want, you can just set the ones you want and IFFT to get the resulting signal. Remember that the coefficents are complex, but the real and imaginary parts are independently Gaussion-distributed.

Related

Simulation Lorentzian function with FFT

I need to utilize Fourier transform on Lorentzian function with ln scale.
I know Lorentzian function after FFT is exp(-pi|k|), it seems right.
I do that on ln scale. It supposed to be linear and no oscillation at all.
However there is oscillation. I lost it totally.
Here is my code:
import numpy as np
from scipy import fft
import matplotlib.pyplot as plt
a =1
N = 500
x =np.linspace(-5,5,N)
lorentz = (a/np.pi) * (1/(a**2 + x**2))
fourier = (fft.fft(lorentz))
fig, (ax1) = plt.subplots(nrows=1, ncols=1)
ax1.loglog(abs(fourier[0:int(N/2)]),basey=np.e)
ax1.grid(True)
plt.show()
How could I solve the problem?
Follow comment said:
Here is
x =np.linspace(-20,20,N)
It seems like postpone the oscillation but still there.
after adding hamming window :
Hamming window postpones it also.
I try to extend to
x =np.linspace(-60,60,N)
It seems correct(related to a and wider range and point interval). But I'm curious about what happened.
Pascal's remark helps to explain. At first glance, I felt the oscillating distortion you show was related to analysis window boundaries. When your signal is not zero on either side of the window, the FFT analysis will find a step, which results into "butterfly" problems that have nothing to do with your input signal. Hamming window - raised cosine - can solve that, but if the hamming has to do too much smoothing on the edges, you analyse a signal you don't have !
Nice to see the tip to enlarge the analysis window worked in this case.. so you took a larger range around zero, you get the expected result for a Lorentzian function. My science expertise is too limited to actually understand why this particular spectrum is the correct result.
Attempt to explain why 2xNyquist requirement is relevant: you are using signal analysis tools in the real domain (not complex input). For FFT analysis of real samples, the analysis window should accommodate 2 periods of the lowest frequency you are interested in. You are investigating an impulse response, so its "period" shape will be the only one you are interested in. By taking a larger interval around 0 into account, you have put your impulse response in the middle. Hamming window will be around 1 there. When your analysis window is wide enough, and a Hamming window is applied, the FFT will see proper input (zero on either side !) and yield proper, smooth output, as if you are analysing a very low frequency periodic signal.
My experience with FFT tools is in the field of speech research. In a speech sample, there is a pitch. The lowest frequency or "f0" of the speaker. For males, you have e.g. typically 100 Hz pitch. With a sampling frequency of 20kHz, a single pitch period will require 200 samples. Two pitch periods require 400 samples. I prefer setting FFT order instead of analysis window. Order 9 FFT is 512 samples in your window, it will yield 256 frequencies. Order 10 is 512 result frequencies, requiring 1024 samples, etc. The hamming window I use in my spectrum tool is always the full window, not a truncated window.

What is the conceptual purpose of librosa.amplitude_to_db?

I'm using the librosa library to get and filter spectrograms from audio data.
I mostly understand the math behind generating a spectrogram:
Get signal
window signal
for each window compute Fourier transform
Create matrix whose columns are the transforms
Plot heat map of this matrix
So that's really easy with librosa:
spec = np.abs(librosa.stft(signal, n_fft=len(window), window=window)
Yay! I've got my matrix of FFTs. Now I see this function librosa.amplitude_to_db and I think this is where my ignorance of signal processing starts to show. Here is a snippet I found on Medium:
spec = np.abs(librosa.stft(y, hop_length=512))
spec = librosa.amplitude_to_db(spec, ref=np.max)
Why does the author use this amplitude_to_db function? Why not just plot the output of the STFT directly?
The range of perceivable sound pressure is very wide, from around 20 μPa (micro Pascal) to 20 Pa, a ratio of 1 million. Furthermore the human perception of sound levels is not linear, but better approximated by a logarithm.
By converting to decibels (dB) the scale becomes logarithmic. This limits the numerical range, to something like 0-120 dB instead. The intensity of colors when this is plotted corresponds more closely to what we hear than if one used a linear scale.
Note that the reference (0 dB) point in decibels can be chosen freely. The default for librosa.amplitude_to_db is to compute numpy.max, meaning that the max value of the input will be mapped to 0 dB. All other values will then be negative. The function also applies a threshold on the range of sounds, by default 80 dB. So anything lower than -80 dB will be clipped -80 dB.

Is there a way to invert a spectrogram back to signal

In my algorithm I created a spectogram and did manipulation on the data:
import scipy.signal as signal
data = spio.loadmat(mat_file, squeeze_me=True)['records'][:, i]
data = data- np.mean(data)
data = data/ np.max(np.abs(data))
freq, time, Sxx = signal.spectrogram(data, fs=250000, window=signal.get_window("hamming", 480), nperseg=None, noverlap=360, nfft=480)
// ...
// manipulation on Sxx
// ...
Is there anyway to revert the freq, time, and Sxx back to signal?
No, this is not possible. To calculate a spectrogram, you divide your input time-domain signal into (half overlapping) chunks of data, which each are multiplied by an appropriate window function, after which you do a FFT, which gives you a complex vector that indicates the amplitude and phase for every frequency bin. Each column of the spectrogram is finally formed by taking the absolute square of one FFT (and normally you throw away the negative frequencies, since a PSD is symmetric for a real input signal). By taking the absolute square, you lose any phase information. This makes it impossible to accurately reconstruct the original time-domain signal.
Since your ear doesn't care about phase information (your brain would sense something similar as a spectrogram), it might however be possible to reconstruct a signal that sounds approximately the same. This could basically be done by doing all the described steps in reverse, while choosing a random phase for the FFT.
Note that there is one issue with your code: you create a variable named signal, which 'shadows' the scipy.signal module which you import with the same name.

How to obtain small bins after FFT in python?

I'm using scipy.signal.fft.rfft() to calculate power spectral density of a signal. The sampling rate is 1000Hz and the signal contains 2000 points. So frequency bin is (1000/2)/(2000/2)=0.5Hz. But I need to analyze the signal in [0-0.1]Hz.
I saw several answers recommending chirp-Z transform, but I didn't find any toolbox of it written in Python.
So how can I complete this small-bin analysis in Python? Or can I just filter this signal to [0-0.1]Hz using like Butterworth filter?
Thanks a lot!
Even if you use another transform, that will not make more data.
If you have a sampling of 1kHz and 2s of samples, then your precision is 0.5Hz. You can interpolate this with chirpz (or just use sinc(), that's the shape of your data between the samples of your comb), but the data you have on your current point is the data that determines what you have in the lobes (between 0Hz and 0.5Hz).
If you want a real precision of 0.1Hz, you need 10s of data.
You can't get smaller frequency bins to separate out close spectral peaks unless you use more (a longer amount of) data.
You can't just use a narrower filter because the transient response of such a filter will be longer than your data.
You can get smaller frequency bins that are just a smooth interpolation between nearby frequency bins, for instance to plot the spectrum on wider paper or at a higher dpi graphic resolution, by zero-padding the data and using a longer FFT. But that won't create more detail.

Signal processing-remove noise for a series of spectra

I am a chemist and measured a series of spectra of one compound under increasing temperature. (-200 degree to 0 degree). The shape of spectra is very similar at different temperature. The only difference is the intensity: at higher temperature the intensity is lower.
My problem is at high temperature, e.g. 0 degree, the real signal's intensity is quite close to the background noise's amplitude, which make the spectra at high temperature very noisy. I tried some simple smoothing method but the result is not good.
The noise is much less affected by the temperature change than the real signal(which means we can assume the background noise doesn't change too much). Thus, I wonder is there any method that can remove the noise(background) using the series of spectra I have, since they share the "common" background noise.
Any information (e.g. name of method, tools in python or R, reference) will be helpful. Thanks for your help!

Categories