I'm trying to create a vibrato by oscillating between two 430Hz and 450Hz, storing the 16-bit sample in the list wav. However, the audible frequency seems to increase range of oscillation across the entire clip. Does anyone know why?
edit: rewrote code to be more clear/concise
# vibrato.py
maxamp = 2**15 - 1 # max signed short
wav = []
(t, dt) = (0, 1 / 44100)
while t < 6.0:
f = 440 + 10 * math.sin(2 * math.pi * 6 * t)
samp = maxamp * math.sin(2 * math.pi * f * t)
wav.append(samp)
t += dt
--
Update: because the response uses numpy, I'll update my code for plain python3
# vibrato.py
maxamp = 2**15 - 1 # max signed short
wav = []
(t, dt) = (0, 1 / 44100)
phase = 0
while t < 6.0:
f = 440 + 10 * math.sin(2 * math.pi * 6 * t)
phase += 2 * math.pi * f * t
samp = maxamp * math.sin(phase)
wav.append(samp)
t += dt
The issue has to do with an implied phase change that goes along with changing the frequency. In short, when you calculate the response relative to each point in a timeline, it's important to note that the phase of the oscillation will be different for each frequency at each time (except at the starting point where they're all the same). Therefore, moving between frequencies is like moving between different phases. For the case of moving between two distinct frequencies, this can be corrected for post hoc by adjusting the overall signal phases based on the frequency change. I've explained this in another answer so won't explain it again here, but here just show the initial plot that highlights the problem, and how to fix the issue. Here, the main thing added is the importance of a good diagnostic plot, and the right plot for this is a spectrogram.
Here's an example:
import numpy as np
dt = 1./44100
time = np.arange(0., 6., dt)
frequency = 440. - 10*np.sin(2*math.pi*time*1.) # a 1Hz oscillation
waveform = np.sin(2*math.pi*time*frequency)
Pxx, freqs, bins, im = plt.specgram(waveform, NFFT=4*1024, Fs=44100, noverlap=90, cmap=plt.cm.gist_heat)
plt.show()
Note that the span of the frequency oscillation is increasing (as you initially heard). Applying the correction linked to above gives:
dt = 1./defaults['framerate']
time = np.arange(0., 6., dt)
frequency = 440. - 10*np.sin(2*math.pi*time*1.) # a 1Hz oscillation
phase_correction = np.add.accumulate(time*np.concatenate((np.zeros(1), 2*np.pi*(frequency[:-1]-frequency[1:]))))
waveform = np.sin(2*math.pi*time*frequency + phase_correction)
Which is much closer to what was intended, I hope.
Another way to conceptualize this, which might make more sense in the context of looping through each time step (as the OP does), and as closer to the physical model, is to keep track of the phase at each step and determine the new amplitude considering both the amplitude and phase from the previous step, and combining these with the new frequency. I don't have the patience to let this run in pure Python, but in numpy the solution looks like this, and gives a similar result:
dt = 1./44100
time = np.arange(0., 6., dt)
f = 440. - 10*np.sin(2*math.pi*time*1.) # a 1Hz oscillation
delta_phase = 2 * math.pi * f * dt
phase = np.cumsum(delta_phase) # add up the phase differences along timeline (same as np.add.accumulate)
wav = np.sin(phase)
Related
The Situation
I am currently writing a program that will later on be used to analyze a signal that is somewhat of a asymmetric Gaussian. I am interested in how many frequencies I need to reproduce the signal somewhat exact and especially the amplitudes of those frequencies.
Before I input the real data I'm testing the program with a default (asymmetric) Gaussian, as can be seen in the code below.
My Problem
To ensure I that get the amplitudes right, I am rebuilding the original signal using the whole frequency spectrum, but there are some difficulties. I get to reproduce the signal somewhat well multiplying amp with 0.16, which I got by looking at the fraction rebuild/original. Of course, this is really unsatisfying and can't be the correct solution.
To be precise the difference is not dependant on the time length and seems to be a Gaussian too, following the form of the original, increasing in asymmetry according to the Skewnorm function itself. The amplitude of the difference function is correlated linear to 'height'.
My Question
I am writing this post because I am out of ideas for getting the amplitude right. Maybe anyone has had the same / a similar problem and can share their solution for this / give a hint.
Further information
Before focusing on a (asymmetric) Gaussian I analyzed periodic signals and rectangular pulses, which sadly were very unstable to variations in the time length of the input signal. In this context, I experimented with window functions, which seemed to speed up the process and increase the stability, the reason being that I had to integrate the peaks. Working with the Gaussian I got told to take each peak, received via the bare fft and ditch the integration approach, therefore my incertitude considering the amplitude described above. Maybe anyone got an opinion on the approach chosen by me and if necessary can deliver an improvement.
Code
from numpy.fft import fft, fftfreq
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import skewnorm
np.random.seed(1234)
def data():
height = 1
data = height * skewnorm.pdf(t, a=0, loc=t[int(N/2)])
# noise_power = 1E-6
# noise = np.random.normal(scale=np.sqrt(noise_power), size=t.shape)
# data += noise
return data
def fft_own(data):
freq = fftfreq(N, dt)
data_fft = fft(data) * np.pi
amp = 2/N * np.abs(data_fft) # * factor (depending on t1)
# amp = 2/T * np.abs(data_fft)**2
phase = np.angle(data_fft)
peaks, = np.where(amp >= 0) # use whole spectrum for rebuild
return freq, amp, phase, peaks
def rebuild(fft_own):
freq, amp, phase, peaks = fft_own
df = freq[1] - freq[0]
data_rebuild = 0
for i in peaks:
amplitude = amp[i] * df
# amplitude = amp[i] * 0.1
# amplitude = np.sqrt(amp[i] * df)
data_rebuild += amplitude * np.exp(0+1j * (2*np.pi * freq[i] * t
+ phase[i]))
f, ax = plt.subplots(1, 1)
# mask = (t >= 0) & (t <= t1-1)
ax.plot(t, data_init, label="initial signal")
ax.plot(t, np.real(data_rebuild), label="rebuild")
# ax.plot(t[mask], (data_init - np.real(data_rebuild))[mask], label="diff")
ax.set_xlim(0, t1-1)
ax.legend()
t0 = 0
t1 = 10 # diff(t0, t1) ∝ df
# T = t1- t0
N = 4096
t = np.linspace(t0, t1, int(N))
dt = (t1 - t0) / N
data_init = data()
fft_init = fft_own(data_init)
rebuild_init = rebuild(fft_init)
You should get a perfect reconstruction if you divide amp by N, and remove all your other factors.
Currently you do:
data_fft = fft(data) * np.pi # Multiply by pi
amp = 2/N * np.abs(data_fft) # Multiply by 2/N
amplitude = amp[i] * df # Multiply by df = 1/(dt*N) = 1/10
This means that you currently multiply by a total of pi * 2 / 10, or 0.628, that you shouldn't (only the 1/N factor in there is correct).
Correct code:
def fft_own(data):
freq = fftfreq(N, dt)
data_fft = fft(data)
amp = np.abs(data_fft) / N
phase = np.angle(data_fft)
peaks, = np.where(amp >= 0) # use whole spectrum for rebuild
return freq, amp, phase, peaks
def rebuild(fft_own):
freq, amp, phase, peaks = fft_own
data_rebuild = 0
for i in peaks:
data_rebuild += amp[i] * np.exp(0+1j * (2*np.pi * freq[i] * t
+ phase[i]))
Your program can be significantly simplified by using ifft. Simply set to 0 those frequencies in data_fft that you don't want to include in the reconstruction, and apply ifft to it:
data_fft = fft(data)
data_fft[np.abs(data_fft) < threshold] = 0
rebuild = ifft(data_fft).real
Note that the Fourier transform of a Gaussian is a Gaussian, so you won't be picking out individual peaks, you are picking a compact range of frequencies that will always include 0. This is an ideal low-pass filter.
I am trying to generate a white noise field with a specified power spectral density in V**2/Hz, however I am having issues when I try to validate my attempt.
In Python, here is a quick example of what I am doing - I am not sure what I am doing wrong!
from scipy.signal import spectrogram
import numpy as np
n = 10000
fs = 1024
# Desired power spectral density = 60dB re 1 V**2 / Hz
amp = 10.0**( 60.0 / 10.0 )
# Frequency resolution = 1 / duration
f_res = fs / n
# Normalize amplitude to a per Hz bin
amp *= f_res
# Construct uniformly random phases
phases = np.random.rand(n) * 2 * np.pi
spectrum = amp * (np.cos(phases) + 1j * np.sin(phases))
# Set the 0-Hz offset to 0
spectrum[0] = 0 + 0j
# Create timeseries using ifft:
ts = np.fft.ifft(spectrum)
# Use scipy spectrogram to validate approach
f, t, a = spectrogram(ts, fs, window='hann', scaling='density', mode='psd', return_onesided=False)
print(10.0*np.log10(np.mean(a)))
Running this, I get an answer of ~30 dB - not what I expected! I expected 60 dB. If I change the desired PSD to 70 dB, I get 50 dB, and if I change the desired to 90 dB, I get approximately 90 dB.
I feel like I am missing a scaling factor somewhere, or I am interpreting something incorrectly. Can anyone help me out here?
I am working on finding the frequencies from a given dataset and I am struggling to understand how np.fft.fft() works. I thought I had a working script but ran into a weird issue that I cannot understand.
I have a dataset that is roughly sinusoidal and I wanted to understand what frequencies the signal is composed of. Once I took the FFT, I got this plot:
However, when I take the same dataset, slice it in half, and plot the same thing, I get this:
I do not understand why the frequency drops from 144kHz to 128kHz which technically should be the same dataset but with a smaller length.
I can confirm a few things:
Step size between data points 0.001
I have tried interpolation with little luck.
If I slice the second half of the dataset I get a different frequency as well.
If my dataset is indeed composed of both 128 and 144kHz, then why doesn't the 128 peak show up in the first plot?
What is even more confusing is that I am running a script with pure sine waves without issues:
T = 0.001
fs = 1 / T
def find_nearest_ind(data, value):
return (np.abs(data - value)).argmin()
x = np.arange(0, 30, T)
ff = 0.2
y = np.sin(2 * ff * np.pi * x)
x = x[:len(x) // 2]
y = y[:len(y) // 2]
n = len(y) # length of the signal
k = np.arange(n)
T = n / fs
frq = k / T * 1e6 / 1000 # two sides frequency range
frq = frq[:len(frq) // 2] # one side frequency range
Y = np.fft.fft(y) / n # dft and normalization
Y = Y[:n // 2]
frq = frq[:50]
Y = Y[:50]
fig, (ax1, ax2) = plt.subplots(2)
ax1.plot(x, y)
ax1.set_xlabel("Time (us)")
ax1.set_ylabel("Electric Field (V / mm)")
peak_ind = find_nearest_ind(abs(Y), np.max(abs(Y)))
ax2.plot(frq, abs(Y))
ax2.axvline(frq[peak_ind], color = 'black', linestyle = '--', label = F"Frequency = {round(frq[peak_ind], 3)}kHz")
plt.legend()
plt.xlabel('Freq(kHz)')
ax1.title.set_text('dV/dX vs. Time')
ax2.title.set_text('Frequencies')
fig.tight_layout()
plt.show()
Here is a breakdown of your code, with some suggestions for improvement, and extra explanations. Working through it carefully will show you what is going on. The results you are getting are completely expected. I will propose a common solution at the end.
First set up your units correctly. I assume that you are dealing with seconds, not microseconds. You can adjust later as long as you stay consistent.
Establish the period and frequency of the sampling. This means that the Nyquist frequency for the FFT will be 500Hz:
T = 0.001 # 1ms sampling period
fs = 1 / T # 1kHz sampling frequency
Make a time domain of 30e3 points. The half domain will contain 15000 points. That implies a frequency resolution of 500Hz / 15k = 0.03333Hz.
x = np.arange(0, 30, T) # time domain
n = x.size # number of points: 30000
Before doing anything else, we can define our time domain right here. I prefer a more intuitive approach than what you are using. That way you don't have to redefine T or introduce the auxiliary variable k. But as long as the results are the same, it does not really matter:
F = np.linspace(0, 1 - 1/n, n) / T # Notice F[1] = 0.03333, as predicted
Now define the signal. You picked ff = 0.2. Notice that 0.2Hz. 0.2 / 0.03333 = 6, so you would expect to see your peak in exactly bin index 6 (F[6] == 0.2). To better illustrate what is going on, let's take ff = 0.22. This will bleed the spectrum into neighboring bins.
ff = 0.22
y = np.sin(2 * np.pi * ff * x)
Now take the FFT:
Y = np.fft.fft(y) / n
maxbin = np.abs(Y).argmax() # 7
maxF = F[maxbin] # 0.23333333: This is the nearest bin
Since your frequency bins are 0.03Hz wide, the best resolution you can expect 0.015Hz. For your real data, which has much lower resolution, the error is much larger.
Now let's take a look at what happens when you halve the data size. Among other things, the frequency resolution becomes smaller. Now you have a maximum frequency of 500Hz spread over 7.5k samples, not 15k: the resolution drops to 0.066666Hz per bin:
n2 = n // 2 # 15000
F2 = np.linspace(0, 1 - 1 / n2, n2) / T # F[1] = 0.06666
Y2 = np.fft.fft(y[:n2]) / n2
Take a look what happens to the frequency estimate:
maxbin2 = np.abs(Y2).argmax() # 3
maxF2 = F2[maxbin2] # 0.2: This is the nearest bin
Hopefully, you can see how this applies to your original data. In the full FFT, you have a resolution of ~16.1 per bin with the full data, and ~32.2kHz with the half data. So your original result is within ~±8kHz of the right peak, while the second one is within ~±16kHz. The true frequency is therefore between 136kHz and 144kHz. Another way to look at it is to compare the bins that you showed me:
full: 128.7 144.8 160.9
half: 96.6 128.7 160.9
When you take out exactly half of the data, you drop every other frequency bin. If your peak was originally closest to 144.8kHz, and you drop that bin, it will end up in either 128.7 or 160.9.
Note: Based on the bin numbers you show, I suspect that your computation of frq is a little off. Notice the 1 - 1/n in my linspace expression. You need that to get the right frequency axis: the last bin is (1 - 1/n) / T, not 1 / T, no matter how you compute it.
So how to get around this problem? The simplest solution is to do a parabolic fit on the three points around your peak. That is usually a sufficiently good estimator of the true frequency in the data when you are looking for essentially perfect sinusoids.
def peakF(F, Y):
index = np.abs(Y).argmax()
# Compute offset on normalized domain [-1, 0, 1], not F[index-1:index+2]
y = np.abs(Y[index - 1:index + 2])
# This is the offset from zero, which is the scaled offset from F[index]
vertex = (y[0] - y[2]) / (0.5 * (y[0] + y[2]) - y[1])
# F[1] is the bin resolution
return F[index] + vertex * F[1]
In case you are wondering how I got the formula for the parabola: I solved the system with x = [-1, 0, 1] and y = Y[index - 1:index + 2]. The matrix equation is
[(-1)^2 -1 1] [a] Y[index - 1]
[ 0^2 0 1] # [b] = Y[index]
[ 1^2 1 1] [c] Y[index + 1]
Computing the offset using a normalized domain and scaling afterwards is almost always more numerically stable than using whatever huge numbers you have in F[index - 1:index + 2].
You can plug the results in the example into this function to see if it works:
>>> peakF(F, Y)
0.2261613409657391
>>> peakF(F2, Y2)
0.20401580936430794
As you can see, the parabolic fit gives an improvement, however slight. There is no replacement for just increasing frequency resolution through more samples though!
Last month, I posted this question about how to concatenate sine waves WHEN you are generating them, but now I've faced a different situation where I will generate a sine and make it continue from the end of another sine I did not generate.
My solution was based on the second answer to my previous question, compute the hilbert transform, then, calculate the angle with numpy.angle and normalize it by adding 90, and generating the next sine from there. It works, but only when the unit of my frequency value is 0 or 5, otherwise, the waves doesn't match and I have no clue why.
from scipy.signal import hilbert
import numpy as np
from matplotlib import pyplot as plt
N = 1024
t = np.linspace(0, 1, N)
freq = 5.0
c = np.sin(2 * np.pi * freq * t + 0.0)
c2 = np.angle(hilbert(c), True) # in degrees
plt.subplot(2, 1, 1)
plt.grid()
plt.plot(c)
plt.subplot(2, 1, 2)
phase = c2[-1] + 90
c3 = np.sin(2.0 * np.pi * freq * t + np.deg2rad(phase))
plt.grid()
plt.plot(c3)
plt.show()
Frequency: 5.0
Frequency: 5.8
When the values at the beginning and the end of the time interval do not agree, boundary effects appear, distorting the Hilbert transform. (Recall that the Fourier transform reacts poorly to discontinuities.) This can be seen by plotting the end of c2: plt.plot(c2[-200:] + 90): notice the distortion toward the end, the curve is supposed to rise with constant slope.
You'll get better results by stepping back one period from the edge of the time window:
phase = c2[-1 - int(N//freq)] + 90
I tried with frequency 5.8: the beginning of second curve matches the end of the first.
It is not clear what your exact problem scope is. In the previous question, in a comment which spawned this followup question, you said:
If I don't have the generation equation ( say, I've got a chunk from mic ) what would be the approach?
Does this mean the data is not necessarily a sine wave? Is it noisy? Is it of varying magnitude? You mention DSP: are you doing the processing in real time, or can the analysis take as long as needed?
If it is a clean sine wave of known magnitude, it is relatively easy to extract the phase from the end of the signal, to allow a smooth continuation.
The phase is sin⁻¹(y/mag). There are two inputs to sin(angle), which result in the value y/mag, one for where sin(angle) is increasing with increasing angle, and one for when it is decreasing. By looking at the previous point, we can determine which one we need.
def ending_phase(c, mag):
angle = math.asin(c[-1] / mag)
if c[-2] > c[-1]:
angle = np.pi - angle
return angle
From the phase of the last point, and the phase of the second last point, we can extrapolate the phase for the next point.
def next_phase(c, mag):
ph1 = ending_phase(c[:-1], mag)
ph2 = ending_phase(c, mag)
return 2 * ph2 - ph1
Passing the previous chunk to next_phase() computes the phase argument required to smoothly continue the chunk.
N = 1024
t = np.linspace(0, 1, N)
mag = 1.2
freq = 5.2
phase = 2.2
c1 = mag * np.sin(2 * np.pi * freq * t + phase)
plt.subplot(2,2,1)
plt.grid()
plt.plot(c1)
freq = 3.8
phase = next_phase(c1, mag)
c2 = mag * np.sin(2 * np.pi * freq * t + phase)
plt.subplot(2,2,2)
plt.grid()
plt.plot(c2)
c3 = np.concatenate((c1, c2))
plt.subplot(2,1,2)
plt.grid()
plt.plot(c3)
plt.show()
I'm new to Python programming and I wanted to know if there was a way to create a high-pass filter for a periodic function like so:
import numpy as np
from scipy.signal import lfilter, firwin, butter
from pylab import figure, plot, show
sample_rate = .0167
nsamples = 480
F_1Hz = 1.38e-4
A_1Hz = 1.0
F_15Hz = .0011
A_15Hz = .5
t = np.arange(nsamples) / sample_rate
signal = A_1Hz * np.sin(2*np.pi*F_1Hz*t) + A_15Hz*np.sin(2*np.pi*F_15Hz*t)
signal[::120] = 2
figure(1)
plot(t,signal,'b')
show()
I want to keep the higher frequency ( .0011 Hz) as well as the spikes of 2 at the certain spots, however the amplitudes of the .0011 Hz needs to stay at .5 and the spikes need to stay at an amplitude of 2, so normalizing isn't an option. Moreover, if I made the function have the spikes of 2 at a non-periodic intervals(say a spike at only signal[prime numbers]) could I still filter it correctly, with the correct amplitudes?
One possibility is to use a custom high-pass filter. A simple way to make a high-pass filter is to start with a low-pass filter:
def lp_win_sinc(tw, fc, n):
m = int(np.ceil( 2./tw) * 2)
samps = np.arange(m+1)
shift = samps - m/2
shift[m/2] = 1
h = np.sin(2 * np.pi * fc * shift)/shift
h[m/2] = 2 * np.pi * fc
h = h * np.blackman(m+1)
h = h / h.sum()
s = np.zeros(n)
s[:len(h)] = h
return np.roll(s, -m/2)
Then construct a simple high-pass
def hp_win_sinc(tw, fc, n):
hp = -lp_win_sinc(tw, fc, n)
hp[0] = hp[0] + 1
return hp
(The ideas behind these are found in http://www.dspguide.com/pdfbook.htm, look at the chapter on windowed-sinc filters.)
Note: these are the impulse responses of the respective filters. To apply them to your data you can either convolve the impulse with your data, or you can fft your data and the impulse response and take the inverse fft of their product. In your case, e.g.
hp = hp_win_sinc(0.2, 0.001, len(signal))
f_hp = np.fft.rfft(hp)
f_d = np.fft.rfft(signal)
filt_sig = np.fft.irfft( f_hp * f_d)
plotting this quick result gives:
filtered data
Depending on your exact application, you might be able to simply adjust the gain to recover the 2.0 and 0.5 amplitudes. Hope this helps. Good luck!
The answer is quite likely no.
The reason behind this blunt answer is that your spikes (which have a value of 2) stand on top of the signal. If you filter anything away, your signal amplitude may change at the spikes.
If you could change this:
signal[::120] = 2
into
signal[::120] += 2
then such a filter can be constructed. What do you want to filter away? Anything below .0011 Hz?