I am trying to detect the pitch of a B3 note played with a guitar. The audio can be found here.
This is the spectrogram:
As you can see, it is visible that the fundamental pitch is about 250Hz which corresponds to the B3 note.
It also contains a good amount of harmonics and that is why I chose to use HPS from here. I am using this code for detecting the pitch:
def freq_from_hps(signal, fs):
"""Estimate frequency using harmonic product spectrum
Low frequency noise piles up and overwhelms the desired peaks
"""
N = len(signal)
signal -= mean(signal) # Remove DC offset
# Compute Fourier transform of windowed signal
windowed = signal * kaiser(N, 100)
# Get spectrum
X = log(abs(rfft(windowed)))
# Downsample sum logs of spectra instead of multiplying
hps = copy(X)
for h in arange(2, 9): # TODO: choose a smarter upper limit
dec = decimate(X, h)
hps[:len(dec)] += dec
# Find the peak and interpolate to get a more accurate peak
i_peak = argmax(hps[:len(dec)])
i_interp = parabolic(hps, i_peak)[0]
# Convert to equivalent frequency
return fs * i_interp / N # Hz
My sampling rate is 40000. However, instead of getting a result close to 250Hz (B3 note), I am getting 0.66Hz. How is this possible?
I also tried with an autocorrelation method from the same repo but I also get bad results like 10000Hz.
Thanks to an answer I understand I have to apply a filter to remove the low frequencies in the signal. How do I do that? Are there multiple methods to do that, and which one is recommended?
STATUS UPDATE:
The high-pass filter proposed by the answer is working. If I apply the function in the answer to my audio signal, it correctly displays about 245Hz. However, I would like to filter the whole signal, not only a part of it. A note could lie in the middle of the signal or a signal contain more than one note (I know a solution is onset detection, but I am curious to know why this isn't working). That is why I edited the code to return filtered_audio.
The problem is that if I do that, even though the noise has been correctly removed (see screenshot). I get 0.05 as a result.
Based on the distances between the harmonics in the spectrogram, I would estimate the pitch to be about 150-200 Hz. So, why doesn't the pitch detection algorithm detect the pitch that we can see by eye in the spectrogram? I have a few guesses:
The note only lasts for a few seconds. At the beginning, there is a beautiful harmonic stack with 10 or more harmonics! These quickly fade away and are not even visible after 5 seconds. If you are trying to estimate the pitch of the entire signal, your estimate might be contaminated by the "pitch" of the sound from 5-12 seconds. Try computing the pitch only for the first 1-2 seconds.
There is too much low frequency noise. In the spectrogram, you can see a lot of power between 0 and 64 Hz. This is not part of the harmonics, so you could try removing it with a high-pass filter.
Here is some code that does the job:
import numpy as np
from scipy.io import wavfile
from scipy import signal
import matplotlib.pyplot as plt
from frequency_estimator import freq_from_hps
# downloaded from https://github.com/endolith/waveform-analyzer/
filename = 'Vocaroo_s1KZzNZLtg3c.wav'
# downloaded from http://vocaroo.com/i/s1KZzNZLtg3c
# Parameters
time_start = 0 # seconds
time_end = 1 # seconds
filter_stop_freq = 70 # Hz
filter_pass_freq = 100 # Hz
filter_order = 1001
# Load data
fs, audio = wavfile.read(filename)
audio = audio.astype(float)
# High-pass filter
nyquist_rate = fs / 2.
desired = (0, 0, 1, 1)
bands = (0, filter_stop_freq, filter_pass_freq, nyquist_rate)
filter_coefs = signal.firls(filter_order, bands, desired, nyq=nyquist_rate)
# Examine our high pass filter
w, h = signal.freqz(filter_coefs)
f = w / 2 / np.pi * fs # convert radians/sample to cycles/second
plt.plot(f, 20 * np.log10(abs(h)), 'b')
plt.ylabel('Amplitude [dB]', color='b')
plt.xlabel('Frequency [Hz]')
plt.xlim((0, 300))
# Apply high-pass filter
filtered_audio = signal.filtfilt(filter_coefs, [1], audio)
# Only analyze the audio between time_start and time_end
time_seconds = np.arange(filtered_audio.size, dtype=float) / fs
audio_to_analyze = filtered_audio[(time_seconds >= time_start) &
(time_seconds <= time_end)]
fundamental_frequency = freq_from_hps(audio_to_analyze, fs)
print 'Fundamental frequency is {} Hz'.format(fundamental_frequency)
Related
I am trying to visualize a frequency spectrum using the BURG algroithm. The data that I am trying to visualize is the distance between heartbeats in milliseconds (e.g: [700, 650, 689, ..., 702]). Time distance is measured from R peak to R peak of next heartbeat.
Now I would like to visualize the frequency band with python's spectrum library (I'm a total noob). The minimum frequency that I am trying to display is 0.0033Hz, so all time differences in my dataset summarized are 5 Minutes long.
My approach was to first take the reciprocal of each value, then multiply by 1000, and then multiply by 60. This should get me the Bpm for each heartbeat.
This is what it looks like: [67.11409396 64.72491909 ... 64.58557589]
Afterwards I use spectrum's burg algorithm to create the PSD. The "data" list contains my BpM for each heartbeat.
AR, rho, ref = arburg(data.tolist(), 7)
PSD = arma2psd(AR, rho=rho, NFFT=1024)
PSD = PSD[len(PSD):len(PSD)//2:-1]
plot(linspace(0, 0.5, len(PSD)), 10*log10(abs(PSD)*2./(1.*pi)))
pylab.legend(['PSD estimate of x using Burg AR(7)'])
The graph that I get looks like this:
5 Minutes Spectrogram
This specific data already exists as a 3D-Spectrogram (Graph above is the equivalent to the last 5 Minutes of 3D-Spectrogram):
Long Time 3D-Spectrogram
My Graph does not seem to match the 3D-Spectrogram. My frequencies are way off.... What causes this and how can I fix it?
Also I would like the y-Axis in my Graph not in [dB] but in absolute Values. I tried with:
plot(linspace(0, 0.5, len(PSD)), abs(PSD))
but that did not really seem to work. It just drew a hyperbole.
Thank you for your help!
The spectrum package comes with a pburg class than can generate a frequencies array, this is shown below. If you want direct comparison between a spectrogram and AR PSDs, I would take the time definition used to compute the spectrogram to also compute the AR PSD per window.
Also, your spectrogram example image looks focused on very low frequencies, so you may want to increase nfft to increase frequency resolution.
import matplotlib.pyplot as plt
from scipy.signal import spectrogram
import numpy as np
from spectrum import pburg
# Parameter settings
n_seconds = 10
fs = 1000 # sampling rate, in hz
freq = 10
nfft = 4096
nperseg = fs
order = 8
# Simulate 10 hz sine wave with white noise
x = np.sin(np.arange(0, n_seconds, 1/fs) * freq * 2 * np.pi)
x += np.random.rand(len(x)) / 10
# Compute spectrogram
freqs, times, powers = spectrogram(x, fs=fs, nfft=nfft)
# Get spectrogram time definition
times = (times * fs).astype(int)
window_times = np.array((times-times[0], times+times[0])).T
# Compute Burg's spectrum per window
powers_burg = np.array([pburg(x[t[0]:t[1]], order=order,
NFFT=nfft, sampling=fs).psd for t in window_times]).T
freqs_burg = np.array(pburg(x, order=order, NFFT=nfft, sampling=fs).frequencies())
# Plot
inds = np.where(freqs < 20)
inds_burg = np.where(freqs_burg < 20)
fig, axes = plt.subplots(ncols=2, figsize=(10, 5))
axes[0].pcolormesh(times/fs, freqs[inds], powers[inds], shading='gouraud')
axes[1].pcolormesh(times/fs, freqs_burg[inds_burg], powers_burg[inds_burg], shading='gouraud')
axes[0].set_title('Spectrogram')
axes[1].set_title('Burg\'s Spectrogram')
I am trying to detect sudden loud noises in audio recordings. One way I have found to do this is by creating a spectrogram of the audio and adding the values of each column. By graphing the sum of the values in each column, one can see a spike every time there is a sudden loud noise. The problem is that, in my use case, I need to play a beep tone (with a frequency of 2350 Hz), while the audio is being recorded. The spectrogram of the beep looks like this:
As you can see, at the beginning and the end of this beep (which is a simple tone with a frequency of 2350 Hz), there are other frequencies present, which I have been unsuccessful in removing. These unwanted frequencies cause a spike when summing up the columns of the spectrogram, at the beginning and at the end of the beep. I want to avoid this because I don't want my beep to be detected as a sudden loud noise. See the spectrogram below for reference:
Here is what the graph of the sum of each column in the spectrogram:
Obviously, I want to avoid having false positives in my algorithm. So I need some way of getting rid of the spikes caused by the beginning and end of the beep. One idea that I have had so far is to add random noise with a low decibel value above and/or below the 2350 Hz line in the beep spectrogram above. This would ideally, create a tone that sounds very similar to the original, but instead of creating a spike when I add up all the values in the column, it would create more of a plateau. Is this idea a feasible solution to my problem? If so, how would I go about creating a beep sound that has random noise like I described above using python? Is there another, easier solution to my problem that I am overlooking?
Currently, I am using the following code to generate my beep sound:
import math
import wave
import struct
audio = []
sample_rate = 44100.0
def append_sinewave(
freq=440.0,
duration_milliseconds=500,
volume=1.0):
"""
The sine wave generated here is the standard beep. If you want something
more aggresive you could try a square or saw tooth waveform. Though there
are some rather complicated issues with making high quality square and
sawtooth waves... which we won't address here :)
"""
global audio # using global variables isn't cool.
num_samples = duration_milliseconds * (sample_rate / 1000.0)
for x in range(int(num_samples)):
audio.append(volume * math.sin(2 * math.pi * freq * ( x / sample_rate )))
return
def save_wav(file_name):
# Open up a wav file
wav_file=wave.open(file_name,"w")
# wav params
nchannels = 1
sampwidth = 2
# 44100 is the industry standard sample rate - CD quality. If you need to
# save on file size you can adjust it downwards. The stanard for low quality
# is 8000 or 8kHz.
nframes = len(audio)
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, sample_rate, nframes, comptype, compname))
# WAV files here are using short, 16 bit, signed integers for the
# sample size. So we multiply the floating point data we have by 32767, the
# maximum value for a short integer. NOTE: It is theortically possible to
# use the floating point -1.0 to 1.0 data directly in a WAV file but not
# obvious how to do that using the wave module in python.
for sample in audio:
wav_file.writeframes(struct.pack('h', int( sample * 32767.0 )))
wav_file.close()
return
append_sinewave(volume=1, freq=2350)
save_wav("output.wav")
Not really an answer - more of a question.
You're asking the speaker to go from stationary to a sine wave instantaneously - that is quite hard to do (though the frequencies aren't that high). If it does manage it, then the received signal should be the convolution of the top hat and the sine wave (sort of like what you are seeing, but without having some data and knowing what you're doing for the spectrogram it's hard to tell).
In either case you could check this by smoothing the start and end of your tone. Something like this for your tone generation:
tr = 0.05 # rise time, in seconds
tf = duration_milliseconds / 1000 # finish time of tone, in seconds
for x in range(int(num_samples)):
t = x / sample_rate # Time of sample in seconds
# Calculate a bump function
bump_function = 1
if 0 < t < tr: # go smoothly from 0 to 1 at the start of the tone
tp = 1 - t / tr
bump_function = math.e * math.exp(1/(tp**2 - 1))
elif tf - tr < t < tf: # go smoothly from 1 to 0 at the end of the tone
tp = 1 + (t - tf) / tr
bump_function = math.e * math.exp(1/(tp**2 - 1))
audio.append(volume * bump_function * math.sin(2 * math.pi * freq * t))
You might need to tune the rise time a bit. With this form of bump function you know that you have a full volume tone from tr after the start to tr before the end. Lots of other functions exist, but if this smooths the start/stop effects in your spectrogram then you at least know why they are there. And prevention is generally better than trying to remove the effect in post-processing.
I'm trying to make a wavetable synthesizer in Python for the first time (based off an example I found here https://blamsoft.com/tutorials/expanse-creating-wavetables/) but the resultant sound I'm getting doesn't sound tonal at all. My output is just a low grainy buzz. I'm pretty new to making wavetables in Python and I was wondering if anybody might be able to tell me what I'm missing in order to write an A440 sine wavetable to the file "wavetable.wav" and have it actually produce a pure sine tone? Here's what I have at the moment:
import wave
import struct
import numpy as np
frame_count = 256
frame_size = 2048
sps = 44100
freq_hz = 440
file = "wavetable.wav" #write waveform to file
wav_file = wave.open(file, 'w')
wav_file.setparams((1, 2, sps, frame_count, 'NONE', 'not compressed'))
values = bytes(0)
for i in range(frame_count):
for ii in range(frame_size):
sample = np.sin((float(ii)/frame_size) * (i+128)/256 * 2 * np.pi * freq_hz/sps) * 65535
if sample < 0:
sample = 0
sample -= 32768
sample = int(sample)
values += struct.pack('h', sample)
wav_file.writeframes(values)
wav_file.close()
print("Generated " + file)
The sine function I have inside the for loop is probably the part I understand the least because I just went by the example verbatim. I'm used to making sine functions like (y = Asin(2πfx)) but I'm not sure what the purpose is of multiplying by ((i+128)/256) and 65535 (16-bit amplitude resolution?). I'm also not sure what the purpose is of subtracting 32768 from each sample. Is anyone able to clarify what I'm missing and maybe point me in the right direction? Am I going about this the wrong way? Any help is appreciated!
If you just wanted to generate sound data ahead of time and then dump it all into a file, and you’re also comfortable using NumPy, I’d suggest using it with a library like SoundFile. Then there’s no need to delimit the data into frames.
Starting with a naïve approach (using numpy.sin, not trying to optimize things yet), one ends with something like this:
from math import tau
import numpy as np
import soundfile as sf
file_path = 'sine.flac'
sample_rate = 48_000 # hertz
duration = 1.0 # seconds
frequency = 432.0 # hertz
amplitude = 0.8 # (not in decibels!)
start_phase = 0.0 # at what phase to start
sample_count = floor(sample_rate * duration)
# cyclical frequency in sample^-1
omega = frequency * tau / sample_rate
# all phases for which we want to sample our sine
phases = np.linspace(start_phase, start_phase + omega * sample_count,
sample_count, endpoint=False)
# our sine wave samples, generated all at once
audio = amplitude * np.sin(phases)
# now write to file
fmt, sub = 'FLAC', 'PCM_24'
assert sf.check_format(fmt, sub) # to make sure we ask the correct thing beforehand
sf.write(file_path, audio, sample_rate, format=fmt, subtype=sub)
This will be a mono sound, you can write stereo using 2d arrays (see NumPy and SoundFile’s docs).
But note that to make a wavetable specifically, you need to be sure it contains just a single period (or an integer number of periods) of the wave exactly, so the playback of the wavetable will be without clicks and have a correct frequency.
You can play chunked sound in real time in Python too, using something like PyAudio. (I’ve not yet used that, so at least for a time this answer would lack code related to that.)
Finally, frankly, all above is unrelated to the generation of sound data from a wavetable: you just pick a wavetable from somewhere, that doesn’t do much for actual synthesis. Here is a simple starting algorithm for that. Assume you want to play back a chunk of sample_count samples and have a wavetable stored in wavetable, a single period which loops perfectly and is normalized. And assume your current wave phase is start_phase, frequency is frequency, sample rate is sample_rate, amplitude is amplitude. Then:
# indices for the wavetable values; this is just for `np.interp` to work
wavetable_period = float(len(wavetable))
wavetable_indices = np.linspace(0, wavetable_period,
len(wavetable), endpoint=False)
# frequency of the wavetable played at native resolution
wavetable_freq = sample_rate / wavetable_period
# start index into the wavetable
start_index = start_phase * wavetable_period / tau
# code above you run just once at initialization of this wavetable ↑
# code below is run for each audio chunk ↓
# samples of wavetable per output sample
shift = frequency / wavetable_freq
# fractional indices into the wavetable
indices = np.linspace(start_index, start_index + shift * sample_count,
sample_count, endpoint=False)
# linearly interpolated wavetavle sampled at our frequency
audio = np.interp(indices, wavetable_indices, wavetable,
period=wavetable_period)
audio *= amplitude
# at last, update `start_index` for the next chunk
start_index += shift * sample_count
Then you output the audio. Though there are better ways to play back a wavetable, linear interpolation is at least a fine start. Frequency slides are also possible with this approach: just compute indices in another way, no longer spaced uniformly.
I am using the Librosa library for pitch and onset detection. Specifically, I am using onset_detect and piptrack.
This is my code:
def detect_pitch(y, sr, onset_offset=5, fmin=75, fmax=1400):
y = highpass_filter(y, sr)
onset_frames = librosa.onset.onset_detect(y=y, sr=sr)
pitches, magnitudes = librosa.piptrack(y=y, sr=sr, fmin=fmin, fmax=fmax)
notes = []
for i in range(0, len(onset_frames)):
onset = onset_frames[i] + onset_offset
index = magnitudes[:, onset].argmax()
pitch = pitches[index, onset]
if (pitch != 0):
notes.append(librosa.hz_to_note(pitch))
return notes
def highpass_filter(y, sr):
filter_stop_freq = 70 # Hz
filter_pass_freq = 100 # Hz
filter_order = 1001
# High-pass filter
nyquist_rate = sr / 2.
desired = (0, 0, 1, 1)
bands = (0, filter_stop_freq, filter_pass_freq, nyquist_rate)
filter_coefs = signal.firls(filter_order, bands, desired, nyq=nyquist_rate)
# Apply high-pass filter
filtered_audio = signal.filtfilt(filter_coefs, [1], y)
return filtered_audio
When running this on guitar audio samples recorded in a studio, therefore samples without noise (like this), I get very good results in both functions. The onset times are correct and the frequencies are almost always correct (with some octave errors sometimes).
However, a big problem arises when I try to record my own guitar sounds with my cheap microphone. I get audio files with noise, such as this. The onset_detect algorithm gets confused and thinks that noise contains onset times. Therefore, I get very bad results. I get many onset times even if my audio file consists of one note.
Here are two waveforms. The first is of a guitar sample of a B3 note recorded in a studio, whereas the second is my recording of an E2 note.
The result of the first is correctly B3 (the one onset time was detected).
The result of the second is an array of 7 elements, which means that 7 onset times were detected, instead of 1! One of those elements is the correct onset time, other elements are just random peaks in the noise part.
Another example is this audio file containing the notes B3, C4, D4, E4:
As you can see, the noise is clear and my high-pass filter has not helped (this is the waveform after applying the filter).
I assume this is a matter of noise, as the difference between those files lies there. If yes, what could I do to reduce it? I have tried using a high-pass filter but there is no change.
I have three observations to share.
First, after a bit of playing around, I've concluded that the onset detection algorithm appears as if it's probably probably been designed to automatically rescale its own operation in order to take into account local background noise at any given instant. This is likely in order so that it can detect onset times in pianissimo sections with equal likelihood as it would in fortissimo sections. This has the unfortunate result that the algorithm tends to trigger on background noise coming from your cheap microphone--the onset detection algorithm honestly thinks it's simply listening to pianissimo music.
A second observation is that roughly the first ~2200 samples in your recorded example (roughly the first 0.1 seconds) are a bit wonky, in the sense that the noise truly is nearly zero during that short initial interval. Try zooming way into the waveform at the starting point and you'll see what I mean. Unfortunately, the start of the guitar playing follows so quickly after the noise onset (roughly around sample 3000) that the algorithm is unable to resolve the two independently--instead it simply merges the two into a single onset event that begins about 0.1 seconds too early. I therefore cut out roughly the first 2240 samples in order to "normalize" the file (I don't think this is cheating though; it's an edge effect that would likely disappear if you had simply recorded a second or so of initial silence prior to plucking the first string, as one would normally do).
My third observation is that frequency-based filtering only works if the noise and the music are actually in somewhat different frequency bands. That may be true in this case, however I don't think you've demonstrated it yet. Therefore, instead of frequency-based filtering, I elected to try a different approach: thresholding. I used the final 3 seconds of your recording, where there is no guitar playing, in order to estimate the typical background noise level throughout the recording, in units of RMS energy, and then I used that median value to set a minimum energy threshold which was calculated to lie safely above the median. Only onset events returned by the detector occurring at times when the RMS energy is above the threshold are accepted as "valid".
An example script is shown below:
import librosa
import numpy as np
import matplotlib.pyplot as plt
# I played around with this but ultimately kept the default value
hoplen=512
y, sr = librosa.core.load("./Vocaroo_s07Dx8dWGAR0.mp3")
# Note that the first ~2240 samples (0.1 seconds) are anomalously low noise,
# so cut out this section from processing
start = 2240
y = y[start:]
idx = np.arange(len(y))
# Calcualte the onset frames in the usual way
onset_frames = librosa.onset.onset_detect(y=y, sr=sr, hop_length=hoplen)
onstm = librosa.frames_to_time(onset_frames, sr=sr, hop_length=hoplen)
# Calculate RMS energy per frame. I shortened the frame length from the
# default value in order to avoid ending up with too much smoothing
rmse = librosa.feature.rmse(y=y, frame_length=512, hop_length=hoplen)[0,]
envtm = librosa.frames_to_time(np.arange(len(rmse)), sr=sr, hop_length=hoplen)
# Use final 3 seconds of recording in order to estimate median noise level
# and typical variation
noiseidx = [envtm > envtm[-1] - 3.0]
noisemedian = np.percentile(rmse[noiseidx], 50)
sigma = np.percentile(rmse[noiseidx], 84.1) - noisemedian
# Set the minimum RMS energy threshold that is needed in order to declare
# an "onset" event to be equal to 5 sigma above the median
threshold = noisemedian + 5*sigma
threshidx = [rmse > threshold]
# Choose the corrected onset times as only those which meet the RMS energy
# minimum threshold requirement
correctedonstm = onstm[[tm in envtm[threshidx] for tm in onstm]]
# Print both in units of actual time (seconds) and sample ID number
print(correctedonstm+start/sr)
print(correctedonstm*sr+start)
fg = plt.figure(figsize=[12, 8])
# Print the waveform together with onset times superimposed in red
ax1 = fg.add_subplot(2,1,1)
ax1.plot(idx+start, y)
for ii in correctedonstm*sr+start:
ax1.axvline(ii, color='r')
ax1.set_ylabel('Amplitude', fontsize=16)
# Print the RMSE together with onset times superimposed in red
ax2 = fg.add_subplot(2,1,2, sharex=ax1)
ax2.plot(envtm*sr+start, rmse)
for ii in correctedonstm*sr+start:
ax2.axvline(ii, color='r')
# Plot threshold value superimposed as a black dotted line
ax2.axhline(threshold, linestyle=':', color='k')
ax2.set_ylabel("RMSE", fontsize=16)
ax2.set_xlabel("Sample Number", fontsize=16)
fg.show()
Printed output looks like:
In [1]: %run rosatest
[ 0.17124717 1.88952381 3.74712018 5.62793651]
[ 3776. 41664. 82624. 124096.]
and the plot that it produces is shown below:
Did you test to normalize the sound sample before treatment ?
When reading onset_detect documentation we can see that there is a lot of optionals arguments, have you already try to use some ?
Maybe one of this optionals arguments may help you to keep only the good one (or at least limit the size of the onset time returned array):
librosa.util.peak_pick (maybe the best)
backtrack
energy
Please see also an update of your code in order to use a pre-computed onset envelope:
def detect_pitch(y, sr, onset_offset=5, fmin=75, fmax=1400):
y = highpass_filter(y, sr)
o_env = librosa.onset.onset_strength(y, sr=sr)
times = librosa.frames_to_time(np.arange(len(o_env)), sr=sr)
onset_frames = librosa.onset.onset_detect(y=o_env, sr=sr)
pitches, magnitudes = librosa.piptrack(y=y, sr=sr, fmin=fmin, fmax=fmax)
notes = []
for i in range(0, len(onset_frames)):
onset = onset_frames[i] + onset_offset
index = magnitudes[:, onset].argmax()
pitch = pitches[index, onset]
if (pitch != 0):
notes.append(librosa.hz_to_note(pitch))
return notes
def highpass_filter(y, sr):
filter_stop_freq = 70 # Hz
filter_pass_freq = 100 # Hz
filter_order = 1001
# High-pass filter
nyquist_rate = sr / 2.
desired = (0, 0, 1, 1)
bands = (0, filter_stop_freq, filter_pass_freq, nyquist_rate)
filter_coefs = signal.firls(filter_order, bands, desired, nyq=nyquist_rate)
# Apply high-pass filter
filtered_audio = signal.filtfilt(filter_coefs, [1], y)
return filtered_audio
does it work better ?
So after my two last questions I come to my actual problem. Maybe somebody finds the error in my theoretical procedure or I did something wrong in programming.
I am implementing a bandpass filter in Python using scipy.signal (using the firwin function). My original signal consists of two frequencies (w_1=600Hz, w_2=800Hz). There might be a lot more frequencies that's why I need a bandpass filter.
In this case I want to filter the frequency band around 600 Hz, so I took 600 +/- 20Hz as cutoff frequencies. When I implemented the filter and reproduce the signal in the time domain using lfilter the right frequency is filtered.
To get rid of the phase shift I plotted the frequency response by using scipy.signal.freqz with the return h of firwin as numerator and 1 as predefined denumerator.
As described in the documentation of freqz I plotted the phase (== angle in the doc) as well and was able to look at the frequency response plot to get the phase shift for the frequency 600 Hz of the filtered signal.
So the phase delay t_p is
t_p=-(Tetha(w))/(w)
Unfortunately when I add this phase delay to the time data of my filtered signal, it has not got the same phase as the original 600 Hz signal.
I added the code. It is weird, before eliminating some part of the code to keep the minimum, the filtered signal started at the correct amplitude - now it is even worse.
################################################################################
#
# Filtering test
#
################################################################################
#
from math import *
import numpy as np
from scipy import signal
from scipy.signal import firwin, lfilter, lti
from scipy.signal import freqz
import matplotlib.pyplot as plt
import matplotlib.colors as colors
################################################################################
# Nb of frequencies in the original signal
nfrq = 2
F = [60,80]
################################################################################
# Sampling:
nitper = 16
nper = 50.
fmin = np.min(F)
fmax = np.max(F)
T0 = 1./fmin
dt = 1./fmax/nitper
#sampling frequency
fs = 1./dt
nyq_rate= fs/2
nitpermin = nitper*fmax/fmin
Nit = int(nper*nitpermin+1)
tps = np.linspace(0.,nper*T0,Nit)
dtf = fs/Nit
################################################################################
# Build analytic signal
# s = completeSignal(F,Nit,tps)
scomplete = np.zeros((Nit))
omg1 = 2.*pi*F[0]
omg2 = 2.*pi*F[1]
scomplete=scomplete+np.sin(omg1*tps)+np.sin(omg2*tps)
#ssingle = singleSignals(nfrq,F,Nit,tps)
ssingle=np.zeros((nfrq,Nit))
ssingle[0,:]=ssingle[0,:]+np.sin(omg1*tps)
ssingle[1,:]=ssingle[0,:]+np.sin(omg2*tps)
################################################################################
## Construction of the desired bandpass filter
lowcut = (60-2) # desired cutoff frequencies
highcut = (60+2)
ntaps = 451 # the higher and closer the signal frequencies, the more taps for the filter are required
taps_hamming = firwin(ntaps,[lowcut/nyq_rate, highcut/nyq_rate], pass_zero=False)
# Use lfilter to get the filtered signal
filtered_signal = lfilter(taps_hamming, 1, scomplete)
# The phase delay of the filtered signal
delay = ((ntaps-1)/2)/fs
plt.figure(1, figsize=(12, 9))
# Plot the signals
plt.plot(tps, scomplete,label="Original signal with %s freq" % nfrq)
plt.plot(tps-delay, filtered_signal,label="Filtered signal %s freq " % F[0])
plt.plot(tps, ssingle[0,:],label="original signal %s Hz" % F[0])
plt.grid(True)
plt.legend()
plt.xlim(0,1)
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
# Plot the frequency responses of the filter.
plt.figure(2, figsize=(12, 9))
plt.clf()
# First plot the desired ideal response as a green(ish) rectangle.
rect = plt.Rectangle((lowcut, 0), highcut - lowcut, 5.0,facecolor="#60ff60", alpha=0.2,label="ideal filter")
plt.gca().add_patch(rect)
# actual filter
w, h = freqz(taps_hamming, 1, worN=1000)
plt.plot((fs * 0.5 / np.pi) * w, abs(h), label="designed rectangular window filter")
plt.xlim(0,2*F[1])
plt.ylim(0, 1)
plt.grid(True)
plt.legend()
plt.xlabel('Frequency (Hz)')
plt.ylabel('Gain')
plt.title('Frequency response of FIR filter, %d taps' % ntaps)
plt.show()'
The delay of your FIR filter is simply 0.5*(n - 1)/fs, where n is the number of filter coefficients (i.e. "taps") and fs is the sample rate. Your implementation of this delay is fine.
The problem is that your array of time values tps is not correct. Take a look
at 1.0/(tps[1] - tps[0]); you'll see that it does not equal fs.
Change this:
tps = np.linspace(0.,nper*T0,Nit)
to, for example, this:
T = Nit / fs
tps = np.linspace(0., T, Nit, endpoint=False)
and your plots of the original and filtered 60 Hz signals will line up beautifully.
For another example, see http://wiki.scipy.org/Cookbook/FIRFilter.
In the script there, the delay is calculated on line 86. Below this, the delay is used to plot the original signal aligned with the filtered signal.
Note: The cookbook example uses scipy.signal.lfilter to apply the filter. A more efficient approach is to use numpy.convolve.
Seems like you may have had this answered already, but I believe that this is what the filtfilt function is used for. Basically, it does both a forward sweep and a backward sweep through your data, thus reversing the phase shift introduced by the initial filtering. Might be worth looking into.