Measure how much activity is occurring within certain frequency values in Python - python

I'm trying to find a way to obtain frequency values (in Hz) of an audio file, and measure how often these frequency values occur in proportion to the rest of the frequency values in that file.
For example, in an audio file, I'd like to see what proportion of the audio activity occurs within the 300 - 500 Hz range.
This would be simple if I could somehow get a list or an array filled with all frequency values of a given audio file, but I don't know how to do that.
Thank you.

You may be looking for a Fourier Transform or Fast Fourier Transform.
From Wikipedia:
On the left would be your normal signal, and on the right is your frequency-domain signal. Of course, you can just cut out the 300-500 Hz range, take the integral over that area, then divide by the total area to get the proportion...
Not really my specialty, but consider a scipy solution?

Related

Filtering specific frequency audio from a audio file using python

I have to separate a particular frequency audio(say 3700-8000Hz) from an existing or real time audio and then store the extracted audio data into a new audio file as output
I can think of two basic ways to do this:
Use a digital filter to simulate a bandpass filter with the desired frequencies. This might work in real-time; however, it is conceptually more difficult to do.
Use a Fourier transform (FFT) to convert the time signal into a frequency spectrum, blank out the undesired frequencies, and perform a reverse FFT to get back to the time domain. Depending on the window size of your FFT you need to work out what the exact values are that you want to keep.

How to find time of high frequency noise in an audio wav / raw file using Python?

I have an audio clip, and I want to detect when a certain (high pitch) noise occurs. I don't know anything about FFT, how do I return the audio frame at which the noise occurs (I was thinking frequency trigger)?
Since you already have "scipy" there: You can compute a spectrogram to get frequencies over time for your sample (which you can load with wavfile.read.
You can then plot the spectrogram and figure out what the frequency you're looking for is.
Then
either loop over the spectrogram data itself to find the time where there's suitably strong signal in that frequency, or
filter your original signal first (a bandpass filter would be best) and then find the overall intense bits.

How do I eliminate "uninteresting" parts of waveform from a non-uniform waveform using MATLAB functions?

I have signal(s) (of a person climbing stairs.) of the following nature. This is a signal worth 38K + samples over a period of 6 minutes of stair ascent. The parts where there is some low frequency noise are the times when the person would take a turnabout to get to the next flight of stairs (and hence does not count as stair ascent.)
Figure 1
This is why I need to get rid of it for my deep learning model which only accepts the stair ascent data. Essentially, I only need the high frequency regions where the person is climbing stairs. I could do eliminate it manually, but it would take me a lot of time since there are 58 such signals.
My approach for a solution to this problem was modulating this signal with a square wave which is 0 for low frequency regions and 1 for high frequency regions and then to multiply the signals together. But the problem is how to create such a square wave signal which detects the high and low frequency regions on its own?
I tried enveloping the signal (using MATLAB's envelope rms function) and I got the following result:
Figure 2
As you can see the envelope rms signal follows the function quite well. But I am stuck as to how to create a modulating square wave function off of it (essentially what I am asking for a variable pulse-width modulating waveform.)
PS: I have considered using high-pass filter but this won't work because there are some low frequency signals in the high frequency stair-climbing region which I cannot afford to remove. I have also thought of using some form of rising/falling edge detection(for the envelope rms function) but have found no practical way of implementing it.) Please advise.
Thank you for your help in advance,
Shreya
Thanks to David for his thresholding suggestion which I did on my dataset I have these results... though I am again stuck with trying to get rid of the redundant peaks between zeros (see image below) What do I do next?
Figure 3
I think I have been able to solve my problem of being able to isolate the "interesting" part of the waveform from the entire original waveform successfully using the following procedure (for the reader's future reference:)
A non-uniform waveform such as Figure 1 can have the "envelope(rms)" MATLAB function applied to obtain the orange function such as the one in Figure 2. Subsequently, I filtered this enveloperms waveform using MATLAB's very own "idfilt" function. This enabled me to get rid of the unwanted spikes (between zeroes) that were occurring between the "interesting" parts of the waveform. Then, using thresholding, I converted this waveform to be equal to 1 at the "interesting" parts and 0 at the "uninteresting" parts giving me a pulse-width modulated square wave form that follows ONLY the "interesting parts of the original waveform (in Figure 1) I then multiplied my square waveform with the original function and was able to filter out the "uninteresting" parts as demonstrated in Figure 4.
Figure 4
Thank You all for your help! This thread is now resolved!
I think I have been able to solve my problem of being able to isolate the "interesting" part of the waveform from the entire original waveform successfully using the following procedure (for the reader's future reference:)
A non-uniform waveform such as Figure 1 can have the "envelope(rms)" MATLAB function applied to obtain the orange function such as the one in Figure 2. Subsequently, I filtered this enveloperms waveform using MATLAB's very own "idfilt" function. This enabled me to get rid of the unwanted spikes (between zeroes) that were occurring between the "interesting" parts of the waveform. Then, using thresholding, I converted this waveform to be equal to 1 at the "interesting" parts and 0 at the "uninteresting" parts giving me a pulse-width modulated square wave form that follows ONLY the "interesting parts of the original waveform (in Figure 1) I then multiplied my square waveform with the original function and was able to filter out the "uninteresting" parts as demonstrated in Figure 4.
Thank You all for your help! This thread is now resolved!

How to obtain small bins after FFT in python?

I'm using scipy.signal.fft.rfft() to calculate power spectral density of a signal. The sampling rate is 1000Hz and the signal contains 2000 points. So frequency bin is (1000/2)/(2000/2)=0.5Hz. But I need to analyze the signal in [0-0.1]Hz.
I saw several answers recommending chirp-Z transform, but I didn't find any toolbox of it written in Python.
So how can I complete this small-bin analysis in Python? Or can I just filter this signal to [0-0.1]Hz using like Butterworth filter?
Thanks a lot!
Even if you use another transform, that will not make more data.
If you have a sampling of 1kHz and 2s of samples, then your precision is 0.5Hz. You can interpolate this with chirpz (or just use sinc(), that's the shape of your data between the samples of your comb), but the data you have on your current point is the data that determines what you have in the lobes (between 0Hz and 0.5Hz).
If you want a real precision of 0.1Hz, you need 10s of data.
You can't get smaller frequency bins to separate out close spectral peaks unless you use more (a longer amount of) data.
You can't just use a narrower filter because the transient response of such a filter will be longer than your data.
You can get smaller frequency bins that are just a smooth interpolation between nearby frequency bins, for instance to plot the spectrum on wider paper or at a higher dpi graphic resolution, by zero-padding the data and using a longer FFT. But that won't create more detail.

How to get the correct peaks and troughs from an 1d-array

We are trying to find peaks and troughs from an 1d-array.
We are using the numpy.r_() and it finds every peak and trough from an array but we want only the peaks and troughs that correspond to relaxation and contraction points of diaphragmatic motion.
Is there any function that rejects the wrong min and max points?
See a bad example below:
You have high-frequency, small-amplitude oscillations that are undesirable for peak finding purposes. Filter them out prior to searching for peaks. A simple filter to use is 1-dimensional Gaussian from scipy.ndimage. On the scale of your chart, it seems that
smooth_signal = ndimage.gaussian_filter1d(signal, 5)
should be about right (the window size should be large enough to suppress unwanted oscillation but small enough to not distort actual peaks). Then apply your peak finding algorithm to smooth_signal.
The signal processing module has more sophisticated filters, but those take some time to learn to use.

Categories