spectrogram by fft using python - python

I am trying to understand a piece of code which in my opinion trying to apply filter first and then compute FFT.
I don't understand how it is doing that. Can anyone please explain that to me.
Here is the code:
# Parameters to create the spectrogram
N = 160000 # No. of frames in .wav file
K = 512
step = 4
wind = 0.5 * (1 - np.cos(np.array(range(K)) * 2 * np.pi / (K - 1))) # 0.5*2*sin(o/2), creation of filter window
ffts = []
def wav_to_floats(file):
s = wave.open(file, 'r')
str_sig = s.readframes(s.getnframes())
y = np.fromstring(str_sig, np.short)
s.close()
return y
for file_index in range(len(label)):
test_flag = label.iloc[file_index]['fold'] # 0 - training data, 1 - test data
fname = label.iloc[file_index]['filename']
#-------------from here i dont understand mainly------------
spectogram = []
s = wav_to_floats(essential_folder+'src_wavs/'+fname+'.wav')
for j in range(int((step*N/K) - step)):
vec = s[j * K/step : (j+step) * K/step] * wind
spectogram.append(abs(fft(vec, K)[:K / 2]))
ffts.append(np.array(spectogram))

First of all, it converts the file from wav to float( s = wav_to_floats(essential_folder+'src_wavs/'+fname+'.wav')`
, because to calculate fft you need a float number. After that, it does the convolution between the signal and the window(probably a windowed filter)
for j in range(int((step*N/K) - step)):
vec = s[j * K/step : (j+step) * K/step] * wind
takes the modulus of the fft (because fft gives to you a complex number which carries information about modulus and phase) and adds this vector to ffts

Related

Vector Normalization in Python

I'm trying to port this MatLab function in Python:
fs = 128;
x = (0:1:999)/fs;
y_orig = sin(2*pi*15*x);
y_noised = y_orig + 0.5*randn(1,length(x));
[yseg] = mapstd(y_noised);
I wrote this code (which works, so there are not problems with missing variables or else):
Norm_Y = 0
Y_Normalized = []
for i in range(0, len(YSeg), 1):
Norm_Y = Norm_Y + (pow(YSeg[i],2))
Norm_Y = sqrt(Norm_Y)
for i in range(0, len(YSeg), 1):
Y_Normalized.append(YSeg[i] / Norm_Y)
print("%3d %f" %(i, Y_Normalized[i]))
YSeg is Y_Noised (I wrote it in another section of the code).
Now I don't expect the values to be same between MatLab code and mine, cause YSeg or Y_Noised are generated by RAND values, so it's ok they are different, but they are TOO MUCH different.
These are the first 10 values in Matlab:
0.145728655284548
1.41918657039301
1.72322238170491
0.684826842884694
0.125379108969931
-0.188899711186140
-1.03820858801652
-0.402591786430960
-0.844782236884026
0.626897216311757
While these are the first 10 numbers in my python code:
0.052015
0.051132
0.041209
0.034144
0.034450
0.003812
0.048629
0.016854
0.024484
0.021435
It's like mine are 100 times lower. So I feel like I've missed a step during normalization. Can you help ?
You can normalize a vector quite easily in python with numpy:
import numpy as np
def normalize_vector(input_vector):
return input_vector / np.sqrt(np.sum(input_vector**2))
random_vec = np.random.rand(10)
vec_norm = normalize_vector(random_vec)
print(vec_norm)
You can call the provided function with your input vector (YSeg) and check the output. I would expect a similar output as in matlab.
This is an implementation in numpy:
import numpy as np
fs = 127
x = np.arange(10000) / fs
y_orig = np.sin(2 * np.pi * 15 * x)
y_noised = y_orig + 0.5 * np.random.randn(len(x))
yseg = (y_noised - y_noised.mean()) / y_noised.std()
However, why do you consider the values to be "too much different"? After all, the values of y_orig are in range [-1, 1] and you are randomly distorting them by ~0.4 on average.

Gaussian fit for Python with two different function

I used the Gaussian fit with 3 gauss to adjust but datai but I utility data that sometimes my curve contains only two Gaussians in it not find the parameter remnants to use and but great an error is what there is a method that but allows to change with curve fit function use if two or three gaussians .
for my function main, i have this code :
FitGWPS = mainCurveFitGWPS(global_ws, period, All_Max_GWPS, DoupleDip)
and my code for fit is :
import numpy as np
from scipy.optimize import curve_fit
#Functions-----------------------------------------
#Gaussian function
def _1gaus(X,C,X_mean,sigma):
return C*np.exp(-(X-X_mean)**2/(2*sigma**2))
def _3gaus(x, amp1,cen1,sigma1, amp2,cen2,sigma2, amp3,cen3,sigma3):
return amp1*np.exp(-(x-cen1)**2/(2*sigma1**2)) +\
amp2*np.exp(-(x-cen2)**2/(2*sigma2**2)) + amp3*np.exp(-(x-
cen3)**2/(2*sigma3**2))
def ParamFit (Gws, P, Max, popt_Firstgauss):
#Calculating the Lorentzian PDF values given Gaussian parameters and random variableX
width=0
Amp = []
cen = []
wid = []
for j in range(len(Max-1)):
Amp.append(0.8 * (Gws[Max[j]])) # Amplitude
cen.append(P[Max[j]]) # Frequency
if j == 0 : wid.append(0.3 + width * 2.) # Width
else : wid.append(0.3 + popt_Firstgauss[2] * 2.)
return Amp,wid,cen
def mainCurveFitGWPS(global_ws_in, period_in, All_Max_GWPS, DoupleDip):
#Calculating the Gaussian PDF values given Gaussian parameters and random variable X
# For the first fit we calculate with function of the max values
mean = sum(period_in*(global_ws_in))/sum((global_ws_in ))
sigma = np.sqrt(sum((global_ws_in)*(period_in-mean)**2)/sum((global_ws_in)))
Cst = 1 / ( 2* np.pi * sigma)
width=0
Amp = 0.8 * (global_ws_in[All_Max_GWPS[0]]) # Amplitude
cen = period_in[All_Max_GWPS[0]] # Frequency
wid = 0.3 + width * 2. #Width
Amp = []
cen = []
wid = []
for j in range(len(All_Max_GWPS-1)):
Amp.append(0.8 * (global_ws_in[All_Max_GWPS[j]])) # Amplitude
cen.append(period_in[All_Max_GWPS[j]]) # Frequency
if j == 0 : wid.append(0.3 + width * 2.)
else : wid.append(0.3 + popt_gauss[2] * 2.)
#do the fit!
popt_gauss, pcov_gauss = curve_fit(_1gaus, period_in, global_ws_in, p0 = [Cst,
mean, sigma])
FitGauss = _1gaus(period_in, *popt_gauss)
#I use the center, amplitude, and sigma values which I used to create the fake
#data
popt_3gauss, pcov_3gauss = curve_fit(_3gaus, period_in, global_ws_in, p0=[Amp[0],
cen[0], wid[0],Amp[1], cen[1], wid[1],Amp[2], cen[2], wid[2]], maxfev =5000)
Fit3Gauss = _3gaus(period_in, *popt_3gauss)
return Fit3Gauss
for example picture :
and

How to display a 2d interpolation function in python as a matrix?

I looked around a lot but it's hard to find an answer. Basically when one interpolates v -> w you would normally use one of the many interpolation functions. But I want to get the corresponding matrix Av = w.
In my case w is a 200x200 matrices with v beeing a random subset of w with half as many points. I don't really care for fancy math it could be as simple as weighting the known points by distance squared. I already tried just implementing it all with some for loops but it only really works with small values. But maybe it helps explaining my question.
from random import sample
def testScatter(xbig, ybig):
NumberOfPoints = int(xbig * ybig / 2) #half as many points as in full Sample
#choose random coordinates
Index = sample(range(xbig * ybig),NumberOfPoints)
IndexYScatter = np.remainder(Index, xbig)
IndexXScatter = np.array((Index - IndexYScatter) / xbig, dtype=int)
InterpolationMatrix = np.zeros((xbig * ybig , NumberOfPoints), dtype=np.float32)
WeightingSum = np.zeros(xbig * ybig )
coordsSamplePoints = []
for i in range(NumberOfPoints): #first set all the given points (no need to interpolate)
coordsSamplePoints.append(IndexYScatter[i] + xbig * IndexXScatter[i])
InterpolationMatrix[coordsSamplePoints[i], i] = 1
WeightingSum[coordsSamplePoints[i]] = 1
for x in range(xbig * ybig): #now comes the interpolation
if x not in coordsSamplePoints:
YIndexInterpol = x % xbig #xcoord in interpolated matrix
XIndexInterpol = (x - YIndexInterpol) / xbig #ycoord in interp. matrix
for y in range(NumberOfPoints):
XIndexScatter = IndexXScatter[y]
YIndexScatter = IndexYScatter[y]
distanceSquared = (np.float32(YIndexInterpol) - np.float32(YIndexScatter))**2+(np.float32(XIndexInterpol) - np.float32(XIndexScatter))**2
InterpolationMatrix[x,y] = 1/distanceSquared
WeightingSum[x] += InterpolationMatrix[x,y]
return InterpolationMatrix/ WeightingSum[:,None] , IndexXScatter, IndexYScatter
You need to spend some time with the Numpy documentation start at the top of this page and working your way down. Studying answers here on SO for questions asking how to vectorize an operation when using Numpy array's would help you. If you find that you are iterating over indices and performing calcs with Numpy arrays there is probably a better way.
First cut...
The first for loop can be replaced with:
coordsSamplePoints = IndexYScatter + (xbig * IndexXScatter)
InterpolationMatrix[coordsSamplePoints,np.arange(coordsSamplePoints.shape[0])] = 1
WeightingSum[coordsSamplePoints] = 1
This mainly makes use of elementwise arithmetic and Index arrays - the complete Indexing Tutorial should be read
You can test this by enhancing the function and executing the for loop along with Numpy way then comparing the result.
...
IM = InterpolationMatrix.copy()
WS = WeightingSum.copy()
for i in range(NumberOfPoints): #first set all the given points (no need to interpolate)
coordsSamplePoints.append(IndexYScatter[i] + xbig * IndexXScatter[i])
InterpolationMatrix[coordsSamplePoints[i], i] = 1
WeightingSum[coordsSamplePoints[i]] = 1
cSS = IndexYScatter + (xbig * IndexXScatter)
IM[cSS,np.arange(cSS.shape[0])] = 1
WS[cSS] = 1
# TEST Validity
print((cSS == coordsSamplePoints).all(),
(IM == InterpolationMatrix).all(),
(WS == WeightingSum).all())
...
The outer loop:
...
for x in range(xbig * ybig): #now comes the interpolation
if x not in coordsSamplePoints:
YIndexInterpol = x % xbig #xcoord in interpolated matrix
XIndexInterpol = (x - YIndexInterpol) / xbig #ycoord in interp. matrix
...
Can be replaced with:
...
space = np.arange(xbig * ybig)
mask = ~(space == cSS[:,None]).any(0)
iP = space[mask] # points to interpolate
yIndices = iP % xbig
xIndices = (iP - yIndices) / xbig
...
Complete solution:
import random
import numpy as np
def testScatter(xbig, ybig):
NumberOfPoints = int(xbig * ybig / 2) #half as many points as in full Sample
#choose random coordinates
Index = random.sample(range(xbig * ybig),NumberOfPoints)
IndexYScatter = np.remainder(Index, xbig)
IndexXScatter = np.array((Index - IndexYScatter) / xbig, dtype=int)
InterpolationMatrix = np.zeros((xbig * ybig , NumberOfPoints), dtype=np.float32)
WeightingSum = np.zeros(xbig * ybig )
coordsSamplePoints = IndexYScatter + (xbig * IndexXScatter)
InterpolationMatrix[coordsSamplePoints,np.arange(coordsSamplePoints.shape[0])] = 1
WeightingSum[coordsSamplePoints] = 1
IM = InterpolationMatrix
cSS = coordsSamplePoints
WS = WeightingSum
space = np.arange(xbig * ybig)
mask = ~(space == cSS[:,None]).any(0)
iP = space[mask] # points to interpolate
yIndices = iP % xbig
xIndices = (iP - yIndices) / xbig
dSquared = ((yIndices[:,None] - IndexYScatter) ** 2) + ((xIndices[:,None] - IndexXScatter) ** 2)
IM[iP,:] = 1/dSquared
WS[iP] = IM[iP,:].sum(1)
return IM / WS[:,None], IndexXScatter, IndexYScatter
I'm getting about 200x improvement with this over your original with (100,100) for the arguments. Probably some other minor improvements but they won't effect execution time significantly.
Broadcasting is another Numpy skill that is a must.

First DFT line of the stft by scipy.signal

I have manually implemented the STFT.
Comparison to the scipy.signal.stft revealed the same results as my implementation, except an additional DFT section at the beginning (t=0).
Can anyone write the script describing the first DFT, which I have probably missed?
my stft of the signal:
]1
sci.signal.stft of the signal:
2
the code:
def my_stft(samples, fs, wind_len_time=0.5, overlap_factor=0.5,
zero_padding_factor=4):
wind_len = int(fs * wind_len_time)
overlap = wind_len * overlap_factor
section_promotion = wind_len - overlap
transform_len = wind_len * zero_padding_factor
stft = []
for index in np.arange(0, samples.size, section_promotion).astype(int):
section = samples[index:index + wind_len]
section_fft = np.abs(fft(section, n=transform_len))
if not np.mod(section_fft.size, 2).astype(bool):
section_fft = section_fft[:section_fft.size / 2]
else:
logger.debug('odd length fft')
stft.append(section_fft)
time = np.arange(0, samples.size, section_promotion) / float(fs)
freq = np.arange(section_fft.size) / float(section_fft.size) * fs / 2.0
Freq, Time = np.meshgrid(time, freq)
stft = np.array(stft).transpose()
scaling = 2 / fs# onesided
stft = stft * scaling
return Time, Freq, stft
Definition of the parameter boundary=None (in the scipy.signal.stft) will remove the first spectrum the t=0

computing spectrograms of wav files & recorded sound (normalizing for volume)

I want to compare recorded audio with audio read from disk in a consistent way, but I'm running into problems with normalization for volume (otherwise amplitudes of spectrograms are different).
I also have never worked with signals, FFTs, or the WAV format ever before so this is new, uncharted territory for me. I retrieve channels as lists of signed 16bit ints sampled at 44100 Hz from both
on disk .wav files
recorded music playing from my laptop
and then I proceed through each with a window (2^k) with a certain amount of overlap. For each window like so:
# calculate window variables
window_step_size = int(self.window_size * (1.0 - self.window_overlap_ratio)) + 1
last_frame = nframes - window_step_size # nframes is total number of frames from audio source
num_windows, i = 0, 0 # calculate number of windows
while i <= last_frame:
num_windows += 1
i += window_step_size
# allocate memory and initialize counter
wi = 0 # index
nfft = 2 ** self.nextpowof2(self.window_size) # size of FFT in 2^k
fft2D = np.zeros((nfft/2 + 1, num_windows), dtype='c16') # 2d array for storing results
# for each window
count = 0
times = np.zeros((1, num_windows)) # num_windows was calculated
while wi <= last_frame:
# channel_samples is simply list of signed ints
window_samples = channel_samples[ wi : (wi + self.window_size)]
window_samples = np.hamming(len(window_samples)) * window_samples
# calculate and reformat [[[[ THIS IS WHERE I'M UNSURE ]]]]
fft = 2 * np.fft.rfft(window_samples, n=nfft) / nfft
fft[0] = 0 # apparently these are completely real and should not be used
fft[nfft/2] = 0
fft = np.sqrt(np.square(fft) / np.mean(fft)) # use RMS of data
fft2D[:, count] = 10 * np.log10(np.absolute(fft))
# sec / frame * frames = secs
# get midpt
times[0, count] = self.dt * wi
wi += window_step_size
count += 1
# remove NaNs, infs
whereAreNaNs = np.isnan(fft2D);
fft2D[whereAreNaNs] = 0;
whereAreInfs = np.isinf(fft2D);
fft2D[whereAreInfs] = 0;
# find the spectorgram peaks
fft2D = fft2D.astype(np.float32)
# the get_2D_peaks() method discretizes the fft2D periodogram array and then
# finds peaks and filters out those peaks below the threshold supplied
#
# the `amp_xxxx` variables are used for discretizing amplitude and the
# times array above is used to discretize the time into buckets
local_maxima = self.get_2D_peaks(fft2D, self.amp_threshold, self.amp_max, self.amp_min, self.amp_step_size, times, self.dt)
In particular, the crazy stuff (to me at least) happens on the line with my comment [[[[ THIS IS WHERE I'M UNSURE ]]]].
Can anyone point me in the right direction or help me to generate this audio spectrogram while normalizing for volume correctly?
A quick look tells me that you forgot to use a window, it is necessary to calculate your Spectrogram .
You need to use one Window (hamming, hann) in your "window_samples"
np.hamming(len(window_samples)) * window_samples
Then you can calculate rfft.
Edit:
#calc magnetitude from FFT
fftData=fft(windowed);
#Get Magnitude (linear scale) of first half values
Mag=abs(fftData(1:Chunk/2))
#if you want log scale R=20 * np.log10(Mag)
plot(Mag)
#calc RMS from FFT
RMS = np.sqrt( (np.sum(np.abs(np.fft(data)**2) / len(data))) / (len(data) / 2) )
RMStoDb = 20 * log10(RMS)
PS: If you want calculate RMS from FFT you cant use Window(Hann, Hamming), this line makes no sense:
fft = np.sqrt(np.square(fft) / np.mean(fft)) # use RMS of data
One simple normalization data can be done for each window:
window_samples = channel_samples[ wi : (wi + self.window_size)]
#framMax=np.max(window_samples);
framMean=np.mean(window_samples);
Normalized=window_samples/framMean;

Categories