converting stft to chroma and plotting the result - python

I am trying to convert stft of a wav file into chromagram.
Here's my code :-
def stft(x,fs,framesize,hopsize):
frame = int(framesize*fs)
hop = int(hopsize*fs)
w = scipy.hamming(frame)
X = scipy.array([scipy.fft(w*x[i:i+frame])])
for i in range(0,len(x)-frame,hop)
return X
Here's the code for chromagram :-
def chromagram(x,fs,framesize,hopsize):
X = stft(x,fs,framesize,hopsize)
chroma = np.fmod(np.round(np.log2(X / 440) * 12), 12)
return chroma
When I calculate fft I get an array with complex values so I have to cast the result to float before calculating chroma. Am I doing anything wrong here?
Also, How do I plot the result?

I don't think, that works the way to do it. In X you have the complex-valued STFT. You can get its magnitude values with np.abs(X). Did you want to apply this formula? This was to convert frequencies to musical notes, but in X there are no frequencies. You can get the the corresponding frequencies with np.fft.fftfreq(framesize, 1.0/fs).
If you don't want to use the Bregman Audio-Visual Information Toolbox for Chroma Features, and want to implement them for you own, you could port the Matlab Chroma Toolbox. I think they use filterbanks instead of the FFT. Down on this page you find references where Chroma Features are explained in detail.
Anyway, if you have Chroma Features, you can plot them like any 2-dimensional array with imshow.
from matplotlib import pyplot as plt
import numpy as np
X = np.random.random((30, 30))
plt.imshow(X)
plt.show()

Related

What is the best way/method to digitize the data of a 3D surface into a grid of pixels with smaller resolution in Python?

I want to digitize (= average out over cells) photon count data into pixels given by a grid that tells how they are aligned. The photon count data is stored in a 2D array. I want to split that data into cells, each of which would correspond to a pixel. The idea is basically the same as changing an HD image to a smaller resolution. I'd like to achieve this in Python.
The digitizing function I've written:
import numpy as np
def digitize(function_data, grid_shape):
"""
function_data = 2D array of function values of some 3D shape,
eg.: exp(-(x^2 + y^2 -> want to digitize this
grid_shape: an array of length 2 which contains the dimensions of the smaller resolution
"""
l = len(function_data)
pixel_len_x = int(l/grid_shape[0])
pixel_len_y = int(l/grid_shape[1])
digitized_data = np.empty((grid_shape[0], grid_shape[1]))
for i in range(grid_shape[0]): #row-index of pixel in smaller-resolution grid
for j in range(grid_shape[1]): #column-index of pixel in smaller-resolution grid
hd_pixel = []
for k in range(pixel_len_y):
hd_pixel.append(z_data[k][j:j*pixel_len_x])
hd_pixel = np.ravel(hd_pixel) #turns 2D array into 1D to be able to compute average
pixel_avg = np.average(hd_pixel)
digitized_data[i][j] = pixel_avg
return digitized_data
In theory, this function should do what I want to achieve, but when tested it doesn't yield the expected results. Either a completed version of my function or any other method that achieves my goal would be extremely helpful.
You could also use a interpolation function, if you can use SciPy. Here we use one of the gridded data interpolating functions, RectBivariateSpline to upsample your function, but you can find numerous examples on this and other sites.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import RectBivariateSpline as rbs
# Sampling coordinates
x = np.linspace(-2,2,20)
y = np.linspace(-2,2,30)
# Your function
f = np.exp(-(x[:,None]**2 + y**2))
# Interpolator
interp = rbs(x, y, f)
# Higher resolution coordinates
x_hd = np.linspace(x.min(), x.max(), x.size * 5)
y_hd = np.linspace(y.min(), y.max(), y.size * 5)
# New higher res function
f_hd = interp(x_hd, y_hd, grid = True)
# Some plots
fig, ax = plt.subplots(ncols = 2)
ax[0].imshow(f)
ax[1].imshow(f_hd)

NP.FFT on python list

Could you please advise me on the following:
I gather data from an Arduino ADC and store the data in a list on a Raspberry Pi 4 with Python 3.
The list is called 'dataList' and contains 1024 10 bits samples. This all works fine: I can reproduce the sampled signal on the Raspberry.
I would like to use the power spectrum of the acquired signal using numpy FFT.
I tried the following:
[see below]
This should illustrate what I'm trying to do; however this produces incoherent output. The sampled signal has a frequency of about 300 Hz. I would be very grateful for any hints in the right direction!
def show_FFT(window):
fft = np.fft.fft (dataList, 1024, -1, None)
for X_value in range (0,512, 1):
Y_value = fft ([X_value]
gfxdraw.pixel (window, X_value, int(abs(Y_value), black)
As you mentioned in your question, you have a data set whith X starting from 0 to... but for numpy.fft.fft you must keep in mind that it is a discrete Fourier transform (DFT) which caculate the fft of equaly spaced samples and i must mntion that it must be a symetric range of dataset from -x to x. You can simply try it with a gausian finction and change the parameters as you wish and see what are the results...
Since you didn''t give any data set here , I would refer you to a generl case with below code:
import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
# create data from dataframes
x = np.random.rand(50) #unequaly spaced measurment
x.sort()
y = np.exp(-x*x) #measured signal
based on the answer here you can resample your data into equaly spaced points by:
f = interpolate.interp1d(x, y)
num = 500
xx = np.linspace(x[0], x[-1], num)
yy = f(xx)
plt.close('all')
plt.plot(x,y,'bo')
plt.plot(xx,yy, 'g.-')
plt.show()
enter image description here
then you can make your x data symetric very simply by :
x=xx
y=yy
xsample = x-((x.max()-x.min())/2)
xsample=xsample-(xsample.max()+xsample.min())/2
x=xsample
thne if you try fft you will get the corect results as:
ysample =yy
ysample_fft = np.fft.fftshift(np.abs(np.fft.fft(ysample/ysample.max()))) /
np.sqrt(len(ysample))
plt.plot(xsample,ysample_fft/ysample_fft.max(),'b--')
plt.show()
enter image description here

Creating similar spectrogram in continues wavelet transform compared to discret wavelet transform

Using PyWavelets and Matplotbib.Specgram on a signal gives more detailed plots with pywt.dwt then pywt.cwt. How can I get a pywt.cwt specgram in a similar way?
With dwt:
import pywt
import pywt.data
import matplotlib.pyplot as plot
from scipy import signal
from scipy.io import wavfile
bA, bD = pywt.dwt(datamean, 'db2')
powerSpectrum, freqenciesFound, time, imageAxis = plot.specgram(bA, NFFT = 387, Fs=100)
plot.xlabel('Time')
plot.ylabel('Frequency')
plot.show()
with this spectrogram plot:
https://imgur.com/a/bYb8bBS
With cwt:
widths = np.arange(1,5)
coef, freqs = pywt.cwt(datamean, widths,'morl')
powerSpectrum, freqenciesFound, time, imageAxis = plot.specgram(coef, NFFT = 129, Fs=100)
plot.xlabel('Time')
plot.ylabel('Frequency')
plot.show()
with this spectrogram plot:
https://imgur.com/a/GIINzJp
and for better results:
sig = datamean
widths = np.arange(1, 31)
cwtmatr = signal.cwt(sig, signal.ricker, widths)
plt.imshow(cwtmatr, extent=[-1, 1, 1, 5], cmap='PRGn', aspect='auto',
vmax=abs(cwtmatr).max(), vmin=-abs(cwtmatr).max())
plt.show()
with this spectrogram plot:
https://imgur.com/a/TnXqgGR
How can I get for cwt (spectrogram plot 2 and 3) a similar spectogram plot and style like in the first one?
It seems like the 1st spectrogram plot compared to the 3rd has much more details.
This would be better as a comment, but since I lack the Karma to do that:
You don't want to make a spectrogram with wavelets, but a scalogram instead. What it looks like you're doing above is projecting your data in a scale subspace (that correlates to frequency), then taking those scales and finding the frequency content of them which is not what you probably want.
The detail and approximation coefficients are what you would want to use directly. Unfortunately, PyWavelets doesn't have a simple plotting function to do this for you, AFAIK. Matlab does, and their help page may be illuminating if I fail.
def scalogram(data):
wave='db4'
coeff=pywt.wavedec(data,wave)
levels=len(coeff)
lengths=[len(co) for co in coeff]
col=np.max(lengths)
im=np.ones([levels,col])
col=col.astype(float)
for level in range(levels):
#print [lengths[level],col]
y=coeff[level]
if lengths[1+level]<col:
x=col/(lengths[1+level]+1)*np.arange(1,len(y)+1)
xi=np.linspace(0,int(col),int(col))
yi=griddata(points=x,values=y,xi=xi,method='nearest')
else:
yi=y
im[level,:]=yi
im[im==0]=np.nan
tiles=sum(lengths)-lengths[0]
return im,tiles
Wxx,tiles=scalogram(data)
IM=plt.imshow(np.log10(abs(Wxx)),aspect='auto')
plt.show()
There are better ways of doing that, but it works. This produces a square matrix similar to spectrogram in "Wxx", and tiles is simply a counter of the number of time-frequency tilings to compare to the number used in a SFFT.
I've attached a picture of what these tilings look like

Why does scipy.signal.correlate2d fail to work in this example?

I am trying to cross-correlate two images, and thus locate the template image on the first image, by finding the maximum correlation value.
I drew an image with some random shapes (first image), and cut out one of these shapes (template). Now, when I use scipy's correlate2d, and locate point in the correlation with maximum values, several point appear. From my knowledge, shouldn't there only be one point where the overlap is at max?
The idea behind this exercise is to take some part of an image, and then correlate that to some previous images from a database. Then I should be able to locate this part on the older images based on the maximum value of correlation.
My code looks something like this:
from matplotlib import pyplot as plt
from PIL import Image
import scipy.signal as sp
img = Image.open('test.png').convert('L')
img = np.asarray(img)
temp = Image.open('test_temp.png').convert('L')
temp = np.asarray(temp)
corr = sp.correlate2d(img, temp, boundary='symm', mode='full')
plt.imshow(corr, cmap='hot')
plt.colorbar()
coordin = np.where(corr == np.max(corr)) #Finds all coordinates where there is a maximum correlation
listOfCoordinates= list(zip(coordin[1], coordin[0]))
for i in range(len(listOfCoordinates)): #Plotting all those coordinates
plt.plot(listOfCoordinates[i][0], listOfCoordinates[i][1],'c*', markersize=5)
This yields the figure:
Cyan stars are points with max correlation value (255).
I expect there to be only one point in "corr" to have the max value of correlation, but several appear. I have tried to use different modes of correlating, but to no avail.
This is the test image I use when correlating.
This is the template, cut from the original image.
Can anyone give some insight to what I might be doing wrong here?
You are probably overflowing the numpy type uint8.
Try using:
img = np.asarray(img,dtype=np.float32)
temp = np.asarray(temp,dtype=np.float32)
Untested.
Applying
img = img - img.mean()
temp = temp - temp.mean()
before computing the 2D cross-correlation corr should give you the expected result.
Cleaning up the code, for a full example:
from imageio import imread
from matplotlib import pyplot as plt
import scipy.signal as sp
import numpy as np
img = imread('https://i.stack.imgur.com/JL2LW.png', pilmode='L')
temp = imread('https://i.stack.imgur.com/UIUzJ.png', pilmode='L')
corr = sp.correlate2d(img - img.mean(),
temp - temp.mean(),
boundary='symm',
mode='full')
# coordinates where there is a maximum correlation
max_coords = np.where(corr == np.max(corr))
plt.plot(max_coords[1], max_coords[0],'c*', markersize=5)
plt.imshow(corr, cmap='hot')

How to produce the following images (gabor patches)

I am trying to create four gabor patches, very similar to those below.
I don't need them to be identical to the pictures below, but similar.
Despite a bit of tinkering, I have been unable to reproduce these images...
I believe they were created in MATLAB originally. I don't have access to the original MATLAB code.
I have the following code in python (2.7.10):
import numpy as np
from scipy.misc import toimage # One can also use matplotlib*
data = gabor_fn(sigma = ???, theta = 0, Lambda = ???, psi = ???, gamma = ???)
toimage(data).show()
*graphing a numpy array with matplotlib
gabor_fn, from here, is defined below:
def gabor_fn(sigma,theta,Lambda,psi,gamma):
sigma_x = sigma;
sigma_y = float(sigma)/gamma;
# Bounding box
nstds = 3;
xmax = max(abs(nstds*sigma_x*numpy.cos(theta)),abs(nstds*sigma_y*numpy.sin(theta)));
xmax = numpy.ceil(max(1,xmax));
ymax = max(abs(nstds*sigma_x*numpy.sin(theta)),abs(nstds*sigma_y*numpy.cos(theta)));
ymax = numpy.ceil(max(1,ymax));
xmin = -xmax; ymin = -ymax;
(x,y) = numpy.meshgrid(numpy.arange(xmin,xmax+1),numpy.arange(ymin,ymax+1 ));
(y,x) = numpy.meshgrid(numpy.arange(ymin,ymax+1),numpy.arange(xmin,xmax+1 ));
# Rotation
x_theta=x*numpy.cos(theta)+y*numpy.sin(theta);
y_theta=-x*numpy.sin(theta)+y*numpy.cos(theta);
gb= numpy.exp(-.5*(x_theta**2/sigma_x**2+y_theta**2/sigma_y**2))*numpy.cos(2*numpy.pi/Lambda*x_theta+psi);
return gb
As you may be able to tell, the only difference (I believe) between the images is contrast. So, gabor_fn would likely needed to be altered to do allow for this (unless I misunderstand one of the params)...I'm just not sure how.
UPDATE:
from math import pi
from matplotlib import pyplot as plt
data = gabor_fn(sigma=5.,theta=pi/2.,Lambda=12.5,psi=90,gamma=1.)
unit = #From left to right, unit was set to 1, 3, 7 and 9.
bound = 0.0009/unit
fig = plt.imshow(
data
,cmap = 'gray'
,interpolation='none'
,vmin = -bound
,vmax = bound
)
plt.axis('off')
The problem you are having is a visualization problem (although, I think you are chossing too large parameters).
By default matplotlib, and scipy's (toimage) use bilinear (or trilinear) interpolation, depending on your matplotlib's configuration script. That's why your image looks so smooth. It is because your pixels values are being interpolated, and you are not displaying the raw kernel you have just calculated.
Try using matplotlib with no interpolation:
from matplotlib import pyplot as plt
plt.imshow(data, 'gray', interpolation='none')
plt.show()
For the following parameters:
data = gabor_fn(sigma=5.,theta=pi/2.,Lambda=25.,psi=90,gamma=1.)
You get this output:
If you reduce lamda to 15, you get something like this:
Additionally, the sigma you choose changes the strength of the smoothing, adding parameters vmin=-1 and vmax=1 to imshow (similar to what #kazemakase) suggested, will give you the desired contrast.
Check this guide for sensible values (and ways to use) gabor kernels:
http://scikit-image.org/docs/dev/auto_examples/plot_gabor.html
It seems like toimage scales the input data so that the min/max values are mapped to black/white.
I do not know what amplitudes to reasonably expect from gabor patches, but you should try something like this:
toimage(data, cmin=-1, cmax=1).show()
This tells toimage what range your data is in. You can try to play around with cmin and cmax, but make sure they are symmetric (i.e. cmin=-x, cmax=x) so that a value of 0 maps to grey.

Categories