scipy signal find_peaks_cwt not finding the peaks accurately?

scipy signal find_peaks_cwt not finding the peaks accurately? - python

I've got a 1-D signal in which I'm trying to find the peaks. I'm looking to find them perfectly.
I'm currently doing:
import scipy.signal as signal
peaks = signal.find_peaks_cwt(data, np.arange(100,200))
The following is a graph with red spots which show the location of the peaks as found by find_peaks_cwt().
As you can see, the calculated peaks aren't accurate enough. The ones that are really important are the three on the right hand side.
My question: How do I make this more accurate?
UPDATE: Data is here: http://pastebin.com/KSBTRUmW
For some background, what I'm trying to do is locate the space in-between the fingers in an image. What is plotted is the x-coordinate of the contour around the hand. Cyan spots = peaks. If there is a more reliable/robust approach this, please leave a comment.

Solved, solution:
Filter data first:
window = signal.general_gaussian(51, p=0.5, sig=20)
filtered = signal.fftconvolve(window, data)
filtered = (np.average(data) / np.average(filtered)) * filtered
filtered = np.roll(filtered, -25)
Then use angrelextrema as per rapelpy's answer.
Result:

There is a much easier solution using this function:
https://gist.github.com/endolith/250860
which is an adaptation of http://billauer.co.il/peakdet.html
I've just tried with the data you provided and I got the result below. No need for pre-filtering...
Enjoy :-)

Edited after getting the raw data.
argelmax and arglextrma are out of the race.
The curve is very noisy, so you have to play with small peak width (as pv. mentioned) and the noise.
The best I found looks not very good.
import numpy as np
import scipy.signal as signal
peakidx = signal.find_peaks_cwt(y_array, np.arange(10,15), noise_perc=0.1)
print peakidx
[10, 100, 132, 187, 287, 351, 523, 597, 800, 1157, 1451, 1673, 1742, 1836]

Based on #cjm2671 answer, here is a working example for finding relative maxima and minima in a noisy signal:
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage.filters import gaussian_filter1d
from scipy import signal
data =np.array([5.14,5.22,5.16,4.82,4.46,4.36,4.4,4.35,4.13,3.83,3.59,3.51,3.46,3.27,3.08,3.03,2.95,2.96,2.98,3.02,3.09,3.14,3.06,2.84,2.68,2.72,2.92,3.23,3.44,3.5,3.28,3.34,3.73,3.97,4.26,4.48,4.5,5.06,6.02,6.68,7.09,7.58,8.6,9.85,10.7,11.3,11.3,11.6,12.3,12.6,12.8,12.8,12.5,12.4,12.2,12.2,12.3,11.9,11.2,10.6,10.3,10.3,10.,9.53,8.97,8.55,8.49,8.41,8.09,7.71,7.34,7.26,7.42,7.47,7.37,7.17,7.05,7.02,7.09,7.23,7.18,7.16,7.47,7.92,8.55,8.68,8.31,8.52,9.11,9.59,9.83,9.73,10.2,11.1,11.6,11.7,11.7,12.,12.6,13.1,13.3,13.2,13.,12.6,12.3,12.2,12.3,12.,11.6,11.1,10.9,10.9,10.7,10.3,9.83,9.64,9.63,9.37,8.88,8.39,8.14,8.12,7.92,7.48,7.06,6.87,6.87,6.63,6.17,5.71,5.45,5.45,5.34,5.05,4.78,4.57,4.47,4.37,4.16,3.95,3.88,3.83,3.69,3.64,3.57,3.5,3.51,3.33,3.14,3.09,3.06,3.12,3.11,2.94,2.83,2.76,2.74,2.77,2.75,2.73,2.72,2.59,2.47,2.53,2.54,2.63,2.76,2.78,2.75,2.69,2.54,2.42,2.58,2.79,2.83,2.78,2.71,2.77,2.88,2.97,2.97,2.9,2.92,3.16,3.29,3.28,3.49,3.97,4.32,4.49,4.82,5.08,5.48,6.03,6.52,6.72,7.16,8.18,9.52,10.9,12.1,12.6,12.9,13.3,13.3,13.6,13.9,13.9,13.6,13.3,13.2,13.2,12.8,12.,11.4,11.,10.9,10.4,9.54,8.83,8.57,8.61,8.24,7.54,6.82,6.46,6.43,6.26,5.78,5.29,5.,5.08,5.14,5.,4.84,4.56,4.38,4.52,4.84,5.33,5.52,5.56,5.82,6.54,7.27,7.74,7.64,8.14,8.96,9.7,10.2,10.2,10.5,11.3,12.,12.4,12.5,12.3,12.,11.8,11.8,11.9,11.6,11.,10.3,10.,9.98,9.6,8.87,8.16,7.76,7.74,7.54,7.03,6.54,6.25,6.26,6.09,5.66,5.31,5.08,5.19,5.4,5.38,5.38,5.22,4.95,4.9,5.02,5.28,5.44,5.93,6.77,7.63,8.48,8.89,8.97,9.49,10.3,10.8,11.,11.1,11.,11.,10.9,11.1,11.1,11.,10.7,10.5,10.4,10.3,10.4,10.3,10.2,10.1,10.2,10.4,10.4,10.5,10.7,10.8,11.,11.2,11.2,11.2,11.3,11.4,11.4,11.3,11.2,11.2,11.,10.7,10.4,10.3,10.3,10.2,9.9,9.62,9.47,9.46,9.35,9.12,8.82,8.48,8.41,8.61,8.83,8.77,8.48,8.26,8.39,8.84,9.2,9.31,9.18,9.11,9.49,9.99,10.3,10.5,10.4,10.2,10.,9.91,10.,9.88,9.47,9.,8.78,8.84,8.8,8.55,8.17,8.02,8.03,7.78,7.3,6.8,6.54,6.53,6.35,5.94,5.54,5.33,5.32,5.14,4.76,4.43,4.28,4.3,4.26,4.11,4.,3.89,3.81,3.68,3.48,3.35,3.36,3.47,3.57,3.55,3.43,3.29,3.19,3.2,3.17,3.21,3.33,3.37,3.33,3.37,3.38,3.26,3.34,3.62,3.86,3.92,3.83,3.69,4.2,4.78,5.03,5.13,5.07,5.4,6.,6.42,6.5,6.45,6.48,6.55,6.66,6.79,7.06,7.33,7.53,7.9,8.17,8.29,8.6,9.05,9.35,9.51,9.69,9.88,10.2,10.6,10.8,10.6,10.7,10.9,11.2,11.3,11.3,11.4,11.5,11.6,11.8,11.7,11.3,11.1,10.9,11.,11.2,11.1,10.6,10.3,10.1,10.2,10.,9.6,9.03,8.73,8.73,8.7,8.53,8.26,8.06,8.03,8.03,7.97,7.94,7.77,7.64,7.85,8.29,8.65,8.68,8.61,9.08,9.66,9.86,9.9,9.71,10.,10.9,11.4,11.6,11.8,11.8,11.9,11.9,12.,12.,11.7,11.3,10.9,10.8,10.7,10.4,9.79,9.18,8.89,8.87,8.55,7.92,7.29,6.99,6.98,6.73,6.18,5.65,5.35,5.35,5.22,4.89,4.53,4.28,4.2,4.05,3.83,3.67,3.61,3.61,3.48,3.27,3.05,2.9,2.93,2.99,2.99,2.98,2.94,2.88,2.89,2.92,2.86,2.97,3.,3.02,3.03,3.11,3.07,3.46,3.96,4.09,4.25,4.3,4.67,5.7,6.33,6.68,6.9,7.09,7.66,8.25,8.75,8.87,8.97,9.78,10.9,11.6,11.8,11.8,11.9,12.3,12.6,12.8,12.9,12.7,12.4,12.1,12.,12.,11.9,11.5,11.1,10.9,10.9,10.7,10.5,10.1,9.91,9.84,9.63,9.28,9.,8.86,8.95,8.87,8.61,8.29,7.99,7.95,7.96,7.92,7.87,7.77,7.78,7.9,7.73,7.51,7.43,7.6,8.07,8.62,9.06,9.24,9.13,9.14,9.46,9.76,9.8,9.78,9.73,9.82,10.2,10.6,10.8,10.8,10.9,11.,10.9,11.,11.,10.9,10.9,11.,10.9,10.8,10.5,10.2,10.2,10.2,9.94,9.51,9.08,8.88,8.88,8.62,8.13,7.64,7.37,7.37,7.23,6.91,6.6,6.41,6.42,6.29,5.94,5.57,5.43,5.46,5.4,5.17,4.95,4.84,4.87,4.9,4.69,4.4,4.24,4.26,4.35,4.34,4.19,3.96,3.97,4.42,5.03,5.34,5.15,4.73,4.86,5.35,5.88,6.35,6.52,6.81,7.26,7.62,7.66,8.01,8.91,10.,10.9,11.3,11.1,10.9,10.9,10.8,10.9,11.,10.7,10.2,9.68,9.43,9.42,9.17,8.66,8.13,7.83,7.81,7.62,7.21,6.77,6.48,6.44,6.31,6.06,5.72,5.47,5.45,5.42,5.31,5.23,5.22,5.3,5.32,5.16,4.96,4.82,4.73,4.9,4.95,4.91,4.92,5.41,6.04,6.34,6.8,7.08,7.26,7.95,8.57,8.78,8.95,9.06,9.14,9.2,9.33,9.53,9.65,9.69,9.53,9.18,9.02,9.,8.82,8.42,8.05,7.85,7.84,7.79,7.58,7.28,7.09,7.07,6.94,6.68,6.35,6.09,6.2,6.27,6.24,6.16,5.91,5.86,6.02,6.19,6.45,6.92,7.35,7.82,8.4,8.87,9.,9.09,9.61,9.99,10.4,10.8,10.7,10.7,11.1,11.4,11.5,11.5,11.3,11.3,11.4,11.7,11.8,11.5,11.,10.5,10.4,10.3,9.94,9.23,8.52,8.16,8.15,7.86,7.23,6.59,6.26,6.25,6.04,5.55,5.06,4.81,4.78,4.62,4.28,3.98,3.84,3.92,3.93,3.68,3.46,3.31,3.16,3.11,3.18,3.19,3.14,3.28,3.3,3.16,3.19,3.04,3.07,3.59,3.83,3.82,3.95,4.06,4.71,5.39,5.89,6.06,6.08,6.45,6.97,7.57,8.1,8.25,8.55,8.92,9.09,9.2,9.32,9.36,9.45,9.65,9.73,9.7,9.82,9.94,9.92,9.97,9.93,9.78,9.63,9.48,9.49,9.48,9.2,8.81,8.34,8.,8.06,7.98,7.63,7.47,7.37,7.24,7.2,7.05,6.93,6.83,6.59,6.44,6.42,6.33,6.18,6.37,6.29,6.1,6.34,6.57,6.54,6.77,7.21,7.58,7.86,8.11,8.57,9.07,9.45,9.67,9.68,9.87,10.2,10.4,10.4,10.4,10.4,10.4,10.5,10.6,10.7,10.4,9.98,9.58,9.45,9.51,9.44,9.09,8.68,8.46,8.36,8.17,7.88,7.55,7.34,7.3,7.17,6.97,6.88,6.69,6.69,6.77,6.77,6.81,6.67,6.5,6.57,6.99,7.4,7.59,7.8,8.45,9.47,10.4,10.8,10.9,10.9,11.,11.4,11.8,12.,11.9,11.4,10.9,10.8,10.8,10.5,9.76,8.99,8.59,8.58,8.43,8.05,7.61,7.26,7.16,6.99,6.58,6.15,5.98,5.93,5.71,5.48,5.22,5.06,5.08,4.95,4.78,4.62,4.45,4.48,4.65,4.66,4.69])
dataFiltered = gaussian_filter1d(data, sigma=5)
tMax = signal.argrelmax(dataFiltered)[0]
tMin = signal.argrelmin(dataFiltered)[0]
plt.plot(data, label = 'raw')
plt.plot(dataFiltered, label = 'filtered')
plt.plot(tMax, dataFiltered[tMax], 'o', mfc= 'none', label = 'max')
plt.plot(tMin, dataFiltered[tMin], 'o', mfc= 'none', label = 'min')
plt.legend()
plt.savefig('fig.png', dpi = 300)
The Gaussian filter already implements the convolution with Gaussian windows. We just have to give it the standard deviation of the window as a parameter.
In this case, this approach works much better than using signal.find_peaks_cwt.

Related

Finding the maximum values of a set of local maxima using matplotlib and numpy

I want to ask a question about finding the maxima of a set of peaks using matplotlib and numpy.
I have been given data containing peaks and asked to calculate the maxima of the set of peaks.
Below is a picture of the peaks.
I discovered the find_peaks method and attempted to solve the problem using this.
I wrote the following block of code in Jupyter:
%pylab inline
from scipy.signal import find_peaks
testdata = loadtxt("testdata.dat", usecols=(0,1))
testdata_x = testdata[100:200,0]
testdata_y = testdata[100:200,1]
plot(testdata_x, testdata_y)
show()
peaks = find_peaks(testdata_y)
peaks
However, I get the following output for peaks:
(array([ 7, 12, 36, 40, 65, 69, 93, 97]), {})
I cannot understand why I get an output as above and am struggling to find a solution.
I attempted also to pass the following:
peaks = find_peaks(testdata_y, testdata_x)
but this was to no avail.
How can I sort out the matter?
I have attached the data file here as a download link if necessary (hoested on filehosting.org)

Like the comments say, the values returned by find_peaks are the indices (or locations) of the peaks.
To find the values of these peaks, use the peak indices to get the values out of testdata_y. Then you can get the max.
%pylab inline
from scipy.signal import find_peaks
testdata = loadtxt("testdata.dat", usecols=(0,1))
testdata_x = testdata[100:200,0]
testdata_y = testdata[100:200,1]
plot(testdata_x, testdata_y)
show()
peaks = find_peaks(testdata_y)
peak_values = testdata_y[peaks[0]]
max_peak = max(peak_values)

Creating similar spectrogram in continues wavelet transform compared to discret wavelet transform

Using PyWavelets and Matplotbib.Specgram on a signal gives more detailed plots with pywt.dwt then pywt.cwt. How can I get a pywt.cwt specgram in a similar way?
With dwt:
import pywt
import pywt.data
import matplotlib.pyplot as plot
from scipy import signal
from scipy.io import wavfile
bA, bD = pywt.dwt(datamean, 'db2')
powerSpectrum, freqenciesFound, time, imageAxis = plot.specgram(bA, NFFT = 387, Fs=100)
plot.xlabel('Time')
plot.ylabel('Frequency')
plot.show()
with this spectrogram plot:
https://imgur.com/a/bYb8bBS
With cwt:
widths = np.arange(1,5)
coef, freqs = pywt.cwt(datamean, widths,'morl')
powerSpectrum, freqenciesFound, time, imageAxis = plot.specgram(coef, NFFT = 129, Fs=100)
plot.xlabel('Time')
plot.ylabel('Frequency')
plot.show()
with this spectrogram plot:
https://imgur.com/a/GIINzJp
and for better results:
sig = datamean
widths = np.arange(1, 31)
cwtmatr = signal.cwt(sig, signal.ricker, widths)
plt.imshow(cwtmatr, extent=[-1, 1, 1, 5], cmap='PRGn', aspect='auto',
vmax=abs(cwtmatr).max(), vmin=-abs(cwtmatr).max())
plt.show()
with this spectrogram plot:
https://imgur.com/a/TnXqgGR
How can I get for cwt (spectrogram plot 2 and 3) a similar spectogram plot and style like in the first one?
It seems like the 1st spectrogram plot compared to the 3rd has much more details.

This would be better as a comment, but since I lack the Karma to do that:
You don't want to make a spectrogram with wavelets, but a scalogram instead. What it looks like you're doing above is projecting your data in a scale subspace (that correlates to frequency), then taking those scales and finding the frequency content of them which is not what you probably want.
The detail and approximation coefficients are what you would want to use directly. Unfortunately, PyWavelets doesn't have a simple plotting function to do this for you, AFAIK. Matlab does, and their help page may be illuminating if I fail.
def scalogram(data):
wave='db4'
coeff=pywt.wavedec(data,wave)
levels=len(coeff)
lengths=[len(co) for co in coeff]
col=np.max(lengths)
im=np.ones([levels,col])
col=col.astype(float)
for level in range(levels):
#print [lengths[level],col]
y=coeff[level]
if lengths[1+level]<col:
x=col/(lengths[1+level]+1)*np.arange(1,len(y)+1)
xi=np.linspace(0,int(col),int(col))
yi=griddata(points=x,values=y,xi=xi,method='nearest')
else:
yi=y
im[level,:]=yi
im[im==0]=np.nan
tiles=sum(lengths)-lengths[0]
return im,tiles
Wxx,tiles=scalogram(data)
IM=plt.imshow(np.log10(abs(Wxx)),aspect='auto')
plt.show()
There are better ways of doing that, but it works. This produces a square matrix similar to spectrogram in "Wxx", and tiles is simply a counter of the number of time-frequency tilings to compare to the number used in a SFFT.
I've attached a picture of what these tilings look like

How to efficiently perform a top-hat (disk-like) smoothing on a healpix map?

I've got a high-resolution healpix map (nside = 4096) that I want to smooth in disks of a given radius, let's say 10 arcmin.
Being very new to healpy and having read the documentation I found that one - not so good - way to do this was to perform a "cone search", that is to find around each pixels the ones inside the disk, average them and give this new value to the pixel at the center. However this is very time-consuming.
import numpy as np
import healpy as hp
kappa = hp.read_map("zs_1.0334.fits") #Reading my file
NSIDE = 4096
t = 0.00290888 #10 arcmin
new_array = []
n = len(kappa)
for i in range(n):
a = hp.query_disc(NSIDE,hp.pix2vec(NSIDE,i),t)
new_array.append(np.mean(kappa[a]))
I think the healpy.sphtfunc.smoothing function could be of some help as it states that you can enter any custom beam window function but I don't understand how this works at all...
Thanks a lot for your help !

As suggested, I can easily make use of the healpy.sphtfunc.smoothing function by specifying a custom (circular) beam window.
To compute the beam window, which was my problem, healpy.sphtfunc.beam2bl is very useful and simple in the case of a top-hat.
The appropriated l_max would roughly be 2*Nside but it can be smaller depending on specific maps. One could for example compute the angular power-spectra (the Cls) and check if it dampens for smaller l than l_max which could help gain some more time.
Thanks a lot to everyone who helped in the comments section!

since I spent a certain amount of time trying to figure out how the function smoothing was working. There is a bit of code that allows you to do a top_hat smoothing.
Cheers,
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
def top_hat(b, radius):
return np.where(abs(b)<=radius, 1, 0)
nside = 128
npix = hp.nside2npix(nside)
#create a empy map
tst_map = np.zeros(npix)
#put a source in the middle of the map with value = 100
pix = hp.ang2pix(nside, np.pi/2, 0)
tst_map[pix] = 100
#Compute the window function in the harmonic spherical space which will smooth the map.
b = np.linspace(0,np.pi,10000)
bw = top_hat(b, np.radians(45)) #top_hat function of radius 45°
beam = hp.sphtfunc.beam2bl(bw, b, nside*3)
#Smooth map
tst_map_smoothed = hp.smoothing(tst_map, beam_window=beam)
hp.mollview(tst_map_smoothed)
plt.show()

Quantile-Quantile Plot using python statsmodels api

I am trying to see whether a normal distribution with specific parameters fits to a data set. However it seems qqplot does not work as it is expected to. The following small example shows this:
import numpy as np
import statsmodels.api as sm
import pylab
test = np.random.normal(20,5, 1000)
sm.qqplot(test, loc = 20, scale = 5 , line='45')
pylab.show()
As one can see I expect the points to be around the line with slope = 1 but it gives the following figure:
Can anyone explain me why this happens?

You can use line = '45' and it will work well if you have z-normalized data, meaning your distribution will have mean = 0 and sd = 1. In other cases you have several options, e.g. line = 's' or line = 'q' in case you want to see a fit against standardized line (the expected order statistics are scaled by the standard deviation of the given sample and have the mean added to them) or against line fit through the quartiles, which in my opinion is the one really meaning full and let's observe well the deviation of your data distribution from the normal one. Also, you can use line = 'r' for to see the fit to regression line. By default line is set to "None"
simply use code like this
import numpy as np
import statsmodels.api as sm
import pylab
test = np.random.normal(20, 5, 1000)
sm.qqplot(test, line='q')
pylab.show()

Please add "fit" as :
sm.qqplot(aaa, line = "45", fit = True)

I noticed that when I omitted the line='45' parameter from your code the following plot results.
We can see that what has happened is that, in the Q-Q plot that statsmodels makes the theoretical quantiles are not rescaled back to the dimensions of the original pseudosample, which is why the blue line is confined to the left edge of the your plot.
I don't know how to make statsmodels do what you want; however, there is another way — see https://stackoverflow.com/a/47189575/131187.

You can try setting the fit parameter to True

What's the correct usage of matplotlib.mlab.normpdf()?

I intend for part of a program I'm writing to automatically generate Gaussian distributions of various statistics over multiple raw text sources, however I'm having some issues generating the graphs as per the guide at:
python pylab plot normal distribution
The general gist of the plot code is as follows.
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as pyplot
meanAverage = 222.89219487179491 # typical value calculated beforehand
standardDeviation = 3.8857889432054091 # typical value calculated beforehand
x = np.linspace(-3,3,100)
pyplot.plot(x,mlab.normpdf(x,meanAverage,standardDeviation))
pyplot.show()
All it does is produce a rather flat looking and useless y = 0 line!
Can anyone see what the problem is here?
Cheers.

If you read documentation of matplotlib.mlab.normpdf, this function is deprycated and you should use scipy.stats.norm.pdf instead.
Deprecated since version 2.2: scipy.stats.norm.pdf
And because your distribution mean is about 222, you should use np.linspace(200, 220, 100).
So your code will look like:
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as pyplot
meanAverage = 222.89219487179491 # typical value calculated beforehand
standardDeviation = 3.8857889432054091 # typical value calculated beforehand
x = np.linspace(200, 220, 100)
pyplot.plot(x, norm.pdf(x, meanAverage, standardDeviation))
pyplot.show()

It looks like you made a few small but significant errors. You either are choosing your x vector wrong or you swapped your stddev and mean. Since your mean is at 222, you probably want your x vector in this area, maybe something like 150 to 300. This way you get all the good stuff, right now you are looking at -3 to 3 which is at the tail of the distribution. Hope that helps.

I see that, for the *args which are sending meanAverage, standardDeviation, the correct thing to be sent is:
mu : a numdims array of means of a
sigma : a numdims array of atandard deviation of a
Does this help?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

scipy signal find_peaks_cwt not finding the peaks accurately? - python

Solved, solution: Filter data first: window = signal.general_gaussian(51, p=0.5, sig=20) filtered = signal.fftconvolve(window, data) filtered = (np.average(data) / np.average(filtered)) * filtered filtered = np.roll(filtered, -25) Then use angrelextrema as per rapelpy's answer. Result:

There is a much easier solution using this function: https://gist.github.com/endolith/250860 which is an adaptation of http://billauer.co.il/peakdet.html I've just tried with the data you provided and I got the result below. No need for pre-filtering... Enjoy :-)

Related

Finding the maximum values of a set of local maxima using matplotlib and numpy

Creating similar spectrogram in continues wavelet transform compared to discret wavelet transform

How to efficiently perform a top-hat (disk-like) smoothing on a healpix map?

Quantile-Quantile Plot using python statsmodels api

What's the correct usage of matplotlib.mlab.normpdf()?

Categories

Resources