Related
I want to create a sine wave that starts from the frequency f1 and ends at the frequency f2.
Here is the code I used:
import matplotlib.pyplot as plt
import numpy as np
def freq_interp(dur,f1,f2,fs=44100):
num_samples = fs*dur
t = np.linspace(0,dur,num_samples)
a = np.linspace(0,1,num_samples)
f = (1-a)*f1+a*f2 # interpolate
samples = np.cos(2*np.pi*f*t)
return samples,f
When I try to generate a WAV file or just plot the STFT of the signal, I get an unexpected result. For example I used the code below:
def plot_stft(sig,fs=44100):
f, t, Zxx = signal.stft(sig,fs=fs,nperseg=2000)
plt.pcolormesh(t, f, np.abs(Zxx), vmin=0, vmax=0.1)
plt.ylim(0,2000)
plt.title('STFT Magnitude')
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
s,f = freq_interp(dur=2,f1=1,f2=1000)
plt.plot(f)
plt.show()
plot_stft(s)
s,f = freq_interp(dur=2,f1=1000,f2=1)
plt.plot(f)
plt.show()
plot_stft(s)
I get these plots:
The problem is more evident in the second row. Where the frequency has bounced back at t=1s. Also in the first row you can see that the frequency has gone up to 2000Hz which is wrong. Any idea why this happens and how I can fix it?
A sin wave is sin(p(t)) where p(t) is the phase function. And frequency function is f(t) = d p(t) / dt, to calculate p(t), you first calculate f(t) and then integrate it. The simplest method of integration is by using cumsum().
def freq_interp(dur,f1,f2,fs=44100):
num_samples = int(fs*dur)
t = np.linspace(0,dur,num_samples)
f = np.linspace(f1, f2, num_samples)
phase = 2 * np.pi * np.cumsum(f) / fs
samples = np.cos(phase)
return t, samples
I have data points in a .txt file (delimiter = white space), the first column is x axis and the second is the y axis. I want to fit a 2D Gaussian to theses data points using Python. Truth is, I don't understand the theory behind Gaussian fitting (either one or two dimensional). I have read similar posts here on stackoverflow and got a code, but it's not fitting well. Please someone help. Thanks
Below is what I have in the .txt file:
3.369016418457e+02 3.761813938618e-01
3.369006652832e+02 4.078308343887e-01
3.368996887207e+02 4.220226705074e-01
3.368987121582e+02 4.200653433800e-01
3.368977355957e+02 4.454285204411e-01
3.368967590332e+02 4.156131148338e-01
3.368957824707e+02 3.989491164684e-01
3.368948059082e+02 4.512043893337e-01
3.368938293457e+02 4.565380811691e-01
3.368928527832e+02 4.095999598503e-01
3.368918762207e+02 4.196371734142e-01
3.368908996582e+02 4.002234041691e-01
3.368899230957e+02 4.133881926537e-01
3.368889465332e+02 4.394644796848e-01
3.368879699707e+02 4.504477381706e-01
3.368869934082e+02 3.946847021580e-01
3.368860168457e+02 4.214486181736e-01
3.368850402832e+02 3.753573596478e-01
3.368840637207e+02 3.673824667931e-01
3.368830871582e+02 4.088735878468e-01
3.368821105957e+02 4.351278841496e-01
3.368811340332e+02 4.393630325794e-01
3.368801574707e+02 4.210205972195e-01
3.368791809082e+02 4.322172403336e-01
3.368782043457e+02 4.652716219425e-01
3.368772277832e+02 5.251595377922e-01
3.368762512207e+02 5.873318314552e-01
3.368752746582e+02 6.823697686195e-01
3.368742980957e+02 8.375824093819e-01
3.368733215332e+02 9.335057735443e-01
3.368723449707e+02 1.083636641502e+00
3.368713684082e+02 1.170072913170e+00
3.368703918457e+02 1.224770784378e+00
3.368694152832e+02 1.158735513687e+00
3.368684387207e+02 1.131350398064e+00
3.368674621582e+02 1.073648810387e+00
3.368664855957e+02 9.659162163734e-01
3.368655090332e+02 8.495713472366e-01
3.368645324707e+02 7.637447714806e-01
3.368635559082e+02 6.956064105034e-01
3.368625793457e+02 6.713893413544e-01
3.368616027832e+02 5.285132527351e-01
3.368606262207e+02 4.968771338463e-01
3.368596496582e+02 5.077748298645e-01
3.368586730957e+02 4.686309695244e-01
3.368576965332e+02 4.693206846714e-01
3.368567199707e+02 4.462305009365e-01
3.368557434082e+02 3.872672021389e-01
3.368547668457e+02 4.243377447128e-01
3.368537902832e+02 3.918920457363e-01
3.368528137207e+02 3.848327994347e-01
3.368518371582e+02 4.093343317509e-01
3.368508605957e+02 4.321203231812e-01
Below is the code I have tried:
%pylab inline
import matplotlib.pyplot as plt
import numpy as np
import astropy
import scipy.optimize as opt
import pylab as plb
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
x,y=np.loadtxt('taper2reduced.txt', unpack= True, delimiter=' ')
mean = sum(x * y) / sum(y)
sigma = np.sqrt(sum(y * (x - mean)**2) / sum(y))
def Gauss(x, a, x0, sigma):
<pre>return a * np.exp(-(x - x0)\**2 / (2 * sigma**2))<code>
popt,pcov = curve_fit(Gauss, x, y, p0=[max(y), mean, sigma])
plt.plot(x, y, 'b+:', label='data')
plt.plot(x, Gauss(x, *popt), 'r-', label='fit')
plt.legend()
plt.title('Fig. 1 - Fit for Frequency')
plt.xlabel('Frequecy (GHz)')
plt.ylabel('Flux Density (mJy)')
plt.show()
Your problem is that your function doesn't reflect well your data set. You define a distribution between 0 and max_y, while in reality your data are between min_y and max_y. Change your function like this:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
#function declaration with additional offset parameter
def Gauss(x, a, x0, sigma, offset):
return a * np.exp(-(x - x0)**2 / (2 * sigma**2)) + offset
#loading x, y dataset
x, y = np.loadtxt('test.txt', unpack = True, delimiter=' ')
#calculate parameter of fit function with scipy's curve fitting algorithm
popt, pcov = curve_fit(Gauss, x, y, p0=[np.max(y), np.median(x), np.std(x), np.min(y)])
#plot original data
plt.plot(x, y, 'b+:', label='data')
#create different x value array for smooth fit function curve
x_fit = np.linspace(np.min(x), np.max(x), 1000)
#plot fit function
plt.plot(x_fit, Gauss(x_fit, *popt), 'r-', label='fit')
#beautify graph
plt.legend()
plt.title('Fig. 1 - Fit for Frequency')
plt.xlabel('Frequecy (GHz)')
plt.ylabel('Flux Density (mJy)')
plt.show()
Output:
You might have noticed that I changed two more things.
I de-cluttered the imports. It is not a good idea to load a lot of different unused functions and modules into your name space.
And I changed the start parameter estimation. We don't have to be correct here, an approximation will usually do. Something that doesn't need much code and is fast.
yeap,you want to fit the data to 2d distribution,but the code means
that how to fit a 1d distribution.
I have very few data points and I want to create a line to best fit the data points when plotted in a semilogy scale. I have tried curve-fit and cubic interpolation from scipy, but none of them seems to be very reasonable to me compared to the data trend.
I would kindly ask you to check if there is a more efficient way to create a straight line fit for the data. Probably extrapolation can do, but I did not find a good documentation on extrapolation on python.
your help is very appreciated
import sys
import os
import numpy
import matplotlib.pyplot as plt
from pylab import *
from scipy.optimize import curve_fit
import scipy.optimize as optimization
from scipy.interpolate import interp1d
from scipy import interpolate
Mass500 = numpy.array([ 13.938 , 13.816, 13.661, 13.683, 13.621, 13.547, 13.477, 13.492, 13.237,
13.232, 13.07, 13.048, 12.945, 12.861, 12.827, 12.577, 12.518])
y500 = numpy.array([ 7.65103978e-06, 4.79865790e-06, 2.08218909e-05, 4.98385924e-06,
5.63462673e-06, 2.90785458e-06, 2.21166794e-05, 1.34501705e-06,
6.26021870e-07, 6.62368879e-07, 6.46735547e-07, 3.68589447e-07,
3.86209019e-07, 5.61293275e-07, 2.41428755e-07, 9.62491134e-08,
2.36892162e-07])
plt.semilogy(Mass500, y500, 'o')
# interpolation
f2 = interp1d(Mass500, y500, kind='cubic')
plt.semilogy(Mass500, f2(Mass500), '--')
# curve-fit
def line(x, a, b):
return 10**(a*x+b)
#Initial guess.
x0 = numpy.array([1.e-6, 1.e-6])
print optimization.curve_fit(line, Mass500, y500, x0)
popt, pcov = curve_fit(line, Mass500, y500)
print popt
plt.semilogy(Mass500, line(Mass500, popt[0], popt[1]), 'r-')
plt.legend(['data', 'cubic', 'curve-fit'], loc='best')
show()
There are many regression functions available in numpy and scipy.
scipy.stats.lingress is one of the simpler functions, and it returns common linear regression parameters.
Here are two options for fitting semi-log data:
Plot Transformed Data
Rescale Axes and Transform Input/Output Function Values
Given
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
%matplotlib inline
# Data
mass500 = np.array([
13.938 , 13.816, 13.661, 13.683,
13.621, 13.547, 13.477, 13.492,
13.237, 13.232, 13.07, 13.048,
12.945, 12.861, 12.827, 12.577,
12.518
])
y500 = np.array([
7.65103978e-06, 4.79865790e-06, 2.08218909e-05, 4.98385924e-06,
5.63462673e-06, 2.90785458e-06, 2.21166794e-05, 1.34501705e-06,
6.26021870e-07, 6.62368879e-07, 6.46735547e-07, 3.68589447e-07,
3.86209019e-07, 5.61293275e-07, 2.41428755e-07, 9.62491134e-08,
2.36892162e-07
])
Code
Option 1: Plot Transformed Data
# Regression Function
def regress(x, y):
"""Return a tuple of predicted y values and parameters for linear regression."""
p = sp.stats.linregress(x, y)
b1, b0, r, p_val, stderr = p
y_pred = sp.polyval([b1, b0], x)
return y_pred, p
# Plotting
x, y = mass500, np.log(y500) # transformed data
y_pred, _ = regress(x, y)
plt.plot(x, y, "mo", label="Data")
plt.plot(x, y_pred, "k--", label="Pred.")
plt.xlabel("Mass500")
plt.ylabel("log y500") # label axis
plt.legend()
Output
A simple approach is to plot transformed data and label the appropriate log axes.
Option 2: Rescale Axes and Transform Input/Output Function Values
Code
x, y = mass500, y500 # data, non-transformed
y_pred, _ = regress(x, np.log(y)) # transformed input
plt.plot(x, y, "o", label="Data")
plt.plot(x, np.exp(y_pred), "k--", label="Pred.") # transformed output
plt.xlabel("Mass500")
plt.ylabel("y500")
plt.semilogy()
plt.legend()
Output
A second option is to alter the axes to semi-log scales (via plt.semilogy()). Here the non-transformed data naturally appears linear. Also notice the labels represent the data as-is.
To make an accurate regression, all that remains is to transform data passed into the regression function (via np.log(x) or np.log10(x)) in order to return the proper regression parameters. This transformation is immediately reversed when plotting predicated values using a complementary operation, i.e. np.exp(x) or 10**x.
If you want a line that will look good on log-y scale, then fit a line to the logarithms of the y-values.
def line(x, a, b):
return a*x+b
popt, pcov = curve_fit(line, Mass500, np.log10(y500))
plt.semilogy(Mass500, 10**line(Mass500, popt[0], popt[1]), 'r-')
This is it; I only left out the cubic interpolation part which didn't seem relevant.
After a lot of searching and being unable to find an answer i choose to place my question here.
How do i fit an exponential function in the form of y=(1/A)e^(-x/A) to the shown data and plot this function? I still need some getting used to fitting in Python. Help will be more than appreciated!
Thank you in advance.
Looks like i figured it out.
def exponential_fit(x, a, c):
"""
Logarithmic fit used for the MuonLab life time measurements.
:param x:
:param a:
:param c:
:return:
"""
return (1/a)*np.exp(-x/a)+c
def logarithmic_fit_plot(x, y): # WIP
font = {'family': 'normal',
'weight': 'bold',
'size': 20}
matplotlib.rc('font', **font)
xdata = x
ydata = y
plt.rc('text', usetex=True)
plt.plot(xdata, ydata, '.', label='sample')
popt, pcov = sp.optimize.curve_fit(exponential_fit, xdata, ydata)
plt.plot(xdata, exponential_fit(xdata, *popt), 'r-',
label=r"$\frac{1}{\tau_0}e^{\frac{-x}{\tau_0}}, \tau_0=%5.3f, c=%5.3f$" % tuple(popt))
plt.legend()
plt.show()
Sadly it doesn't fit the data very well, but that's just a math problem i guess.
This code produces a decent fit.
first = True
lifetimes = []
counts = []
with open('Werkverkeer.txt') as w:
next(w)
for line in w:
_, life, count = line.rstrip().split()
life, count = float(life), int(count)
if count==0:
continue
lifetimes.append(life-0.005)
counts.append(count)
probs = [_/sum(counts) for _ in counts]
print (probs)
from scipy.optimize import leastsq
from scipy.stats import expon
from numpy import exp
def residual(params, X, data):
model = [expon.cdf(x+0.005, scale=params[0])-expon.cdf(x-0.005, scale=params[0]) for x in X]
return [d-m for (d,m) in zip(data, model)]
r = leastsq(residual, [140], args=(lifetimes, probs))
estimate = r[0][0]
print (estimate)
fitted = [expon.cdf(x+0.005, scale=estimate)-expon.cdf(x-0.005, scale=estimate) for x in lifetimes]
print(fitted)
from matplotlib import pyplot as plt
plt.plot(lifetimes, probs, 'r.')
plt.plot(lifetimes, fitted, 'b-')
plt.show()
Things to note:
Rather than fitting to counts I've fitted to normalised counts, which are estimates of probabilities, because the counts are really a way of getting at an estimate of the probability density function for the lifetimes.
Because I'm using counts I need to fit the areas under the density function, for a given value of the parameter, between the boundaries of the bins. Hence, the line model =.
As usual, the final line in residual returns the difference between the observed probabilities (based on counts) and the provisionally calculated probabilities.
leastsq returns a value of 0.0497646352872 for the parameter.
I'm aware that there are threads pertaining to this, but i'm confused to where I want to I fit my data to the fit.
My data is imported and plotted as such.
import matplotlib.pyplot as plt
%matplotlib inline
import pylab as plb
import numpy as np
import scipy as sp
import csv
FreqTime1 = []
DecayCount1 = []
with open('Half_Life.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
FreqTime1.append(row[0])
DecayCount1.append(row[3])
FreqTime1 = np.array(FreqTime1)
DecayCount1 = np.array(DecayCount1)
fig1 = plt.figure(figsize=(15,6))
ax1 = fig1.add_subplot(111)
ax1.plot(FreqTime1,DecayCount1, ".", label = 'Run 1')
ax1.set_xlabel('Time (sec)')
ax1.set_ylabel('Count')
plt.legend()
Problem is, i'm having difficulty setting up general exponential decay, in which I'm not sure how compute the parameter values from the data set.
If possible as well, I'm then wanting to have the equation of the fitted decay equation to be displayed with the graph. But that can be easily applied if a fit is able to be produced.
Edit -------------------------------------------------------------
So when using the fitting function that Stanely R mentioned
def model_func(x, a, k, b):
return a * np.exp(-k*x) + b
x = FreqTime1
y = DecayCount1
p0 = (1.,1.e-5,1.)
opt, pcov = curve_fit(model_func, x, y, p0)
a, k, b = opt
I'm returned with this error message
TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')
Any idea on how to resolve this?
You have to use curve_fit from scipy.optimize: http://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.optimize.curve_fit.html
from scipy.optimize import curve_fit
import numpy as np
# define type of function to search
def model_func(x, a, k, b):
return a * np.exp(-k*x) + b
# sample data
x = np.array([399.75, 989.25, 1578.75, 2168.25, 2757.75, 3347.25, 3936.75, 4526.25, 5115.75, 5705.25])
y = np.array([109,62,39,13,10,4,2,0,1,2])
# curve fit
p0 = (1.,1.e-5,1.) # starting search koefs
opt, pcov = curve_fit(model_func, x, y, p0)
a, k, b = opt
# test result
x2 = np.linspace(250, 6000, 250)
y2 = model_func(x2, a, k, b)
fig, ax = plt.subplots()
ax.plot(x2, y2, color='r', label='Fit. func: $f(x) = %.3f e^{%.3f x} %+.3f$' % (a,k,b))
ax.plot(x, y, 'bo', label='data with noise')
ax.legend(loc='best')
plt.show()
"I'm returned with this error message
TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')
Any idea on how to resolve this?"
Your code that reads the CSV file to create FreqTime1 and DelayCount1 is creating arrays of strings. You can fix that by following the suggestion that #StanleyR made in a comment. A better idea is to replace this code:
FreqTime1 = []
DecayCount1 = []
with open('Half_Life.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
FreqTime1.append(row[0])
DecayCount1.append(row[3])
FreqTime1 = np.array(FreqTime1)
DecayCount1 = np.array(DecayCount1)
with:
FreqTime1, DecayCount1 = np.loadtxt('Half_Life.csv', delimiter=',', usecols=(0, 3), unpack=True)