Please tell me how to determine the unknown parameters in the calculated curve, using scipy optimization, having an experimental curve at the input. I need to determine the unknown parameters a, b, c (in the code below) from the calculated curve, so that the standard deviation functional is minimal
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from math import pi
def func(a,b,c):
return -a/(2*np.tan(c*(pi/2)))+np.sqrt(b+(a**2)/(4*np.tan((c)*(pi/2))))
file=experimental curve in .txt file
pd_file=pd.read_csv(file, sep="\s+",header=None,names=['frequence', 'y'],
skiprows=1)
xdata=pd_file['frequence']
ydata=pd_file['y']
popt, pcov = curve_fit(func, xdata, ydata, p0=[0.6,1], maxfev=500000000)
print('popt',popt)
I do not think your functional form is suitable for fitting the data you have. After some experimentation may I suggest a different one:
def func2(x,b, d):
return 0.2/ (1 + b * x + d * np.log(1+x))
file='chi_strich_strich_H0.txt'
pd_file=pd.read_csv(file, sep="\s+",header=None,names=['frequence', 'y'],
skiprows=1)
xdata=pd_file['frequence']
ydata=pd_file['y']
popt, pcov = curve_fit(func2, xdata, ydata, p0=[0,0], maxfev=500000000)
print('popt',popt)
yfit = func2(xdata,popt[0], popt[1])
plt.plot(xdata, ydata, '.', label = 'data')
plt.plot(xdata, yfit, '-', label = 'fit')
plt.legend(loc = 'best')
plt.show()
popt: [ 1.84672386e-05 -7.69652828e-02]
The fit is on this plot:
Related
I'm trying to plot a histogram in python by importing data from an excel file.
Also, the histogram needs to be fitted with an exponential function.
How can I do this plotting and fitting procedure?
For plotting just use plt.hist and your data
import random
import matplotlib.pyplot as plt
# data for test
data = [random.randint(1,20) for i in range(20)]
n, x, _ = plt.hist(data)
bin_centers = 0.5*(x[1:]+x[:-1])
plt.plot(bin_centers,n);
for fitting you can extract bins centers and try to fit it with curve_fit:
from scipy.optimize import curve_fit
# some exponential function
def func(x, a, b, c):
return a * np.exp(-b * x) + c
popt, pcov = curve_fit(func, bin_centers, n, bounds=(0, [3., 1., 0.5]))
# bounds are variable, so you can change them as you wish
plt.plot(bin_centers, n, label='data')
plt.plot(bin_centers, func(bin_centers, *popt), label='fit')
plt.legend()
I am trying to fit an exponential CDF to my data to see if it is a good fit/develop an equation from the fit, but am not sure how since I think scipy.stats fits the PDF, not the CDF. If I have the data below:
eta = [1,0.5,0.3,0.25,0.2];
q = [1e-9,9.9981e-10,9.9504e-10,9.7905e-10,9.492e-10];
How do I fit an exponential CDF to the data? Or how do find the distribution that fits the data the best?
You can define a general exp function, and use curve_fit from scipy.optimize:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
def exp_func(x, a, b, c):
return a * np.exp(-b * x) + c
eta = np.array([1,0.5,0.3,0.25,0.2])
cdf = np.array([1e-9,9.9981e-10,9.9504e-10,9.7905e-10,9.492e-10])
popt, pcov = curve_fit(exp_func, eta, cdf)
plt.plot(eta, cdf)
plt.plot(eta, exp_func(eta, *popt), 'r-', label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
plt.legend()
plt.show()
And you'll get an exp function which is very similar to your values:
From the fitted parameters, you can see the function is y=np.exp(-19.213 * x).
* Update *
If you want to make sure this is really a CDF function, you'll need to calculate the pdf (by taking the derivative):
x = np.linspace(0, 1, 1000)
cdf_fit = exp_func(x, *popt)
cdf_diff = np.r_[cdf_fit[0], np.diff(cdf_fit)]
You can do a sanity check:
plt.plot(x, np.cumsum(cdf_diff))
And then use scipy to fit the pdf to an exponent distribution:
from scipy.stats import expon
params = expon.fit(cdf_diff)
pdf_fit = expon.pdf(x, *params)
I must warn you the something doesn't sum up. pdf_fit doesn't align with cdf_diff. Maybe your CDF isn't a real distribution function? The last value of a CDF should be 1.
For the following python script, when I add bounds to the curve_fit function, the resulting curve fit is completely different and visibly wrong, even though the parameter that is adjusted for the fit is within the bounds both before and after the bounds are added to the code. Why would this happen?
Here's a link to the data: https://drive.google.com/file/d/0Bwb0PrDn9o3KZ0lOa1FVZldjV0k/view?usp=sharing
import numpy as np
import matplotlib.pyplot as plt
from numpy import loadtxt, sqrt
from scipy.optimize import curve_fit #for least squares curve fit
from scipy import special #for erfc function
plt.rcParams.update({'font.family': "Times New Roman"})
plt.rcParams.update({'font.size': 12})
filename = 'Cr3.csv'
C_b = 17 #base concentration
t_hours = 451
t = t_hours * 3600 #451 hours = 1623600 seconds
data = loadtxt(filename, delimiter=',')
xdata = data[:, 0] #positions
ydata = data[:, 1] #concentration
corr = data[0, 2] #the correction value is manually measured in imagej
xdata = xdata - corr
def func(x, D):
return C_b/2 * special.erfc(x/(2 * sqrt(D * t))/1e6) #correction for um to m
fig = plt.figure()
plt.plot(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, p0=1e-16)#, bounds=(0,1))
perr = np.sqrt(np.diag(pcov))
plt.plot(xdata, func(xdata, *popt), 'r-',
label='fit: D = %.2e' % tuple(popt))#, z = %5.3f
plt.xlabel('x (μm)')
plt.ylabel('Cr (wt%)')
plt.legend()
plt.show()
I would like to find and plot a function f that represents a curve fitted on some number of set points that I already know, x and y.
After some research I started experimenting with scipy.optimize and curve_fit but on the reference guide I found that the program uses a function to fit the data instead and it assumes ydata = f(xdata, *params) + eps.
So my question is this: What do I have to change in my code to use the curve_fit or any other library to find the function of the curve using my set points? (note: I want to know the function as well so I can integrate later for my project and plot it). I know that its going to be a decaying exponencial function but don't know the exact parameters. This is what I tried in my program:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.array([0.2, 0.5, 0.8, 1])
ydata = np.array([6, 1, 0.5, 0.2])
plt.plot(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata)
plt.plot(xdata, func(xdata, *popt), 'r-', label='fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Am currently developing this project on a Raspberry Pi, if it changes anything. And would like to use least squares method since is great and precise, but any other method that works well is welcome.
Again, this is based on the reference guide of scipy library. Also, I get the following graph, which is not even a curve: Graph and curve based on set points
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
#c is a constant so taking the derivative makes it go to zero
def deriv(x, a, b, c):
return -a * b * np.exp(-b * x)
#Integrating gives you another c coefficient (offset) let's call it c1 and set it equal to zero by default
def integ(x, a, b, c, c1 = 0):
return -a/b * np.exp(-b * x) + c*x + c1
#There are only 4 (x,y) points here
xdata = np.array([0.2, 0.5, 0.8, 1])
ydata = np.array([6, 1, 0.5, 0.2])
#curve_fit already uses "non-linear least squares to fit a function, f, to data"
popt, pcov = curve_fit(func, xdata, ydata)
a,b,c = popt #these are the optimal parameters for fitting your 4 data points
#Now get more x values to plot the curve along so it looks like a curve
step = 0.01
fit_xs = np.arange(min(xdata),max(xdata),step)
#Plot the results
plt.plot(xdata, ydata, 'bx', label='data')
plt.plot(fit_xs, func(fit_xs,a,b,c), 'r-', label='fit')
plt.plot(fit_xs, deriv(fit_xs,a,b,c), 'g-', label='deriv')
plt.plot(fit_xs, integ(fit_xs,a,b,c), 'm-', label='integ')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
I am trying to fit a gaussian data to a specific three-term gaussian (in which the amplitude in one term is equal to twice the standard deviation of the next term). Here is my attempt:
import numpy as np
#from scipy.optimize import curve_fit
import scipy.optimize as optimize
import matplotlib.pyplot as plt
#r=np.linspace(0.0e-15,4e-15, 100)
data = np.loadtxt('V_lambda_n.dat')
r = data[:, 0]
V = data[:, 1]
def func(x, ps1, ps2, ps3, ps4):
return ps1*np.exp(-(x/ps2)**2) + ps2*np.exp(-(x/ps3)**2) + ps3*np.exp(-(x/ps4)**2)
popt, pcov = optimize.curve_fit(func, r, V, maxfev=10000)
#params = optimize.curve_fit(func, ps1, ps2, ps3, ps4)
#[ps1, ps2, ps2, ps4] = params[0]
p1=plt.plot(r, V, 'bo', label='data')
p2=plt.plot(r, func(r, *popt), 'r-', label='fit')
plt.xticks(np.linspace(0, 4, 9, endpoint=True))
plt.yticks(np.linspace(-50, 150, 9, endpoint=True))
plt.show()
Here is the result:
How may I fix this code to improve the fit? Thanks
With the help of friends from scipy-user forum, I tried as initial guess the following:
p0=[V.max(), std_dev, V.max(), 2]
The fit got a lot better. The new fit is as shown
enter image description here
I hope the fit could get better than this.