How do i get the curve_fit to fit the data? - python

I can't figure out why my curve_fit is not following the data?
import numpy as np
import matplotlib.pyplot as plt
import scipy.io as sio
import math
from scipy.stats import binom, poisson, norm
from scipy.optimize import curve_fit
AD1 = sio.loadmat('ATLAS_DATA1' ,squeeze_me=True,mat_dtype = True)
locals().update({k :AD1[k]for k in ['n','e']})
xData = e
yData = n
yErr = 0
plt.xlabel("e (GeV)")
plt.ylabel("Number of events (n)")
plt.errorbar(xData,yData,yErr,marker = '.',linestyle = '')
plt.show()
def func(e, a, b):
return a * np.exp(-b * e)
xDat = e
yDat = func(xDat, 2, 1)
popt, pcov = curve_fit(func, xDat, yDat)
plt.plot(xDat, func(xDat, *popt))
plt.show()
Below is my data for n at the top and e at the bottom.
Data for n and e
Graph for the data that i want to fit

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as optimize
from scipy.optimize import curve_fit
Write your xData and yData as numpy arrays as follows:
I used a sample from it
xData =np.array([383,358,326,366,335,331,308,299,303,325,306,299,270,282,253,265,248,256,220,208,252,215,220,237,204,213,224,212])
yData = np.array([101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,124,128])
plt.xlabel("e (GeV)")
plt.ylabel("Number of events (n)")
plt.scatter(xData,yData)
plt.show()
Heres the original data
Its bettter to use plt.scatter than plt.errorbar
i found this equation better for your curve
def func(x, a, c, d):
return a*np.exp(-c*x)+d
Same thing goes for xDat (write it as np.array)
xDat = xData
yDat = func(xDat, 2, 1)
plt.scatter(xDat,yDat)
popt, pcov = curve_fit(func, xData, yData,p = (10, 1e-6, 100))
plt.plot(xDat, func(xDat, *popt))
plt.show()
Tip: Dont use lower case e as a variable, because most of the time e represents the exponential constant e=2.77
UPDATE :
if you want to use your original function heres the code:
def func(e, a, b):
return a * np.exp(-b * e)
popt, pcov = curve_fit(func, xData, yData,p0 = [10,-0.00001])

Related

Overfitting with curve.fit

can anyone help me struggle with fitting issue from curve.fit. I would like to fit my data to a second order equation. But I obtained a result like a linear equation.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
f = a*np.power(x, 2) + b*x + c
return f
xdata_prime=[3.0328562996216282, 3.101784841139168, 3.1707134502066894, 3.2396419917242292, 3.308570533241769, 3.3774990747593088, 3.3774990747593088, 3.4337789932367149, 3.4900589392912855, 3.5463388577686916, 3.6026187762460977, 3.6588987223006684]
ydata_prime=[6.344300000000002, 6.723900000000002, 7.080399999999999, 7.399800000000001, 7.649099999999999, 7.753100000000002, 7.753100000000002, 7.658600000000002, 7.442100000000002, 7.180100000000001, 6.902700000000001, 6.6211]
plt.plot(xdata_prime, ydata_prime, 'b-', label='data')
popt, pcov = curve_fit(func, xdata_prime, ydata_prime)
popt
plt.plot(xdata_prime, func(xdata_prime, *popt), 'r-',label='fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Your arrays need to be numpy arrays because your function is doing vectorized operations (namely a*np.power(x, 2)). So with this your code will work:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
f = a*np.power(x, 2) + b*x + c
return f
xdata_prime=np.array([3.0328562996216282, 3.101784841139168, 3.1707134502066894, 3.2396419917242292, 3.308570533241769, 3.3774990747593088, 3.3774990747593088, 3.4337789932367149, 3.4900589392912855, 3.5463388577686916, 3.6026187762460977, 3.6588987223006684])
ydata_prime=np.array([6.344300000000002, 6.723900000000002, 7.080399999999999, 7.399800000000001, 7.649099999999999, 7.753100000000002, 7.753100000000002, 7.658600000000002, 7.442100000000002, 7.180100000000001, 6.902700000000001, 6.6211])
plt.plot(xdata_prime, ydata_prime, 'b-', label='data')
popt, pcov = curve_fit(func, xdata_prime, ydata_prime)
plt.plot(xdata_prime, func(xdata_prime, *popt), 'r-',label='fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

How do I solve the error in plot variable dimension mismatch?

I am trying to fit a data generated using formula-1 by formula-2. The former has 3 parameters, whereas the later has 5 fitting parameters. But now I get error in plotting the fitted curve due to shape mismatch.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c, d, e):
return (((a/e) * (2*x)**b) + (d * (2*x)**c))
y = []
x = []
A = 6.7
B = 2.0
C = 0.115
for N in np.logspace(1, 9., 100, base = 10.):
x.append(int(N))
y.append(np.exp((A-np.log(int(N)))/B)+C)
plt.loglog(x, y, 'b:*', label='data')
popt, pcov = curve_fit(func, x, y)
print(popt)
plt.loglog(x, func(x, *popt))
I would like to see the fitted curve, but there s a dimension error in the last line '''plt.loglog(x, func(x, *popt))'''
One way to do this is to create a list y_model in which you add the element y corresponding to each x.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c, d, e):
return (((a/e) * (2*x)**b) + (d * (2*x)**c))
y = []
x = []
A = 6.7
B = 2.0
C = 0.115
for N in np.logspace(1, 9., 100, base = 10.):
x.append(int(N))
y.append(np.exp((A-np.log(int(N)))/B)+C)
popt, pcov = curve_fit(func, x, y)
y_model = []
for e in x:
y_model.append(func(e, *popt))
plt.loglog(x, y, 'b:*', label='data')
plt.loglog(x, y_model)
Result:

When bounds are added, the result of scipy.optimize.curve_fit varies when it shouldn't

For the following python script, when I add bounds to the curve_fit function, the resulting curve fit is completely different and visibly wrong, even though the parameter that is adjusted for the fit is within the bounds both before and after the bounds are added to the code. Why would this happen?
Here's a link to the data: https://drive.google.com/file/d/0Bwb0PrDn9o3KZ0lOa1FVZldjV0k/view?usp=sharing
import numpy as np
import matplotlib.pyplot as plt
from numpy import loadtxt, sqrt
from scipy.optimize import curve_fit #for least squares curve fit
from scipy import special #for erfc function
plt.rcParams.update({'font.family': "Times New Roman"})
plt.rcParams.update({'font.size': 12})
filename = 'Cr3.csv'
C_b = 17 #base concentration
t_hours = 451
t = t_hours * 3600 #451 hours = 1623600 seconds
data = loadtxt(filename, delimiter=',')
xdata = data[:, 0] #positions
ydata = data[:, 1] #concentration
corr = data[0, 2] #the correction value is manually measured in imagej
xdata = xdata - corr
def func(x, D):
return C_b/2 * special.erfc(x/(2 * sqrt(D * t))/1e6) #correction for um to m
fig = plt.figure()
plt.plot(xdata, ydata, 'b-', label='data')
popt, pcov = curve_fit(func, xdata, ydata, p0=1e-16)#, bounds=(0,1))
perr = np.sqrt(np.diag(pcov))
plt.plot(xdata, func(xdata, *popt), 'r-',
label='fit: D = %.2e' % tuple(popt))#, z = %5.3f
plt.xlabel('x (μm)')
plt.ylabel('Cr (wt%)')
plt.legend()
plt.show()

curve_fit() using python

def model(A, x, mu, sigma):
return A*exp(-((x-mu)**2)/(2*sigma**2))
from scipy.optimize import curve_fit
mu=np.mean(d_spacing_2)
sigma=np.std(d_spacing_2)
f=intensity_2
x=d_spacing_2
popt, pcov = curve_fit(model, A, x, mu, sigma)
TypeError: model() missing 2 required positional arguments: 'mu' and 'sigma'
You are using curve_fit totally wrong. Here is working example from the help of curve_fit and some additional plotting:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = curve_fit(func, xdata, ydata,p0=[2,1,1])
plt.ion()
plt.plot(xdata,ydata,'o')
xplot = np.linspace(0,4,100)
plt.plot(xplot,func(xplot,*popt))
The first input argument of curve_fit is the function the second the x values of the data and the third the y values. You should normally also use the optional input argument p0, which is an initial guess for the solution.

sigmoidal curve with semilogx SciPy/Python

I have been working on fitting a negatively sloped sigmoidal trendline for a set of data. I have only been working on python for a week, sorry for the sloppy code. I have two sets of code which produce the data, however, I cannot get the sigmoid curve output as well.
from numpy import *
from matplotlib.pyplot import *
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def sigmoid(x, x0, k):
y = 1 / (1 + np.exp(-(-k*(x-x0))))
return y
x = [0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001]
y = [0.649097038, 0.682633434, 0.705470344, 0.749350609, 0.989377822, 0.972679201]
coefficients = np.polyfit(x, y, 2)
polynomial = poly1d(coefficients)
xs = arange(0.000001, 0, 0.1)
ys = polynomial(xs)
curve_fit(sigmoid, x, y)
semilogx()
np.polyfit(x, y, 3, rcond=None, full=False, w=None, cov=False)
plot(x, y, 'o')
plot(xs, ys)
ylabel('Cell Viability')
xlabel('Concentration mM')
show()
.
import numpy as np
import pylab
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
def sigmoid(x, x0, k):
y = 1 / (1 + np.exp(-(-k*(x-x0))))
return y
xdata = np.array([0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001])
ydata = np.array([0.649097038, 0.682633434, 0.705470344, 0.749350609, 0.989377822, 0.972679201])
popt, pcov = curve_fit(sigmoid, xdata, ydata)
print popt
x = np.linspace(-10, 1, 50)
y = sigmoid(x, *popt)
semilogx()
pylab.plot(xdata, ydata, 'o', label='data')
pylab.plot(x,y, label='fit')
pylab.ylim(0, 1.05)
pylab.legend(loc='best')
pylab.show()
There are a number of issues with your two code pieces - some of which Ajean has hinted at. Let's carefully review what there is and what problems that causes.
1st Code Block
Discard the first two lines and use only:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
Now, instead of poly1d, you'll have to call np.poly1d; instead of semilogx() it's plt.semilogx(); plot, xlabel, ylabel and show become plt.plot, etc.
Next, your use of arange returns an empty array. Instead, try this:
np.arange(0.000001, 0.1, 0.000001)
From curvefit you should actually store the returns, as your second code does:
popt, pcov = curve_fit(sigmoid, x, y)
Next, use sigmoid to generate new y-values:
ysig = sigmoid(x,*popt)
If now you include an additional plot statement at the bottom, e.g.:
plt.plot(x,ysig,'g')
the output will be something like this:
2nd Code Block
It is sufficient to import matplotlib.pyplot as plt. Now, replace the pylab. occurrences with plt.
However, all that does not really work, is the linspace command. If you try
x = np.arange(0.000001, 0.1, 0.000001)
instead, you'll get this output
However, both approaches indicate that your fit does not really suit the data. But that may be a different question.
This is what I have for code block 1.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def sigmoid(x, x0, k):
y = 1 / (1 + np.exp(-(-k*(x-x0))))
return y
x = [0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001]
y = [0.649097038, 0.682633434, 0.705470344, 0.749350609, 0.989377822, 0.972679201]
coefficients = np.polyfit(x, y, 3)
polynomial = np.poly1d(coefficients)
popt, pcov = curve_fit(sigmoid, x, y)
ysig = sigmoid(x, *popt)
plt.semilogx()
np.arange(0.000001, 0.1, 0.000001)
np.polyfit(x, y, 3, rcond=None, full=False, w=None, cov=False)
plt.plot(x, y, 'o')
plt.plot(x, ysig, 'g')
plt.ylabel('Cell Viability')
plt.xlabel('Concentration mM')
plt. show()

Categories