Python: how to fit function to points? - python

I am trying to fit a curve to some points.
### Analysis: cost function
md = 215 / 0.89
wl = [0, 0.5, 1, 1.5, 2, 3, 4, 5, 6]
d = [0, 0.49, 0.71, 0.84, 0.95, 0.98, 1.0, 1.0, 1.0]
dr = []
for i in d: dr.append(i*md)
f, ax = plt.subplots(figsize=(9.5, 6.5))
ax = setFont(ax, 'Arial', 14)
ax.plot(wl, dr, lw=2)
grid()
This is a typical logistic function. This what I am doing
from scipy.optimize import curve_fit
def func(t,alpha, a):
return 241.573 / 1+ (a * np.exp(alpha * t))
# coefficients and curve fit for curve
popt, pcov = curve_fit(func, wl, dr)
alpha, a = popt
v_fit = func(wl, alpha, a)
But I get the error
TypeError: can't multiply sequence by non-int of type 'numpy.float64'

The error is due to the fact that wl isn't a numpy array:
import numpy as np
from scipy.optimize import curve_fit
md = 215 / 0.89
wl = np.array([0, 0.5, 1, 1.5, 2, 3, 4, 5, 6])
d = np.array([0, 0.49, 0.71, 0.84, 0.95, 0.98, 1.0, 1.0, 1.0])
dr = np.array([i*md for i in d])
def func(t,alpha, a):
return 241.573 / 1+ (a * np.exp(alpha * t))
# coefficients and curve fit for curve
popt, pcov = curve_fit(func, wl, dr)
alpha, a = popt
v_fit = func(wl, alpha, a)

Based on sample code found here:
from scipy.optimize import curve_fit
import numpy as np
# assume func is defined to calculate your y-values and has the signature func(x, a, b, c)
xdata = np.linspace(0,4,50)
np.random.seed(1729)
ydata = [func(x, 2.5, 1.3, 0.5)+(0.2*np.random.normal(size=xdata.size)) for x in xdata]
plt.plot(xdata, ydata, 'b-', label='data')

Related

Least squares function with 5 unknown parameters

I am having a trouble with estimation of 5 unknown parameters a, b, c, d, e that are definitely lay in the intervals. It's simply looks this way:
import numpy as np
from scipy.optimize import curve_fit
diap_a = np.arange(0.01, 1, 0.2)
diap_b = np.arange(0.01, 30, 5)
diap_c = np.arange(0.01, 2, 0.5)
diap_d = np.arange(0.01, 2, 0.5)
diap_e = np.arange(0.01, 0.3, 0.03)
X = np.arange(0.01, 1, 0.01)
def func(a, b, c, d, e):
return a + b + c + d + e #for example
Y = func(a, b, c, d, e)
I have data (expected values) such that
Y1 = [60, 59, 58, 57, 56, 55, 50, 30, 10]
X1 = [0.048, 0.049, 0.05, 0.05, 0.06, 0.089, 0.1, 0.12, 0.134]
I was trying to implement it this way:
popt, pcov = curve_fit(func, a, b, c, d, e, Y1, X1)
to find optimal a, b, c, d, e that will help to fit the curve
plt.plot(Y, X)
plt.show()
But it doesn't work.
The result is:
OptimizeWarning: Covariance of the parameters could not be estimated
Sorry for my bad formulation of the problem.
Your curve_fit() should take func, X1, and Y1 as the first three parameters according to the curve_fit() documentation. As currently coded, func() will always return a single value that has nothing to do with X1 and cannot fit the data. Here is an example graphing fitter using your data that has three parameters and uses scipy's default initial parameter estimates of all 1.0 - these are not always optimal. If you get a bad fit of the data to any given function it might be the initial parameter estimates, and so scipy has a genetic algorithm module to help find those estimates if needed.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
xData = numpy.array([0.048, 0.049, 0.05, 0.05, 0.06, 0.089, 0.1, 0.12, 0.134])
yData = numpy.array([60, 59, 58, 57, 56, 55, 50, 30, 10])
def func(x, a, b, c): # simple quadratic example
return (a * numpy.square(x)) + b * x + c
# these are the same as the scipy defaults
initialParameters = numpy.array([1.0, 1.0, 1.0])
# curve fit the test data
fittedParameters, pcov = curve_fit(func, xData, yData, initialParameters)
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('Parameters:', fittedParameters)
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

Printing Curve Fit Function

I have been struggling to find a way to get the determined parameters for the curve fit function below to print. The graph properly matches my data, but I can't figure out how to get the equation it produced. Any help would be appreciated!
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
x_data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
y_data = [.99, 1, .98, .93, .85, .77, .67, .56, .46, .36, .27, .19, .12, .07, .03, .01, 0, .01, .05, .09, .16, .24, .33, .44, .55, .65, .76, .85, .93, .98, 1]
x_val = np.array(x_data)
y_val = np.array(y_data)
def fitFunc(x, a, b, c, d):
return a * np.sin((2* np.pi / b) * x - c) + d
print(a, b, c, d)
plt.plot(x_val, y_val, marker='.', markersize=0, linewidth='0.5', color='green')
popt, pcov = curve_fit(fitFunc, x_val, y_val)
plt.plot(x_val, fitFunc(x_val, *popt), color='orange', linestyle='--')
Here is a graphing example that uses your data, note the equation. This example uses initial parameter estimates that were manually estimated from a scatterplot of the data, the default curve_fit estimates are all 1.0 by default and those do not work well in this case.
import numpy as np
import scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
xData = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0])
yData = np.array([.99, 1.0, 0.98, 0.93, 0.85, 0.77, 0.67, 0.56, 0.46, 0.36, 0.27, 0.19, 0.12, 0.07, 0.03, 0.01, 0, 0.01, 0.05, 0.09, 0.16, 0.24, 0.33, 0.44, 0.55, 0.65, 0.76, 0.85, 0.93, 0.98, 1.0])
def fitFunc(x, amplitude, center, width, offset):
return amplitude * np.sin(np.pi * (x - center) / width) + offset
# these are the curve_fit default parameter estimates, and
# do not work well for this data and equation - manually estimate below
#initialParameters = np.array([1.0, 1.0, 1.0, 1.0])
# eyeball the scatterplot for some better, simple, initial parameter estimates
initialParameters = np.array([0.5, 1.0, 16.0, 0.5])
# curve fit the test data using initial parameters
fittedParameters, pcov = curve_fit(fitFunc, xData, yData, initialParameters)
print(fittedParameters)
modelPredictions = fitFunc(xData, *fittedParameters)
absError = modelPredictions - yData
SE = np.square(absError) # squared errors
MSE = np.mean(SE) # mean squared errors
RMSE = np.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (np.var(absError) / np.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = np.linspace(min(xData), max(xData))
yModel = fitFunc(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

Plotting one sigma error bars on a curve fit line in scipy

I plotted a linear least square fit curve using scipy.optimize.curve_fit(). My data has some error associated to it and I added those while plotting the fit curve.
Next, I want to plot two dashed lines representing one sigma error bar on the curve fit and shade region between those two lines. This is what I have tried so far:
import sys
import os
import numpy
import matplotlib.pyplot as plt
from pylab import *
import scipy.optimize as optimization
from scipy.optimize import curve_fit
xdata = numpy.array([-5.6, -5.6, -6.1, -5.0, -3.2, -6.4, -5.2, -4.5, -2.22, -3.30, -6.15])
ydata = numpy.array([-18.40, -17.63, -17.67, -16.80, -14.19, -18.21, -17.10, -17.90, -15.30, -18.90, -18.62])
# Initial guess.
x0 = numpy.array([1.0, 1.0])
#data error
sigma = numpy.array([0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.22, 0.45, 0.35])
sigma1 = numpy.array([0.000001, 0.000001, 0.000001, 0.0000001, 0.0000001, 0.13, 0.22, 0.30, 0.00000001, 1.0, 0.05])
#def func(x, a, b, c):
# return a + b*x + c*x*x
def line(x, a, b):
return a * x + b
#print optimization.curve_fit(line, xdata, ydata, x0, sigma)
popt, pcov = curve_fit(line, xdata, ydata, sigma =sigma)
print popt
print "a =", popt[0], "+/-", pcov[0,0]**0.5
print "b =", popt[1], "+/-", pcov[1,1]**0.5
#1 sigma error ######################################################################################
sigma2 = numpy.array([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]) #make change
popt1, pcov1 = curve_fit(line, xdata, ydata, sigma = sigma2) #make change
print popt1
print "a1 =", popt1[0], "+/-", pcov1[0,0]**0.5
print "b1 =", popt1[1], "+/-", pcov1[1,1]**0.5
#####################################################################################################
plt.errorbar(xdata, ydata, yerr=sigma, xerr= sigma1, fmt="none")
plt.ylim(-11.5, -19.5)
plt.xlim(-2, -7)
xfine = np.linspace(-2.0, -7.0, 100) # define values to plot the function for
plt.plot(xfine, line(xfine, popt[0], popt[1]), 'r-')
plt.plot(xfine, line(xfine, popt1[0], popt1[1]), '--') #make change
plt.show()
However, I think the dashed line I plotted takes one sigma error from my provided xdata and ydata numpy array, not from the curve fit. Do I have to know the coordinates that satisfy my fit curve and then make a second array to make the one sigma error fit curve?
It seems you are plotting two completely different lines.
Instead, you need to plot three lines: the first one is your fit without any corrections, the other two lines should be built with the same parameters a and b, but with added or subtracted sigmas. You obtain the respective sigmas from the covariance matrix you obtain in pcov. So you'll have something like:
y = line(xfine, popt[0], popt[1])
y1 = line(xfine, popt[0] + pcov[0,0]**0.5, popt[1] - pcov[1,1]**0.5)
y2 = line(xfine, popt[0] - pcov[0,0]**0.5, popt[1] + pcov[1,1]**0.5)
plt.plot(xfine, y, 'r-')
plt.plot(xfine, y1, 'g--')
plt.plot(xfine, y2, 'g--')
plt.fill_between(xfine, y1, y2, facecolor="gray", alpha=0.15)
fill_between shades the area between the error bar lines.
This is the result:
You can apply the same technique for your other line if you want.

Curve fitting an exponential function using SciPy

I have the following "score" function, that meant to give a score between 0 and one for a certain measurement, that looks like:
def func(x, a, b):
return 1.0/(1.0+np.exp(-b*(x-a)))
I would like to fit it to the following x and y daya:
x = np.array([4000, 2500, 2000, 1000, 500])
y = np.array([ 0.1, 0.3, 0.5, 0.7, 0.9])
But curve_fit does not seems to work:
popt, pcov = curve_fit(func, x, y)
When I try to fit it with a linear function curve_fit gives a good fitting (in green line), but with the exponential function above it just give a=1 and b=1, that is not a good fitting. A good fitting should be a=1800 and b=-0.001667, that gives the red line (data in blue).
The reason is likely that the starting condition is not specified. If you give it as some reasonable numbers, then it is more likely that curve_fit will converge. Below is an example with some reasonable starting conditions:
from scipy.optimize import curve_fit
def func(x, a, b):
return 1.0/(1.0+np.exp(-b*(x-a)))
x = np.array([4000., 2500., 2000., 1000., 500.])
y = np.array([ 0.1, 0.3, 0.5, 0.7, 0.9])
popt, pcov = curve_fit(func, x, y, p0=[2000., 0.005])
plot(x, y, 'x')
xx = linspace(0, 4000, 100)
yy = func(xx, *popt)
plot(xx, yy, lw=5)

Least Squares Fit on Cubic Bezier Curve

I would like fit a cubic bezier curve on a set of 500 random points.
Here's the code I have for the bezier curve:
import numpy as np
from scipy.misc import comb
def bernstein_poly(i, n, t):
"""
The Bernstein polynomial of n, i as a function of t
"""
return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bezier_curve(points, nTimes=1000):
nPoints = len(points)
x = np.array([p[0] for p in points])
y = np.array([p[1] for p in points])
t = np.linspace(0.0, 1.0, nTimes)
polynomial_array = np.array([ bernstein_poly(i, nPoints-1, t) for i in range(0, nPoints) ])
xvals = np.dot(x, polynomial_array)
yvals = np.dot(y, polynomial_array)
return xvals, yvals
if __name__ == "__main__":
from matplotlib import pyplot as plt
nPoints = 4
points = np.random.rand(nPoints,2)*200
xpoints = [p[0] for p in points]
ypoints = [p[1] for p in points]
xvals, yvals = bezier_curve(points, nTimes=1000)
plt.plot(xvals, yvals)
plt.plot(xpoints, ypoints, "ro")
for nr in range(len(points)):
plt.text(points[nr][0], points[nr][1], nr)
plt.show()
I'm aware that Numpy and Scipy have least squares methods: numpy.linalg.lstsq and scipy.optimize.least_squares
But I'm not sure how can I use them for fitting the curve on the 500 points. Can someone offer some assistance?
Thank you
Use the function curve_fit in scipy, https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
import numpy as np
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = curve_fit(func, xdata, ydata)
#Constrain the optimization to the region of 0 < a < 3, 0 < b < 2 and 0 < c < 1:
popt, pcov = curve_fit(func, xdata, ydata, bounds=(0, [3., 2., 1.]))
The scipy documentation itself has a most excellent tutorial on using splines here:
https://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
with lots of code, examples and cool graphs comparing different types of splines.

Categories