OK, I have a function which uses a range of parameters to calculate the effect on two separate variables over time. These variables have already been curve-matched to some existing data to minimize the variation (shown below)
I want to be able to check the previous working, and match new data. I have been trying to use the scipy.optimize.curve_fit function, by stacking the x and y data resulting from my function (as suggested here: fit multiple parametric curves with scipy).
It may not be the right method, or I may just be misunderstanding, but my code keeps running into a type error TypeError: Improper input: N=3 must not exceed M=2
My simplified prototype code was initially taken from here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
result = ([],[])
for i in x:
#set up 2 example curves
result[0].append(a * np.exp(-b * i) + c)
result[1].append(a * np.exp(-b * i) + c**2)
return result #as a tuple containing 2 lists
#Define the data to be fit with some noise:
xdata = list(np.arange(0, 10, 1))
y = func(xdata, 2.5, 5, 0.5)[0]
y2 = func(xdata, 1, 1, 2)[1]
#Add some noise
y_noise = 0.1 * np.random.normal(size=len(xdata))
y2_noise = 0.1 * np.random.normal(size=len(xdata))
ydata=[]
ydata2=[]
for i in range(len(y)): #clunky
ydata.append(y[i] + y_noise[i])
ydata2.append(y2[i] + y2_noise[i])
plt.scatter(xdata, ydata, label='data')
plt.scatter(xdata, ydata2, label='data2')
#plt.plot(xdata, y, 'k-', label='data (original function)')
#plt.plot(xdata, y2, 'k-', label='data2 (original function)')
#stack the data
xdat = xdata+xdata
ydat = ydata+ydata2
popt, pcov = curve_fit(func, xdat, ydat)
plt.plot(xdata, func(xdata, *popt), 'r-',
label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Any help much appreciated !
Here is graphing example code that fits two different equations with a single shared parameter, if this looks like what you need it can easily be adapted for your specific problem.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
y1 = np.array([ 16.00, 18.42, 20.84, 23.26])
y2 = np.array([-20.00, -25.50, -31.00, -36.50, -42.00])
comboY = np.append(y1, y2)
x1 = np.array([5.0, 6.1, 7.2, 8.3])
x2 = np.array([15.0, 16.1, 17.2, 18.3, 19.4])
comboX = np.append(x1, x2)
if len(y1) != len(x1):
raise(Exception('Unequal x1 and y1 data length'))
if len(y2) != len(x2):
raise(Exception('Unequal x2 and y2 data length'))
def function1(data, a, b, c): # not all parameters are used here, c is shared
return a * data + c
def function2(data, a, b, c): # not all parameters are used here, c is shared
return b * data + c
def combinedFunction(comboData, a, b, c):
# single data reference passed in, extract separate data
extract1 = comboData[:len(x1)] # first data
extract2 = comboData[len(x1):] # second data
result1 = function1(extract1, a, b, c)
result2 = function2(extract2, a, b, c)
return np.append(result1, result2)
# some initial parameter values
initialParameters = np.array([1.0, 1.0, 1.0])
# curve fit the combined data to the combined function
fittedParameters, pcov = curve_fit(combinedFunction, comboX, comboY, initialParameters)
# values for display of fitted function
a, b, c = fittedParameters
y_fit_1 = function1(x1, a, b, c) # first data set, first equation
y_fit_2 = function2(x2, a, b, c) # second data set, second equation
plt.plot(comboX, comboY, 'D') # plot the raw data
plt.plot(x1, y_fit_1) # plot the equation using the fitted parameters
plt.plot(x2, y_fit_2) # plot the equation using the fitted parameters
plt.show()
print('a, b, c:', fittedParameters)
Related
I have a original curve. I am developing a model curve matching closely the original curve. Everything is working fine but not matching. How to control the curvature of my model curve? Below code is based on answer here.
My code:
def curve_line(point1, point2):
a = (point2[1] - point1[1])/(np.cosh(point2[0]) - np.cosh(point1[0]))
b = point1[1] - a*np.sinh(point1[0])
x = np.linspace(point1[0], point2[0],100).tolist()
y = (a*np.cosh(x) + b).tolist()
return x,y
###### A sample of my code is given below
point1 = [10,100]
point2 = [20,50]
x,y = curve_line(point1, point2)
plt.plot(point1[0], point1[1], 'o')
plt.plot(point2[0], point2[1], 'o')
plt.plot(x,y) ## len(x)
My present output:
I tried following function as well:
y = (50*np.exp(-x/10) +2.5)
The output is:
Instead of just guessing the right parameters of your model function, you can fit a model curve to your data using curve_fit.
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
x = np.array([ 1.92, 14.35, 21.50, 25.27, 27.34, 30.32, 32.31, 34.09, 34.21])
y = np.array([8.30, 8.26, 8.13, 7.49, 6.66, 4.59, 2.66, 0.60, 0.06])
def fun(x, a, b, c):
return a * np.cosh(b * x) + c
coef,_ = curve_fit(fun, x, y)
plt.plot(x, y, label='Original curve')
plt.plot(x, fun(x, *coef), label=f'Model: %5.3f cosh(%4.2f x + %4.2f)' % tuple(coef) )
plt.legend()
plt.show()
If it is important that the start and end points are closely fitted, you can pass uncertainties to curve_fit, adjusting them to lower values towards the ends, e.g. by
s = np.ones(len(x))
s[1:-1] = s[1:-1] * 3
coef,_ = curve_fit(fun, x, y, sigma=s)
Your other approach a * np.exp(b * x) + c will also work and gives -0.006 exp(0.21 x + 8.49).
In some cases you'll have to provide an educated guess for the initial values of the coefficients to curve_fit (it uses 1 as default).
I have been trying to do curve fitting using python giving x and y values, but unfortunately I am not getting the
curve in the way i wanted. I also tried to fit(exponential) the same x,y values in matlab and i am getting the exact curve.
The problems is that the coefficients returned by python code is not the same as returned by matlab thus generating a different curve.
Please help me with a way to find the correct coefficient values.
I have attached the code below.
#CODE
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import math
n=13.75;
x=[0.375,2.125/4.85714,(2.125/5.6667), (2.125/11.33), (2.125/34),0]
y=[0,n/6.111, (n)/3.0555, (n)/2.24489, (n)/2.03708, (n)/1.96428];
#x = np.linspace(0,4,50) # Example data
def func(x, a, b, c, d):
return a * np.exp(b * x) + c * np.exp(d * x)
#y = func(x, 2.5, 1.3, 0.5, 0.5) # Example exponential data
# Here you give the initial parameters for a,b,c which Python then iterates over
# to find the best fit
popt, pcov = curve_fit(func,x,y,p0=(0.17273307092464,0.050942379680265,0,0.050942379680265), method='trf')
print(popt) # This contains your three best fit parameters
p5 = popt[0] # This is your a
p6 = popt[1] # This is your b
p7 = popt[2] # This is your c
p8 = popt[3] # This is your d
yy=np.linspace(0,(n)/1.96428,50);
xx=p5 * np.exp(p6 * yy) + p7 * np.exp(p8 * yy)
plt.plot(yy,xx)
plt.scatter(y,x, c='b',label='The data points')
plt.show()
I need to count the number of particle under the fitted Gaussian curve. The area of the fitted curve can be found by integrating the function within the limit (mean-3*sigma) to (mean+3*sigma). Would you please help me to solve this. Thanks for your kind consideration.
import pylab as py
import numpy as np
from scipy import optimize
from scipy.stats import stats
import matplotlib.pyplot as plt
import pandas as pd
BackPFT='T067.csv'
df_180 = pd.read_csv(BackPFT, error_bad_lines=False, header=1)
x_180=df_180.iloc[:,3]
y_180=df_180.iloc[:,4]
#want to plot the distribution of s calculated by the following equation
s=np.sqrt((((16*x_180**2*38.22**2)/((4*38.22**2-y_180**2)**2))+1))-1
#Shape of this distribution is Gaussian
#I need to fit this distribution by following parameter
mean=0.433
sigma=0.014
draw=s
#Definition of bin number
bi=np.linspace(0.01,8, 1000)
data = py.hist(draw.dropna(), bins = bi)
#Definition of Gaussian function
def f(x, a, b, c):
return (a * py.exp(-(x - mean)**2.0 / (2 *sigma**2)))
x = [0.5 * (data[1][i] + data[1][i+1]) for i in xrange(len(data[1])-1)]
y = data[0]
#Fitting the peak of the distribution
popt, pcov = optimize.curve_fit(f, x, y)
chi2, p = stats.chisquare(popt)
x_fit = py.linspace(x[0], x[-1], 80000)
y_fit = f(x_fit, *popt)
plot(x_fit, y_fit, lw=3, color="r",ls="--")
plt.xlim(0,2)
plt.tick_params(axis='both', which='major', labelsize=20)
plt.show()
The problem is how to integrate the defined function (f) and count the number under the area. Here I attach the file T067.csv. Thanks in advance for your kind consideration.
BackPFT='T061.csv'
df_180 = pd.read_csv(BackPFT, skip_blank_lines=True ,skiprows=1,header=None,skipfooter=None,engine='python')
x_180=df_180.iloc[:,3]
y_180=df_180.iloc[:,4]
b=42.4
E=109.8
LET=24.19
REL=127.32
mean=0.339; m1=0.259
sigma=0.012; s1=0.015
s=np.sqrt((((16*x_180**2*b**2)/((4*b**2-y_180**2)**2))+1))-1
draw=s
bi=np.linspace(0,8, 2000)
binwidth=0.004
#I want to plot the dsitribution of s. This distribution has three gaussian peaks
data = py.hist(draw.dropna(), bins = bi,color='gray',)
#first Gaussian function for the first peak (peaks counted from the right)
def f(x, a, b, c):
return (a * py.exp(-(x - mean)**2.0 / (2 *sigma**2)))
# fitting the function (Gaussian)
x = [0.5 * (data[1][i] + data[1][i+1]) for i in xrange(len(data[1])-1)]
y = data[0]
popt, pcov = optimize.curve_fit(f, x, y)
chi, p = stats.chisquare(popt)
x_fit = py.linspace(x[0], x[-1], 80000)
y_fit = f(x_fit, *popt)
plot(x_fit, y_fit, lw=5, color="r",ls="--")
#integration of first function f
gaussF = lambda x, a: f(x, a, sigma, mean)
bins=((6*sigma)/(binwidth))
delta = ((mean+3*sigma) - (mean-3*sigma))/bins
f1 = lambda x : f(x, popt[0], sigma, mean)
result = quad(f1,mean-3*sigma,mean+3*sigma)
area = result[0] # this give the area after integration of the gaussian
numPar = area / delta # this gives the number of particle under the integrated area
print"\n\tArea under curve = ", area, "\n\tNumber of particel= ", numPar
The file T061.csv here. Thanks Dr. I Putu Susila for his kind co-operation and interest.
I have the following function I need to solve:
np.exp((1-Y)/Y) = np.exp(c) -b*x
I defined the function as:
def function(x, b, c):
np.exp((1-Y)/Y) = np.exp(c) -b*x
return y
def function_solve(y, b, c):
x = (np.exp(c)-np.exp((1-Y)/Y))/b
return x
then I used:
x_data = [4, 6, 8, 10]
y_data = [0.86, 0.73, 0.53, 0.3]
popt, pcov = curve_fit(function, x_data, y_data,(28.14,-0.25))
answer = function_solve(0.5, popt[0], popt[1])
I tried running the code and the error was:
can't assign to function call
The function I'm trying to solve is y = 1/ c*exp(-b*x) in the linear form. I have bunch of y_data and x_data, I want to get optimal values for c and b.
There are two problems that jump at me:
ln((1-Y)/Y) = ln(c) -b*x this is not valid Python code. On the left side you must have a name, whereas here you have a function call ln(..), hence the error.
ln() is not a Python function in the standard library. There is a math.log() function. Unless you defined ln() somewhere else, it will not work.
Some problems with your code have already been pointed out. Here is a solution:
First, you need to get the correct logarithmic expression of your original function:
y = 1 / (c * exp(-b * x))
y = exp(b * x) / c
ln(y) = b * x + ln(1/c)
ln(y) = b * x - ln(c)
If you want to use that in curve_fit, you need to define your function as follows:
def f_log(x, b, c_ln):
return b * x - c_ln
I now show you the outcome for some randomly generated data (using b = 0.08 and c = 100.5) using the original function and then also the output for the data you provided:
[ 8.17260899e-02 1.17566291e+02]
As you can see the fitted values are close to the original ones and the fit describes the data very well.
For your data it looks as follows:
[-0.094 -1.263]
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def f(x, b, c):
return 1. / (c * np.exp(-b * x))
def f_log(x, b, c_ln):
return b * x - c_ln
# some random data
b_org = 0.08
c_org = 100.5
x_data = np.linspace(0.01, 100., 50)
y_data = f(x_data, b_org, c_org) + np.random.normal(0, 0.5, len(x_data))
# fit the data
popt, pcov = curve_fit(f, x_data, y_data, p0=(0.1, 50))
print popt
# plot the data
xnew = np.linspace(0.01, 100., 5000)
plt.plot(x_data, y_data, 'bo')
plt.plot(xnew, f(xnew, *popt), 'r')
plt.show()
# your data
x_data = np.array([4, 6, 8, 10])
y_data = np.array([0.86, 0.73, 0.53, 0.3])
# fit the data
popt_log, pcov_log = curve_fit(f_log, x_data, y_data)
print popt_log
# plot the data
xnew = np.linspace(4, 10., 500)
plt.plot(x_data, y_data, 'bo')
plt.plot(xnew, f_log(xnew, *popt_log), 'r')
plt.show()
Your problem is in defining function():
def function(x, b, c):
ln((1-Y)/Y) = ln(c) -b*x
return y
You're trying to assign
ln(c) - b*x
to the call of another function, ln(), rather than a variable. Instead, solve the function for a variable (of the function) so it can be stored in a python variable.
Python's curve_fit calculates the best-fit parameters for a function with a single independent variable, but is there a way, using curve_fit or something else, to fit for a function with multiple independent variables? For example:
def func(x, y, a, b, c):
return log(a) + b*log(x) + c*log(y)
where x and y are the independent variable and we would like to fit for a, b, and c.
You can pass curve_fit a multi-dimensional array for the independent variables, but then your func must accept the same thing. For example, calling this array X and unpacking it to x, y for clarity:
import numpy as np
from scipy.optimize import curve_fit
def func(X, a, b, c):
x,y = X
return np.log(a) + b*np.log(x) + c*np.log(y)
# some artificially noisy data to fit
x = np.linspace(0.1,1.1,101)
y = np.linspace(1.,2., 101)
a, b, c = 10., 4., 6.
z = func((x,y), a, b, c) * 1 + np.random.random(101) / 100
# initial guesses for a,b,c:
p0 = 8., 2., 7.
print(curve_fit(func, (x,y), z, p0))
Gives the fit:
(array([ 9.99933937, 3.99710083, 6.00875164]), array([[ 1.75295644e-03, 9.34724308e-05, -2.90150983e-04],
[ 9.34724308e-05, 5.09079478e-06, -1.53939905e-05],
[ -2.90150983e-04, -1.53939905e-05, 4.84935731e-05]]))
optimizing a function with multiple input dimensions and a variable number of parameters
This example shows how to fit a polynomial with a two dimensional input (R^2 -> R) by an increasing number of coefficients. The design is very flexible so that the callable f from curve_fit is defined once for any number of non-keyword arguments.
minimal reproducible example
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def poly2d(xy, *coefficients):
x = xy[:, 0]
y = xy[:, 1]
proj = x + y
res = 0
for order, coef in enumerate(coefficients):
res += coef * proj ** order
return res
nx = 31
ny = 21
range_x = [-1.5, 1.5]
range_y = [-1, 1]
target_coefficients = (3, 0, -19, 7)
xs = np.linspace(*range_x, nx)
ys = np.linspace(*range_y, ny)
im_x, im_y = np.meshgrid(xs, ys)
xdata = np.c_[im_x.flatten(), im_y.flatten()]
im_target = poly2d(xdata, *target_coefficients).reshape(ny, nx)
fig, axs = plt.subplots(2, 3, figsize=(29.7, 21))
axs = axs.flatten()
ax = axs[0]
ax.set_title('Unknown polynomial P(x+y)\n[secret coefficients: ' + str(target_coefficients) + ']')
sm = ax.imshow(
im_target,
cmap = plt.get_cmap('coolwarm'),
origin='lower'
)
fig.colorbar(sm, ax=ax)
for order in range(5):
ydata=im_target.flatten()
popt, pcov = curve_fit(poly2d, xdata=xdata, ydata=ydata, p0=[0]*(order+1) )
im_fit = poly2d(xdata, *popt).reshape(ny, nx)
ax = axs[1+order]
title = 'Fit O({:d}):'.format(order)
for o, p in enumerate(popt):
if o%2 == 0:
title += '\n'
if o == 0:
title += ' {:=-{w}.1f} (x+y)^{:d}'.format(p, o, w=int(np.log10(max(abs(p), 1))) + 5)
else:
title += ' {:=+{w}.1f} (x+y)^{:d}'.format(p, o, w=int(np.log10(max(abs(p), 1))) + 5)
title += '\nrms: {:.1f}'.format( np.mean((im_fit-im_target)**2)**.5 )
ax.set_title(title)
sm = ax.imshow(
im_fit,
cmap = plt.get_cmap('coolwarm'),
origin='lower'
)
fig.colorbar(sm, ax=ax)
for ax in axs.flatten():
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.show()
P.S. The concept of this answer is identical to my other answer here, but the code example is way more clear. At the time given, I will delete the other answer.
Fitting to an unknown numer of parameters
In this example, we try to reproduce some measured data measData.
In this example measData is generated by the function measuredData(x, a=.2, b=-2, c=-.8, d=.1). I practice, we might have measured measData in a way - so we have no idea, how it is described mathematically. Hence the fit.
We fit by a polynomial, which is described by the function polynomFit(inp, *args). As we want to try out different orders of polynomials, it is important to be flexible in the number of input parameters.
The independent variables (x and y in your case) are encoded in the 'columns'/second dimension of inp.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def measuredData(inp, a=.2, b=-2, c=-.8, d=.1):
x=inp[:,0]
y=inp[:,1]
return a+b*x+c*x**2+d*x**3 +y
def polynomFit(inp, *args):
x=inp[:,0]
y=inp[:,1]
res=0
for order in range(len(args)):
print(14,order,args[order],x)
res+=args[order] * x**order
return res +y
inpData=np.linspace(0,10,20).reshape(-1,2)
inpDataStr=['({:.1f},{:.1f})'.format(a,b) for a,b in inpData]
measData=measuredData(inpData)
fig, ax = plt.subplots()
ax.plot(np.arange(inpData.shape[0]), measData, label='measuered', marker='o', linestyle='none' )
for order in range(5):
print(27,inpData)
print(28,measData)
popt, pcov = curve_fit(polynomFit, xdata=inpData, ydata=measData, p0=[0]*(order+1) )
fitData=polynomFit(inpData,*popt)
ax.plot(np.arange(inpData.shape[0]), fitData, label='polyn. fit, order '+str(order), linestyle='--' )
ax.legend( loc='upper left', bbox_to_anchor=(1.05, 1))
print(order, popt)
ax.set_xticklabels(inpDataStr, rotation=90)
Result:
Yes. We can pass multiple variables for curve_fit. I have written a piece of code:
import numpy as np
x = np.random.randn(2,100)
w = np.array([1.5,0.5]).reshape(1,2)
esp = np.random.randn(1,100)
y = np.dot(w,x)+esp
y = y.reshape(100,)
In the above code I have generated x a 2D data set in shape of (2,100) i.e, there are two variables with 100 data points. I have fit the dependent variable y with independent variables x with some noise.
def model_func(x,w1,w2,b):
w = np.array([w1,w2]).reshape(1,2)
b = np.array([b]).reshape(1,1)
y_p = np.dot(w,x)+b
return y_p.reshape(100,)
We have defined a model function that establishes relation between y & x.
Note: The shape of output of the model function or predicted y should be (length of x,)
popt, pcov = curve_fit(model_func,x,y)
The popt is an 1D numpy array containing predicted parameters. In our case there are 3 parameters.
Yes, there is: simply give curve_fit a multi-dimensional array for xData.