I'm trying to fit the next function into some data using the Scipy Curve_fit function:
def sinugauss(x, A, B, C):
exponente = A*(np.sin(x-B))**2
return np.array([C/(np.exp(exponente))])
I have a data set of 33 points but I keep getting this error:
Traceback (most recent call last):\
File "D:Es_periodico_o_no.py", line 35, in <module>\
res, cov = curve_fit(sinugauss,datos['x'],datos['y'])\
File "D:\lib\site-packages\scipy\optimize\minpack.py", line 789, in curve_fit\
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)\
File "D:\lib\site-packages\scipy\optimize\minpack.py", line 414, in leastsq
raise TypeError(f"Improper input: func input vector length N={n} must"\
TypeError: Improper input: func input vector length N=3 must not exceed func output vector length M=1
This is the full code:
def sinugauss(x, Ventas, Inicio, Desv):
exponente = Desv*(np.sin(x-Inicio))**2
return np.array([Ventas/(np.exp(exponente))])
for index, row in real_df.iterrows():
datos_y = np.array([row]).transpose()
datos_x = np.array([range(len(datos_y))]).transpose()
datos = pd.DataFrame(np.column_stack([datos_x,datos_y]),columns=['x','y'])
res, cov = curve_fit(sinugauss,datos['x'],datos['y'])
print(res)
print(cov)
The error raises since the first iteration, all the rows has 33 not nan points. There may be zeros
Thank you
In the function sinugauss, change the return statement to:
return C/np.exp(exponente)
When you write np.array([C/(np.exp(exponente))]), you are wrapping the expression C/np.exp(exponente), which might be an array with shape, say, (3,), in a 2-d array with shape (1, 3). That is not the shape that curve_fit expects from your function.
Related
I'm trying to fit a lorentzian to one of the peaks in my dataset.
We were given the fit for a gaussian, and aside from the actual fit equation, the code is very similar, so I'm not sure where I am going wrong. I don't see why there is an issue with the dimensions when I'm using curve_fit.
Here are the relevant pieces of my code for a better idea of what I'm talking about.
Reading the CSV file in and trimming it
import csv
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from matplotlib.ticker import StrMethodFormatter
#reading in the csv file
with open("Data-Oscilloscope.csv") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
time =[]
voltage_raw = []
for row in csv_reader:
time.append(float(row[3]))
voltage_raw.append(float(row[4]))
print("voltage:", row[4])
#trimming the data
trim_lower_index = 980
trim_upper_index = 1170
time_trim = time[trim_lower_index:trim_upper_index]
voltage_trim = voltage_raw[trim_lower_index:trim_upper_index]
The Gaussian fit given
#fitting the gaussian function
def gauss_function(x, a, x0, sigma):
return a*np.exp(-(x-x0)**2/(2*sigma**2))
popt, pcov = curve_fit(gauss_function, time_trim, voltage_trim, p0=[1,.4,0.1])
perr = np.sqrt(np.diag(pcov))
#plot of the gaussian fit
plt.figure(2)
plt.plot(time_trim, gauss_function(time_trim, *popt), label = "fit")
plt.plot(time_trim, voltage_trim, "-b")
plt.show()
My attempted Lorentzian fit
#x is just the x values, a is the amplitude, x0 is the central value, and f is the full width at half max
def lorentz_function(x, a, x0,f):
w = f/2 #half width at half max
return a*w/ [(x-x0)**2+w**2]
popt, pcov = curve_fit(lorentz_function, time_trim, voltage_trim, p0=[1,.4,0.1])
I get an error running this that states:
in leastsq raise TypeError('Improper input: N=%s must not exceed M=%s' % (n, m))
TypeError: Improper input: N=3 must not exceed M=1
I'm probably missing something very obvious but just can't see it.
Thanks in advance!
EDIT: I have taken a look at other, similar questions and went through their explanations, but can't see how those fit in with my code because the number of parameters and dimensions of my inputs should be fine, given that they worked for the gaussian fit.
You didn't show the full traceback/error so I can only guess where it's happening. It's probably looking at the result returned by lorentz_function, and finding the dimensions to be wrong. So while the error is produced by your function, the testing is in its caller (in this case a level or two down).
def optimize.curve_fit(
f,
xdata,
ydata,
p0=None,... # p0=[1,.4,0.1]
...
res = leastsq(func, p0,
...
curve_fit passes the task to leastsq, which starts as:
def optimize.leastsq(
func,
x0,
args=(), ...
x0 = asarray(x0).flatten()
n = len(x0)
if not isinstance(args, tuple):
args = (args,)
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
m = shape[0]
if n > m:
raise TypeError('Improper input: N=%s must not exceed M=%s' % (n, m))
I'm guessing _check_func does
res = func(x0, *args) # call your func with initial values
and returning the shape of res. The error says there's a mismatch between what it expects based on the shape of x0 and the result of your func.
I'm guessing that with a 3 element p0, it's complaining that your function returned a 1 element result (due to the []).
lorentz is your function. You don't test the output shape so it can't raise this error.
I had a similar problem yielding this error message.
In that case, the array of data passed to the optimize.leastsq was in the matrix form. the data seems to have to be a 1 row array.
For example, the sentence calling the leastsq was
result = optimize.leastsq(fit_func, param0, args=(xdata, ydata, zdata))
xdata, ydata, zdata was in the [ 1 x num ] matrix form. It was
[[200. .... 350.]]
It should be
[200. .... 350.]
So, I had to add the sentence for conversion
xdata = xdata[0, :]
So do ydata and zdata.
I hope this be a help for you.
Following my previous two posts (post1, post 2), I have now reached the point where I use scipy to find a curve fit. However, the code I have produces an error.
A sample of the .csv file I'm working with is located in post1. I tried to copy and substitute examples from the Internet, but it doesn't seem to be working.
Here's what I have (the .py file)
import pandas as pd
import numpy as np
from scipy import optimize
df = pd.read_csv("~/Truncated raw data hcl.csv", usecols=['time' , '1mnaoh trial 1']).dropna()
data1 = df
array1 = np.asarray(data1)
x , y = np.split(array1,[-1],axis=1)
def func(x, a , b , c , d , e):
return a + (b - a)/((1 + c*np.exp(-d*x))**(1/e))
popt, pcov = optimize.curve_fit(func, x , y , p0=[23.2, 30.1 , 1 , 1 , 1])
popt
From the limited research I've done, it might be a problem with the x and y arrays. The title states the error that is written. It is a minpack.error.
Edit: the error returned
ValueError: object too deep for desired array
Traceback (most recent call last):
File "~/test2.py", line 15, in <module>
popt, pcov = optimize.curve_fit(func, x , y , p0=[23.2, 30.1 , 1 , 1 , 1])
File "~/'virtualenvname'/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 744, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "~/'virtualenvname'/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 394, in leastsq
gtol, maxfev, epsfcn, factor, diag)
minpack.error: Result from function call is not a proper array of floats.
Thank you.
After the split, the shape of x and y is (..., 1). This means that each element of them itself are arrays of length one. You want to flatten the array first, i.e. via x = np.flatten(x).
But I think you don't need the split at all. You can just do the following
array1 = np.asarray(data1).T
x , y = array1
You want x and y to be the first and second columns of array1. So an easy way to achieve this is to transpose the array first. You could also access them via [:,0] and [:,1].
import math
from scipy.optimize import fsolve
def sigma(s, Bpu):
return s - math.sin(s) - math.pi * Bpu
def jac_sigma(s):
return 1 - math.cos(s)
if __name__ == '__main__':
Bpu = 0.5
sig_r = fsolve(sigma, x0=[math.pi], args=(Bpu), fprime=jac_sigma)
Running the above script throws the following error,
Traceback (most recent call last):
File "C:\Users\RP12808\Desktop\_test_fsolve.py", line 12, in <module>
sig_r = fsolve(sigma, x0=[math.pi], args=(Bpu), fprime=jac_sigma)
File "C:\Users\RP12808\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\minpack.py", line 146, in fsolve
res = _root_hybr(func, x0, args, jac=fprime, **options)
File "C:\Users\RP12808\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\minpack.py", line 226, in _root_hybr
_check_func('fsolve', 'fprime', Dfun, x0, args, n, (n, n))
File "C:\Users\RP12808\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\minpack.py", line 26, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
TypeError: jac_sigma() takes 1 positional argument but 2 were given
I am unsure how to pass jacobian to fsolve function... how do solve this?
Thanks in advance..RP
The function that computes the Jacobian matrix must take the same arguments as the function to be solved, and it must return an array:
def jac_sigma(s, Bpu):
return np.array([1 - math.cos(s)])
In general, the Jacobian matrix is a two-dimensional array, but
when the variable is a scalar (as it is here) and the Jacobian "matrix" is 1x1, the code accepts a one- or two-dimensional value. (It might be nice if it also accepted a scalar in this case, but it doesn't.)
Actually, it is sufficient that the return value be "array-like"; e.g. a list is also acceptable:
def jac_sigma(s, Bpu):
return [1 - math.cos(s)]
This is my code
import os
import sys
import numpy as np
import scipy
from scipy.optimize import leastsq
def peval (inp_mat,p):
m0,m1,m2,m3,m4,m5,m6,m7 = p
out_mat = np.array(np.zeros(inp_mat.shape,dtype=np.float32))
mid = inp_mat.shape[0]/2
for xy in range(0,inp_mat.shape[0]):
if (xy<(inp_mat.shape[0]/2)):
out_mat[xy] = ( ( (inp_mat[xy+mid]*m0)+(inp_mat[xy]*m1)+ m2 ) /( (inp_mat[xy+mid]*m6)+(inp_mat[xy]*m7)+1 ) )
else:
out_mat[xy] = ( ( (inp_mat[xy]*m3)+(inp_mat[xy-mid]*m4)+ m5 ) /( (inp_mat[xy]*m6)+(inp_mat[xy-mid]*m7)+1 ) )
return np.array(out_mat)
def residuals(p, out_mat, inp_mat):
m0,m1,m2,m3,m4,m5,m6,m7 = p
err=np.array(np.zeros(inp_mat.shape,dtype=np.float32))
if (out_mat.shape == inp_mat.shape):
for xy in range(0,inp_mat.shape[0]):
err[xy] = err[xy]+ (out_mat[xy] -inp_mat[xy])
return np.array(err)
f = open('/media/anilil/Data/Datasets/repo/txt_op/vid.txt','r')
x = np.loadtxt(f,dtype=np.int16,comments='#',delimiter='\t')
nof = x.shape[0]/72 # Find the number of frames
x1 = x.reshape(-1,60,40)
x1_1= x1[0,:,:].flatten()
x1_2= x1[1,:,:].flatten()
x= []
y= []
for xy in range(1,50,1):
y.append(x1[xy,:,:].flatten())
x.append(x1[xy-1,:,:].flatten())
x=np.array(x,dtype=np.float32)
y=np.array(y,dtype=np.float32)
length = x1_1.shape#initail guess
p0 = np.array([1,1,1,1,1,1,1,1],dtype=np.float32)
abc=leastsq(residuals, p0,args=(y,x))
print ('Size of first matrix is '+str(x1_1.shape))
print ('Size of first matrix is '+str(x1_2.shape))
print ("Done with program")
I have tried adding np.array in most places with no use.
Could someone please help me ?
Another question here is do I give the output of the residuals() as a single value by adding all errorsnp.sum(err,axis=1). or leave it the way it is ?
When I return np.sum(err,axis=1) in the function residuals(). There is no change in the initial guess. It just remains the same.
I.E error is for each item in the input output mapping. or a combined error overall ?
Example data.
Output
ValueError: object too deep for desired array
Traceback (most recent call last):
File "/media/anilil/Data/charm/mv_clean/.idea/nose_reduction_mpeg.py", line 49, in <module>
abc=leastsq(residuals, p0,args=(y,x))
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 378, in leastsq
gtol, maxfev, epsfcn, factor, diag)
minpack.error: Result from function call is not a proper array of floats.
leastsq requires a 1D array to be returned from your residuals function.
Currently you calculate the residuals for the whole image and return that as a 2D array.
The simple fix would be to flatten the array of residuals (turning your 2D array into a 1D one).
So instead of returning
return np.array(err)
Do this instead
return err.flatten()
Note that err is already a numpy array so doesn't need to be cast before the return (I guess that slipped in when you were trying to debug it!)
I seem to be getting an error when I use the root-finder in scipy. I was wondering if anyone could point out what I'm doing wrong.
The function I'm finding the root of is just an easy example, and not particularly important.
If I run this code with scipy 0.9.0:
import numpy as np
from scipy.optimize import fsolve
tmpFunc = lambda xIn: (xIn[0]-4)**2 + (xIn[1]-5)**2 + (xIn[2]-7)**3
x0 = [3,4,5]
xFinal = fsolve(tmpFunc, x0 )
print xFinal
I get the following error message:
Traceback (most recent call last):
File "tmpStack.py", line 7, in <module>
xFinal = fsolve(tmpFunc, x0 )
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 115, in fsolve
_check_func('fsolve', 'func', func, x0, args, n, (n,))
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 26, in _check_func
raise TypeError(msg)
TypeError: fsolve: there is a mismatch between the input and output shape of the 'func' argument '<lambda>'.
Well it looks like I was trying to use this routine incorrectly. This routine requires the same number of equations and variables vs. the one equation with three variables I gave it. So if the input to the function to be minimized is a 3-D array the output should be a 3-D array. This code works:
import numpy as np
from scipy.optimize import fsolve
tmpFunc = lambda xIn: np.array( [(xIn[0]-4)**2 + xIn[1], (xIn[1]-5)**2 - xIn[2]) \
, (xIn[2]-7)**3 + xIn[0] ] )
x0 = [3,4,5]
xFinal = fsolve(tmpFunc, x0 )
print xFinal
Which represents solving three equations simultaneously.