Fitting Gaus-function on measurement data - python

we are trying to fit a Gauss function to some data, but we always get a warning, that the error could not be estimated, and the fit is very bad. The parameters are all estimated to 1 and the error to infinity.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from uncertainties import unumpy as unp
from uncertainties import ufloat
from uncertainties import umath as um
from scipy.constants import pi,c,e,h,sigma,k,N_A,zero_Celsius
x_H=np.loadtxt('a4_spek.csv',delimiter=',',usecols=0,skiprows=3)
P_H=np.loadtxt('a4_spek.csv',delimiter=',',usecols=1,skiprows=3)
x_H=unp.uarray(x_H,1)+ufloat(38,1) #in mm
x_L=unp.uarray(x_L,1)+ufloat(38,1) #in mm
P_H=unp.uarray(P_H,0.001)-ufloat(0.001,0.001) #in µW
P_L=unp.uarray(P_L,0.001)-ufloat(0.001,0.001) #in µW
def gaus(x,y0,x0,sig):
return y0*np.exp(-(x-x0)**2/(2*sig**2))/np.sqrt(2*pi*sig**2)
sig=unp.std_devs(P_H)
y=unp.nominal_values(P_H)
x=unp.nominal_values(x_H)
kg, kger = curve_fit(gaus,x,y,sigma=sig,method='lm')
print(kg)
print(kger)
This is the relevant data:
a4_spek.csv
Thanks for your help.

curve_fit is sensitive to initial conditions. The default in your case would be p0 = [1.0, 1.0, 1.0] which is what is giving you the problem. Try the following,
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from uncertainties import unumpy as unp
from uncertainties import ufloat
from uncertainties import umath as um
from scipy.constants import pi,c,e,h,sigma,k,N_A,zero_Celsius
x_H=np.loadtxt('a4_spek.csv',delimiter=',',usecols=0,skiprows=3)
P_H=np.loadtxt('a4_spek.csv',delimiter=',',usecols=1,skiprows=3)
x_H=unp.uarray(x_H,1)+ufloat(38,1) #in mm
#x_L=unp.uarray(x_L,1)+ufloat(38,1) #in mm
P_H=unp.uarray(P_H,0.001)-ufloat(0.001,0.001) #in µW
#P_L=unp.uarray(P_L,0.001)-ufloat(0.001,0.001) #in µW
def gaus(x,y0,x0,sig):
return y0*np.exp(-(x-x0)**2/(2*sig**2))/np.sqrt(2*pi*sig**2)
sig=unp.std_devs(P_H)
y=unp.nominal_values(P_H)
x=unp.nominal_values(x_H)
kg, kger = curve_fit(gaus, x, y, p0= [100, 100, 100], sigma=sig, method='lm')
print(kg)
print(kger)
The initial values for the fit are now [100, 100, 100] which appears to be a better starting point for your data.
The output,
[ 1.48883451 84.19781151 3.66861888]
[[ 0.00923875 -0.00232398 0.01531638]
[-0.00232398 0.07796845 -0.01488248]
[ 0.01531638 -0.01488248 0.07563641]]

Related

interpolate, derivate and integrate a function -- some math fun

I have a problem. I have three lists. The list_umf are the x values and list list_kf are the y values, while list_kfm are y values, too. kfm is the integral of kf. The values are the output of my code.
To show that kfm is the integral of kf, I want to calculate the derivative of kfm, which shuold be the same as kf. But the re calculated kfm (list_kf_re) is just 101.0 every time.
Whats wrong with my code?
import numpy as np
from scipy import integrate, interpolate
from scipy.misc import derivative as deriv
import matplotlib.pyplot as plt
list_kfm = [15.348748494618041, 26.240336614039776, 37.76846357985518, 49.80068952374503, 62.25356792292074, 75.0692188764684, 88.20491343740369, 101.6276911997135,
115.31128207665246, 129.2342114999071, 143.37856687640036, 157.72915825067278, 172.27292637703843, 186.9985127198004, 201.89593919604192, 216.95636451973587]
list_kf = [168.08871431597626, 179.78615963605742, 188.728883379148, 196.0371678709251, 202.25334207341422, 207.68364358717665, 212.51893919883966, 216.88670040685466,
220.87653440371076, 224.55397301446894, 227.96847485999652, 231.15833919688876, 234.1538643061246, 236.97945558527186, 239.65507793294745, 242.19728380107006]
list_umf = [0.1, 0.15000000000000002, 0.20000000000000004, 0.25000000000000006, 0.30000000000000004, 0.3500000000000001, 0.40000000000000013, 0.45000000000000007,
0.5000000000000001, 0.5500000000000002, 0.6000000000000002, 0.6500000000000001, 0.7000000000000002, 0.7500000000000002, 0.8000000000000002, 0.8500000000000002]
f = interpolate.interp1d(
list_umf, list_kfm, bounds_error=False, fill_value=(15, 217))
list_kf_re = [deriv(f, x) for x in list_umf]
plt.plot(list_umf, list_kfm, label='kfm')
plt.plot(list_umf, list_kf, label='kf')
plt.plot(list_umf, list_kf_re, label='kfre')
print(list_kf_re)
print(list_kf)
Use UnivariateSpline to create an interpolator that you can later apply integral or derivative functions (see this post)
Sample:
import numpy as np
#from scipy import integrate, interpolate
from scipy.interpolate import UnivariateSpline as US, InterpolatedUnivariateSpline as IUS
#from scipy.misc import derivative as deriv
import matplotlib.pyplot as plt
list_kfm = [15.348748494618041, 26.240336614039776, 37.76846357985518, 49.80068952374503, 62.25356792292074, 75.0692188764684, 88.20491343740369, 101.6276911997135,
115.31128207665246, 129.2342114999071, 143.37856687640036, 157.72915825067278, 172.27292637703843, 186.9985127198004, 201.89593919604192, 216.95636451973587]
list_kf = [168.08871431597626, 179.78615963605742, 188.728883379148, 196.0371678709251, 202.25334207341422, 207.68364358717665, 212.51893919883966, 216.88670040685466,
220.87653440371076, 224.55397301446894, 227.96847485999652, 231.15833919688876, 234.1538643061246, 236.97945558527186, 239.65507793294745, 242.19728380107006]
list_umf = [0.1, 0.15000000000000002, 0.20000000000000004, 0.25000000000000006, 0.30000000000000004, 0.3500000000000001, 0.40000000000000013, 0.45000000000000007,
0.5000000000000001, 0.5500000000000002, 0.6000000000000002, 0.6500000000000001, 0.7000000000000002, 0.7500000000000002, 0.8000000000000002, 0.8500000000000002]
f = US(list_umf, list_kfm)
list_kf_re = [f.derivative()(x) for x in list_umf]
plt.plot(list_umf, list_kfm, label='kfm')
plt.plot(list_umf, list_kf, label='kf')
plt.plot(list_umf, list_kf_re, label='kfre')
plt.plot(list_umf, list_kf, 'o', label='kfre_2')
print(list_kf_re)
print(list_kf)

logistic regression/puzzled about scipy.optimize function

I was trying to figure out a problem with logistic regression. So the problem is asking to find some parameters to separate points into two groups.
After I completed the costFunction and the gradient function, I decided to use the fmin_tnc function
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy.optimize as opt
def sigmoid(x):
return 1/(1+np.exp(-x))
def costFunction(theta,X,y):
# print(X.shape,theta.shape)
A=(-y)*np.log(sigmoid(X # theta))
# print(A.shape)
B=(1-y)*np.log(1-sigmoid(X.dot(theta)))
# print(B.shape)
return np.mean(A-B)
data=pd.read_csv("D:\Files\Coursera-ML-AndrewNg-Notes-master\code\ex2-logistic regression\ex2data1.txt",
names=['exam1','exam2','admitted'])
data.insert(0,'Ones',1)
cols=data.shape[1]
X=data.iloc[:,0:cols-1]
y=data.iloc[:,cols-1:cols]
X=np.array(X.values) #100*3
y=np.array(y.values) #100*1
theta=np.zeros(shape=[3,1])
#X:100*3 y:100*1 theta:3*1
def gradient(theta,X,y):
return (X.T # (sigmoid(X # theta) - y))/len(X)
theta.reshape([3,1])
res=opt.fmin_tnc(func=costFunction,x0=theta,fprime=gradient,args=(X,y))
positive = data[data.admitted.isin(['1'])]
negetive = data[data.admitted.isin(['0'])]
plt.scatter(positive['exam1'],positive['exam2'],c='r',label='Admitted')
plt.scatter(negetive['exam1'],negetive['exam2'],c='yellow',marker='x',label='NotAdmitted')
x1=np.linspace(30,100)
x2=-1*(final_theta[0]+final_theta[1]*x1)/final_theta[2]
plt.plot(x1,x2,c='blue')
plt.savefig('./fig2.png')
plt.show()
the error

solve complicated ODE groups using spicy.integrate solve_bvp unsuccessfully

When I run my code sentence by sentence, no errors exist. However, when I check out the results of AA (bvp solution in my code), it displays that I solve this ODE group falsely.
How can I run this code correctly?
import numpy as np
import scipy.integrate
from scipy.integrate import solve_bvp
import matplotlib.pyplot as plt
k=0.06
Theta=20
p=0.4 #porosity of the electrode
pi=3.14
L=4 #cm
R=8.314
F=96485
t2=0.78
C0=1
T=298.15
vs=1
aa=0.5
ac=0.5
a=23300
i0=2e-2
Dad=0.5
I=2
Da=900
B=(1/k*(p**(1.5)))+1/Theta
C=I/(2*pi*L*Theta)
D=2*R*T/F
E=Da*p**(1.5)
def battery(r,y):
A=np.exp((aa*F*y[0])/(R*T))-np.exp((-ac*F*y[0])/(R*T))
return np.vstack((y[1]*B-C/r+(D*y[3]/y[2])*(0.22+y[2]),
A/((1/a*i0)+A/Dad),
y[3],
((1-E)/E)*y[3]+(0.22/(E*F))*(A/((1/a*i0)+A/Dad))))
def boundary(ya,yb):
return [ya[1]-2/(2*3.14*4*10*4.2),yb[1],ya[2]-9,yb[3]]
n = 25
r = np.linspace(4.2, 6.9, n)
y = np.ones((4,r.size))
AA=solve_bvp(battery,boundary,r,y)
The results are below:
sol: <scipy.interpolate.interpolate.PPoly object at 0x11303f728>
status: 2
success: False

Is that the expected result on ndimage.gaussian_filter?

I'm trying to calculate the discrete derivative using gaussian_filter from scipy.ndimage and so the output is presenting some strange behavior with boundary conditions. The code is below:
from scipy import ndimage
import numpy as np
import matplotlib.pyplot as plt
y = np.linspace(0.2*np.pi,0.7*np.pi,100)
U = np.sin(y)
sg = 1
Uy = ndimage.gaussian_filter1d(U, sigma=sg,order=1,mode='constant',cval=0)
Uy2 = ndimage.gaussian_filter1d(U, sigma=sg,order=1,mode='nearest')
Uy3 = ndimage.gaussian_filter1d(U, sigma=sg,order=1,mode='reflect')
Uy4 = ndimage.gaussian_filter1d(U, sigma=sg,order=1,mode='mirror')
Uy5 = ndimage.gaussian_filter1d(U, sigma=sg,order=1,mode='wrap')
fig,(a1,a2) = plt.subplots(1,2)
a1.plot(U , y,label='data')
a2.plot(Uy, y,label='constant')
a2.plot(Uy2,y,label='nearest')
a2.plot(Uy3,y,label='reflect')
a2.plot(Uy4,y,label='mirror')
a2.plot(Uy5,y,label='wrap')
a1.legend(loc='best')
a2.legend(loc='best')
What happened? Constant mode should result cval on boudary? Is that the expected result?

'numpy.ndarray' object is not callable when using a CALLABLE function in minimization

I keep getting the numpy.ndarray object is not callable error. I know that this mistake happens because one uses an np.array instead of a function. The problem in my code, is that I am indeed using a function to run the minimize python function.
Could someone please let me know what is happening?
The code is here:
# -*- coding: utf-8 -*-
"""
Created on Thu Oct 15 06:27:54 2015
"""
# -*- coding: utf-8 -*-
"""
Created on Mon Oct 12 20:22:27 2015
"""
# Midterm Macroeconometrics
import numpy as np
from numpy import log
import numpy.linalg as linalg
from scipy import *
from scipy.optimize import fminbound, broyden1, brentq, bisect, minimize
from scipy import interp
import pylab as pl
#from numdifftools import Gradient, Jacobian, Derivative
import matplotlib.pyplot as plt
import pandas as pd
from mpl_toolkits.mplot3d import axes3d
from matplotlib import cm
import scipy.io as sio
import os
"""
IMPORTING DATA FROM PANDAS
"""
#Importing data from text file- using Pandas.
os.chdir(r'/Users/camilahenao/Dropbox/UIUC Phd Econ/Year 3/Fall/macroeconometrics shin/Homework/ps3-MIDTERM')
os.path.abspath(os.path.curdir)
data=pd.read_csv(r'midterm2015.csv', header= None)
data.columns = ['GDP_I', 'GDP_E']
GDP_I=np.array(data.GDP_I)
GDP_E=np.array(data.GDP_E)
y= np.vstack((GDP_I,GDP_E))
def kalman2(a_old, p_old, Z, gamma, theta, y):
mu, rho, h_I, h_E, h_G = theta[0], theta[1], np.log(theta[2]), np.log(theta[3]), np.log(theta[4])
sigma_I= np.exp(h_I)
sigma_E= np.exp(h_E)
sigma_G= np.exp(h_G)
H = np.array([[sigmaI,0],[0, sigmaE]])
H=np.matrix(H)
list_a = np.array([a_old])
list_p = np.array([p_old])
list_f = np.array([])
list_v = np.array([])
log_likelihood_Y= np.array([ ])
list_log_like_sum = np.array([])
for i in range(y[0].size):
N=y.shape[0]
Time=y[0].size
inv= np.matrix(linalg.inv(Z*p_old*Z.T+H))
cosa= Z.T*inv
temp= p_old*cosa
a_new= np.array(a_old +temp*(np.array([[y[0][i]],[y[1][i]]])-Z*a_old-gamma*w))[0]
list_a=np.hstack((list_a,a_new))
p_new= np.array(p_old - temp* Z*p_old)[0]
list_p=np.hstack((list_p, p_new))
#Transform the previous posterior into prior-
a_old=T*a_new
a_old=a_old[0]
p_old=T*p_new*T + R*Q*R #25
#Moments for log-likelihood:
f= np.linalg.det(Z*p_old*Z.T + H)
list_f= np.hstack((list_f,f))
#print list_f
v= np.array([[y[0][i]],[y[1][i]]])-Z*a_old - gamma*w
v_element= np.array((v.T *np.matrix(np.linalg.inv(Z*p_old*Z.T + H)) *v))[0]
list_v=np.hstack((list_v,v_element))
#print list_v
#Log likelihood function for each period of time:
log_like= (-N*(Time-1)/2*np.log(2*pi)-(1/2)*sum(log(list_f)) -(1/2)*sum(list_v))
log_likelihood_Y=np.hstack((log_likelihood_Y, log_like))
#Create the sum over all Time of the log-likelihood
log_like_sum=np.sum(log_likelihood_Y)
list_log_like_sum=np.hstack((list_log_like_sum, log_like_sum))
return list_a, list_p, log_likelihood_Y, list_log_like_sum
#Define the "callable function"
def mle(a_old, p_old, Z, gamma, theta, y, bds):
a, P, py, py_sum = kalman2(a_old, p_old, Z, gamma, theta, y)
mle= -1*py_sum
return mle
#Run the minimization algorithm
theta2=(.8, 3.0, 5.0, 5.0, 5.0)
a_old=0.0
p_old= sigmaG/(1-rho**2)
Z=np.array([[1.0],[1.0]])
gamma=np.array([[1.0],[1.0]])
bds = [[-10e100, 10e100], [-10e100, 10e100], [1e-6, 10e100], [1e-6, 10e100], [1e-6, 10e100]]
theta_guess = [3, 0.8, np.sqrt(5), np.sqrt(5), np.sqrt(5)]
result = minimize(mle(a_old, p_old, Z, gamma, theta, y, bds), theta_guess, bounds = bds)
As Warren Weckesser mentioned in a comment, you're passing the result of calling mle(a_old, p_old, Z, gamma, theta, y, bds) — which is a floating point value — as the first argument to the minimize() function. According to the scipy documentation the first argument to minimize() should be a callable function, so for starters you're going to need to change the call it so it's something like this:
result = minimize(mle, (a_old, p_old, Z, gamma, theta, y, bds),
theta_guess, bounds=bds)
However you're going to run into new problems because your mle() function doesn't accept a vector as its first argument, which is required of the function you pass to minimize() — so you're also going to need to modify its definition accordingly.
Unfortunately I don't understand enough of what you're actually trying to accomplish to suggest how you should do that.

Categories