Exponential fit on a histogram - python

I'm trying to fit an exponential curve on a histogram created from the variable y1_pt and then get the exponential's parameters. Problem is it gives me the following warnings:
OptimizeWarning: Covariance of the parameters could not be estimated
and pcov_exponential =
array([[inf, inf, inf],
[inf, inf, inf],
[inf, inf, inf]]))
and the result is more an exponential fit which looks to me slightly random.. (see plot)
Does anyone have a clue as to what's wrong?
import pandas as pd
import numpy
from pylab import *
import scipy.stats as ss
from scipy.optimize import curve_fit
df=pd.read_hdf('data.h5','dataset')
pty1 = df1['y1_pt']
bins1 = numpy.linspace(35, 1235, 100)
counts, bins = numpy.histogram(pty1, bins = bins1, range = [35, 1235], density = False)
binscenters = numpy.array([0.5 * (bins1[i] + bins1[i+1]) for i in range(len(bins1)-1)])
def exponential(x, a, k, b):
return a*np.exp(-x*k) + b
popt_exponential, pcov_exponential = curve_fit(exponential, xdata=binscenters, ydata=counts)
print(popt_exponential)
xspace = numpy.linspace(0, 6, 100000)
plt.bar(binscenters, counts, color='navy', label=r'Histogram entries')
plt.plot(xspace, exponential(xspace, *popt_exponential), color='darkorange', linewidth=2.5, label=r'Fitted function')
plt.show()

I think you are missing a minus sign in the exponential formula, hence the overflow. It should be a * np.exp( - x * k) + b
See the example at https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

Related

How to estimate the parameters of an exponential truncated power law using Python?

There is an equation of exponential truncated power law in the article below:
Gonzalez, M. C., Hidalgo, C. A., & Barabasi, A. L. (2008). Understanding individual human mobility patterns. Nature, 453(7196), 779-782.
like this:
It is an exponential truncated power law. There are three parameters to be estimated: rg0, beta and K. Now we have got several users' radius of gyration(rg), and uploaded it onto Github: radius of gyrations.txt
The following codes can be used to read data and calculate P(rg):
import numpy as np
# read radius of gyration from file
rg = []
with open('/path-to-the-data/radius of gyrations.txt', 'r') as f:
for i in f:
rg.append(float(i.strip('\n')))
# calculate P(rg)
rg = sorted(rg, reverse=True)
rg = np.array(rg)
prg = np.arange(len(sorted_data)) / float(len(sorted_data)-1)
or you can directly get rg and prg data as the following:
rg = np.array([ 20.7863444 , 9.40547933, 8.70934714, 8.62690145,
7.16978087, 7.02575052, 6.45280959, 6.44755478,
5.16630287, 5.16092884, 5.15618737, 5.05610068,
4.87023561, 4.66753197, 4.41807645, 4.2635671 ,
3.54454372, 2.7087178 , 2.39016885, 1.9483156 ,
1.78393238, 1.75432688, 1.12789787, 1.02098332,
0.92653501, 0.32586582, 0.1514813 , 0.09722761,
0. , 0. ])
prg = np.array([ 0. , 0.03448276, 0.06896552, 0.10344828, 0.13793103,
0.17241379, 0.20689655, 0.24137931, 0.27586207, 0.31034483,
0.34482759, 0.37931034, 0.4137931 , 0.44827586, 0.48275862,
0.51724138, 0.55172414, 0.5862069 , 0.62068966, 0.65517241,
0.68965517, 0.72413793, 0.75862069, 0.79310345, 0.82758621,
0.86206897, 0.89655172, 0.93103448, 0.96551724, 1. ])
I can plot the P(r_g) and r_g using the following python script:
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(rg, prg, 'bs', alpha = 0.3)
# roughly estimated params:
# rg0=1.8, beta=0.15, K=5
plt.plot(rg, (rg+1.8)**-.15*np.exp(-rg/5))
plt.yscale('log')
plt.xscale('log')
plt.xlabel('$r_g$', fontsize = 20)
plt.ylabel('$P(r_g)$', fontsize = 20)
plt.show()
How can I use these data of rgs to estimate the three parameters above? I hope to solve it using python.
According to #Michael 's suggestion, we can solve the problem using scipy.optimize.curve_fit
def func(rg, rg0, beta, K):
return (rg + rg0) ** (-beta) * np.exp(-rg / K)
from scipy import optimize
popt, pcov = optimize.curve_fit(func, rg, prg, p0=[1.8, 0.15, 5])
print popt
print pcov
The results are given below:
[ 1.04303608e+03 3.02058550e-03 4.85784945e+00]
[[ 1.38243336e+18 -6.14278286e+11 -1.14784675e+11]
[ -6.14278286e+11 2.72951900e+05 5.10040746e+04]
[ -1.14784675e+11 5.10040746e+04 9.53072925e+03]]
Then we can inspect the results by plotting the fitted curve.
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(rg, prg, 'bs', alpha = 0.3)
plt.plot(rg, (rg+popt[0])**-(popt[1])*np.exp(-rg/popt[2]) )
plt.yscale('log')
plt.xscale('log')
plt.xlabel('$r_g$', fontsize = 20)
plt.ylabel('$P(r_g)$', fontsize = 20)
plt.show()

curve_fit doesn't work properly with 4 parameters

Running the following code,
x = np.array([50.849937, 53.849937, 56.849937, 59.849937, 62.849937, 65.849937, 68.849937, 71.849937, 74.849937, 77.849937, 80.849937, 83.849937, 86.849937, 89.849937, 92.849937])
y = np.array([410.67800, 402.63800, 402.63800, 386.55800, 330.27600, 217.71400, 72.98990, 16.70860, 8.66833, 40.82920, 241.83400, 386.55800, 394.59800, 394.59800, 402.63800])
def f(om, a, i , c):
return a - i*np.exp(- c* (om-74.)**2)
par, cov = curve_fit(f, x, y)
stdev = np.sqrt(np.diag(cov) )
produces this Graph,
With the following parameters and standard deviation:
par = [ 4.09652163e+02, 4.33961227e+02, 1.58719772e-02]
stdev = [ 1.46309578e+01, 2.44878171e+01, 2.40474753e-03]
However, when trying to fit this data to the following function:
def f(om, a, i , c, omo):
return a - i*np.exp(- c* (om-omo)**2)
It doesn't work, it produces a standard deviation of
stdev = [inf, inf, inf, inf, inf]
Is there any way to fix this?
It looks like it isn't converging (see this and this). Try adding an initial condition,
par, cov = curve_fit(f, x, y, p0=[1.,1.,1.,74.])
which results in the
par = [ 4.11892318e+02, 4.36953868e+02, 1.55741131e-02, 7.32560690e+01])
stdev = [ 1.17579445e+01, 1.94401006e+01, 1.86709423e-03, 2.62952690e-01]
You can calculate the initial condition from data:
%matplotlib inline
import pylab as pl
import numpy as np
from scipy.optimize import curve_fit
x = np.array([50.849937, 53.849937, 56.849937, 59.849937, 62.849937, 65.849937, 68.849937, 71.849937, 74.849937, 77.849937, 80.849937, 83.849937, 86.849937, 89.849937, 92.849937])
y = np.array([410.67800, 402.63800, 402.63800, 386.55800, 330.27600, 217.71400, 72.98990, 16.70860, 8.66833, 40.82920, 241.83400, 386.55800, 394.59800, 394.59800, 402.63800])
def f(om, a, i , c, omo):
return a - i*np.exp(- c* (om-omo)**2)
par, cov = curve_fit(f, x, y, p0=[y.max(), y.ptp(), 1, x[np.argmin(y)]])
stdev = np.sqrt(np.diag(cov) )
pl.plot(x, y, "o")
x2 = np.linspace(x.min(), x.max(), 100)
pl.plot(x2, f(x2, *par))

morse potential fit using python and curve fit from scipy

I am trying to fit a morse potential using a python and scipy.
The morse potential is defined as:
V = D*(exp(-2*m*(x-u)) - 2*exp(-m*(x-u)))
where D, m and u are the parameters I need to extract.
Unfortunately the fit is not satisfactory as you can see below (sorry I do not have 10 reputation so the image has to be clicked). Could anyone help me please? I must say I am not the best programmer with python.
Here is my code:
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
xdata2=np.array([1.0 ,1.1 ,1.2 ,1.3 ,1.4 ,1.5 ,1.6 ,1.7 ,1.8 ,1.9 ,2.0 ,2.1 ,2.2 ,2.3 ,2.4 ,2.5 ,2.6 ,2.7 ,2.8 ,2.9 ,3.0 ,3.1 ,3.2 ,3.3 ,3.4 ,3.5 ,3.6 ,3.7 ,3.8 ,3.9 ,4.0 ,4.1 ,4.2 ,4.3 ,4.4 ,4.5 ,4.6 ,4.7 ,4.8 ,4.9 ,5.0 ,5.1 ,5.2 ,5.3 ,5.4 ,5.5 ,5.6 ,5.7 ,5.8 ,5.9])
ydata2=[-1360.121815,-1368.532641,-1374.215047,-1378.090480,-1380.648178,-1382.223113,-1383.091562,-1383.479384,-1383.558087,-1383.445803,-1383.220380,-1382.931531,-1382.609269,-1382.273574,-1381.940879,-1381.621299,-1381.319042,-1381.036231,-1380.772039,-1380.527051,-1380.301961,-1380.096257,-1379.907700,-1379.734621,-1379.575837,-1379.430693,-1379.299282,-1379.181303,-1379.077272,-1378.985220,-1378.903626,-1378.831588,-1378.768880,-1378.715015,-1378.668910,-1378.629996,-1378.597943,-1378.572742,-1378.554547,-1378.543296,-1378.539843,-1378.543593,-1378.554519,-1378.572747,-1378.597945,-1378.630024,-1378.668911,-1378.715015,-1378.768915,-1378.831593]
t=np.linspace(0.1,7)
def morse(q, m, u, x ):
return (q * (np.exp(-2*m*(x-u))-2*np.exp(-m*(x-u))))
popt, pcov = curve_fit(morse, xdata2, ydata2, maxfev=40000000)
yfit = morse(t,popt[0], popt[1], popt[2])
print popt
plt.plot(xdata2, ydata2,"ro")
plt.plot(t, yfit)
plt.show()
Old fit before gboffi's comment
I am guessing the exact depth of the morse potential does not interest you overly much. So I added an additional parameter to shift the morse potential up and down (v), includes #gboffis comment. Furthermore, the first argument of your function must be the arguments, not the parameters you want to fit (see http://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.optimize.curve_fit.html)
In addition, such fits are dependent on your starting position. The following should give you what you want.
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
xdata2=np.array([1.0 ,1.1 ,1.2 ,1.3 ,1.4 ,1.5 ,1.6 ,1.7 ,1.8 ,1.9 ,2.0 ,2.1 ,2.2 ,2.3 ,2.4 ,2.5 ,2.6 ,2.7 ,2.8 ,2.9 ,3.0 ,3.1 ,3.2 ,3.3 ,3.4 ,3.5 ,3.6 ,3.7 ,3.8 ,3.9 ,4.0 ,4.1 ,4.2 ,4.3 ,4.4 ,4.5 ,4.6 ,4.7 ,4.8 ,4.9 ,5.0 ,5.1 ,5.2 ,5.3 ,5.4 ,5.5 ,5.6 ,5.7 ,5.8 ,5.9])
ydata2=[-1360.121815,-1368.532641,-1374.215047,-1378.090480,-1380.648178,-1382.223113,-1383.091562,-1383.479384,-1383.558087,-1383.445803,-1383.220380,-1382.931531,-1382.609269,-1382.273574,-1381.940879,-1381.621299,-1381.319042,-1381.036231,-1380.772039,-1380.527051,-1380.301961,-1380.096257,-1379.907700,-1379.734621,-1379.575837,-1379.430693,-1379.299282,-1379.181303,-1379.077272,-1378.985220,-1378.903626,-1378.831588,-1378.768880,-1378.715015,-1378.668910,-1378.629996,-1378.597943,-1378.572742,-1378.554547,-1378.543296,-1378.539843,-1378.543593,-1378.554519,-1378.572747,-1378.597945,-1378.630024,-1378.668911,-1378.715015,-1378.768915,-1378.831593]
t=np.linspace(0.1,7)
tstart = [1.e+3, 1, 3, 0]
def morse(x, q, m, u , v):
return (q * (np.exp(-2*m*(x-u))-2*np.exp(-m*(x-u))) + v)
popt, pcov = curve_fit(morse, xdata2, ydata2, p0 = tstart, maxfev=40000000)
print popt # [ 5.10155662 1.43329962 1.7991549 -1378.53461345]
yfit = morse(t,popt[0], popt[1], popt[2], popt[3])
#print popt
#
#
#
plt.plot(xdata2, ydata2,"ro")
plt.plot(t, yfit)
plt.show()

curve fitting with a known function numpy

I have a x and y one-dimension numpy array and I would like to reproduce y with a known function to obtain "beta". Here is the code I am using:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
y = array([ 0.04022493, 0.04287536, 0.03983657, 0.0393201 , 0.03810298,
0.0363814 , 0.0331144 , 0.03074823, 0.02795767, 0.02413816,
0.02180802, 0.01861309, 0.01632699, 0.01368056, 0.01124232,
0.01005323, 0.00867196, 0.00940864, 0.00961282, 0.00892419,
0.01048963, 0.01199101, 0.01533408, 0.01855704, 0.02163586,
0.02630014, 0.02971127, 0.03511223, 0.03941218, 0.04280329,
0.04689105, 0.04960554, 0.05232003, 0.05487037, 0.05843364,
0.05120701])
x= array([ 0., 0.08975979, 0.17951958, 0.26927937, 0.35903916,
0.44879895, 0.53855874, 0.62831853, 0.71807832, 0.80783811,
0.8975979 , 0.98735769, 1.07711748, 1.16687727, 1.25663706,
1.34639685, 1.43615664, 1.52591643, 1.61567622, 1.70543601,
1.7951958 , 1.88495559, 1.97471538, 2.06447517, 2.15423496,
2.24399475, 2.33375454, 2.42351433, 2.51327412, 2.60303391,
2.6927937 , 2.78255349, 2.87231328, 2.96207307, 3.05183286,
3.14159265])
def func(x,beta):
return 1.0/(4.0*np.pi)*(1+beta*(3.0/2*np.cos(x)**2-1.0/2))
guesses = [20]
popt,pcov = curve_fit(func,x,y,p0=guesses)
y_fit = 1/(4.0*np.pi)*(1+popt[0]*(3.0/2*np.cos(x)**2-1.0/2))
plt.figure(1)
plt.plot(x,y,'ro',x,y_fit,'k-')
plt.show()
The code works but the fitting is completely off (see picture). Any idea why?
It looks like the formula to use contains an additional parameter, i.e. p
def func(x,beta,p):
return p/(4.0*np.pi)*(1+beta*(3.0/2*np.cos(x)**2-1.0/2))
guesses = [20,5]
popt,pcov = curve_fit(func,x,y,p0=guesses)
y_fit = func(angle_plot,*popt)
plt.figure(2)
plt.plot(x,y,'ro',x,y_fit,'k-')
plt.show()
print popt # [ 1.23341604 0.27362069]
In the popt which one is beta and which one is p?
This is perhaps not what you want but, if you are just trying to get a good fit to the data, you could use np.polyfit:
fit = np.polyfit(x,y,4)
fit_fn = np.poly1d(fit)
plt.scatter(x,y,label='data',color='r')
plt.plot(x,fit_fn(x),color='b',label='fit')
plt.legend(loc='upper left')
Note that fit gives the coefficient values of, in this case, a 4th order polynomial:
>>> fit
array([-0.00877534, 0.05561778, -0.09494909, 0.02634183, 0.03936857])
This is going to be as good as you can get (assuming you get the equation right as #mdurant suggested), an additional intercept term is required to further improve the fit:
def func(x,beta, icpt):
return 1.0/(4.0*np.pi)*(1+beta*(3.0/2*np.cos(x)**2-1.0/2))+icpt
guesses = [20, 0]
popt,pcov = curve_fit(func,x,y,p0=guesses)
y_fit = func(x, *popt)
plt.figure(1)
plt.plot(x,y,'ro', x,y_fit,'k-')
print popt #[ 0.33748816 -0.05780343]

Finding the intersection of a curve from polyfit

This seems simple but I can't quite figure it out. I have a curve calculated from x,y data. Then I have a line. I want to find the x, y values for where the two intersect.
Here is what I've got so far. It's super confusing and doesn't give the correct result. I can look at the graph and find the intersection x value and calculate the correct y value. I'd like to remove this human step.
import numpy as np
import matplotlib.pyplot as plt
from pylab import *
from scipy import linalg
import sys
import scipy.interpolate as interpolate
import scipy.optimize as optimize
w = np.array([0.0, 11.11111111111111, 22.22222222222222, 33.333333333333336, 44.44444444444444, 55.55555555555556, 66.66666666666667, 77.77777777777777, 88.88888888888889, 100.0])
v = np.array([0.0, 8.333333333333332, 16.666666666666664, 25.0, 36.11111111111111, 47.22222222222222, 58.333333333333336, 72.22222222222221, 86.11111111111111, 100.0])
z = np.polyfit(w, v, 2)
print (z)
p=np.poly1d(z)
g = np.polyval(z,w)
print (g)
N=100
a=arange(N)
b=(w,v)
b=np.array(b)
c=(w,g)
c=np.array(c)
print(c)
d=-a+99
e=(a,d)
print (e)
p1=interpolate.PiecewisePolynomial(w,v[:,np.newaxis])
p2=interpolate.PiecewisePolynomial(w,d[:,np.newaxis])
def pdiff(x):
return p1(x)-p2(x)
xs=np.r_[w,w]
xs.sort()
x_min=xs.min()
x_max=xs.max()
x_mid=xs[:-1]+np.diff(xs)/2
roots=set()
for val in x_mid:
root,infodict,ier,mesg = optimize.fsolve(pdiff,val,full_output=True)
# ier==1 indicates a root has been found
if ier==1 and x_min<root<x_max:
roots.add(root[0])
roots=list(roots)
print(np.column_stack((roots,p1(roots),p2(roots))))
plt.plot(w,v, 'r', a, -a+99, 'b-')
plt.show()
q=input("what is the intersection value? ")
print (p(q))
Any ideas to get this to work?
Thanks
I don't think I fully understand what you are trying to do in your code, but what you described in English can be done with
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
w = np.array([0.0, 11.11111111111111, 22.22222222222222, 33.333333333333336,
44.44444444444444, 55.55555555555556, 66.66666666666667,
77.77777777777777, 88.88888888888889, 100.0])
v = np.array([0.0, 8.333333333333332, 16.666666666666664, 25.0,
36.11111111111111, 47.22222222222222, 58.333333333333336,
72.22222222222221, 86.11111111111111, 100.0])
poly_coeff = np.polynomial.polynomial.polyfit(w, v, 2)
poly = np.polynomial.polynomial.Polynomial(poly_coeff)
roots = np.polynomial.polynomial.polyroots(poly_coeff - [99, -1, 0])
x = np.linspace(np.min(roots) - 50, np.max(roots) + 50, num=1000)
plt.plot(x, poly(x), 'r-')
plt.plot(x, 99 - x, 'b-')
for root in roots:
plt.plot(root, 99 - root, 'ro')

Categories