How to guess the numerical Solution for Mathieu's Equation - python

am trying to predict the exact solution for the mathieu's equation y"+(lambda - 2qcos(2x))y = 0. I have been able to get five eigenvalues for the equation using numerical approximation and I want to find for each eigenvalues a guessed exact solution. I would be greatfull if someone helps. Thank you. Below is one of the codes for the fourth Eigenvalue
from scipy.integrate import solve_bvp
import numpy as np
import matplotlib.pyplot as plt
Definition of Mathieu's Equation
q = 5.0
def func(x,u,p):
lambd = p[0]
# y'' + (lambda - 2qcos(2x))y = 0
ODE = [u[1],-(lambd - 2.0*q*np.cos(2.0*x))*u[0]]
return np.array(ODE)
Definition of Boundary conditions(BC)
def bc(ua,ub,p):
return np.array([ua[0]-1., ua[1], ub[1]])
A guess solution of the mathieu's Equation
def guess(x):
return np.cos(4*x-6)
Nx = 100
x = np.linspace(0, np.pi, Nx)
u = np.zeros((2,x.size))
u[0] = -x
res = solve_bvp(func, bc, x, u, p=[16], tol=1e-7)
sol = guess(x)
print res.p[0]
x_plot = np.linspace(0, np.pi, Nx)
u_plot = res.sol(x_plot)[0]
plt.plot(x_plot, u_plot, 'r-', label='u')
plt.plot(x, sol, color = 'black', label='Guess')
plt.legend()
plt.xlabel("x")
plt.ylabel("y")
plt.title("Mathieu's Equation for Guess$= \cos(3x) \quad \lambda_4 = %g$" % res.p )
plt.grid()
plt.show()
[Plot of the Fourth Eigenvalues][2]

To compute the first five eigenpairs, thus, pairs of eigenvalues and eigenfunctions, of the Mathieu's equation Y" + (λ − 2q cos(2x))y = 0, on the interval [0, π] with boundary conditions:
y'(0) = 0, and y'(π) = 0 when q = 5.
The solution is normalized so that y(0) = 1. Though all the initial values are known at x = 0, the problem requires finding a value for the parameters that allows the boundary condition y'(π) = 0 to be satisfied.
Therefore the guess or exact solution of Mathieu's equation is cos(k*x) where k ∈ ℕ.
from scipy.integrate import solve_bvp
import numpy as np
import matplotlib.pyplot as plt
q = 5.0
# Definition of Mathieu's Equation
def func(x,u,p):
lambd = p[0]
# y'' + (lambda - 2qcos(2x))y = 0 can be rewritten as u2'= - (lambda - 2qcos(2x))u1
ODE = [u[1],-(lambd - 2.0*q*np.cos(2.0*x))*u[0]]
return np.array(ODE)
# Definition of Boundary conditions(BC)
def bc(ua,ub,p):
return np.array([ua[0]-1., ua[1], ub[1]])
# A guess solution of the mathieu's Equation
def guess(x):
return np.cos(5*x) # for k=5
Nx = 100
x = np.linspace(0, np.pi, Nx)
u = np.zeros((2,x.size))
u[0] = -x # initial guess
res = solve_bvp(func, bc, x, u, p=[20], tol=1e-9)
sol = guess(x)
print res.p[0]
x_plot = np.linspace(0, np.pi, Nx)
u_plot = res.sol(x_plot)[0]
plt.plot(x_plot, u_plot, 'r-', label='u')
plt.plot(x, sol, linestyle='--', color='k', label='Guess')
plt.legend(loc='best')
plt.xlabel("x")
plt.ylabel("y")
plt.title("Mathieu's Equation $\lambda_5 = %g$" % res.p)
plt.grid()
plt.savefig('Eigenpair_5v1.png')
plt.show()
Solution of Mathieu Equation

Related

Finite Difference Solution to Heat Equation

Practicing finite difference implementation and I cannot figure out why my solution looks so strange. Code taken from: http://people.bu.edu/andasari/courses/numericalpython/Week9Lecture15/PythonFiles/FTCS_DirichletBCs.py.
Note: I'm using this lecture example for the heat equation not the reaction-diffusion equation!
I haven't learned the relevant mathematics so this could be why!
My code:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import math as mth
from mpl_toolkits.mplot3d import Axes3D
import pylab as plb
import scipy as sp
import scipy.sparse as sparse
import scipy.sparse.linalg
# First start with diffusion equation with initial condition u(x, 0) = 4x - 4x^2 and u(0, t) = u(L, t) = 0
# First discretise the domain [0, L] X [0, T]
# Then discretise the derivatives
# Generate algorithm:
# 1. Compute initial condition for all i
# 2. For all n:
# 2i. Compute u_i^{n + 1} for internal space points
# 2ii. Set boundary values for i = 0 and i = N_x
M = 40 # number of grid points for space interval
N = 70 # '' '' '' '' '' time ''
x0 = 0
xL = 1 # unit grid differences
dx = (xL - x0) / (M - 1) # space step
t0 = 0
tF = 0.2
dt = (tF - t0) / (N - 1)
D = 0.3 # thermal diffusivity
a = D * dt / dx**2
# Create grid
tspan = np.linspace(t0, tF, N)
xspan = np.linspace(x0, xL, M)
# Initial matrix solution
U = np.zeros((M, N))
# Initial condition
U[:, 0] = 4*xspan - 4*xspan**2
# Boundary conditions
U[0, :] = 0
U[-1, 0] = 0
# Discretised derivative formula
for k in range(0, N-1):
for i in range(1, M-1):
U[i, k+1] = a * U[i-1, k] + (1 - 2 * a) * U[i, k] + a * U[i + 1, k]
X, T = np.meshgrid(tspan, xspan)
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, T, U, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
ax.set_xticks([0, 0.05, 0.1, 0.15, 0.2])
ax.set_xlabel('Space')
ax.set_ylabel('Time')
ax.set_zlabel('U')
plt.tight_layout()
plt.show()
edit: Changed therm diff value to correct one.
The main problem is the time step length. If you look at the differential equation, the numerics become unstable for a>0.5. Translated this means for you that roughly N > 190. I get a nice picture if I increase your N to such value.
However, I thing somewhere the time and space axes are swapped (if you try to interpret the graph then, i.e. boundary conditions and expected dampening of profile over time). I cannot figure out right now why.
Edit: Actually, you swap T and X when you do meshgrid. This should work:
N = 200
...
...
T, X = np.meshgrid(tspan, xspan)
...
surf = ax.plot_surface(T, X, U, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
...
ax.set_xlabel('Time')
ax.set_ylabel('Space')

Solving a complex BVP problem with scipy solve_bvp

I have a complex BVP problem (almost Shrodinger equation). My code works but the solution is completely wrong or just equal to zero. What am I doing wrong?
I also have obtained right solution by Maple, there was no problem. Also I don't understand why there is no problem with plot when it has to be complex-valued function.
import numpy as np
from scipy.integrate import odeint
from scipy.integrate import solve_bvp as bvp
import matplotlib.pyplot as plt
mp = 938.2720813
mn = 939.5654133
mu = (mn + mp)/4
h2m = hbar**2/(2*mu)
V0 = 20
Rv = 1.5
Q0 = 1.5
Rq = 4.5
EIm = 0.3
ERe = 1
V = lambda x : -V0*np.exp(-x/Rv)
Q = lambda x : -Q0*np.exp(-x/Rq)
def fun(x, y):
return np.vstack((y[1], -( Q(x)/ h2m ) - ((ERe + 1j * EIm) *y[0]/ h2m ) + V(x)*y[0]/h2m - (2/y[0])* y[1]))
def bc(ya, yb):
return np.array([ya[0], yb[0]])
x = np.linspace(0, 1000, 10000)
y_a = np.zeros((2, x.size), dtype=np.complex)
# print(x.size)
i = 0
while i < x.size - 1:
i = i + 1
y_a[0, i] = 1000* 1j
y_a[1, i] = 1j
from scipy.integrate import solve_bvp
res_a = solve_bvp(fun, bc, x, y_a)
x_plot = np.linspace(0, 1000, 10000)
y_plot_a = res_a.sol(x_plot)[0]
import matplotlib.pyplot as plt
plt.plot(x_plot, y_plot_a, label='y_a')
plt.legend()
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Upd: I fixed a mistake in the equation.
Result is still wrong. But there is another error - division by zero. How to avoid it? If I choose x = np.linspace(0.1, 1000, 10000) for example it doesn't help.

emcee walkers burn in but then remain the same

I'm having an issue using emcee. Its a simple enough 3 parameter fit but occasionally (only has occurred in two scenarios so far despite much use) my walkers burn in just fine but then do not move (see figure below). The acceptance fraction reported is 0.
Has anyone else encountered this issue before? I have tried varying my initial conditions and number of walkers and iterations etc. This piece of code has been running well on very similar data sets. Its not a challenging parameter space and it seems unlikely that the walker would be getting "stuck".
Any ideas? I'm stumped... my walkers are lazy it seems...
Sample code below (and sample data file). This code + data file fail when I run it.
`
import numpy as n
import math
import pylab as py
import matplotlib.pyplot as plt
import scipy
from scipy.optimize import curve_fit
from scipy import ndimage
import pyfits
from scipy import stats
import emcee
import corner
import scipy.optimize as op
import matplotlib.pyplot as pl
from matplotlib.ticker import MaxNLocator
def sersic(x, B,r_s,m):
return B * n.exp(-1.0 * (1.9992*m - 0.3271) * ( (x/r_s)**(1.0/m) - 1.))
def lnprior(theta):
B,r_s,m, lnf = theta
if 0.0 < B < 500.0 and 0.5 < m < 10. and r_s > 0. and -10.0 < lnf < 1.0:
return 0.0
return -n.inf
def lnlike(theta, x, y, yerr): #"least squares"
B,r_s,m, lnf = theta
model = sersic(x,B, r_s, m)
inv_sigma2 = 1.0/(yerr**2 + model**2*n.exp(2*lnf))
return -0.5*(n.sum((y-model)**2*inv_sigma2 - n.log(inv_sigma2)))
def lnprob(theta, x, y, yerr):#kills based on priors
lp = lnprior(theta)
if not n.isfinite(lp):
return -n.inf
return lp + lnlike(theta, x, y, yerr)
profile=open("testprofile.dat",'r') #read in the data file
profilelines=profile.readlines()
x=n.empty(len(profilelines))
y=n.empty(len(profilelines))
yerr=n.empty(len(profilelines))
for i,line in enumerate(profilelines):
col=line.split()
x[i]=col[0]
y[i]=col[1]
yerr[i]=col[2]
# Find the maximum likelihood value.
chi2 = lambda *args: -2 * lnlike(*args)
result = op.minimize(chi2, [50,2.0,0.5,0.5], args=(x, y, yerr))
B_ml, rs_ml,m_ml, lnf_ml = result["x"]
print("""Maximum likelihood result:
B = {0}
r_s = {1}
m = {2}
""".format(B_ml, rs_ml,m_ml))
# Set up the sampler.
ndim, nwalkers = 4, 4000
pos = [result["x"] + 1e-4*n.random.randn(ndim) for i in range(nwalkers)]
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(x, y, yerr))
# Clear and run the production chain.
print("Running MCMC...")
Niter = 2000 #2000
sampler.run_mcmc(pos, Niter, rstate0=n.random.get_state())
print("Done.")
# Print out the mean acceptance fraction.
af = sampler.acceptance_fraction
print "Mean acceptance fraction:", n.mean(af)
# Plot sampler chain
pl.clf()
fig, axes = pl.subplots(3, 1, sharex=True, figsize=(8, 9))
axes[0].plot(sampler.chain[:, :, 0].T, color="k", alpha=0.4)
axes[0].yaxis.set_major_locator(MaxNLocator(5))
axes[0].set_ylabel("$B$")
axes[1].plot(sampler.chain[:, :, 1].T, color="k", alpha=0.4)
axes[1].yaxis.set_major_locator(MaxNLocator(5))
axes[1].set_ylabel("$r_s$")
axes[2].plot(n.exp(sampler.chain[:, :, 2]).T, color="k", alpha=0.4)
axes[2].yaxis.set_major_locator(MaxNLocator(5))
axes[2].set_xlabel("step number")
fig.tight_layout(h_pad=0.0)
fig.savefig("line-time_test.png")
# plot MCMC fit
burnin = 100
samples = sampler.chain[:, burnin:, :3].reshape((-1, ndim-1))
B_mcmc, r_s_mcmc, m_mcmc = map(lambda v: (v[0]),
zip(*n.percentile(samples, [50],
axis=0)))
print("""MCMC result:
B = {0}
r_s = {1}
m = {2}
""".format(B_mcmc, r_s_mcmc, m_mcmc))
pl.close()
# Make the triangle plot.
burnin = 50
samples = sampler.chain[:, burnin:, :3].reshape((-1, ndim-1))
fig = corner.corner(samples, labels=["$B$", "$r_s$", "$m$"])
fig.savefig("line-triangle_test.png")
Here's a better result. I made the random initial samples not so close to the maximum likelihood value and run the chain for a lot more steps with fewer walkers/chains. Notice that I'm plotting the m parameter and not its exponential, as you did.
The mean acceptance fraction is ~0.48, and it took about 1 min to run in my laptop. You can of course add more steps and get an even better result.
import numpy as n
import emcee
import corner
import scipy.optimize as op
import matplotlib.pyplot as pl
from matplotlib.ticker import MaxNLocator
def sersic(x, B, r_s, m):
return B * n.exp(
-1.0 * (1.9992 * m - 0.3271) * ((x / r_s)**(1.0 / m) - 1.))
def lnprior(theta):
B, r_s, m, lnf = theta
if 0.0 < B < 500.0 and 0.5 < m < 10. and r_s > 0. and -10.0 < lnf < 1.0:
return 0.0
return -n.inf
def lnlike(theta, x, y, yerr): # "least squares"
B, r_s, m, lnf = theta
model = sersic(x, B, r_s, m)
inv_sigma2 = 1.0 / (yerr**2 + model**2 * n.exp(2 * lnf))
return -0.5 * (n.sum((y - model)**2 * inv_sigma2 - n.log(inv_sigma2)))
def lnprob(theta, x, y, yerr): # kills based on priors
lp = lnprior(theta)
if not n.isfinite(lp):
return -n.inf
return lp + lnlike(theta, x, y, yerr)
profile = open("testprofile.dat", 'r') # read in the data file
profilelines = profile.readlines()
x = n.empty(len(profilelines))
y = n.empty(len(profilelines))
yerr = n.empty(len(profilelines))
for i, line in enumerate(profilelines):
col = line.split()
x[i] = col[0]
y[i] = col[1]
yerr[i] = col[2]
# Find the maximum likelihood value.
chi2 = lambda *args: -2 * lnlike(*args)
result = op.minimize(chi2, [50, 2.0, 0.5, 0.5], args=(x, y, yerr))
B_ml, rs_ml, m_ml, lnf_ml = result["x"]
print("""Maximum likelihood result:
B = {0}
r_s = {1}
m = {2}
lnf = {3}
""".format(B_ml, rs_ml, m_ml, lnf_ml))
# Set up the sampler.
ndim, nwalkers = 4, 10
pos = [result["x"] + 1e-1 * n.random.randn(ndim) for i in range(nwalkers)]
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(x, y, yerr))
# Clear and run the production chain.
print("Running MCMC...")
Niter = 50000
sampler.run_mcmc(pos, Niter, rstate0=n.random.get_state())
print("Done.")
# Print out the mean acceptance fraction.
af = sampler.acceptance_fraction
print("Mean acceptance fraction:", n.mean(af))
# Plot sampler chain
pl.clf()
fig, axes = pl.subplots(3, 1, sharex=True, figsize=(8, 9))
axes[0].plot(sampler.chain[:, :, 0].T, color="k", alpha=0.4)
axes[0].yaxis.set_major_locator(MaxNLocator(5))
axes[0].set_ylabel("$B$")
axes[1].plot(sampler.chain[:, :, 1].T, color="k", alpha=0.4)
axes[1].yaxis.set_major_locator(MaxNLocator(5))
axes[1].set_ylabel("$r_s$")
# axes[2].plot(n.exp(sampler.chain[:, :, 2]).T, color="k", alpha=0.4)
axes[2].plot(sampler.chain[:, :, 2].T, color="k", alpha=0.4)
axes[2].yaxis.set_major_locator(MaxNLocator(5))
axes[2].set_ylabel("$m$")
axes[2].set_xlabel("step number")
fig.tight_layout(h_pad=0.0)
fig.savefig("line-time_test.png")
# plot MCMC fit
burnin = 10000
samples = sampler.chain[:, burnin:, :3].reshape((-1, ndim - 1))
B_mcmc, r_s_mcmc, m_mcmc = map(
lambda v: (v[0]), zip(*n.percentile(samples, [50], axis=0)))
print("""MCMC result:
B = {0}
r_s = {1}
m = {2}
""".format(B_mcmc, r_s_mcmc, m_mcmc))
pl.close()
# Make the triangle plot.
burnin = 50
samples = sampler.chain[:, burnin:, :3].reshape((-1, ndim - 1))
fig = corner.corner(samples, labels=["$B$", "$r_s$", "$m$"])
fig.savefig("line-triangle_test.png")

Fitting Data to a Square-root or Logarithmic Function [duplicate]

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic).
I use Python and Numpy and for polynomial fitting there is a function polyfit(). But I found no such functions for exponential and logarithmic fitting.
Are there any? Or how to solve it otherwise?
For fitting y = A + B log x, just fit y against (log x).
>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> numpy.polyfit(numpy.log(x), y, 1)
array([ 8.46295607, 6.61867463])
# y ≈ 8.46 log(x) + 6.62
For fitting y = AeBx, take the logarithm of both side gives log y = log A + Bx. So fit (log y) against x.
Note that fitting (log y) as if it is linear will emphasize small values of y, causing large deviation for large y. This is because polyfit (linear regression) works by minimizing ∑i (ΔY)2 = ∑i (Yi − Ŷi)2. When Yi = log yi, the residues ΔYi = Δ(log yi) ≈ Δyi / |yi|. So even if polyfit makes a very bad decision for large y, the "divide-by-|y|" factor will compensate for it, causing polyfit favors small values.
This could be alleviated by giving each entry a "weight" proportional to y. polyfit supports weighted-least-squares via the w keyword argument.
>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> numpy.polyfit(x, numpy.log(y), 1)
array([ 0.10502711, -0.40116352])
# y ≈ exp(-0.401) * exp(0.105 * x) = 0.670 * exp(0.105 * x)
# (^ biased towards small values)
>>> numpy.polyfit(x, numpy.log(y), 1, w=numpy.sqrt(y))
array([ 0.06009446, 1.41648096])
# y ≈ exp(1.42) * exp(0.0601 * x) = 4.12 * exp(0.0601 * x)
# (^ not so biased)
Note that Excel, LibreOffice and most scientific calculators typically use the unweighted (biased) formula for the exponential regression / trend lines. If you want your results to be compatible with these platforms, do not include the weights even if it provides better results.
Now, if you can use scipy, you could use scipy.optimize.curve_fit to fit any model without transformations.
For y = A + B log x the result is the same as the transformation method:
>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> scipy.optimize.curve_fit(lambda t,a,b: a+b*numpy.log(t), x, y)
(array([ 6.61867467, 8.46295606]),
array([[ 28.15948002, -7.89609542],
[ -7.89609542, 2.9857172 ]]))
# y ≈ 6.62 + 8.46 log(x)
For y = AeBx, however, we can get a better fit since it computes Δ(log y) directly. But we need to provide an initialize guess so curve_fit can reach the desired local minimum.
>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t), x, y)
(array([ 5.60728326e-21, 9.99993501e-01]),
array([[ 4.14809412e-27, -1.45078961e-08],
[ -1.45078961e-08, 5.07411462e+10]]))
# oops, definitely wrong.
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t), x, y, p0=(4, 0.1))
(array([ 4.88003249, 0.05531256]),
array([[ 1.01261314e+01, -4.31940132e-02],
[ -4.31940132e-02, 1.91188656e-04]]))
# y ≈ 4.88 exp(0.0553 x). much better.
You can also fit a set of a data to whatever function you like using curve_fit from scipy.optimize. For example if you want to fit an exponential function (from the documentation):
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
x = np.linspace(0,4,50)
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))
popt, pcov = curve_fit(func, x, yn)
And then if you want to plot, you could do:
plt.figure()
plt.plot(x, yn, 'ko', label="Original Noised Data")
plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve")
plt.legend()
plt.show()
(Note: the * in front of popt when you plot will expand out the terms into the a, b, and c that func is expecting.)
I was having some trouble with this so let me be very explicit so noobs like me can understand.
Lets say that we have a data file or something like that
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
import sympy as sym
"""
Generate some data, let's imagine that you already have this.
"""
x = np.linspace(0, 3, 50)
y = np.exp(x)
"""
Plot your data
"""
plt.plot(x, y, 'ro',label="Original Data")
"""
brutal force to avoid errors
"""
x = np.array(x, dtype=float) #transform your data in a numpy array of floats
y = np.array(y, dtype=float) #so the curve_fit can work
"""
create a function to fit with your data. a, b, c and d are the coefficients
that curve_fit will calculate for you.
In this part you need to guess and/or use mathematical knowledge to find
a function that resembles your data
"""
def func(x, a, b, c, d):
return a*x**3 + b*x**2 +c*x + d
"""
make the curve_fit
"""
popt, pcov = curve_fit(func, x, y)
"""
The result is:
popt[0] = a , popt[1] = b, popt[2] = c and popt[3] = d of the function,
so f(x) = popt[0]*x**3 + popt[1]*x**2 + popt[2]*x + popt[3].
"""
print "a = %s , b = %s, c = %s, d = %s" % (popt[0], popt[1], popt[2], popt[3])
"""
Use sympy to generate the latex sintax of the function
"""
xs = sym.Symbol('\lambda')
tex = sym.latex(func(xs,*popt)).replace('$', '')
plt.title(r'$f(\lambda)= %s$' %(tex),fontsize=16)
"""
Print the coefficients and plot the funcion.
"""
plt.plot(x, func(x, *popt), label="Fitted Curve") #same as line above \/
#plt.plot(x, popt[0]*x**3 + popt[1]*x**2 + popt[2]*x + popt[3], label="Fitted Curve")
plt.legend(loc='upper left')
plt.show()
the result is:
a = 0.849195983017 , b = -1.18101681765, c = 2.24061176543, d = 0.816643894816
Here's a linearization option on simple data that uses tools from scikit learn.
Given
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import FunctionTransformer
np.random.seed(123)
# General Functions
def func_exp(x, a, b, c):
"""Return values from a general exponential function."""
return a * np.exp(b * x) + c
def func_log(x, a, b, c):
"""Return values from a general log function."""
return a * np.log(b * x) + c
# Helper
def generate_data(func, *args, jitter=0):
"""Return a tuple of arrays with random data along a general function."""
xs = np.linspace(1, 5, 50)
ys = func(xs, *args)
noise = jitter * np.random.normal(size=len(xs)) + jitter
xs = xs.reshape(-1, 1) # xs[:, np.newaxis]
ys = (ys + noise).reshape(-1, 1)
return xs, ys
transformer = FunctionTransformer(np.log, validate=True)
Code
Fit exponential data
# Data
x_samp, y_samp = generate_data(func_exp, 2.5, 1.2, 0.7, jitter=3)
y_trans = transformer.fit_transform(y_samp) # 1
# Regression
regressor = LinearRegression()
results = regressor.fit(x_samp, y_trans) # 2
model = results.predict
y_fit = model(x_samp)
# Visualization
plt.scatter(x_samp, y_samp)
plt.plot(x_samp, np.exp(y_fit), "k--", label="Fit") # 3
plt.title("Exponential Fit")
Fit log data
# Data
x_samp, y_samp = generate_data(func_log, 2.5, 1.2, 0.7, jitter=0.15)
x_trans = transformer.fit_transform(x_samp) # 1
# Regression
regressor = LinearRegression()
results = regressor.fit(x_trans, y_samp) # 2
model = results.predict
y_fit = model(x_trans)
# Visualization
plt.scatter(x_samp, y_samp)
plt.plot(x_samp, y_fit, "k--", label="Fit") # 3
plt.title("Logarithmic Fit")
Details
General Steps
Apply a log operation to data values (x, y or both)
Regress the data to a linearized model
Plot by "reversing" any log operations (with np.exp()) and fit to original data
Assuming our data follows an exponential trend, a general equation+ may be:
We can linearize the latter equation (e.g. y = intercept + slope * x) by taking the log:
Given a linearized equation++ and the regression parameters, we could calculate:
A via intercept (ln(A))
B via slope (B)
Summary of Linearization Techniques
Relationship | Example | General Eqn. | Altered Var. | Linearized Eqn.
-------------|------------|----------------------|----------------|------------------------------------------
Linear | x | y = B * x + C | - | y = C + B * x
Logarithmic | log(x) | y = A * log(B*x) + C | log(x) | y = C + A * (log(B) + log(x))
Exponential | 2**x, e**x | y = A * exp(B*x) + C | log(y) | log(y-C) = log(A) + B * x
Power | x**2 | y = B * x**N + C | log(x), log(y) | log(y-C) = log(B) + N * log(x)
+Note: linearizing exponential functions works best when the noise is small and C=0. Use with caution.
++Note: while altering x data helps linearize exponential data, altering y data helps linearize log data.
Well I guess you can always use:
np.log --> natural log
np.log10 --> base 10
np.log2 --> base 2
Slightly modifying IanVS's answer:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
#return a * np.exp(-b * x) + c
return a * np.log(b * x) + c
x = np.linspace(1,5,50) # changed boundary conditions to avoid division by 0
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))
popt, pcov = curve_fit(func, x, yn)
plt.figure()
plt.plot(x, yn, 'ko', label="Original Noised Data")
plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve")
plt.legend()
plt.show()
This results in the following graph:
We demonstrate features of lmfit while solving both problems.
Given
import lmfit
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
np.random.seed(123)
# General Functions
def func_log(x, a, b, c):
"""Return values from a general log function."""
return a * np.log(b * x) + c
# Data
x_samp = np.linspace(1, 5, 50)
_noise = np.random.normal(size=len(x_samp), scale=0.06)
y_samp = 2.5 * np.exp(1.2 * x_samp) + 0.7 + _noise
y_samp2 = 2.5 * np.log(1.2 * x_samp) + 0.7 + _noise
Code
Approach 1 - lmfit Model
Fit exponential data
regressor = lmfit.models.ExponentialModel() # 1
initial_guess = dict(amplitude=1, decay=-1) # 2
results = regressor.fit(y_samp, x=x_samp, **initial_guess)
y_fit = results.best_fit
plt.plot(x_samp, y_samp, "o", label="Data")
plt.plot(x_samp, y_fit, "k--", label="Fit")
plt.legend()
Approach 2 - Custom Model
Fit log data
regressor = lmfit.Model(func_log) # 1
initial_guess = dict(a=1, b=.1, c=.1) # 2
results = regressor.fit(y_samp2, x=x_samp, **initial_guess)
y_fit = results.best_fit
plt.plot(x_samp, y_samp2, "o", label="Data")
plt.plot(x_samp, y_fit, "k--", label="Fit")
plt.legend()
Details
Choose a regression class
Supply named, initial guesses that respect the function's domain
You can determine the inferred parameters from the regressor object. Example:
regressor.param_names
# ['decay', 'amplitude']
To make predictions, use the ModelResult.eval() method.
model = results.eval
y_pred = model(x=np.array([1.5]))
Note: the ExponentialModel() follows a decay function, which accepts two parameters, one of which is negative.
See also ExponentialGaussianModel(), which accepts more parameters.
Install the library via > pip install lmfit.
Wolfram has a closed form solution for fitting an exponential. They also have similar solutions for fitting a logarithmic and power law.
I found this to work better than scipy's curve_fit. Especially when you don't have data "near zero". Here is an example:
import numpy as np
import matplotlib.pyplot as plt
# Fit the function y = A * exp(B * x) to the data
# returns (A, B)
# From: https://mathworld.wolfram.com/LeastSquaresFittingExponential.html
def fit_exp(xs, ys):
S_x2_y = 0.0
S_y_lny = 0.0
S_x_y = 0.0
S_x_y_lny = 0.0
S_y = 0.0
for (x,y) in zip(xs, ys):
S_x2_y += x * x * y
S_y_lny += y * np.log(y)
S_x_y += x * y
S_x_y_lny += x * y * np.log(y)
S_y += y
#end
a = (S_x2_y * S_y_lny - S_x_y * S_x_y_lny) / (S_y * S_x2_y - S_x_y * S_x_y)
b = (S_y * S_x_y_lny - S_x_y * S_y_lny) / (S_y * S_x2_y - S_x_y * S_x_y)
return (np.exp(a), b)
xs = [33, 34, 35, 36, 37, 38, 39, 40, 41, 42]
ys = [3187, 3545, 4045, 4447, 4872, 5660, 5983, 6254, 6681, 7206]
(A, B) = fit_exp(xs, ys)
plt.figure()
plt.plot(xs, ys, 'o-', label='Raw Data')
plt.plot(xs, [A * np.exp(B *x) for x in xs], 'o-', label='Fit')
plt.title('Exponential Fit Test')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(loc='best')
plt.tight_layout()
plt.show()

How to make my pylab.poly1d(fit) pass through zero?

My code bellow produces a polyfit of the points in my graph, but I want this fit to always pass through zero, how do I do this?
import pylab as pl
import numpy as np
y=(abs((UX2-UY2)+(2*UXY)))
a=np.mean(y)
y=y-a
x=(abs((X2-Y2)+(2*XY)))
b=np.mean(x)
x=x-b
ax=pl.subplot(1,4,4) #plot XY
fit=pl.polyfit(x,y,1)
slope4, fit_fn=pl.poly1d(fit)
print slope4
fit_fn=pl.poly1d(fit)
x_min=-2
x_max=5
n=10000
x_fit = pl.linspace(x_min, x_max, n)
y_fit = fit_fn(x_fit)
q=z=[-2,5]
scat=pl.plot(x,y, 'o', x_fit,y_fit, '-r', z, q, 'g' )
When you fit an n-degree polynomial p(x) = a0 + a1*x + a2*x**2 + ... + an*x**n to a set of data points (x0, y0), (x1, y1), ..., (xm, y_m), a call to np.lstsq is made with a coefficient matrix that looks like:
[1 x0 x0**2 ... x0**n]
[1 x1 x1**2 ... x1**n]
...
[1 xm xm**2 ... xm**n]
If you remove the j-th column from that matrix, you are effectively setting that coefficient in the polynomial to 0. So to get rid of the a0 coefficient you could do the following:
def fit_poly_through_origin(x, y, n=1):
a = x[:, np.newaxis] ** np.arange(1, n+1)
coeff = np.linalg.lstsq(a, y)[0]
return np.concatenate(([0], coeff))
n = 1000
x = np.random.rand(n)
y = 1 + 3*x - 4*x**2 + np.random.rand(n)*0.25
c0 = np.polynomial.polynomial.polyfit(x, y, 2)
c1 = fit_poly_through_origin(x, y, 2)
p0 = np.polynomial.Polynomial(c0)
p1 = np.polynomial.Polynomial(c1)
plt.plot(x, y, 'kx')
xx = np.linspace(0, 1, 1000)
plt.plot(xx, p0(xx), 'r-', )
plt.plot(xx, p1(xx), 'b-', )
As was mentioned, you can't really do it explicitly with polyfit (but you can write your own function).
However, if you want to still use polyfit() you can try this math hack: add a point at zero, and then use the w flag (weights) in polyfit() to give it a high weight while all other points get a low weight. This will have the effect of forcing the polynomial to pass at zero or very close.

Categories