I'm trying to find maximum likelihood estimate of mu and sigma from normal distribution using minimize function form scipy. However minimazation returns expected value of mean but estimate of sigma is far from real sigma.
I define function llnorm that returns negative log-likelihood of normal distribution, then create random sample from normal distribution with mean 150 and standard deviation 10, then using optimize I am trying to find MLE.
import numpy as np
import math
import scipy.optimize as optimize
def llnorm(par, data):
n = len(data)
mu, sigma = par
ll = -np.sum(-n/2 * math.log(2*math.pi*(sigma**2)) - ((data-mu)**2)/(2 * (sigma**2)))
return ll
data = 10 * np.random.randn(100) + 150
result = optimize.minimize(llnorm, [150,10], args = (data))
Even though mean of data is close to 150 and std is close to 10, optimazation returns much smaller value of estimated sigma (close to 0).
Your math is slightly off:
ll = n*math.log(2*math.pi*(sigma**2))/2 + np.sum(((data-mu)**2)/(2 * (sigma**2)))
or
ll = np.sum(math.log(2*math.pi*(sigma**2))/2 + ((data-mu)**2)/(2 * (sigma**2)))
First I cancel the -'s (not a problem), but above all either you keep the constant term in the sum and don't multiply it by n, or you take it out and multiply it by n,... but not both at the same time.
np.random.randn creates random Gaussian distribution with variance 1 (docs here). Since you aim to have distribution with std of 10, you need to multiply with 10 * 10 instead
import numpy as np
import math
import scipy.optimize as optimize
def llnorm(par, data):
n = len(data)
mu, sigma = par
ll = -np.sum(-n/2 * math.log(2*math.pi*(sigma**2)) - ((data-mu)**2)/(2 * (sigma**2)))
return ll
data = 10 * 10 * np.random.randn(100) + 150
result = optimize.minimize(llnorm, [150,10], args = (data))
print(result)
This gives me:
fun: 36328.17002555693
hess_inv: array([[ 0.96235834, -0.32116447],
[-0.32116447, 0.10879383]])
jac: array([0., 0.])
message: 'Optimization terminated successfully.'
nfev: 44
nit: 8
njev: 11
status: 0
success: True
x: array([166.27014352, 9.15113937])
EDIT: it seems like the output of ~9 is purely coincidental. Something else needs to be investigated
Related
I'm trying to compute the following definite integral:
Integral I want to compute:
where rho_ch is
and
a = 3.66 * 10^(-15) m (in meters)
b = 0.54 * 10^(-15) m
and rho_0 = 1.23 * 10^(-35) C/m^3.
When I compute it, I get something of the order of 10^(-61), but I believe it should be something closer to 1.
I take the upper limit of the integral to 10^(-10) because the integral should have converged by then. My guess is that the problem has do with overflow in the rho_ch function, although I tried a fix for that, but it didn't work.
Does anyone have any idea of what I'm doing wrong?
import numpy as np
from scipy.constants import *
from scipy import integrate
Z_t = 20
a = 1.07*40**(1/3)
b = 0.54
rho_0 = 0.077*e
X_0 = np.array([rho_0, a, b])*10**(-15)
E = 250*10**(6)*e
p_sq = (E/c)**2-(m_e*c)**2
beta_sq = 1-(1/(E/(m_e*c**2))**2)
theta = np.pi/6
def q(theta):
return 2*np.sqrt(p_sq*np.sin(theta/2)**2)
def F_quad(theta, X):
return 4*np.pi*hbar/(Z_t*e*q(theta))*integrate.quad(integrand2, 0, 10**(-10), args=(theta, X))[0]
integrand2 = lambda r, theta, X: r*X[0]/(1+np.exp((r-X[1])/X[2]))*np.sin(q(theta)*r/hbar) if (r-X[1])/X[2] <= 600 else 0
I have the mean and the SD from a log normal distribution. However, in order to provide a sampling with from a log-normal distribution in python I need to transfer these variables into a the mean and SD of the underlying Normal distribution.
from numpy.random import seed
from numpy.random import normal
import numpy as np
mu = 25.2
sigma = 10.5
#pd.reset_option('display.float_format')
r = []
r = np.random.lognormal(mu, sigma, 1000)
for i in range(1000):
while r[i] > 64 or r[i] < 4:
y = np.random.lognormal(mu, sigma, 1)
r[i] = y[0]
df = pd.DataFrame(r, columns = ['Column_A'])
print(df)
sns.set_style("whitegrid", {'axes.grid' : False})
sns.set(rc={"figure.figsize": (8, 4)})
sns.distplot(df['Column_A'], bins = 70)
This is what I get
And this is what I want
However, I don't know how to transfer these values
If I understand correctly your post, you want to access to the underlying (mu sigma^2) parametrization of the normal distribution that produced your log-normal observations ?
TL;DR
Assuming your log-normal observations are stored in r:
mu = np.log(np.median(r))
var = 2*(np.log(np.mean(r)) - np.log(np.median(r)))
sd = np.sqrt(var)
Theoretical part
Start reading ref some statistics about log-normal distribution. It appears it's quite hard to retrieve (mu, sigma^2) from the empirical mean and variance of a log-normal sample ...
Let X be a log-normal random variable and let Y=ln(X). It appears Y follows a normal distribution with mean (mu, sigma^2). Let M ans S be the mean and variance of X. It turns out that:
M = exp(mu + sigma^2/2)
S = (exp(sigma^2) - 1) * exp(2*mu + sigma^2)
Which hardly leads to a simple expression for (mu, sigma^2).
However, according to ref, inverting your (M, S) system will be easier by replacing the variance S by either the median Med or the mode Mode since they hold a much simpler expression wrt (mu, sigma^2):
Med = exp(mu)
Mode = exp(mu - sigma^2)
The empirical median will be easier to compute through Numpy so let's assume we'll use it in our computations. The inverted system should lead to the following estimators for (mu, sigma^2):
mu = log(Med)
sigma2 = 2*(log(M) - log(Med))
Pythonic part
Supposing your log-normal observations are stored in your r array:
mu = np.log(np.median(r))
var = 2*(np.log(np.mean(r)) - np.log(np.median(r)))
sd = np.sqrt(var)
And a quick-check shows it's likely to be right :
# random log-normal sample with (mu, sigma)=(1, 2)
r = np.random.lognormal(1, 2, size=(1000000))
# estimators
mu = np.log(np.median(r))
var = 2*(np.log(np.mean(r)) - np.log(np.median(r)))
sd = np.sqrt(var)
$> mu = 1.001368782773
$> sigma = 2.0024723139
I want to find the values of board_trim and lm that will give me the lowest (closest to 0) value for Board_Moments.
For this I use scipy.optimize.minimize, but it does not converge. I really can't figure it out.
with Parameters:
displacement = 70
b = 6.5
deadrise = 20
LCG = 10
Vs_ms = 23.15 #ms
rho = 1025
mu = 1.19e-6
def Board_Moments(params):
board_trim, lm = params
displacement_N = displacement * 9.81 #kN
lp = Lp(Vs_ms, b, lm)
N = displacement_N * cos(d2r(board_trim)) #Drag Forces Perpendicular to the keel
#Taking moments about transom at height of CG
deltaM = (displacement_N * LCG) - (N * lp) #equilibrium condition
return deltaM
where lp:
def Lp(Vs_ms, b, lm):
cv = Cv(Vs_ms, b)
Lambda = Lambda_(lm, b)
Cp = 0.75 - (1 / (5.21 * (cv / Lambda)**2 + 2.39))
lp = Cp * lm
return lp
and
def Cv(Vs_ms, b):
cv = Vs_ms / (9.81 * b)**0.5
return cv
and
def Lambda_(lm, b):
lambda_ = lm / b
return lambda_
the optimization is done with:
board_trim = 2 #initial estimate
lm = 17.754 #initial estimate
x0 = [board_trim, lm]
Deltam = minimize(Board_Moments, x0, method = 'Nelder-Mead')
print(Deltam)
The error I get:
final_simplex: (array([[ 1.36119237e+01, 3.45635965e+23],
[-1.36046725e+01, 3.08439110e+23],
[ 2.07268577e+01, 2.59841956e+23]]), array([-7.64916992e+25,
-6.82618616e+25, -5.53373709e+25]))
fun: -7.649169916342451e+25
message: 'Maximum number of function evaluations has been exceeded.'
nfev: 401
nit: 220
status: 1
success: False
x: array([1.36119237e+01, 3.45635965e+23])
Any help would be much appreciated, thanks
You mention
that will give me the lowest (closest to 0) value for Board_Moments.
But minimize will search the absolute minimum. If you print the intermediate values for deltaM (which you should have done to debug your problem), you'll find they just get smaller and smaller, below zero (so -10, -100, -500 etc. That kind of progression).
To get as close to zero as possible, the solution is simply: return the absolute value of deltaM from Board_Moments:
def Board_Moments(params):
# code as before ...
deltaM = (displacement_N * LCG) - (N * lp) #equilibrium condition
# This print function would have shown the problem immediately
#print(deltaM)
# Use absolute (the built-in `abs` or `np.abs`;
# doesn't really matter for a single value)
# to get close to zero
return np.abs(deltaM)
For this particular case and fix, the result I get is:
final_simplex: (array([[ 2.32386388, 15.3390523 ],
[ 2.32394414, 15.33905343],
[ 2.32390145, 15.33905283]]), array([5.33445927e-08, 7.27723091e-08, 1.09428584e-07]))
fun: 5.334459274308756e-08
message: 'Optimization terminated successfully.'
nfev: 107
nit: 59
status: 0
success: True
x: array([ 2.32386388, 15.3390523 ])
(and if you comment out the print function, you'll see it easily converge towards zero.)
I have a question about fitting a step function using scipy routines like curve_fit. I have trouble making it vectorized, for example:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
xobs=np.linspace(0,10,100)
yl=np.random.rand(50); yr=np.random.rand(50)+100
yobs=np.concatenate((yl,yr),axis=0)
def model(x,rf,T1,T2):
#1: x=np.vectorize(x)
if x<rf:
ret= T1
else:
ret= T2
return ret
#2: model=np.vectorize(model)
popt, pcov = curve_fit(model, xobs, yobs, [40.,0.,100.])
It says
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
If I add #1 or #2 it runs but doesn't really fit the data:
OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)
[ 40. 50.51182064 50.51182064] [[ inf inf inf]
[ inf inf inf]
[ inf inf inf]]
Anybody know how to fix that? THX
Here's what I did. I retained xobs and yobs:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
xobs=np.linspace(0,10,100)
yl=np.random.rand(50); yr=np.random.rand(50)+100
yobs=np.concatenate((yl,yr),axis=0)
Now, Heaviside function must be generated. To give you an overview of this function, consider the half-maximum convention of Heaviside function:
In Python, this is equivalent to: def f(x): return 0.5 * (np.sign(x) + 1)
A sample plot would be:
xval = sorted(np.concatenate([np.linspace(-5,5,100),[0]])) # includes x = 0
yval = f(xval)
plt.plot(xval,yval,'ko-')
plt.ylim(-0.1,1.1)
plt.xlabel('x',size=18)
plt.ylabel('H(x)',size=20)
Now, plotting xobs and yobs gives:
plt.plot(xobs,yobs,'ko-')
plt.ylim(-10,110)
plt.xlabel('xobs',size=18)
plt.ylabel('yobs',size=20)
Notice that comparing the two figures, the second plot is shifted by 5 units and the maximum increases from 1.0 to 100. I infer that the function for the second plot can be represented as follows:
or in Python: (0.5 * (np.sign(x-5) + 1) * 100 = 50 * (np.sign(x-5) + 1)
Combining the plots yields (where Fit represents the above fitting function)
The plot confirms that my guess is correct. Now, assuming that YOU DO NOT KNOW how did this correct fitting function come about, a generalized fitting function is created: def f(x,a,b,c): return a * (np.sign(x-b) + c), where theoretically, a = 50, b = 5, and c = 1.
Proceed to estimation:
popt,pcov=curve_fit(f,xobs,yobs,bounds=([49,4.75,0],[50,5,2])).
Now, bounds = ([lower bound of each parameter (a,b,c)],[upper bound of each parameter]). Technically, this means that 49 < a < 50, 4.75 < b < 5, and 0 < c < 2.
Here are MY results for popt and pcov:
pcov represents the estimated covariance of popt. The diagonals provide the variance of the parameter estimate [Source].
Results show that the parameter estimates pcov are near the theoretical values.
Basically, a generalized Heaviside function can be represented by: a * (np.sign(x-b) + c)
Here is the code that will generate parameter estimates and the corresponding covariances:
import numpy as np
from scipy.optimize import curve_fit
xobs = np.linspace(0,10,100)
yl = np.random.rand(50); yr=np.random.rand(50)+100
yobs = np.concatenate((yl,yr),axis=0)
def f(x,a,b,c): return a * (np.sign(x-b) + c) # Heaviside fitting function
popt, pcov = curve_fit(f,xobs,yobs,bounds=([49,4.75,0],[50,5,2]))
print 'popt = %s' % popt
print 'pcov = \n %s' % pcov
Finally, note that the estimates of popt and pcov vary.
pythonscipy
This question is pretty old, but in case it can be useful to other people: The Heaviside function is not differentiable at the step, and this is causing issues in the minimization. In such cases, I fit a logistic function, as shown below.
Fitting a heaviside function always fails in my case.
x = np.linspace(0,10,101)
y = np.heaviside((x-5), 0.)
def sigmoid(x, x0,b):
return scipy.special.expit((x-x0)*b)
args, cov = optim.curve_fit(sigmoid, x, y)
plt.scatter(x,y)
plt.plot(x, sigmoid(x, *args))
print(args)
>
[ 5.05006427 532.21427701]
What do I have to use to figure out the inverse probability density function for normal distribution? I'm using scipy to find out normal distribution probability density function:
from scipy.stats import norm
norm.pdf(1000, loc=1040, scale=210)
0.0018655737107410499
How can I figure out that 0.0018 probability corresponds to 1000 in the given normal distribution?
There can be no 1:1 mapping from probability density to quantile.
Because the PDF of the normal distribution is quadratic, there can be either 2, 1 or zero quantiles that have a particular probability density.
Update
It's actually not that hard to find the roots analytically. The PDF of a normal distribution is given by:
With a bit of rearrangement we get:
(x - mu)**2 = -2 * sigma**2 * log( pd * sigma * sqrt(2 * pi))
If the discriminant on the RHS is < 0, there are no real roots. If it equals zero, there is a single root (where x = mu), and where it is > 0 there are two roots.
To put it all together into a function:
import numpy as np
def get_quantiles(pd, mu, sigma):
discrim = -2 * sigma**2 * np.log(pd * sigma * np.sqrt(2 * np.pi))
# no real roots
if discrim < 0:
return None
# one root, where x == mu
elif discrim == 0:
return mu
# two roots
else:
return mu - np.sqrt(discrim), mu + np.sqrt(discrim)
This gives the desired quantile(s), to within rounding error:
from scipy.stats import norm
pd = norm.pdf(1000, loc=1040, scale=210)
print get_quantiles(pd, 1040, 210)
# (1000.0000000000001, 1079.9999999999998)
import scipy.stats as stats
import scipy.optimize as optimize
norm = stats.norm(loc=1040, scale=210)
y = norm.pdf(1000)
print(y)
# 0.00186557371074
print(optimize.fsolve(lambda x:norm.pdf(x)-y, norm.mean()-norm.std()))
# [ 1000.]
print(optimize.fsolve(lambda x:norm.pdf(x)-y, norm.mean()+norm.std()))
# [ 1080.]
There exist distributions which attain any value an infinite number of times. (For example, the simple function with value 1 on an infinite sequence of intervals with lengths 1/2, 1/4, 1/8, etc. attains the value 1 an infinite number of times. And it is a distribution since 1/2 + 1/4 + 1/8 + ... = 1)
So the use of fsolve above is not guaranteed to find all values of x where pdf(x) equals a certain value, but it may help you find some root.