nelder-mead, finding error on fitted parameters

nelder-mead, finding error on fitted parameters - python

I've been starting to use the minimizer in scipy.optimize and for most parameters I've tried to fit, the default method BFGS has worked just fine. The method helpfully reports the inverse of the Hessian matrix from which I can extract the errors on the fitted parameters from the diagonals of the matrix. However, for some new parameters that I'm trying to fit, the values are quite small and I run into precision errors using BFGS.
Switching to Nelder-Mead does the job, but I don't know how to extract the uncertainties from the fitted parameters using this method.
How can I extract the uncertainties for the fitted parameters using the Nelder-Mead in scipy.optimize()?

Related

Can scipy.optimize.least_squares be converted to handle IRLS?

I typically fit GLM's using statsmodels. The output always mentions that the method fits using IRLS.
I then fit a regression problem using scipy.optimize.least_squares but notice that the coefficients are somewhat different. Is this because scipy.optimize.least_squares employs ordinary least squares? If so, is there any workaround to try and implement IRLS into the scipy.optimize.least_squares function.
Note: I cannot use ordinary statsmodels GLM as my model requires random parameters.

Chi-square value from lmfit

I have been trying to do a pixel-to-pixel fitting of a set of images ie I have data at different wavelengths in different images and I am trying to fit a function for each pixel individually. I have done the fitting using lmfit and obtained the values of the unknown parameters for each pixel. Now, I want to obtain the chi-squared value for each fit. I know that lmfit has this attribute called chisqr which can give me the same but what is confusing me is this line from the lmfit github site:
"Note that the calculation of chi-square and reduced chi-square assume that the returned residual function is scaled properly to the uncertainties in the data. For these statistics to be meaningful, the person writing the function to be minimized must scale them properly."
I doubt that the values I am getting from the chisqr attribute are not exactly right and some scaling needs to be done. Can somebody please explain how lmfit calculates the chisquare value and what scaling am I required to do?
This is a sample of my fitting function
def fcn2fit(params,freq,F,sigma):
colden=params['colden'].value
tk=params['tk'].value
model = greybodyfit(np.array(freq),colden,tk)
return (model - F)/sigma
colden and tk are the free parameters, freq is the independent variable and F is the dependent variable, sigma is the error in F. Is returning (model-F)/sigma the right way of scaling the residuals so that the chisqr attribute gives the correct chi-square value?

The value reported for chi-square is the sum of the square of the residual for the fit. Lmfit cannot tell whether that residual function is properly scaled by the standard error of the data - this scaling must be done in the objective function if you are using lmfit.minimize or passed in as weights if using lmfit.Model.

How to provide custom gradients to HMC sampler in tensorflow-probability?

I am trying to use the in-built HMC sampler of tensorflow-probability to generate samples from the posterior. According to documentation, it seems like one has to provide (possibly unnormalized) log density of posterior to target_log_prob_fn callable and tensorflow-probability automatically computes its gradient (with respect to parameters to be inferred) to perform Hamiltonian MCMC updates.
However for my application, the likelihood and the gradient of resulting posterior are computed outside of tensorflow (it involves solution of a partial differential equation and I can compute it efficiently using some other python library). So I was wondering is there a way I can somehow directly pass target_log_prob_fn the (unnormalized) log density of posterior and its gradient to perform Hamiltonian MCMC update? In other words, is there a way I can ask the HMC sampler to use the gradients provided by me to perform MCMC update?
I found a related question over here, but it does not exactly answer my question.

Statsmodels - Negative Binomial doesn't converge while GLM does converge

I'm trying to do a Negative Binomial regression using Python's statsmodels package. The model estimates fine when using the GLM routine i.e.
model = smf.glm(formula="Sales_Focus_2016 ~ Sales_Focus_2015 + A_Calls + A_Ed", data=df, family=sm.families.NegativeBinomial()).fit()
model.summary()
However, the GLM routine doesn't estimate alpha, the dispersion term. I tried to use the Negative Binomial routine directly (which does estimate alpha) i.e.
nb = smf.negativebinomial(formula="Sales_Focus_2016 ~ Sales_Focus_2015 + A_Calls + A_Ed", data=df).fit()
nb.summary()
But this doesn't converge. Instead I get the message:
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: nan
Iterations: 0
Function evaluations: 1
Gradient evaluations: 1
My question is:
Do the two routines use different methods of estimation? Is there a way to make the smf.NegativeBinomial routine use the same estimation methods as the GLM routine?

discrete.NegativeBinomial uses either a newton method (default) in statsmodels or the scipy optimizers. The main problem is that the exponential mean function can easily result in overflow problems or problems from large gradients and hessian when we are still far away from the optimum. There are some attempts in the fit method to get good starting values but this does not always work.
a few possibilities that I usually try
check that no regressor has large values, e.g. rescale to have max below 10
use method='nm' Nelder-Mead as initial optimizer and switch to newton or bfgs after some iterations or after convergence.
try to come up with better starting values (see for example about GLM below)
GLM uses by default iteratively reweighted least squares, IRLS, which is only standard for one parameter families, i.e. it takes the dispersion parameter as given. So the same method cannot be directly used for the full MLE in discrete NegativeBinomial.
GLM negative binomial still specifies the full loglike. So it is possible to do a grid search over the dispersion parameter using GLM.fit() for estimating the mean parameters for each value of the dispersion parameter. This should be equivalent to the corresponding discrete NegativeBinomial version (nb2 ? I don't remember). It could also be used as start_params for the discrete version.
In the statsmodels master version, there is now a connection to allow arbitrary scipy optimizers instead of the ones that were hardcoded. scipy recently obtained trust region newton methods, and will get more in future, which should work for more cases than the simple newton method in statsmodels.
(However, most likely that does not work currently for discrete NegativeBinomial, I just found out about a possible problem https://github.com/statsmodels/statsmodels/issues/3747 )

How can I set, rather than fit, the co-efficients of a spline interpolation using scipy?

I am trying to train a predictive model and want to use a spline-like interpolation to represent some function that forms part of the model. However, this is not a simple case of fitting some x,y data over a region to find the spline co-efficients, rather the function being approximated by the spline forms part of a non-linear generative model. To find the co-efficients I need to use non-linear minimisation algorithms to optimise against a training data set. This means I need to be able to directly specify a set of co-efficients rather than using the fitting methods in scipy.interpolate (such as scipy.interpolate.UnivariateSpline).
Is there a way to specify spline co-efficients and then use the resulting object as a function within the model? If this isn't possible with scipy, is there another python library that supports this functionality?

If I understand you correctly, you basically want to specify your own spline coefficients as independent parameters and then evaluate them using the built-in spline functionality.
See http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.splev.html
splev takes knots and coefficients typically generated by splrep or splprep, but you should be able to bypass those routines and modify the coefficients yourself.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.