I know this one: Find p-value (significance) in scikit-learn LinearRegression
I've never extended a class in python and I'm not sure whether this is the right solution for me (I've tried but getting a TypeError). I'm calculating an elastic net regression with scikitlearn. Since my regressors are in a sparse matrix, Statsmodels package is not an option. Thus, I'm looking for a reliable solution to calculate p-values for each coefficient in my elastic net regression. Is there a solution given by scikitlearn nowadays?
Related
I am trying to evaluate the logistic model with residual plot in Python.
I searched on the internet and cannot get the info.
It seems that we can calculate the deviance residual from this answer.
from sklearn.metrics import log_loss
def deviance(X_test, true, model):
return 2*log_loss(y_true, model.predict_log_proba(X_test))
This returns a numeric value.
However, we can evaluate residuals plot when performing GLM....
It seems that there are no packages for Python to plot logistic regression residuals, pearson or deviance.
Moreover, I found a interesting package ResidualsPlot. But I'm not sure whether it can be used for logistic regression.
Any suggestion for plotting residuals plot?
In addition, I also found a resource here, which is for ols rather than logit. It seems that the calculations of residuals are a little bit different.
I'm attempting to translate R code into Python and running into trouble trying to replicate the R lm{stats} function which contains 'weights', allowing for weights to be used in the fitting process.
My ultimate goal is to simply run a weighted linear regression in Python using the statsmodels library.
Searching through the Statsmodels issues I've located caseweights in linear models #743 and SUMM/ENH rare events, unbalanced sample, matching, weights #2701 which make me think this may not be possible with Statsmodels.
Is it possible to add weights to GLM models in Statsmodels or alternatively, is there a better way to run a weighted linear regression in python?
WLS has weights for the linear model, where weights are interpreted as inverse variance for the result statistics.
http://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.WLS.html
The unreleased version of statsmodels has frequency weights for GLM, but no variance weights.
see freq_weights in http://www.statsmodels.org/dev/generated/statsmodels.genmod.generalized_linear_model.GLM.html
(There are many open issues to expand the types of weights and adding weights to other models, but those are not available yet.)
In python sklearn library, both RandomizedLogisticRegression and RandomizedLasso are supported as feature selection methods.
However, they are all using L1(Lasso) penalty, and I am not sure why both of them are implemented. In fact, I though that Lasso regression is the other term of L1-regularized logistic regression, but maybe there seems to be some difference.
I think even Linear SVM with L1 penalty(combined with resampling) will also produce the similar result.
Are there significant difference among them?
From: http://scikit-learn.org/stable/modules/feature_selection.html#randomized-l1
RandomizedLasso implements this strategy for regression settings, using the Lasso, while RandomizedLogisticRegression uses the logistic regression and is suitable for classification tasks. To get a full path of stability scores you can use lasso_stability_path.
RandomizedLasso is used for regression in which the outcome is continuous. RandomizedLogisticRegression on the other hand is for classification in which the outcome is a class label.
I would like to fit a generalized linear model with negative binomial link function and L1 regularization (lasso) in python.
Matlab provides the nice function :
lassoglm(X,y, distr)
where distr can be poisson, binomial etc.
I had a look at both statmodels and scikit-learn but I did not find any ready to use function or example that could direct me towards a solution.
In matlab it seems they minimize this:
min (1/N * Deviance(β0,β) + λ * sum(abs(β)) )
where deviance depends on the link function.
Is there a way to implement this easily with scikit or statsmodels or I should go for cvxopt?
statsmodels has had for some time a fit_regularized for the discrete models including NegativeBinomial.
http://statsmodels.sourceforge.net/devel/generated/statsmodels.discrete.discrete_model.NegativeBinomial.fit_regularized.html
which doesn't have the docstring (I just saw). The docstring for Poisson has the same information http://statsmodels.sourceforge.net/devel/generated/statsmodels.discrete.discrete_model.Poisson.fit_regularized.html
and there should be some examples available in the documentation or unit tests.
It uses an interior algorithm with either scipy slsqp or optionally, if installed, cvxopt. Compared to steepest descend or coordinate descend methods, this is only appropriate for cases where the number of features/explanatory variables is not too large.
Coordinate descend with elastic net for GLM is in a work in progress pull request and will most likely be available in statsmodels 0.8.
I am trying to fit vector autoregressive (VAR) models using the generalized linear model fitting methods included in scikit-learn. The linear model has the form y = X w, but the system matrix X has a very peculiar structure: it is block-diagonal, and all blocks are identical. To optimize performance and memory consumption the model can be expressed as Y = BW, where B is a block from X, and Y and W are now matrices instead of vectors.
The classes LinearRegression, Ridge, RidgeCV, Lasso, and ElasticNet readily accept the latter model structure. However, fitting LassoCV or ElasticNetCV fails due to Y being two-dimensional.
I found https://github.com/scikit-learn/scikit-learn/issues/2402
From this discussion I assume that the behavior of LassoCV/ElasticNetCV is intended.
Is there a way to optimize the alpha/rho parameters other than manually implementing cross-validation?
Furthermore, Bayesian regression techniques in scikit-learn also expect y to be one-dimensional. Is there any way around this?
Note: I use scikit-learn 0.14 (stable)
How crucial is the performance and memory optimization gained by using this formulation of the regression? Given that your reformulation breaks scikit-learn, I wouldn't really call it an optimization... I would suggest:
Running the unoptimized version and waiting (if possible).
Git pull the following code, which supposedly solves your problem. It's referenced in the conversation you posted from the scikit-learn github project. See here for instructions on building scikit-learn from a git pull. You can then add the branched scikit-learn location to your python path and execute your regression using the modified library code. Be sure to post your experiences and any issues you encounter; I'm sure the scikit developers would appreciate it.
To predict matrices instead of vectors there is for Lasso and ElasticNet their MultiTask* counterpart:
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.MultiTaskLasso.html
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.MultiTaskElasticNet.html