I am using sklearn for training and testing my data. I want to use lasso and elastic net regressors with some kernel instead of using Linear model. Is there a possible way in which this can be done?
Related
Let's say I have the following pipeline : GridSearchCV(MultiOutputRegressor(Regressor)).
I am training a model on multiple Target using the MultiOutputRegressor.
How does GridSearchCV operate when it comes to optimizing hyperparameter ?
Does it find optimal hyperparameter for each target individually or in average the best ?
I read about kernel regression:
https://en.wikipedia.org/wiki/Kernel_regression
Does sklearn contains this regression ?
I saw sklearn.kernel_ridge.KernelRidge but it don't seems to be the same.
Do I need to implement the kernel regression or sklearn has it's own kernel regression models (with different types of kernels) ?
I have trained the binary classification model using AWS built-in algorithm with SageMaker and want to evaluate the model using the AUC and confusion matrix. However, I see that SageMaker's Training and HyperTuner job just accepts the Accuracy metric.
Is there a way in SageMaker to add the custom metric for a built-in image classification algorithm?
As I understand AUC/Confusion Matrix/Precision/Recall/F1 are good metrics for a binary classifier, then Why these are missing in the AWS built-in image classification algorithm?
Is there a way where I can batch transform my test data and get these metrics to evaluate the model as Accuracy alone is not good for evaluation?
SageMaker Built-in algorithms cannot accept custom metrics, they work only for the built-in metrics
Confusion matrix is not a metric, it's a visualization. Also note that the image classifier is not a binary classifier, it's a general classifier that can have a large number of labels. Regarding the other metrics I can't speak on behalf of AWS teams :)
Yes, using Batch Transform or real-time endpoints to create predictions to be used in your own custom analytics is a good idea. For example, in this blog post an ephemeral endpoint is created to produce predictions and a confusion matrix for the built-in linear classifier https://aws.amazon.com/blogs/machine-learning/build-multiclass-classifiers-with-amazon-sagemaker-linear-learner/
I'm trying a range of Online classifiers in the ski-kitlearn library to train a model from huge data. I found there are many classifiers supporting the partial_fit allowing for incremental learning. I want to use the Ridge Regression classifier in this setting, but could not find it in the implementation. Is there an alternative model that can do this in sklearn?
sklearn.linear_model.SGDClassifier, its loss function contain ‘hinge’, ‘log’, ‘modified_huber’, ‘squared_hinge’, ‘perceptron’
sklearn.linear_model.SGDRegressor, its default loss function is squared_loss, The possible values are ‘squared_loss’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’
Is there any automated way to evaluate convergence of the SGDClassifier?
I'm trying to run an elastic net logit in python and am using scikit learn's SGDClassifier with log loss and elastic net penalty. When I fit the model in python, I get all zeros for my coefficients. When I run glmnet in R, I get significant non-zero coefficients.
After some twiddling I found that the scikit learn coefficients approach the R coefficients after around 1000 iterations.
Is there any method that I'm missing in scikit learn to iterate until the change in coefficients is relatively small (or a max amount of iterations has been performed), or do I need to do this myself via cross-validation.
This a known limitation of the current implementation of scikit-learn's SGD classifier, there is currently no automated convergence check on that model. You can set verbose=1 to get some feedback when running though.