[flwr][sklearn] Trying to use Gradient Boosted Trees

[flwr][sklearn] Trying to use Gradient Boosted Trees - python

for some day's now I try to implement gradient boosted tree's with flwr.
I tried the example given by flwr.
They use a LogisticRegression Model for the classification of MNIST, which works in my setup.
This is where my problem arise:
Simply replacing the sklear.linear_model.LogisticRegression model with the sklearn.ensemble.GradientBoostingClassifier does not work because:
I don't know which values I have to initalize, which are equivalent to the LogistiRegression.coef_. Does anyone has an idea?
If I could make a working version I would try to add it to flwr as reference for anybody.
Thanks in advance

Related

Is there a cost-sensitive loss function implementation in PyTorch?

I would like to implement a cost-sensitive loss function in PyTorch. My two-class training dataset is heavily imbalanced, where 75% of the data are label '0' and only 25% of the data are label '1'.
I am new to PyTorch but my supervisor is adamant that I use it (they have more experience with it).
I found some implementations in Keras, but I am not that strong in coding to be able to port it over to PyTorch.
I have read around to find some resources to create a cost-sensitive loss function.
This paper uses something which I think might work (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9417097), but I do not understand how the code is implemented despite having access to it here (https://github.com/emadeldeen24/AttnSleep/blob/f993511426900f9fca20594a738bf8bee1116381/utils/util.py).
This website describes the math very detailedly but I do not understand it: https://medium.com/rv-data/how-to-do-cost-sensitive-learning-61848bf4f5e7
Here is an implementation in Keras which I have trouble with converting to PyTorch: https://towardsdatascience.com/fraud-detection-with-cost-sensitive-machine-learning-24b8760d35d9
I also found this implementation in PyTorch, but have trouble with understanding it: https://discuss.pytorch.org/t/dealing-with-imbalanced-datasets-in-pytorch/22596/21
Could you please help me to understand the last link's implementation of the cost-sensitive loss function?
Thank you.

How can I get the confidence scores of LPRNet?

I am a newbie working on the LPRNet provided by openvino toolkit:
https://github.com/openvinotoolkit/training_extensions
I want to get the probability of the predicted result but it seems that tf.nn.ctc_greedy_decoder only returns neg_sum_logits and I'm not sure how to convert it into prabability.
Does anyone know how can I get that?
Any suggestion will be appreciated!
Thanks.

You may try to use the validation app.
Inference Engine Validation Application is a tool that allows to infer deep learning models with standard inputs and outputs configuration and to collect simple validation metrics for topologies. It supports top-1 and top-5 metric for Classification networks and 11-points mAP metric for Object Detection networks.
You may refer here for some insight: https://docs.openvinotoolkit.org/2019_R1/_inference_engine_samples_validation_app_README.html
And also this is a step by step explainations for the application of validation app: https://www.youtube.com/watch?v=4WAQSx3LC-M

How to get CatBoost model's coefficients?

I need to get the parameters to use the model in another program.
I tried cat_model.coef_, cat_model.intercept_ or what I think. is that possible to catch the params ?
I totally solved this problem, what i was tryna do is named 'saving model'.
cat_model.save_model('cat_model.cbm')

Attributes .coef_ and .intercept_ only exist in sklearn applications of linear regression and logistic regression and will give you the slopes and the intercept (if fitted). You can use .feature_importances_ instead.

For catboost, your model has something called feature importances, given that it's a gradient boosting tree model what you get back is how heavy certain features are in splitting the tree up.
cat_model.feature_importances_
will tell you that. Though you should do more research into how the model works and what it will give you back because interpreting these features can be somewhat deceptive.

How to predict the results of a tobit model (trained using AER library in R) with rpy2?

I have a tobit regression model in R working completely well, where I am also able to predict the actual output values for the test set using Inverse-Mills Ratio. However, the rest of the code for my project is in python so I wanted to explore and use the rpy2 API to migrate the code to python from R. I have been able to achieve the bit until model training using AER.tobit() from the library AER (used in R). However when it comes to predicting on test data, the code is not performing as expected. When I use robjects.r.predict(model,newdata) from rpy2, it just gives me fitted values for the training data instead of responses for the test data. If anybody knows a way around this and can let me know, it will be great help! Thanks in advance!
Let me know if you'd need more clarification on the problem.

How to plot the tree of a Light GBM .joblib model?

I'm very new to Machine Learning! My problem concern a model created with LighGBM. I'm not the creator of this model, so I want to see the tree that this model generates. The model is in the format .joblib, and I want to know as much information as possible of it. On the LighGBMclassifier Documentation i don't find anything that can solve my problem. Thanks to the code below, I only know the number of classes.
model = joblib.load("*.joblib.dat")
model.classes_
Output:
array([0, 1])
I want to know the number of rows, the rules and if it's possible, even the plot of the tree. Thank you all!

When you can not find something in the LightGBM documentation, look for it at XGBoost documentation. LightGBM documentation has loss of missing features that the framework has actually in it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

[flwr][sklearn] Trying to use Gradient Boosted Trees - python

Related

Is there a cost-sensitive loss function implementation in PyTorch?

How can I get the confidence scores of LPRNet?

How to get CatBoost model's coefficients?

How to predict the results of a tobit model (trained using AER library in R) with rpy2?

How to plot the tree of a Light GBM .joblib model?

Categories

Resources