I have created a custom error metric which prints as I run the XGBoost xgb.train but does not actually have any affect on the output. From what I can tell it is simply printing the custom error metric for the round but not using that to determine the accuracy.
I think this because the prediction outputs is exactly the same to when I use the default error metric. I have also tried hard coding the error output to a static 1 so that the output should be random but the result was exactly the same.
Do I need to create a custome objective function for the custom error metric to work?
Thanks!
My code:
# xgboost fitting with arbitrary parameters
xgb_params_1 = list(
objective = "reg:linear",
eta = 0.2,
max.depth = 6,
booster = "gbtree"
)
evalerror <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
score <- as.numeric((sum(preds[1:1000]) - sum(labels[1:1000] )) / sum(labels[1:1000]) )
return(list(metric="custom_error",value=1))
}
myWatch <- list(val=dvalid,train=dtrain)
# fit the model with the arbitrary parameters specified above
xgb_1 = xgb.train(data = dtrain,
params = xgb_params_1,
nrounds = 150,
nthread = 6,
verbose = T,
print_every_n = 50,
watchlist = myWatch,
early_stop_round = 1000,
eval_metric = evalerror,
disable_default_eval_metric = 1
)
# Perform a prediction
pred <- predict(xgb_1, dvalid)
results <- cbind(as.data.table(pred), as.data.table(data[year > trainEndDate,"total_installs"]))
#Compute test RMSE
sqrt(mean((results$pred - results$total_installs)**2))
Printed error metrics:
Custom eval_metric is just for evaluation purposes. It is displayed at every round (when using watches) and it is useful to tune number of boosting rounds, and you can use it when you do cross-validation to tune your parameters to maximise/minimise your metric. I use it in particular to tune my learning rate to make the model converge faster with less rounds.
Custom objective function is a completely different beast and it is not the same as evaluation metric. It is more of a type of model like classification, regression etc. It drives the convergence of the model. If you still want it here is an example of xgboost regression objective.
Related
My model performs a multi-class (3) classification task.
I would like to change the way model "fits". Instead of calculation of a metric such as acc or logloss - I would like to run a simulation on whole data set to see how the model performs after each fit, in real-time.
Please note that simulation != loss/error. Simulation takes into the consideration time component of the data, the sequence in which events occur. Whereas the loss function simply calculates the error based on true values.
Currently I do the simulation after the whole "fitting" process has been done:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
all_ds = lgb.Dataset(X, label=y)
train_ds = lgb.Dataset(X_train, label=y_train)
test_ds = lgb.Dataset(X_test, label=y_test)
params = {
'device_type': "gpu",
'objective': 'multiclass',
'metric': 'multi_logloss',
"boosting_type": "gbdt",
"num_class": 3,
'random_state': 123
}
# fit
model = lgb.train(
params,
train_ds,
num_boost_round=20
valid_sets=[test_ds]
)
# make prediction on a whole data set
y_pred= model.predict(all_ds)
# simulate
simulation_result = simulate(X, y_pred) # float value
current process is:
fit step 1 - error x
fit step 2 - error y
..
fit step 20 - error z
simulate - see how the model performs
I would like to change the process to
fit step 1 - simulate - use result of simulation as an error
fit step 2 - simulate - use result of simulation as an error
..
fit step 20 - simulate - use result of simulation as an error
Is there a way to achieve it through a custom callback or a custom evaluation metric or some other way?
I tried creating a custom eval metric, unfortunately I cannot invoke predict() from within the function. Moreover I find the preds parameter value to be something I cannot simply use without transformations of some sort.. It contains some sort of multidimensional array that I have no idea how to convert to actual predictions.
def customEvalMetric(preds, eval_data):
# how to invoke predict() method on a whole dataset here?
# OR how to convert preds to one-hot encoded values?
# simulation_result = simulate(all_ds, ..?..)
return 'simulation_result', simulation_result, True
and using as
model = lgb.train(
params,
train_ds,
num_boost_round=20
valid_sets=[all_ds],
feval=customEvalMetric,
)
p.s. now that I think about it - I could in theory fit once in a loop, then use init_model to load the existing model weights.. Is this the only way?
I suppose this question is applicable to other tree boosting libraries since the API are similar (xgboost for example)
The custom eval function should work. As per the docs, preds is:
The predicted values. Predicted values are returned before any transformation, e.g. they are raw margin instead of probability of positive class for binary task.
So if this is a classification problem, you might need to apply the softmax transformation to each row. For a regression problem, you should be able to use this output as-is.
I am training and tuning a model in pycaret such as:
from pycaret.classification import *
clf1 = setup(data = train, target = 'target', feature_selection = True, test_data = test, remove_multicollinearity = True, multicollinearity_threshold = 0.4)
# create model
lr = create_model('lr')
# tune model
tuned_lr = tune_model(lr)
# optimize threshold
optimized_lr = optimize_threshold(tuned_lr)
I would like to get the parameters estimated for the features in the Logistic Regression, so I could proceed with understanding the effect size of each feature on the target. However, the object optimized_lr has a function optimized_lr.get_params() which returns the hyperparameters of the model, however, I am not quite interested in my tuning decisions, instead, I am very interested in the real parameters of the model, the ones estimated in Logistic Regression.
How could I get them to use pycaret? (I could easily get those using other packages such as statsmodels, but I want to know in pycaret)
how about
for f, c in zip (optimized_lr.feature_names_in_,tuned.coef_[0]):
print(f, c)
To get the coefficients, use this code:
tuned_lr.feature_importances_ #this will give you the coefficients
get_config('X_train').columns #this code will give you the names of the columns.
Now we can create a dataframe so that we could see clearly how it correlates with the independent variable.
Coeff=pd.DataFrame({"Feature":get_config('X_train').columns.tolist(),"Coefficients":tuned_lr.feature_importances_})
print(Coeff)
# It would give me the Coefficient with the names of the respective columns. Hope it helps.
Is there a way to measure the accuracy of an ARMA-GARCH model in Python using a prediction interval (alpha=0.05)? I fitted an ARMA-GARCH model on log returns and used some classical metrics such as RMSE, MSE (out-of-sample), AIC (in-sample), check on residuals and so on. I would like to add a prediction interval as another measurement of accuracy based on my ARMA-GARCH model predictions. I used the armagarch library (https://github.com/iankhr/armagarch).
I already checked on how to use prediction intervals but not sure how to use it with ARMA-GARCH.
I found these formula searching online: Estimator +- 1.96 (for 95%) * Standard Error.
So far i got it, but i have several Standard Errors in my model output for each parameter in the ARMA and GARCH part, which one i have to use? Is there one Standard Error for the whole model itself?
I would be really happy if anyone could help.
ARMA-GARCH model output
So far I created an ARMA(2,2)-GARCH(1,1) model:
#final test of function
import armagarch as ag
#definitions framework
data = pd.DataFrame(data)
meanMdl = ag.ARMA(order = {'AR':2,'MA':2})
volMdl = ag.garch(order = {'p':1,'q':1})
distMdl = ag.normalDist()
model = ag.empModel(data, meanMdl, volMdl, distMdl)
model_fit = model.fit()
After the model fit defining prediction length and
Recieved two arrays as an output (mean + variance) put them into the correct length:
#first array is mean, second is variance
pred = model.predict(nsteps=len(df_test))
#correct the shapes!
df_pred_mean = pd.DataFrame(np.reshape(pred[0], (len(df_test),
1)))
df_pred_variance = pd.DataFrame(np.reshape(pred[1],
(len(df_test), 1)))
So far so good, now i would like to implement a prediction interval.
I got that one has to use the ARMA part +- 1.96 (95%)* GARCH prediction for each prediction. I implemented it for the upper and lower bound. It just shows the upper bound lower bound is same but using * (-1.96) at the end of the formula.
#upper bound
df_all["upper bound"] =df_all["pred_Mean"]+df_all["pred_Variance"]*1.96
Using it on the actual log returns i trained the model with fails in the way its completely wrong. Now I'm unsure if the main approach i used is wrong or the model I used means the package.
prediction interval vs. actual log return
I am using the XGBoost python API and am training my model using the code below.
num_round =100
param = { 'alpha': 0.28342488159238677, 'eta' : 0.43347953113507033, 'max_depth': 2, 'min_child_weight' :8.774621448662675,
'objective':'binary:logistic'}
dtrain = xgb.DMatrix(X, label=y)
dval = xgb.DMatrix(val, label=val_y)
evallist = [(dval, 'eval'), (dtrain, 'train')]
bst = xgb.train(param, dtrain, evals= evallist, early_stopping_rounds=100, evals_result = eval_dict )
I'm concerned that when I print(bst.best_score) it returns the best training score (which also agrees with bst.best_iteration), but often times this isn't the iteration with the best validation score. Shouldn't I be mostly interested in the validation score? I don't want an iteration that has crazy good (overfitted) training error, I want the one the performs the best on the validation set?
I am using best_ntree_limit to make predictions (as described here), but does this mean I am using the model with lowest training error to make my prediction? I am guessing I can change this somewhere to use validation error for the "best" metrics, but I'm confused because I never see anyone do it in examples.
I'm working on a Kaggle competition (https://www.kaggle.com/c/house-prices-advanced-regression-techniques#evaluation) and it states that my model will be evaluated by:
Submissions are evaluated on Root-Mean-Squared-Error (RMSE) between the logarithm of the predicted value and the logarithm of the observed sales price. (Taking logs means that errors in predicting expensive houses and cheap houses will affect the result equally.)
I couldn't find this in the docs (it's basically RMSE(log(truth), log(prediction)), so I went about writing a custom scorer:
def custom_loss(truth, preds):
truth_logs = np.log(truth)
print(truth_logs)
preds_logs = np.log(preds)
numerator = np.sum(np.square(truth_logs - preds_logs))
return np.sum(np.sqrt(numerator / len(truth)))
custom_scorer = make_scorer(custom_loss, greater_is_better=False)
Two questions:
1) Should my custom loss function return a numpy array of scores (one for each (truth, prediction) pair? Or should it be the total loss over those (truth, prediction) pairs, returning a single number?
I looked into the docs but they weren't super helpful re: what my custom loss function should return.
2) When I run:
xgb_model = xgb.XGBRegressor()
params = {"max_depth": [3, 4], "learning_rate": [0.05],
"n_estimators": [1000, 2000], "n_jobs": [8], "subsample": [0.8], "random_state": [42]}
grid_search_cv = GridSearchCV(xgb_model, params, scoring=custom_scorer,
n_jobs=8, cv=KFold(n_splits=10, shuffle=True, random_state=42), verbose=2)
grid_search_cv.fit(X, y)
grid_search_cv.best_score_
I get back:
-0.12137097567803554
which is very surprising. Given that my loss function is taking RMSE(log(truth) - log(prediction)), I shouldn't be able to have a negative best_score_.
Any idea why it's negative?
Thanks!
1) You should return a single number as loss, not array. GridSearchCV will sort the params accroding to the results of this scorer.
By the way instead of defining a custom metric, you can use mean_squared_log_error, which does what you want.
2) Why does it return negative? - Without your actual data and complete code we cant say.
You should be careful with the notation.
There are 2 levels of optimization here:
The loss function optimized when the XGBRegressor is fitted to the data.
The scoring function that is optimized during the grid search.
I prefer calling the second scoring function instead of loss function, since loss function usually refers to a term that is subject to optimization during the model fitting process itself.
However, your custom function only specifies 2. whilst leaving 1. untouched. In case you want to change the loss function of XGBRegressor see here. Most regression models have several criteria from which you can choose such as mean_square_error or mean_absolute_error.
Note, that passing customized loss functions is not supported at the moment (see reasons here and here).
The make_scorer function sign flips if greater_is_better is False