Comments (11)
Hi,
you are looking for greater_is_better
param
greater_is_better : bool, default=False
Effective only when hyperparameters searching.
Whether the quantity to monitor is a score function,
meaning high is good, or a loss function, meaning low is good.
all the best
from shap-hypetune.
hmm I keep getting an error using Brier Score Loss (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.brier_score_loss.html).
I was able to get it working with the AUC metric fine.
Here is the error and the function:
ValueError: y_prob contains values less than 0.
def BRS(y_hat, dtrain):
y_true = dtrain.get_label()
return 'brs', brier_score_loss(y_true, y_hat)
I checked the data and there good mixture of both 1 and 0's and nothing else.
from shap-hypetune.
your boosting model is simply predicting negative values.
from shap-hypetune.
When I checked it directly from the model object, all the probabilities were above 0. I also ran into issues using the balanced accuracy measure. Only AUC seems to work.
from shap-hypetune.
This is a dummy working example which works fine... I hope u can find it helpful.
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.metrics import brier_score_loss
from shaphypetune import BoostRFE
from lightgbm import *
X, y = make_classification(n_samples=6000, n_features=20, n_classes=2,
n_informative=4, n_redundant=6, random_state=0)
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.3, shuffle=False)
def BRIER(y_true, y_hat):
return 'brier', brier_score_loss(y_true, y_hat, pos_label=1), False
param_grid = {
'learning_rate': [0.2, 0.1],
'num_leaves': [25, 35],
'max_depth': [10, 12]
}
model = BoostRFE(
LGBMClassifier(n_estimators=150, random_state=0, metric="custom"),
param_grid=param_grid, min_features_to_select=1, step=1,
greater_is_better=False
)
model.fit(
X_train, y_train,
eval_set=[(X_valid, y_valid)], early_stopping_rounds=6, verbose=1,
eval_metric=BRIER
)
All the best
from shap-hypetune.
So the BoostRFE can be used fine with the classification models? On most of the examples here it showed BoostRFE with the regression models?
https://github.com/cerlymarco/shap-hypetune/blob/main/notebooks/XGBoost_usage.ipynb
from shap-hypetune.
All the estimators available in shap-hypetune can be used for classification and regression with both xgboost or lgbm
from shap-hypetune.
Ah got you. Okay I'm still getting errors on the brier score but also got this error on balanced accuracy:
raise ValueError("Classification metrics can't handle a mix of {0} "
ValueError: Classification metrics can't handle a mix of binary and continuous target
Both in the original DB and the dataframe for the target, created all values are 0 and 1.
The regular clf_xgb fits fine and can do both Brier & Balanced Accuracy without issue, but the code crashes on the BoostRFE model (also Boruta too) on the '.fit' step. Here is the code:
clf_xgb = XGBClassifier(n_estimators=2000,
random_state=0,
verbosity=3,
n_jobs=-1,
scale_pos_weight=1,
use_label_encoder=False,
objective='binary:logistic',
eval_set=[(cv_x, cv_y)])
clf_xgb.fit(train_x, train_y)
class_pred = clf_xgb.predict(train_x)
balanced_accuracy = balanced_accuracy_score(class_pred, train_y)
brier_score = brier_score_loss(class_pred, train_y)
print(brier_score)
print(balanced_accuracy)
model = BoostRFE(clf_xgb, param_grid=param_dist, min_features_to_select=1, step=1, n_iter=8, sampling_seed=0)
model.fit(train_x, train_y, eval_set=[(cv_x, cv_y)], early_stopping_rounds=6, verbose=100,eval_metric=ACC)
print(model.estimator_, model.best_params_, model.best_score_, model.n_features_)
print(f"feature ranking {model.ranking_}")
model_ranking_list = list(model.ranking_)
print(model_ranking_list)
from shap-hypetune.
it seems you are not using eval_metric=ACC
in regular clf_xgb
Pay attention! I think that you are passing to balanced_accuracy_score
probabilities (continuous values) instead of predicted classes/targets.
from shap-hypetune.
I was using the balanced accuracy directly with the following and no crashes:
balanced_accuracy = balanced_accuracy_score(class_pred, train_y)
and when printing the score out. Even when I modify clf_xgb to use the custom Accuracy function as so there are no errors:
clf_xgb = XGBClassifier(n_estimators=2000,
random_state=0,
verbosity=3,
n_jobs=-1,
scale_pos_weight=1,
use_label_encoder=False,
objective='binary:logistic',
eval_set=[(cv_x, cv_y)],
eval_metric=ACC)
and I'm able to print the both the balanced accuracy score (0.984741888307878) and brier score (0.02292) to console.
from shap-hypetune.
This is a dummy working example which works fine... I hope u can find it helpful.
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.metrics import balanced_accuracy_score
from shaphypetune import BoostRFE
from xgboost import *
X, y = make_classification(n_samples=6000, n_features=20, n_classes=2,
n_informative=4, n_redundant=6, random_state=0)
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.3, shuffle=False)
def ACC(y_pred, dtrain):
y_true = dtrain.get_label()
y_pred = (y_pred > 0.5).astype(int)
err = 1 - balanced_accuracy_score(y_true, y_pred)
return 'bal_acc', err
param_grid = {
'learning_rate': [0.2, 0.1],
'num_leaves': [25, 35],
'max_depth': [10, 12]
}
model = BoostRFE(
XGBClassifier(n_estimators=150, random_state=0, metric="custom"),
param_grid=param_grid, min_features_to_select=1, step=1,
greater_is_better=False
)
model.fit(
X_train, y_train,
eval_set=[(X_valid, y_valid)], early_stopping_rounds=6, verbose=1,
eval_metric=ACC
)
sincerely this is the best I can do... all the best. bie
from shap-hypetune.
Related Issues (20)
- Feature Immportance chart with selected feature names with scores HOT 2
- Error in BoostBoruta HOT 1
- Any plan to write a publication or preprint.
- Issue with custom scorer HOT 1
- Any chance to extent this amazing tool to Random Forest? HOT 2
- Erratic behaviour HOT 1
- Can BoostBoruta be used in a scikit-pipeline? HOT 4
- ExplainerError HOT 3
- Suppress warnings HOT 4
- K-fold or Blocked cross validation HOT 1
- Question: isn't `SelectorMixin` the right sklearn subclass? HOT 5
- 'numpy.random.mtrand.RandomState' object has no attribute 'integers' HOT 4
- Question about SHAP importance HOT 1
- What are some possible reasons that BoostRFA is not running in Parallel?
- Possible to use hyperopts q-distributions? HOT 1
- Issue with hyperopt HOT 2
- How to get feature set along with hyper parameters for each iteration HOT 1
- How to get the exact SHAP importance values? HOT 3
- only 10 features show in the BoostBoruta, without any feature labels/ranks/indexes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shap-hypetune.