The mcboost from mlr-org

If mask is a binary vector e.g. c(0, 1, 0, 1, 0, 0), this line returns mean(c(abels[1], labels[1])).
Instead we want mean(c(labels[2], labels[4])), right?

Minimal example:

data = data.frame(X1 = rnorm(n = 10L), X2 = rnorm(n = 10L))
masks =  list(
    rep(c(1, 0), 5)
 )
sf = SubgroupFitter$new(masks)
resid = c(1, rep(0, 9)) 
sm =  SubgroupModel$new(masks)
mn = sm$fit(data = data, labels = resid) # returns mean of 1s no 0 included

Document Hebert Johnson and Kim et al. behaviour, select Kim et al as default but make this clear!

Vignette: MCBoost paper examples

-[ ] Ask Michael for data and predicitons

DESCRIPTION file should have a cph (copyright holder)

Do proper gradient boosting for brier score optimization

The approach proposed in the papers optimizes the brier score through the assumption of predicted probabilities (which can be added or multiplicatively updated).
A proper gradient boosting setup where scores are optimized could be a worthwhile addition.

Implement regression

Write 2-3 more lines about the auditor effect

Adapt code to list of datasets

In the original paper, the validation data is a list of batches (instead of a single validation set).
Adapt code to allow for it, although this is probably not used in practice.

Update ORCIDs in DESCRIPTION

Resolve duplicates.

Treat GraphLearners

Currently not supported, will fix this.

multiplicative with small noise to get away from 0.

Release mcboost 0.3.0

First release:

usethis::use_cran_comments()
Proof read Title: and Description:
Check that all exported functions have @returns and @examples
Checked!
Check that Authors@R: includes a copyright holder (role 'cph')
mlr3 does not have this!
Check licensing of included files
No external code
Review https://github.com/DavisVaughan/extrachecks
Done

Prepare for release:

urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()

Submit to CRAN:

usethis::use_version('patch')
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()
Update install instructions in README

Improve Documentation

What are the outputs for ResidualFitter, ...

Create Extension to Survival Task

Additionally to a classification task, mcboost should also be able to handle survival tasks:

The main differences are:

compute residuals based on the derivate of the Integrated Brier Score (with right-censoring)
deal with individual distributions of survival probabilities instead of individual probabilities (also includes a new variables times)

Save "auditor effect"

Improve README

where should pipeopmcboost live?

In this package
In mlr3pipelines?

Health data vignette ranger <-> randomForest difference

Run multicalibration on pre-computed scores w/o access to initial predictor

I'm trying to multi-calibrate scores precomputed from a black-box model (assume we don't have access to the model itself) but I'm getting non-sensical results.

I'm wondering if this should work in theory (and there's some other bug in my code) or if there's a more fundamental reason this doesn't work.

Here's an example to illustrate what I'm trying to do:

library(mcboost)

# simulate some random data
n = 100
scores = runif(n)
labels = rbinom(n, 1, scores)
is_test = as.logical(rbinom(n, 1, 0.1))
segmentation_features = data.table(
    cbind(
        rbinom(n, 1, 0.1),
        rbinom(n, 1, 0.5)
    )
)

init_predictor = function(data) {
    # Hack to make it return pre-computed scores for train/test since we don't have access to the model
    if(nrow(data) > 50) {
        scores[!is_test]
    } else {
        scores[is_test]
    }
}

mc = MCBoost$new(
    auditor_fitter="TreeAuditorFitter", 
    init_predictor=init_predictor
)

mc$multicalibrate(
    segmentation_features[!is_test],
    labels[!is_test]
)
mc

prs = mc$predict_probs(segmentation_features[is_test])

Multiplier for additive

mlr-org / mcboost Goto Github PK

mcboost's People

Stargazers

Watchers

Forkers

mcboost's Issues

Recommend Projects

Recommend Topics

Recommend Org