Comments (9)
Option 1 of median time-to-event is, I think, the most common option and sounds the most straightforward in terms of definition. It'd be great to see that feature live in aorsf given that I think it'd be attractive for users both of aorsf directly and via a framework. Re time: no particular rush. We are currently actively working on survival analysis in tidymodels and want to release a whole lot of new features across the framework in Q1 but we can integrate survival time prediction via aorsf in censored at any time.
from aorsf.
Thank you! I'd be happy to implement this. The biggest obstacle on my end is deciding how to do it. There are a few ways that could work:
- Compute median time-to-event in each predicted leaf and then aggregate (similar to
bag_tree-rpart.R
file incensored
) - Compute probability of censored weights (PCW), then fit a regression forest with those weights (similar to how you compute C-stat/Brier score using inverse PCW, building on ideas in this paper)
- Compute predicted mortality with
aorsf
and then use one of the existing survival time prediction methods to convert the predicted mortality to predicted time to event.
My thoughts on these:
- I'd estimate that option 1. would take the most time to develop, followed by option 2, and then option 3.
- I think 1. would have to be implemented in
aorsf
, 2. could be implemented in eitheraorsf
orcensored
, and so could 3. - I have no idea which method would actually work best! That's not ideal because I'm tempted to develop all three and then compare them, but I realize you may not want to wait that long =/
@hfrick, do you have thoughts or preferences on how I should proceed? My initial impression is that I like option 1 because it would be the most efficient computationally. However, it would also take me a little while to get it working and then run it through proper tests to make sure it's right.
from aorsf.
Thank you! I appreciate your thoughts on this very much. I will move ahead using median time-to-event and keep you updated.
from aorsf.
Thanks so much for your willingness to implement this! 🙏
from aorsf.
Hello @hfrick! I'm happy to share an update. With aorsf
version 0.1.3 and higher, models can predict survival time (reprex below). I have done some preliminary assessment of the predicted survival times and they seem to be a little less effective at discriminating high versus low risk cases than the mortality (pred_type = 'mort'
) option. This makes sense to me. I think mortality predictions do a better job of quantifying observed events.
Do you think it would be feasible for me to propose making predicted mortality the default for aorsf
in yardstick::concordance_survival(), instead of predicted time? If so, I'd be happy to work on a PR implementing that change. If not, I'm happy to at least resolve the compatibility issue noted in tidymodels/yardstick#475
library(aorsf)
fit_time <- orsf(pbc_orsf, time + status ~ . - id,
oobag_pred_type = 'time')
predict(fit_time, new_data = pbc_orsf[1:3, ], pred_type = 'time')
#> [,1]
#> [1,] 360.580
#> [2,] 2555.766
#> [3,] 1195.855
fit_time$eval_oobag$stat_values
#> [,1]
#> [1,] 0.8360331
fit_mort <- orsf_update(fit_time, oobag_pred_type = 'mort')
fit_mort$eval_oobag$stat_values
#> [,1]
#> [1,] 0.8435335
Created on 2024-01-22 with reprex v2.1.0
from aorsf.
That's awesome, thank you! 🎉 I've opened tidymodels/censored#301 to enable that in censored. Given that there is such a high focus on consistency across tidymodels, I don't think we are likely to change what the default is for any one engine. At that abstraction level, the goal is typically to not have to remember details about an engine. Mortality predictions are also currently not part of tidymodels but that is something that might change in future. If that happens, that would be the opportunity to enable that for aorsf and others and possibly revist defaults.
from aorsf.
I totally understand prioritizing consistency! This is a good incentive for me to investigate more thoughtful ways for aorsf
to predict survival time. I will check out tidymodels/censored#301 and prepare a PR. If there is a deadline for that feature being in censored
, just let me know and I'll be happy to coordinate.
Thanks for your help improving aorsf
! It is great working with you.
from aorsf.
Hi @bcjaeger! Sorry for intruding in this issue :)
- Could we maybe have the survival time in
mlr3extralearners
as well (this would be aresponse
prediction type, seemlr3proba::.surv_return()
)? https://github.com/mlr-org/mlr3extralearners/blob/main/R/learner_aorsf_surv_aorsf.R#L178 - I was just reading a paper where they the authors calculate survival time from a distribution S(t). In the end, a time-interval weighted approach might be applicable to
aorsf
and easy to implement as you get the survival matrix S(t) (observations x times) and can implement easily the equation (6) from that paper (I think it should have a denominator of (t_max - t_min
) in there as well...). Of course, these type of calculations might not be ideal in cases where the distribution is improper, as was shown in the C-hacking paper, fig 2
from aorsf.
Very nice! I will try this out
from aorsf.
Related Issues (20)
- utility functions for impurity of splits (regression trees) HOT 1
- leaf-adjacent models for explainability
- oobag_denom should be saved after grow HOT 1
- unexpected error HOT 1
- vint and 3+ categories HOT 2
- introduce `ltry` HOT 1
- try mean instead of median survival time prediction HOT 2
- `orsf_vs` should throw error if object has no importance type
- smarter prediction if `oobag_pred_type = 'none'` HOT 1
- vs error with n_predictor_min = 1
- Classification Summary Level Selection HOT 1
- survival learner issues in via mlr3 HOT 6
- return non reference coded names from orsf_vs
- oobag prediction on modified training data HOT 1
- Error in matrix(data = c(collapse::fnth(numeric_data, 0.1, w = self$weights), : 'data' must be of a vector type, was 'NULL' HOT 2
- na_omit for regression forest HOT 1
- na_action = 'pass' for cart HOT 1
- allow survival predictions for VI
- dont fit oblique if mtry is 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aorsf.