Comments (13)
Also @sebp it might be a good idea to have two default values for alpha_min_ratio like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.
That's a good idea, would you be able to provide a pull request with this change?
from scikit-survival.
Can you clarify a bit the phrase "5 alphas deep"? Not exactly sure what this means. Thanks!
from scikit-survival.
By 5 alphas deep, I mean that the coefficient path output (model.coef_) is of shape (Mx5) where M is the number of parameters in the regression and 5 is the alpha depth. I would have expected an output with a shape of (M x n_alphas).
In reference to my issue, I am seeing where of those M parameters for example, I get the following:
- coxnet.CoxnetSurvivalAnalysis(n_alphas=30, l1_ratio=1.0) gives 15 params != 0
- coxnet.CoxnetSurvivalAnalysis(n_alphas=20, l1_ratio=1.0) gives 20 params != 0
- coxnet.CoxnetSurvivalAnalysis(n_alphas=40, l1_ratio=1.0) gives 10 params != 0
but in each instance, the model.coef_ output is (Mx5).
Thank you for replying and I apologize if there is some caveat I am missing.
from scikit-survival.
I think you are mixing different concepts:
- There is the grid of alpha values, which is determined by your data,
n_alphas
, andalpha_min_ratio
. The maximum alpha is chosen such that all variables will have a coefficient of zero as determined from your dataset. The next step is to determine the minimum alpha, which isalpha_min_ratio * alpha_max
. Finally,n_alphas
different values fromalpha_max
toalpha_min
are chosen equally spaced in log-scale. Therefore, when you modifyn_alphas
,alpha_max
andalpha_min
will remain the same, but alphas in between will change. - It can happen that optimization stops early if
max_iter
has been reached. The coefficients of the remainingalpha
values will not be updated and a convergence warning will be displayed. - Usually, the number of non-zero coefficients is increasing if
alpha
is decreasing. This is not a strict requirement, though. In certain situation if features interact with each other, a coefficient could go back to zero.
from scikit-survival.
Yes, I understand how the alpha grid works and I agree with everything you said above, I think I may not be explaining very well and it may be one of those one off issues with the data I am working with.
The fit is not relaying any messages or errors regarding convergence and the way I have built elastic nets in the past is exactly the way you describe (calculated min and max then log scaled between them over 100 alphas).
I had assumed then if I specified n_alphas = 30 then I would get a matrix of M parameters by 30. Like wise if I specified n_alphas = 40 I should get an Mx40 matrix of coefficients and they would have nearly identical paths from the n_alphas=30 model but those same paths converging or diverging over the next 10 alphas.
It is confusing me as to why I am getting more non-zero coefficients on n_alphas=10 (nearly the full solution actually ) and significantly less on n_alphas=40 (very sparse) and both coefficient outputs being Mx5. Its entirely possible its the data I am working with as well.
I thought perhaps there was a special method or criteria you had in place and it just was not displaying any messages and maybe you might know off the top of your head.
from scikit-survival.
One possible issue I could imagine when using n_alphas = 10
instead of n_alphas = 100
is that gaps between adjacent alpha values are larger. Hence, moving from one alpha to the next one will result in large updates. The algorithm does not perform step-size optimization. It is possible for updates to over-shot and miss actual minimum. I would recommend to use a relatively dense list of alphas values.
If you want to double check, you can try R's glmnet
package, which implements elastic net too.
from scikit-survival.
This was a good idea. I checked the same data using glmnet
with family='cox'
. It stopped at an alpha depth of 52 and the paths look similar when you specify n_alphas=5
. When you try something higher like n_alphas=10
, the paths looks similar but the python code doesnt return the rest of coefficient matrix and I cant figure out why. For python, n_alphas
is the parameter and for R its nlambda
. I've seen alpha and lambda interchanged in the past, Im not confusing these in this instance as it relates to your code am I?
from scikit-survival.
You are correct, glmnet's nlambda
corresponds to n_alphas
.
Are you saying that scikit-survival does not return the full path of 10 alphas, but glmnet does? Could you plot the individual estimates as dots in the plots, in addition to lines, please.
from scikit-survival.
Yes, in this particular instance, it is not returning the full path of alphas no matter what n_alphas
I specify. I know when I first opened the issue I was all over the place, but this is definitely the main point of my confusion. It makes me think something is inadvertently defaulting somehow within the code as I did not change anything in the code itself.
from scikit-survival.
One quick thought: in sksurv
model call, can you check alpha_min_ratio
and perhaps decrease it? Then perhaps you may get the longer paths seen in glmnet
?
from scikit-survival.
Sorry for the delay in response. I tried your suggestion but it did not work. It seems like a unique issue and in the end, with the way glmnet
is built, I can still get a sparse solution relative to the shorter paths. Additionally, the selected variables seem intuitive and appropriate.
from scikit-survival.
I have a feeling this might be similar to the issue or confusion I’ve been having related to alphas that I commented on in #47.
I’ve found that Coxnet will silently not use all the alphas down the autogenerated sequence once the alpha values gets too small, but it won’t raise any warnings or errors during the fit.
For example, it might calculate an alpha max of 1.5 from the data and with an alpha_min_ratio
set to 0.01 it will create the alphas_
sequence of n_alphas
alphas from 1.5 down to 0.015. When it does the fit it doesn’t typically use all the alphas down the sequence and this seems to be normal behavior. It doesn’t show any convergence warnings.
I only realized this when I was trying to do model selection/CV based on the gist example and got Numerical error... consider increasing alpha
errors when it was fitting individual alphas from the autogenerated sequence from initial fit on the data I did to generate the sequence.
@dex314 I would consider increasing alpha_min_ratio
so that the sequence of alphas don’t become too small and maybe you will see it uses more of them and more alphas are shown in coef_
Also @sebp it might be a good idea to have two default values for alpha_min_ratio
like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.
from scikit-survival.
The default value for alpha_min_ratio
will depend in n_features
and n_samples
in a future release. I added a warning to notify users about this change (see commit dfd645e)
from scikit-survival.
Related Issues (20)
- Survival Random Forest predict_survival_function does not scale with `n_jobs` HOT 1
- Clarify which metrics expect output of survival function vs output of cumulative hazard function HOT 1
- conf_type is not working in kaplan_meier_estimator: HOT 2
- How to ensemble predictions from ExtraSurvivalTrees models? HOT 1
- parallelization for GradientBoostingSurvivalAnalysis? HOT 1
- plotting a tree from estimators_[i] from RandomSurvivalForest.fit() HOT 6
- Possible memory leak for FastKernelSurvivalSVM HOT 1
- Fit does not throw exception if negative event times are passed
- Description of estimate parameter in integrated_brier_score is unclear HOT 1
- Ipcw estimation: Add small value for numerical stability HOT 1
- Description of estimate parameter in brier_score is unclear HOT 4
- Ability to suppress future warnings? HOT 1
- Possible improvement in documentation of Cumulative dynamic AUC HOT 1
- concordance_index_ipcw output inconsistent with survAUC package HOT 1
- Support Scikit-Learn 1.4 (stable version)
- 'cosine' kernel in FastKernelSurvivalSVM still in documentation but not working in 0.22.2
- Can't instantiate abstract class GradientBoostingSurvivalAnalysis with abstract methods _encode_y, _get_loss HOT 1
- SurvivalTree is handling sample_weight incorrectly
- [BUG] `GradientBoostingSurvivalAnalysis` - docstring/logic mismatch on possible `criterion` values HOT 1
- Wrong documentation in the Ridge section
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scikit-survival.