Code Monkey home page Code Monkey logo

Comments (13)

sebp avatar sebp commented on June 15, 2024 1

Also @sebp it might be a good idea to have two default values for alpha_min_ratio like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.

That's a good idea, would you be able to provide a pull request with this change?

from scikit-survival.

plpxsk avatar plpxsk commented on June 15, 2024

Can you clarify a bit the phrase "5 alphas deep"? Not exactly sure what this means. Thanks!

from scikit-survival.

dex314 avatar dex314 commented on June 15, 2024

By 5 alphas deep, I mean that the coefficient path output (model.coef_) is of shape (Mx5) where M is the number of parameters in the regression and 5 is the alpha depth. I would have expected an output with a shape of (M x n_alphas).
In reference to my issue, I am seeing where of those M parameters for example, I get the following:

  • coxnet.CoxnetSurvivalAnalysis(n_alphas=30, l1_ratio=1.0) gives 15 params != 0
  • coxnet.CoxnetSurvivalAnalysis(n_alphas=20, l1_ratio=1.0) gives 20 params != 0
  • coxnet.CoxnetSurvivalAnalysis(n_alphas=40, l1_ratio=1.0) gives 10 params != 0

but in each instance, the model.coef_ output is (Mx5).

Thank you for replying and I apologize if there is some caveat I am missing.

from scikit-survival.

sebp avatar sebp commented on June 15, 2024

I think you are mixing different concepts:

  1. There is the grid of alpha values, which is determined by your data, n_alphas, and alpha_min_ratio. The maximum alpha is chosen such that all variables will have a coefficient of zero as determined from your dataset. The next step is to determine the minimum alpha, which is alpha_min_ratio * alpha_max. Finally, n_alphas different values from alpha_max to alpha_min are chosen equally spaced in log-scale. Therefore, when you modify n_alphas, alpha_max and alpha_min will remain the same, but alphas in between will change.
  2. It can happen that optimization stops early if max_iter has been reached. The coefficients of the remaining alpha values will not be updated and a convergence warning will be displayed.
  3. Usually, the number of non-zero coefficients is increasing if alpha is decreasing. This is not a strict requirement, though. In certain situation if features interact with each other, a coefficient could go back to zero.

from scikit-survival.

dex314 avatar dex314 commented on June 15, 2024

Yes, I understand how the alpha grid works and I agree with everything you said above, I think I may not be explaining very well and it may be one of those one off issues with the data I am working with.
The fit is not relaying any messages or errors regarding convergence and the way I have built elastic nets in the past is exactly the way you describe (calculated min and max then log scaled between them over 100 alphas).
I had assumed then if I specified n_alphas = 30 then I would get a matrix of M parameters by 30. Like wise if I specified n_alphas = 40 I should get an Mx40 matrix of coefficients and they would have nearly identical paths from the n_alphas=30 model but those same paths converging or diverging over the next 10 alphas.
It is confusing me as to why I am getting more non-zero coefficients on n_alphas=10 (nearly the full solution actually ) and significantly less on n_alphas=40 (very sparse) and both coefficient outputs being Mx5. Its entirely possible its the data I am working with as well.
I thought perhaps there was a special method or criteria you had in place and it just was not displaying any messages and maybe you might know off the top of your head.

from scikit-survival.

sebp avatar sebp commented on June 15, 2024

One possible issue I could imagine when using n_alphas = 10 instead of n_alphas = 100 is that gaps between adjacent alpha values are larger. Hence, moving from one alpha to the next one will result in large updates. The algorithm does not perform step-size optimization. It is possible for updates to over-shot and miss actual minimum. I would recommend to use a relatively dense list of alphas values.

If you want to double check, you can try R's glmnet package, which implements elastic net too.

from scikit-survival.

dex314 avatar dex314 commented on June 15, 2024

This was a good idea. I checked the same data using glmnet with family='cox'. It stopped at an alpha depth of 52 and the paths look similar when you specify n_alphas=5. When you try something higher like n_alphas=10, the paths looks similar but the python code doesnt return the rest of coefficient matrix and I cant figure out why. For python, n_alphas is the parameter and for R its nlambda. I've seen alpha and lambda interchanged in the past, Im not confusing these in this instance as it relates to your code am I?

Here are Python paths:
image

Here is GLMNET in R:
image

from scikit-survival.

sebp avatar sebp commented on June 15, 2024

You are correct, glmnet's nlambda corresponds to n_alphas.

Are you saying that scikit-survival does not return the full path of 10 alphas, but glmnet does? Could you plot the individual estimates as dots in the plots, in addition to lines, please.

from scikit-survival.

dex314 avatar dex314 commented on June 15, 2024

Yes, in this particular instance, it is not returning the full path of alphas no matter what n_alphas I specify. I know when I first opened the issue I was all over the place, but this is definitely the main point of my confusion. It makes me think something is inadvertently defaulting somehow within the code as I did not change anything in the code itself.

GLMNET
image

SKSURV
image

from scikit-survival.

plpxsk avatar plpxsk commented on June 15, 2024

One quick thought: in sksurv model call, can you check alpha_min_ratio and perhaps decrease it? Then perhaps you may get the longer paths seen in glmnet?

from scikit-survival.

dex314 avatar dex314 commented on June 15, 2024

Sorry for the delay in response. I tried your suggestion but it did not work. It seems like a unique issue and in the end, with the way glmnet is built, I can still get a sparse solution relative to the shorter paths. Additionally, the selected variables seem intuitive and appropriate.

from scikit-survival.

hermidalc avatar hermidalc commented on June 15, 2024

I have a feeling this might be similar to the issue or confusion I’ve been having related to alphas that I commented on in #47.

I’ve found that Coxnet will silently not use all the alphas down the autogenerated sequence once the alpha values gets too small, but it won’t raise any warnings or errors during the fit.

For example, it might calculate an alpha max of 1.5 from the data and with an alpha_min_ratio set to 0.01 it will create the alphas_ sequence of n_alphas alphas from 1.5 down to 0.015. When it does the fit it doesn’t typically use all the alphas down the sequence and this seems to be normal behavior. It doesn’t show any convergence warnings.

I only realized this when I was trying to do model selection/CV based on the gist example and got Numerical error... consider increasing alpha errors when it was fitting individual alphas from the autogenerated sequence from initial fit on the data I did to generate the sequence.

@dex314 I would consider increasing alpha_min_ratio so that the sequence of alphas don’t become too small and maybe you will see it uses more of them and more alphas are shown in coef_

Also @sebp it might be a good idea to have two default values for alpha_min_ratio like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.

from scikit-survival.

sebp avatar sebp commented on June 15, 2024

The default value for alpha_min_ratio will depend in n_features and n_samples in a future release. I added a warning to notify users about this change (see commit dfd645e)

from scikit-survival.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.