yandexdataschool / roc_comparison Goto Github PK

The fast version of DeLong's method for computing the covariance of unadjusted AUC.

License: MIT License

Python 100.00%

roc_comparison's Introduction

A Python implementation of an algorithm for computing the statistical significance of comparing two sets of predictions by ROC AUC. Also can compute variance of a single ROC AUC estimate. X. Sun and W. Xu, "Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves," in IEEE Signal Processing Letters, vol. 21, no. 11, pp. 1389-1393, Nov. 2014, doi: 10.1109/LSP.2014.2337313.

roc_comparison's People

Contributors

Stargazers

Watchers

Forkers

afcarl lhcwqjp batermj lazycrazyowl chengjianhong nvbic ovavulov saehwanpark llniu dangerousnmeow rlenain keithreid-sfw maybeee18 aniketmaurya rbfricks 7cd chiehchiu hayreenlee gaolanmei nomalocaris pkathail xzllxls subratac emretaha neuroinformaticafbf biogeek qiangzhangcv marcobmota8 bs1119 ben-drucker memy85 zero506 prasannakumaran hza2000 angeela03 heyrict miramliu krekiehn zhangliyuan97

roc_comparison's Issues

Compare mean AUC to a constant predictor

Hello,
I was reading the paper "A rapid volume of interest-based approach of radiomics analysis of breast MRI for tumor decoding and phenotyping of breast cancer" by Demircioglu et al. 2020, where the authors employ the DeLong test to compare if the AUC of ROC curves is different to that of a constant predictor, with the code is this repo.

In this paper the authors compute the average AUC's over the test sets of 25 repeats of 5-fold cross-validation.

I have a similar situation. I have an average AUC (over 100 different, but not independent test sets) and would like to see if this average is significantly different to that of a constant predictor.

How can I use this code to perform this analysis? I can't figure this out from the example script.

Thank you!

Incorrect argument definitions?

Thanks for implementing this!

In both delong_roc_variance() and delong_roc_test(), predictions and predictions_one/predictions_two are defined as "probability of being class 1". Are these arguments necessarily arrays of probability values? I think ROC curves can be built off of any kind of numerical 'score', not only probabilities.

Your example code (https://github.com/yandexdataschool/roc_comparison/blob/master/example.py) passes normal random variables, rather than probabilities (variable is even called scores). I also found that scaling the inputs has no effect on the p value returned from delong_roc_test, and there are not errors returned for input values outside [0, 1].

So would it make sense to revise the parameter descriptions to call them 'scores' rather than 'probabilities'? It might even make sense to rename the variable from predictions to scores.

np.newaxis expired since np1.15

Since np1.15, np.newaxis is expired. Author could you update this as well? Thanks.

You would get error as :
TypeError: '(None, array([ 72301, 56214, 26730, ..., 313412, 313400, 915478]))' is an invalid key

Try following code:

def delong_roc_variance(ground_truth, predictions, sample_weight=None):
    """
    Computes ROC AUC variance for a single set of predictions
    Args:
       ground_truth: np.array of 0 and 1
       predictions: np.array of floats of the probability of being class 1
    """
    order, label_1_count, ordered_sample_weight = compute_ground_truth_statistics(
        ground_truth, sample_weight)
    # predictions_sorted_transposed = predictions[np.newaxis, order]
    # note that newaxis expired since the np version 1.15
    predictions_sorted_transposed = np.expand_dims(predictions[order[::1]], axis=0)
    aucs, delongcov = fastDeLong(predictions_sorted_transposed, label_1_count, ordered_sample_weight)
    assert len(aucs) == 1, "There is a bug in the code, please forward this to the developers"
    return aucs[0], delongcov

Incorrect output

Thanks a lot for your code! It helped me a lot.
In the implementation of this roc comparison, I ran into a problems:

You see, I calculated y_prob_xgboost and y_prob_simple from previous model. Those two are of probabilities between 0-1. The y_test represent the true result.

In the code below, I used the function you kindly provided in the compare_auc_delong_xu module:

delongcov=compare_auc_delong_xu.delong_roc_test(y_test,y_prob_xgboost,y_prob_simple) print(delongcov)

However, I get a result of
[[-15.40781058]]

It did not make sense since I read some of the functions. I believe the function of "calc_pvalue" is to calculate to p value, which mean it is >0 instead of a negative one. By the way, I did not run into any error information during run it. IT MEANS THAT MY PROBLEM MAY NOT BE THE ERROR YOU ASSERTED.

Would you please tell me how it may be caused? Thanks a lot!

pypi

Hi authors,
Thanks for this great software.
Is it also available on pypi? It would be great to have an easy install option for it.
Best wishes,

License

Dear yandexdataschool,

Thanks for sharing these scripts with us. They are very easy and useful!

Could you add a license for the files you have in this repository?

Kind regards,
Victor

test_variance_weight fails

10/11 unit tests pass, but this one fails:

Failure
Traceback (most recent call last):
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\unittest\case.py", line 615, in run
    testMethod()
  File "C:\Users\ohadl\repos\PythonUtils\test\roc_comparison\test.py", line 133, in test_variance_weight
    numpy.testing.assert_allclose(variances.mean(), aucs.var(), rtol=0.1)
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\testing\_private\utils.py", line 1452, in assert_allclose
    verbose=verbose, header=header, equal_nan=equal_nan)
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\testing\_private\utils.py", line 789, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0.1, atol=0

(mismatch 100.0%)
 x: array(0.013752)
 y: array(0.017002)

Not sure what the implications are.

Incorrect number of arguments in fastDeLong

When calling delong_roc_variance(), it gives fastDeLong() 3 arguments while by definition fastDeLong() only accepts two.

This shouldn't be an issue since sample_weight = None, but the issue persists after even manually deleting the input ordered_sample_weight on line #137, e.g.:

>> auc, auc_cov = delong_roc_variance(Y_test, Y_pred)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-191-031a16e7f474> in <module>

----> 5 auc, auc_cov = delong_roc_variance(Y_test, Y_pred)
      6 auc_std = np.sqrt(auc_cov)
      7 lower_upper_q = np.abs(np.array([0,1])-(1-alpha)/2)

~\path\roc_comparison-master\compare_auc_delong_xu.py in delong_roc_variance(ground_truth, predictions)
    135         ground_truth, sample_weight)
    136     predictions_sorted_transposed = predictions[np.newaxis, order]
--> 137     aucs, delongcov = fastDeLong(predictions_sorted_transposed, label_1_count)
    138     assert len(aucs) == 1, "There is a bug in the code, please forward this to the developers"
    139     return aucs[0], delongcov

TypeError: fastDeLong() takes 2 positional arguments but 3 were given

yandexdataschool / roc_comparison Goto Github PK

roc_comparison's Introduction

roc_comparison's People

Contributors

Stargazers

Watchers

Forkers

roc_comparison's Issues

Compare mean AUC to a constant predictor

Incorrect argument definitions?

np.newaxis expired since np1.15

Incorrect output

pypi

License

test_variance_weight fails

Incorrect number of arguments in fastDeLong

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent