Code Monkey home page Code Monkey logo

roc_comparison's Introduction

roc_comparison's People

Contributors

kazeevn avatar keithreid-sfw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

roc_comparison's Issues

Compare mean AUC to a constant predictor

Hello,
I was reading the paper "A rapid volume of interest-based approach of radiomics analysis of breast MRI for tumor decoding and phenotyping of breast cancer" by Demircioglu et al. 2020, where the authors employ the DeLong test to compare if the AUC of ROC curves is different to that of a constant predictor, with the code is this repo.

In this paper the authors compute the average AUC's over the test sets of 25 repeats of 5-fold cross-validation.

I have a similar situation. I have an average AUC (over 100 different, but not independent test sets) and would like to see if this average is significantly different to that of a constant predictor.

How can I use this code to perform this analysis? I can't figure this out from the example script.

Thank you!

Incorrect argument definitions?

Thanks for implementing this!

In both delong_roc_variance() and delong_roc_test(), predictions and predictions_one/predictions_two are defined as "probability of being class 1". Are these arguments necessarily arrays of probability values? I think ROC curves can be built off of any kind of numerical 'score', not only probabilities.

Your example code (https://github.com/yandexdataschool/roc_comparison/blob/master/example.py) passes normal random variables, rather than probabilities (variable is even called scores). I also found that scaling the inputs has no effect on the p value returned from delong_roc_test, and there are not errors returned for input values outside [0, 1].

So would it make sense to revise the parameter descriptions to call them 'scores' rather than 'probabilities'? It might even make sense to rename the variable from predictions to scores.

np.newaxis expired since np1.15

Since np1.15, np.newaxis is expired. Author could you update this as well? Thanks.

You would get error as :
TypeError: '(None, array([ 72301, 56214, 26730, ..., 313412, 313400, 915478]))' is an invalid key

Try following code:

def delong_roc_variance(ground_truth, predictions, sample_weight=None):
    """
    Computes ROC AUC variance for a single set of predictions
    Args:
       ground_truth: np.array of 0 and 1
       predictions: np.array of floats of the probability of being class 1
    """
    order, label_1_count, ordered_sample_weight = compute_ground_truth_statistics(
        ground_truth, sample_weight)
    # predictions_sorted_transposed = predictions[np.newaxis, order]
    # note that newaxis expired since the np version 1.15
    predictions_sorted_transposed = np.expand_dims(predictions[order[::1]], axis=0)
    aucs, delongcov = fastDeLong(predictions_sorted_transposed, label_1_count, ordered_sample_weight)
    assert len(aucs) == 1, "There is a bug in the code, please forward this to the developers"
    return aucs[0], delongcov

Incorrect output

Thanks a lot for your code! It helped me a lot.
In the implementation of this roc comparison, I ran into a problems:

You see, I calculated y_prob_xgboost and y_prob_simple from previous model. Those two are of probabilities between 0-1. The y_test represent the true result.

In the code below, I used the function you kindly provided in the compare_auc_delong_xu module:

delongcov=compare_auc_delong_xu.delong_roc_test(y_test,y_prob_xgboost,y_prob_simple) print(delongcov)

However, I get a result of
[[-15.40781058]]

It did not make sense since I read some of the functions. I believe the function of "calc_pvalue" is to calculate to p value, which mean it is >0 instead of a negative one. By the way, I did not run into any error information during run it. IT MEANS THAT MY PROBLEM MAY NOT BE THE ERROR YOU ASSERTED.

Would you please tell me how it may be caused? Thanks a lot!

pypi

Hi authors,
Thanks for this great software.
Is it also available on pypi? It would be great to have an easy install option for it.
Best wishes,

License

Dear yandexdataschool,

Thanks for sharing these scripts with us. They are very easy and useful!

Could you add a license for the files you have in this repository?

Kind regards,
Victor

test_variance_weight fails

10/11 unit tests pass, but this one fails:

Failure
Traceback (most recent call last):
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\unittest\case.py", line 615, in run
    testMethod()
  File "C:\Users\ohadl\repos\PythonUtils\test\roc_comparison\test.py", line 133, in test_variance_weight
    numpy.testing.assert_allclose(variances.mean(), aucs.var(), rtol=0.1)
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\testing\_private\utils.py", line 1452, in assert_allclose
    verbose=verbose, header=header, equal_nan=equal_nan)
  File "C:\Users\ohadl\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\testing\_private\utils.py", line 789, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0.1, atol=0

(mismatch 100.0%)
 x: array(0.013752)
 y: array(0.017002)

Not sure what the implications are.

Incorrect number of arguments in fastDeLong

When calling delong_roc_variance(), it gives fastDeLong() 3 arguments while by definition fastDeLong() only accepts two.

This shouldn't be an issue since sample_weight = None, but the issue persists after even manually deleting the input ordered_sample_weight on line #137, e.g.:

>> auc, auc_cov = delong_roc_variance(Y_test, Y_pred)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-191-031a16e7f474> in <module>

----> 5 auc, auc_cov = delong_roc_variance(Y_test, Y_pred)
      6 auc_std = np.sqrt(auc_cov)
      7 lower_upper_q = np.abs(np.array([0,1])-(1-alpha)/2)

~\path\roc_comparison-master\compare_auc_delong_xu.py in delong_roc_variance(ground_truth, predictions)
    135         ground_truth, sample_weight)
    136     predictions_sorted_transposed = predictions[np.newaxis, order]
--> 137     aucs, delongcov = fastDeLong(predictions_sorted_transposed, label_1_count)
    138     assert len(aucs) == 1, "There is a bug in the code, please forward this to the developers"
    139     return aucs[0], delongcov

TypeError: fastDeLong() takes 2 positional arguments but 3 were given


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.