I've finished writing a basic <a href="https://github.com/gforge/torchnet/blob/meter-t

The AUC-meter evaluates differently from classical statistics about torchnet HOT 6 CLOSED

facebookarchive commented on August 13, 2024

The AUC-meter evaluates differently from classical statistics

from torchnet.

Comments (6)

lvdmaaten commented on August 13, 2024

The method of computing the AUC described in the article you linked to is a bit different from what is implemented here. Note how the article does linear interpolation between the actual observations. This is not entirely accurate: the linear interpolation is just an approximation of the AUC at the unobserved points. We use a more conservative, constant approximation: in our case, the plot would look like a stepsize function that never lies above the linear interpolation version (and is only equal at observed points).

As a result, the AUC we measure will always be lower than what the article's method computes. It is still possible that there is a bug in AUCMeter but you need a different test case to check.

from torchnet.

gforge commented on August 13, 2024

Thanks for the explanation. I think I grasp the idea but since I'm not entirely comfortable with the calculation I switched approach - now I simply go for a random guess being equal to 0.5 and a perfect guess equal to 1. Unfortunately I seem to still be missing something as I somehow can't get the "perfect guess" to work, my current test code is:

function test.AUCMeter()
   local mtr = tnt.AUCMeter()

   local test_size = 10^3
   mtr:add(torch.rand(test_size), torch.zeros(test_size))
   mtr:add(torch.rand(test_size), torch.Tensor(test_size):fill(1))
   local err = mtr:value()
   tester:eq(err, 0.5, "Random guesses should provide a AUC close to 0.5", 10^-1)

   mtr:add(torch.Tensor(test_size):fill(0), torch.zeros(test_size))
   mtr:add(torch.Tensor(test_size):fill(0.1), torch.zeros(test_size))
   mtr:add(torch.Tensor(test_size):fill(0.2), torch.zeros(test_size))
   mtr:add(torch.Tensor(test_size):fill(0.3), torch.zeros(test_size))
   mtr:add(torch.Tensor(test_size):fill(0.4), torch.zeros(test_size))
   mtr:add(torch.Tensor(test_size):fill(1), torch.Tensor(test_size):fill(1))
   err = mtr:value()
   tester:eq(err, 1, "Only correct guesses should provide a AUC close to 1", 10^-1)

   -- Simulate a random situation where all the guesses are correct
   mtr:reset()
   local output = torch.abs(torch.rand(test_size)-.5)*2/3
   mtr:add(output, torch.zeros(test_size))
   output = torch.min(
      torch.cat(torch.rand(test_size) + .75,
                torch.Tensor(test_size):fill(1),
                2),
      2)
   mtr:add(output:fill(1), torch.Tensor(test_size):fill(1))
   err = mtr:value()
   tester:eq(err, 1, "Simulated random correct guesses should provide a AUC close to 1", 10^-1)
end

I've tried several versions of this with the estimate being around 0.75. I guess it's related to the step quality as it evaluates to 3/4 but the random attempt should in my mind smooth out the steps.

from torchnet.

lvdmaaten commented on August 13, 2024

The first unit test is a bit flaky because they contain randomness (maybe use an example for which you know the correct answer instead?). Also, note you're missing a mtr:reset() between test 1 and 2. (And are you sure you want the output:fill(1) in the last mtr:add(...)?)

This bug is now fixed. Thanks for spotting this!

from torchnet.

gforge commented on August 13, 2024

Thanks. The output:fill(1) was a leftover from debugging. If you dislike the randomness then you can add a torch.manualSeed(123) to make sure it never randomly fails although the chances should be small considering the sample size. I would love to use an example with a correct answer but since I'm an M.D. without a the formal ML training I don't have any material that I can use as a correct test. I've tried to Google it but the AUCs that I've found were classical calculations that didn't apply here.

from torchnet.

lvdmaaten commented on August 13, 2024

Okay, yeah let's fix the random seed for the test then. I don't think it is a good idea to have the unit tests fail with some non-zero probability, since we plan to increasingly rely on Travis to determine whether or not pull requests are okay.

Thanks for contributing these tests!

from torchnet.

gforge commented on August 13, 2024

Done. Thank you for the excellent package and your patience with my questions.

from torchnet.

The AUC-meter evaluates differently from classical statistics about torchnet HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent