Code Monkey home page Code Monkey logo

bdl-benchmarks's People

Contributors

filangelos avatar nband avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bdl-benchmarks's Issues

Plots showing results in Colab notebook display metrics on different scales.

I could be wrong, but this plot for example

image

is clearly suspect since the naive method is giving an AUC around 0 and not 50. I believe the issue is that the results dictionary gives the metrics on a scale from [0,1], while the baseline models give the metrics on a scale of [0,100]. I can't actually find in the code where these baseline results are scaled by 100, or else I would happily make a pull request. In general, I think it would be easier to keep everything on a scale of [0,1]

[question] What is the random refer to in the diagrams

I was wondering what does the random baseline refer to actually in tables/diagrams?
Lastly, any idea why are ensembles performing worse than mc-drop, I think in the paper of deep ensembles they showed that they performed better than mc-drop (plz correct me if I'm wrong)

Thanks!

replicating results of leaderboard

I've been trying to replicate the results of your leaderboard, but I found a number of things confusing (based on the "medium" data in the linked colab):

  1. leaderboard is based on "realworld" level, but colab is based on "medium" level, do you have ready medium results?
  2. using a vgg-16 model (the one found in mc_dropout/model) and training, I found the below results:

for deterministic:
image (accuracy with pink the deterministic)

and for mc_dropout:
image
with numbers (first is mc_dropout and second is deterministic)
image

In your paper mc_dropout outperformed the deterministic approach by a quite a bit, I didn't expect the deterministic approach to perform so badly, these results seem a bit more sensible but not to this other extent, can you find the reason for this discrepancy?

  1. AUC results behave weirdly:
    for mc_dropout

image

here is a colab to replicate the above
also recommend updating your linked colab with the proper required packages as in it's current form it does not run

Interpretation of the Predictive Uncertainty for descision making

Hello,
In classification Is there any way to interpret the obtained Predictive Uncertainty? After computing the predictive uncertainty is there any way to calculate any threshold or cutoff value(as you have mentioned here:https://camo.githubusercontent.com/e78af0e93f0ea7cc80e38f7b9273486bbf6f37f6/687474703a2f2f7777772e63732e6f782e61632e756b2f70656f706c652f616e67656c6f732e66696c6f732f6173736574732f62646c2d62656e63686d61726b732f646961676e6f7369732e706e67) so that if the predictive variance is above that value we can say that the model is uncertain or below which it is certain about its prediction?
Uncertain if (predictive variance>=threshold) || Certain if (predictive variance<threshold)
How to compute this threshold!
Thanks!

TF 2.0 full release breaks image preprocessing

Hi OATML,

(My apologies in advance as preview mode was indicating some formatting issues that I'm not able to fix.)

TF 2.0 breaks image preprocessing for bdl-bechmarks. The problem appears to be that TF 2.0 changed how it works with python's local symbol table since TF 2.0 Beta.

This means that transforms.compose is no longer able to properly compose a transformation for TF's dataset.map function. The information that the compose function requires is no longer present in the local table:

TF 2.0 Beta provides:

output of locals(): {
'nargs': 1,
'f': <bdlb.diabetic_retinopathy_diagnosis.benchmark.DiabeticRetinopathyDiagnosisBecnhmark._preprocessors..Parse object at 0x7f6a6314bcf8>,
'inspect': <module 'inspect' from '/usr/local/lib/python3.6/inspect.py'>,
'x': {'image': <tf.Tensor 'args_0:0' shape=(None, 256, 256, 3) dtype=uint8>,
'label': <tf.Tensor 'args_1:0' shape=(None,) dtype=int64>,
'name': <tf.Tensor 'args_2:0' shape=(None,) dtype=string>},
'self': <bdlb.core.transforms.Compose object at 0x7f6a5c1c7240>}

While TF 2.0 provides only:

output of locals(): {
'caller_fn_scope': <tensorflow.python.autograph.core.function_wrappers.FunctionScope object at 0x7efaf43c0080>,
'kwargs': None, 'args': (),
'options': <tensorflow.python.autograph.core.converter.ConversionOptions object at 0x7efaf43c02b0>,
'f': }


A simple proposed fix to get things working again is to not rely on locals() to discern information about a class being passed to the composition, but instead to define explicit class function signatures, so that we rely solely on how python can interpret itself (i.e. use only the inspect module). Note, of course, that this solution is not robust to potential changes in preprocessing functions that may be composed.

First, we create a unique function signature for the CastX() corner case in bdl-benchmarks/bdlb/diabetic_retinopathy_diagnosis/benchmark.py:

`def call(self, x, y_nochange):

    return tf.cast(x, self.dtype), y_nochange`

Then we use this to compose instead of locals(), in bdl-benchmarks/bdlb/core/transforms.py:

`def call(self, x):
import inspect

for f in self.trans:
  
  last_x, last_y = None, None  

  nargs = len(inspect.signature(f).parameters)

  if (nargs == 2) and ("y" in inspect.signature(f).parameters):
    print("y in locals")    
    x, y = f(x, y)
    last_x = 1
    last_y = 1
    
  else:
    if nargs == 1:
      x = f(x)
      last_x = 1
      
    else:
      x, y = f(x[0], x[1])
      last_x = 1
      last_y = 1  
  
# If the last function has in the composition 2 variables to return, do so, otherwise return only 1 variable

if last_y != None:
  return x, y
else:
  return x`

Error while Importing Data on colab

Hello Sir,

I want to run your baseline code for the diabetic-retinopathy-diagnosis benchmark. The google colab does not support the word "URL" and When I changed the word "URL" to "Homepage" for accessing data, It's giving me the following error while running the diabetic_retinopathy_diagnosis .ipynb file.

TypeError: ['https://www.kaggle.com/c/diabetic-retinopathy-detection/data'] has type list, but expected one of: bytes, unicode. Here is the attached image of the error.
error

can you please help me in resolving this particular error?

PyTorch DataLoaders

Hi,

I'm looking forward to contributing a couple of our benchmarks into this repo, but am not seeing PyTorch dataloaders (or really any PyTorch support). Are there still plans to have PyTorch data loader support for the segmentation tasks?

Additionally, is there a timeline for making public the other benchmarks in the pre-alpha branch?

Thanks,
Wesley

Problem with tensorflow datasets

Hi,

After downloading the diabetic retinopathy diagnosis data and extracting the files using the download_and_prepare utility in the benchmark.py file as suggested in the instructions, the "prepare" part of this procedure gives me errors. I'm running this using the command python3 -u -c "from bdlb.diabetic_retinopathy_diagnosis.benchmark import DiabeticRetinopathyDiagnosisBecnhmark; DiabeticRetinopathyDiagnosisBecnhmark._prepare()" to avoid downloading the data again. The error happens on line 401 in benchmark.py (dtask.download_and_prepare()) and traces back to "python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py" on line 970: for key, record in utils.tqdm(generator, unit=" examples", total=split_info.num_examples, leave=False): and the final error message is "ValueError: too many values to unpack (expected 2)". It seems that the object from the "generator" is of the form {'name': '58_right', 'image': <_io.BytesIO object at 0x 7f7b9a2a28f0>, 'label': 0} with three fields while the for-loop is trying to split it into two ("key" and "record"). Do you have any ideas if something has probably gone wrong already in the extraction part or if the issue might be in the version of tensorflow_datasets for example? Somebody mentioned the use of tensorflow_datasets==1.2.0 in another issue but that didn't seem to solve this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.