tf_bayesian

Transformers Can Do Bayesian Inference Prior-Data Fitted Networks

Overview

Bayesian methods are widely used in science and industry and can provide both optimal solutions, uncertainty estimates, and interpretability, but can be computationally intractable or unsolvable.
Bayesian methods can be used for machine learning, but see above. If they could be made to work they might beat current methods and sidestep the No Free Lunch problem.
One of the most useful parts of Bayesian modeling, the prior, often gets wasted because it is too difficult to calculate them.
Deep learning has been used to solve bayesian problems, but success has been limited.
In possibly groundbreaking work, the authors show how a kind of self-supervised learning (using artificial generated data) can be used build priors and caclculate posterior probabilities using a Transformer network.
In sample work, this approach outperformed XGBoost and CatBoost, and provided meaningful uncertainty, while training more than 5,000 times faster (from 20 hours to 13 seconds)!
More exploration needs to be done to see if the promised holds more generally.

Predicting Outcomes from Tabular Datasets

From their paper, Muller, Hollmann, Pineda, Grabocka, and Hutter describe how they pretrain on a universe of Bayesian Neural Network priors:

Discussion Topic 1

IF the findings in the paper generalize broadly, what impact could this have on future Machine Learning approaches?

Discussion Topic 2

The approach described fits models 100-5,000 times faster than existing methods. What are some other possible advantages to the approach? What are some possible drawbacks? (HINT: Which is faster to score (or inference)--an XGBoost model or a large(ish) Transformer model)?

Discussion Topic 3

The models provide excellent uncertainty calibration--they provide not just point estimates, but ranges of uncertainty that are close to true probabilities. How might that be helpful?

Critical Analysis

In their comparison to machine learning methods (XGBoost and CatBoost), the authors note that they only analyzed datasets from MLBenchmark that were not missing data and that had fewer than 100 predictors, and further simplified the datasets to be balanced. They did not indicate whether their approach was limited to only datasets that met these criteria, or whether this was for expediency. If this is a limitation of the approach, this should have been highlighted.

Resource links

Original Article: https://arxiv.org/abs/2112.10510 Code and trained PFNs are released at https://github.com/automl/TransformersCanDoBayesianInference Spaces for tabular models https://huggingface.co/spaces/samuelinferences/transformers-can-do-bayesian-inference

Code demonstration

The code has not yet been made availalbe.

Video Recording

Link to video recording.

jessespencersmith / tf_bayesian Goto Github PK

tf_bayesian's Introduction

tf_bayesian

Overview

Predicting Outcomes from Tabular Datasets

Discussion Topic 1

Discussion Topic 2

Discussion Topic 3

Critical Analysis

Resource links

Code demonstration

Video Recording

tf_bayesian's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent