Code Monkey home page Code Monkey logo

description-length-probing's Introduction

Information-Theoretic Probing with Minimum Description Length

This is the official repo for the paper Information-Theoretic Probing with Minimum Description Length.

Read the official blog post for the details!

What is this about?

To measure how well pretrained representations (BERT, ELMO) encode some linguistic property, it is common to use accuracy of a probe, i.e. a classifier trained to predict the property from the representations. However, such probes often fail to adequately reflect differences in representations, and can show different results depending on a setting.

We look at this task from the information-theoretic perspective. Our idea can be summarized in two sentences.

Formally, as an alternative to the standard probes,

  • we propose information-theoretic probing which measures minimum description length (MDL) of labels given representations;

  • we show that MDL characterizes both probe quality and the amount of effort needed to achieve it;

  • we explain how to easily measure MDL on top of standard probe-training pipelines;

  • we show that results of MDL probes are more informative and stable than those of standard probes.

Interested? More details in the blog post or the paper.

This repo provides code to reproduce our experiments.

Description Length and Control Tasks

The control tasks paper argued that standard probe accuracy can be similar when probing for genuine linguistic labels and probing for random synthetic tasks (control tasks). To see reasonable differences in accuracy, the authors had to constrain the probe model size.

In our experiments, we show that MDL results are more informative, stable and do not require manual search for settings.

GO TO THE EXPERIMENTS

Description Length and Random Models

Several papers argued that accuracy of a probe does not sufficiently favour pretrained representations over randomly initialized ones. In this part, we conduct experiments with pretrained and randomly initialized ELMO for 7 tasks from the edge probing paper: PoS tagging, syntactic constituent and dependency labeling, named entity recognition, semantic role labeling, coreference resolution, and relation classification.

Here we also show that MDL results are more informative.

GO TO THE EXPERIMENTS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.