Code Monkey home page Code Monkey logo

bandits's Introduction

Bandits

Mark Reid [email protected] -- Created: 2010-04-05

This is a JavaScript/jQuery/Flot hack to demonstrate a few key algorithms in the literature on multi-armed bandit problems. It is also my first attempt at using JavaScript for something more than a widget so be careful taking anything in here as an example of good JS programming style.

Introduction

A multi-armed bandit problem is described by the following game wherein an agent tries to maximise the total reward it receives:

  • Input: k bandits B[0], ..., B[k-1]
  • For time steps t = 0, 1, ..., T
    1. An agent A chooses an "arm" (index) i
    2. The agent receives a reward B[i].reward() (a value in [0,1])

This very simple game illustrates the exploration/exploitation dilemma: if an agent finds an high-rewarding arm early should it stick with it and possibly forego a different arm with a high reward?

Algorithms

There are several well-analysed algorithms for playing such a game. I've implemented two important ones:

  • UCB1, as described in [1]
  • Exp3, as described in [2]

as well as the Random agent (which simply chooses an arm at random), and a couple of the Exp3 variants (Exp3.1 and Exp3.S), also described in [2].

Using the demo

The whole demo is just a HTML file with a bunch of JavaScript implementing the UI, plotting, bandits and agents. Just checkout the source and open index.html in your browser (I've only tested with Safari).

The user is first presented with a game description UI. Bandits and agents can be chosen using the drop-down selections. New bandits and agents can be added by clicking the "+ create" link. If a bandit or agent requires parameters these can be added in the text areas which appear next to the drop-down menu.

Once the bandits and agents are selected, the number of steps for the game can be entered in the text field on the right. Then hit "Go!" to run the game.

The results are plotted below the UI. Rewards for bandits are drawn in light grey, agents are in colour and are listed in the plot's legend. (The "Averages" check box determines whether running totals or averages are displayed).

Acknowledgements

I've used the flot plugin to jQuery for plotting multi-armed bandit games.

References

[1] P. Auer, N. Cesa-Bianchi, and P. Fischer. "Finite-time Analysis of the Multiarmed Bandit Problem", Machine Learning, 47, 235-236, 2002.

[2] P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. "The nonstochastic multiarmed bandit problem", SIAM Journal on Computing, 32(1):48โ€“77, 2002.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.