Code Monkey home page Code Monkey logo

miniwob-plusplus's Introduction

pre-commit Code style: black

The MiniWoB++ (Mini World of Bits++) library contains a collection of over 100 web interaction environments, along with JavaScript and Python interfaces for programmatically interacting with them. The Python interface follows the Gymnasium API and uses Selenium WebDriver to perform actions on the web browser.

MiniWoB++ is an extension of the OpenAI MiniWoB benchmark, and was introduced in the paper Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration.

The documentation website is at miniwob.farama.org. Development on MiniWoB++ is currently ongoing to bring it up to Farama Standards for mature projects, and will be maintained long term after this point. See the Project Roadmap for more details. If you'd like to help out, you can join our discord server here: https://discord.gg/PfR7a79FpQ.

Installation

MiniWoB++ supports Python 3.8+ on Linux and macOS.

Installing the MiniWoB++ Library

To install the MiniWoB++ library, use pip install miniwob.

Installing Chrome/Chromium and ChromeDriver

We strongly recommend using Chrome or Chromium as the web browser, as other browsers may render the environments differently.

The MiniWoB++ Python interface uses Selenium, which interacts with the browser via the WebDriver API. Follow one of the instruction methods to install ChromeDriver. The simplest method is to download ChromeDriver with the matching version, unzip it, and then add the directory containing the chromedriver executable to the PATH environment variable:

export PATH=$PATH:/path/to/chromedriver

For Chromium, the driver may also be available in a software package; for example, in Debian/Ubuntu:

sudo apt install chromium-driver

Example Usage

The following code performs a deterministic action on the click-test-2 environment.

import time
import gymnasium
import miniwob
from miniwob.action import ActionTypes

gymnasium.register_envs(miniwob)

env = gymnasium.make('miniwob/click-test-2-v1', render_mode='human')

# Wrap the code in try-finally to ensure proper cleanup.
try:
  # Start a new episode.
  obs, info = env.reset()
  assert obs["utterance"] == "Click button ONE."
  assert obs["fields"] == (("target", "ONE"),)
  time.sleep(2)       # Only here to let you look at the environment.
  
  # Find the HTML element with text "ONE".
  for element in obs["dom_elements"]:
    if element["text"] == "ONE":
      break

  # Click on the element.
  action = env.unwrapped.create_action(ActionTypes.CLICK_ELEMENT, ref=element["ref"])
  obs, reward, terminated, truncated, info = env.step(action)

  # Check if the action was correct. 
  print(reward)      # Should be around 0.8 since 2 seconds has passed.
  assert terminated is True
  time.sleep(2)

finally:
  env.close()

See the documentation for more information.

Environments

The list of the environments that were included in the MiniWoB++ library can be found in the documentation. All environments share the same observation space, while the action space can be configured during environment construction.

Citation

To cite this project please use:

@inproceedings{liu2018reinforcement,
 author = {Evan Zheran Liu and Kelvin Guu and Panupong Pasupat and Tianlin Shi and Percy Liang},
 title = {Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration},
 booktitle = {International Conference on Learning Representations ({ICLR})},
 url = {https://arxiv.org/abs/1802.08802},
 year = {2018},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.