Code Monkey home page Code Monkey logo

stride's Introduction

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

This repository contains implementation of STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making.

This is achieved by equipping LLMs with a set of operational tools specially designed to take care of low-level calculations of the decision-making problem of interest and then instructing the LLM to generate a structured Thought sequence that utilizes these operational tools to emulate various algorithmic behaviors to optimally solve the problem. For example, the figure below illustrates how STRIDE emulates the Value Iteration Algorithm to compute the optimal policy of an MDP.

alt text

Environments

We have implemented the following environments to evaluate STRIDE.

  • Tabular MDP
    • Tabular MDP with known model: STRIDE emulates Value Iteration Algorithm to compute the optimal policy.
    • Taubular MDP with unknown model: STRIDE emulates Upper Confidence Bound Value Iteration Algorithm to strategically explore the unknown environment.
  • Dynamic Mechanism Design
    • STRIDE emulates a dynamic programming algorithm to compute the policy and pricing of VCG mechanism.
  • Bargaining Games
    • Alternating offer bargaining with complete information: STRIDE emulates a backward induction algorithm to compute the Subgame Perfect Equilibrium.
    • Bargaining with onesided uncertainty: STRIDE emulates an algorithm combining bisection search and backward induction to compute the Sequential Equilibrium.

Instructions

How to run

Execute run.py to run STRIDE on different environments. For example,

python run.py --env tabular_mdp --mdp_known True --agent_engine gpt-4o

Then STRIDE agent, which is defined in agents\StriDe.py, will play the tabular_mdp environment defined in envs\tabular_mdp.py. As --mdp_known is set to True, StriDe will read the demonstration file envs\tabular_mdp\prompts\tabular_mdp_vi_exmps.txt to emulate Value Iteration Algorithm to compute the optimal policy of the given MDP. The tools used for emulating both Value Iteration and Upper Confidence Bound Value Iteration are defined in envs\tabular_mdp\tools.py.

Generate new demonstration

To generate a new demonstration for a specific environment, execute generate_examples.py in the corresponding folder, which produces a new txt file under \prompts.

Prepare STRIDE to emulate new algorithms

Follow the following procedure

  • create a new folder under \envs and define the environment in env.py;
  • define the operational tools in tools.py, where each tool is a BaseModel object of Pydantic package;
  • implement the reference algorithm (the algorithm that we want STRIDE to emulate) in program_agent.py using the defined tools, and augment it with comments that explain the algorithm logic and tool calls;
  • run the implemented algorithm on the defined environment using generate_examples.py, which saves the demonstration as a txt file.

Then by providing STRIDE with the generated demonstration file and the operational tools, it will be able to emulate the reference algorithm in its reasoning process.

Citation

@article{li2024stride,
      title={STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making},
      author={Chuanhao Li, Runhan Yang, Tiankai Li, Milad Bafarassat, Kourosh Sharifi, Dirk Bergemann, Zhuoran Yang},
      journal={arXiv preprint arXiv:2405.16376},
      year={2024}
}

stride's People

Contributors

065536 avatar cyrilli avatar roxie-zhang avatar spaceman08 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

stride's Issues

Missing requirements

Hi, great paper.

Given your demo setup and framework on your git you should put up a requirements.txt as per standard implementation to let users quickly try out your work, not critical of course but should be there

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.