Code Monkey home page Code Monkey logo

agent-flan's Introduction

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

arXiv license Open in OpenXLab

✨ Introduction

[🤗 HuggingFace] [🧰 OpenXLab] [📃 Paper] [🌐 Project Page]

Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents. How to integrate agent ability into general LLMs becomes a crucial and urgent problem. This paper first delivers three key observations: (1) the current agent training corpus is entangled with both formats following and agent reasoning, which significantly shifts from the distribution of its pre-training data; (2) LLMs exhibit different learning speeds on the capabilities required by agent tasks; and (3) current approaches have side-effects when improving agent abilities by introducing hallucinations. Based on the above findings, we propose Agent-FLAN to effectively Fine-tune LANguage models for Agents. Through careful decomposition and redesign of the training corpus, Agent-FLAN enables Llama2-7B to outperform prior best works by 3.5% across various agent evaluation datasets. With comprehensively constructed negative samples, Agent-FLAN greatly alleviates the hallucination issues based on our established evaluation benchmark. Besides, it consistently improves the agent capability of LLMs when scaling model sizes while slightly enhancing the general capability of LLMs.

🚀 What's New

  • [2024.3.21] Paper available on ArXiv. 🔥🔥🔥
  • [2024.3.20] Release the dataset and model checkpoint for Agent-FLAN. 🎉🎉🎉

♟️ Agent-FLAN

Agent-FLAN series are finetuned on AgentInstruct and Toolbench by applying the data generation pipeline proposed in Agent-FLAN paper, which holds strong abilities on various agent tasks and tool utilization~

Comparison of recent agent tuning approaches on Held-In, Held-Out tasks. Performances are normalized with GPT-4 results for better visualization. * denotes our re-implementation for a fair comparison.

🤗 HuggingFace Model & Dataset

Agent-FLAN is produced by mixed training on AgentInstruct, ToolBench, and ShareGPT datasets from the Llama2-chat series.

The models follow the conversation format of Llama-2-chat, with the template protocol as:

dict(role='user', begin='<|Human|>െ', end='\n '),
dict(role='system', begin='<|Human|>െ', end='\n '),
dict(role='assistant', begin='<|Assistant|>െ', end='ി\n '),

The 7B model is available on Huggingface & OpenXLab model hub.

Model Huggingface Repo OpenXLab Repo
Agent-FLAN-7B Model Link Model Link

The Agent-FLAN dataset is also available on Huggingface dataset hub.

Dataset Huggingface Repo
Agent-FLAN Dataset Link

💫 Detailed Results

Main results of Agent-FLAN. Agent-FLAN significantly outperforms previous agent-tuning approaches by a large margin on both held-in and held-out tasks. * denotes our re-implementation with the same amount of training data for a fair comparison. Since FireAct does not train on AgentInstruct dataset, we omit its performance on the HELD-IN set. Bold: the best in API-based and open-sourced models.

❤️ Acknowledgements

Agent-FLAN is built with Lagent and T-Eval. Thanks for their awesome work!

🖊️ Citation

If you find this project useful in your research, please consider cite:

@article{chen2024agent,
  title={Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models},
  author={Chen, Zehui and Liu, Kuikun and Wang, Qiuchen and Zhang, Wenwei and Liu, Jiangning and Lin, Dahua and Chen, Kai and Zhao, Feng},
  journal={arXiv preprint arXiv:2403.12881},
  year={2024}
}

💳 License

This project is released under the Apache 2.0 license.

agent-flan's People

Contributors

harold-lkk avatar zehuichen123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

agent-flan's Issues

想请教一些关于训练的事情

  1. 文章中说"对于Agent-FLAN实验,我们遵循AgentTuning中的做法,将ShareGPT和Agent语料库以1:1的混合比例混合进行训练。"
    image
    我注意到你们给出了数据集,想请教一下你们数据量和配比是什么呢?(包括flan版本和你们复现的agenttuning版本)(shareGPT应该就90000多条吧,你们是把这几个怎么混合(or过采样)到一样的呢?)
    2.想在了解一下超参数的问题,因为我看agenttuning有一些非常奇怪的超参数,您这里是直接使用deepspeed默认的超参数吗?(for example 10%的warmup,最大token是2048还是4096之类的)

Share advancements on contribution to the agent research community:)

Congratulations on your new interesting work release: "Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models." It's always exciting to see new advancements that contribute to the agent research community!

We also want to share our previous released work https://github.com/SalesforceAIResearch/xLAM on Feb given the similarity of paper title and theme:

  • AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
  • xLAM: A Large Action Model tailored for AI Agents

负样本训练问题

文中提到 为了解决幻觉问题,构建了一些负样本。请问训练的时候这些负样本 的loss。是如何设计呢?

模型推理代码

您好,想求一份Agent-flan的推理代码!
p.s 貌似模型中没有定义template中的special tokens比如这种,ി\n,不清楚推理中具体是怎么处理的。
以及template中好像没有定义Function的。

请问会开源训练代码吗

请问会开放训练代码吗,会支持基于书生20B的模型完成训练,并且在langent中上线使用嘛?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.