Code Monkey home page Code Monkey logo

llm-as-chatbot's Introduction

๐Ÿ’ฌ๐Ÿš€ LLM as a Chatbot Service

The purpose of this repository is to let people to use lots of open sourced instruction-following fine-tuned LLM models as a Chatbot service. Because different models behave differently, and different models require differently formmated prompts, I made a very simple library Ping Pong for model agnostic conversation and context managements. Also, I made GradioChat UI looking similar to HuggingChat but entirely built in Gradio. Those two projects are fully integrated to power this project.

Context management

Different model might have different strategies to manage context, so if you want to know the exact strategies applied to each model, take a look at the chats directory. However, here are the basic ideas that I have come up with initially. I have found long prompts will slow down the generation process a lot eventually, so I thought the prompts should be kept as short as possible while as concise as possible at the same time. In the previous version, I have accumulated all the past conversations, and that didn't go well.

  • In every turn of the conversation, the past N conversations will be kept. Think about the N as a hyper-parameter. As an experiment, currently the past 2-3 conversations are only kept for all models.
  • (TBD) In every turn of the conversation, it summarizes or extract information. The summarized information will be given in the every next turn of conversation.

Currently supported models

Checkout the list of models

Instructions

  1. Prerequisites

Note that the code only works Python >= 3.9 and gradio >= 3.32.0

$ conda create -n llm-serve python=3.9
$ conda activate llm-serve
  1. Install dependencies. flash-attn and triton are included to support MPT models, If you don't want to use MPT, comment them out, otherwise you will face two module not found errors, then you will have to install packaging and torch packages while facing the errors.
$ cd LLM-As-Chatbot
$ pip install -r requirements.txt
  1. Run Gradio application
$ python app.py

How to plugin your own model

You need to follow the following steps to bring your own models in this project.

  1. Add your model spec in model_cards.json. If you don't have thumnail image, just leave it as blank string("").
  2. Add the button for your model in app.py. Don't forget to give it a name in the gr.Button and gr.Markdown. For placeholders, their names are omitted. Assign the gr.Button to a variable with the name of your choice.
  3. Add the button variable to the button list in the app.py
  4. Determine the model type in global_vars.py. If you think your model is similar to one of the existings, just add a filtering rules(if-else) and give it the same name.
  5. (Optional) if your model is totally new one, you need to give a new model_type in global_vars.py, and make changes accordingly in utils.py, and chats/central.py.

Todos

  • Gradio components to control the configurations of the generation
  • Flan based Alpaca models
  • Multiple conversation management
  • Implement server only option w/ FastAPI
  • ChatGPT's plugin like features

Acknowledgements

  • I am thankful to Jarvislabs.ai who generously provided free GPU resources to experiment with Alpaca-LoRA deployment and share it to communities to try out.
  • I am thankful to Common Computer who generously provided A100(40G) x 8 DGX workstation for fine-tuning the models.

llm-as-chatbot's People

Contributors

deep-diver avatar erivandev avatar gururise avatar eltociear avatar philwee avatar javierxio avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.