Code Monkey home page Code Monkey logo

tobiasodion / ragbot Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 807 KB

A CLI chatbot that uses RAG architecture for improving and adapting LLM to specific context. It allows users to ask questions and get response directly from open-source LLMs(OpenAI, MistralAI etc.) or from the information on a website which is provided as context using the RAG architecture.

JavaScript 100.00%
chatbot embedding-models gpt-4 in-memory-database langchain llms rag vector-database web-crawler yargs

ragbot's Introduction

Overview

A CLI chatbot that uses RAG architecture for improving and adapting LLM to specific context. It allows users to ask questions and get response directly from open-source LLMs(OpenAI, MistralAI etc.) or from the information on a website which is provided as context using the RAG architecture. This chatbot empowers users to retrieve information from a website that may not have been in the training dataset of open-source LLMs or get current information from a website that may have been updated after the open-source LLM was trained.

RAGBOT DFD

Use cases

Without RAG - Response from Open source LLMs

Asking the OpenAI GPT LLM about the Nigerian Immigration Service(Question: Who is the current comptroller general of Nigeria Immigration Service as at 2023?) by running the script chatbot simple will give some outdated result as shown below:

result without RAG

With RAG - Response from website provided as context

By using the CLI website chatbot which extends the OpenAI GPT LLM with RAG we get a more useful result. We can use this chatbot by running chatbot rag https://immigration.gov.ng/current-and-past-leaders-of-the-nis/ and asking the same question - Who is the current comptroller general of Nigeria Immigration Service as at 2023?. This gives a more current result as shown below:

result with RAG

How it works

The chatbot crawls the given URL to the provided depth(default = 0) and indexes the content of the website which it uses as context to respond to users' questions. Hence, the chatbot tries to generate responses from the provided website context. The more information within the context(pages crawled by the chatbot), the more likely the model will generate a response. However, the quality and accuracy of these responses are subject to the quality and accuracy of information on the website which serves as context for the chatbot under the RAG architecture.

To run Locally

Requirements

  • OpenAI - API key

Steps

  • Link the CLI app to the OS by running npm link
  • Install dependencies by running npm install
  • create a copy of the .env.example file by running cp .env.example .env.local in terminal
  • update the env variables with the required keys in the .env.local file created in the previous step.
  • Run the simple version by running npm run chatbot
  • See the available commands for the app by running chatbot --help
    • Start the chatbot by running chatbot simple
    • Start the chatbot with a website context by running chatbot rag <url>
    • For running the chatbot with context, you can configure the crawl depth by using the --depth option by running chatbot rag <url> --depth <number>. The default crawl depth is 0

NB:

  • Be careful when changing the default crawl depth to protect the memory because the crawled website contents are indexed In-Memory. Hence, the higher the crawl depth the more the memory that will be needed to index the website data.
  • I will recommend a max crawl depth of 2 to be on the safe side. However, you will need to take caution and take into consideration the memory of your machine.
  • This project is intended for learning purposes where the crawl depth(by extension data volume) is expected to be small. If you wish to use in a production environment where the data volume is expected to be high, kindly use a third-party VectorDB service provider(e.g qdrant, pinecone,mongodb)

References Langchain documentation OpenAI API Embeddings RAG Article VectorDB Service

This project started out from RAG tutorial I facilitated. Watch the tutorial recording here

ragbot's People

Contributors

tobiasodion avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.