Code Monkey home page Code Monkey logo

insurance-pdf-search's Introduction

Insurance-PDF-Search

Add environment variables

Note: Create a .env file within the backend directory.

MONGODB_URI=
ARTIFACT_STORE=data/guidlines/artifacts
IMAGES_FOLDER=data/guidlines/images
USE_OPENAI=TRUE
TITLE="SuperDuperDB / Insurance Guidlines: AI Search & RAG Chat (OpenAI)"
OPENAI_API_KEY=

Generate a venv for the Backend

Change directory:

cd backend

Create virtual environment in Python:

python3 -m venv virtualBack

Activate Virtual environment:

source virtualBack/bin/activate

Install Python dependencies:

pip install -r requirements.txt

Start the server:

python3 -m uvicorn main:app --reload

Data Preparation

In MongoDB Atlas create a databse called "demo_rag_insurance" and a collection called "claims_final", import the dataset "demo_rag_insurance.claims.json" into the collection. You have to create two Vector Search Indexes, one for "claimDescriptionEmbedding" called "vector_index_claim_description" and one for "photoEmbedding" called "default":

{
  "fields": [
    {
      "type": "vector",
      "path": "claimDescriptionEmbedding",
      "numDimensions": 350,
      "similarity": "cosine"
    }
  ]
}
{
  "fields": [
    {
      "type": "vector",
      "path": "photoEmbedding",
      "numDimensions": 1000,
      "similarity": "cosine"
    }
  ]
}

run

pip install -r requirements.txt

Parse the pdfs and add model in database

At this point you can either analyze and index the PDFs yourself running the script below, or simply create a new database called "insurance_pdf_search" and import all the collections contained in the folder "insurance_pdf_search_db". If you're indexing the PDFs yourself, once the script is done, import "customer.json" to your database (contained in the "insurance_pdf_search_db" folder).

python3 sddb.py --init

Add the vector index on the collection "_output.elements.chunk" and the field "_outputs.elements.text-embedding-ada-002.0":

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "_outputs.elements.text-embedding-ada-002.0",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "_outputs.elements.chunk.0.source_elements.metadata.filename",
      "type": "filter"
    }
  ]
}

Ask your PDF Sample Queries

python3 sddb.py --query "What is a Certificate of Insurance?"
python3 sddb.py --query "what strategy should an insurer first determine?"

and now, launch the backend

python3 -m uvicorn main:app --reload

move to the frontend folder and run the frontend

npm install
npm start

Build the Application with Docker

To build the Docker images and start the services, run the following command:

make build

Stopping the Application

To stop all running services, use the command:

make stop

Cleaning Up

To remove all images and containers associated with the application, execute:

make clean

insurance-pdf-search's People

Contributors

lucanapoli avatar bereilhp avatar ainhoamugica avatar condiekimdb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.