Code Monkey home page Code Monkey logo

kalyanm45 / medical-chatbot-using-llama-2 Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 3.0 23 KB

The project uses natural language processing and information retrieval to create an interactive system for user queries on a collection of PDFs. It involves loading, segmenting, and embedding PDFs with a Hugging Face model, utilizing Pinecone for efficient similarity searches

License: MIT License

Jupyter Notebook 77.38% Python 8.54% CSS 7.23% HTML 6.86%
ai-healthcare ai-healthcare-chatbot llama2 medical-chatbot pinecone

medical-chatbot-using-llama-2's Introduction

About The Project

This project leverages natural language processing and information retrieval techniques to create an interactive system for answering user queries based on a collection of PDF documents. The process begins with loading and segmenting PDFs into smaller text chunks. These chunks are then embedded using a pre-trained Hugging Face model. The embeddings are indexed using Pinecone, a vector search engine, facilitating efficient similarity searches. User queries are processed using a retrieval question-answering (QA) system, which combines the Pinecone index, a language model loaded from a file, and a defined prompt template. The project aims to provide concise and accurate responses to user queries, fostering a seamless interaction between the user and the information stored in the PDF documents.

Built With

  • Python
  • LangChain
  • Flask
  • Meta Llama2
  • Pinecone

Getting Started

This will help you understand how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Installation Steps

Option 1: Installation from GitHub

Follow these steps to install and set up the project directly from the GitHub repository:

  1. Clone the Repository

    • Open your terminal or command prompt.
    • Navigate to the directory where you want to install the project.
    • Run the following command to clone the GitHub repository:
      git clone https://github.com/KalyanMurapaka45/Medical-Chatbot-using-Llama-2.git
      
  2. Create a Virtual Environment (Optional but recommended)

    • It's a good practice to create a virtual environment to manage project dependencies. Run the following command:
      conda create -p <Environment_Name> python==<python version> -y
      
  3. Activate the Virtual Environment (Optional)

    • Activate the virtual environment based on your operating system:
      conda activate <Environment_Name>/
      
  4. Install Dependencies

    • Navigate to the project directory:
      cd [project_directory]
      
    • Run the following command to install project dependencies:
      pip install -r requirements.txt
      
  5. Run the Project

    • Start the project by running the appropriate command.
      python app.py
      
  6. Access the Project

    • Open a web browser or the appropriate client to access the project.

API Key Setup

Create a .env file in the root directory and add your Pinecone credentials as follows:

PINECONE_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
PINECONE_API_ENV = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Download the quantize model from the link provided in model folder & keep the model in the model directory:

Model: llama-2-7b-chat.ggmlv3.q4_0.bin

URL: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main

Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Report bugs: If you encounter any bugs, please let us know. Open up an issue and let us know the problem.

Contribute code: If you are a developer and want to contribute, follow the instructions below to get started!

  1. Fork the Project
  2. Create your Feature Branch
  3. Commit your Changes
  4. Push to the Branch
  5. Open a Pull Request

Suggestions: If you don't want to code but have some awesome ideas, open up an issue explaining some updates or improvements you would like to see!

Don't forget to give the project a star! Thanks again!

License

This project is licensed under the Open Source Initiative (OSI) approved GNU General Public License v3.0 License - see the LICENSE.txt file for details.

Contact Details

Hema Kalyan Murapaka - [email protected]

Acknowledgements

We'd like to extend our gratitude to all individuals and organizations who have played a role in the development and success of this project. Your support, whether through contributions, inspiration, or encouragement, has been invaluable. Thank you for being a part of our journey.

medical-chatbot-using-llama-2's People

Contributors

kalyanm45 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

medical-chatbot-using-llama-2's Issues

project set up

I'm facing a few issues setting up and getting the following error when running the app.py script

-Chatbot-using-Llama-2 % python app.py
Traceback (most recent call last):
File "/Users/eugvuong/Desktop/Medical-Chatbot/Medical-Chatbot-using-Llama-2/app.py", line 47, in
docsearch=pinecone.Pinecone.Index((index_name), embeddings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/eugvuong/Desktop/Medical-Chatbot/virt_env/lib/python3.11/site-packages/pinecone/control/pinecone.py", line 622, in Index
pt = kwargs.pop('pool_threads', None) or self.pool_threads
^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'pool_threads'

I also noticed every time you run the script, it will create a index in pinecone but this causes issues if the index is already created

File "/Users/eugvuong/Desktop/Medical-Chatbot/virt_env/lib/python3.11/site-packages/pinecone/core/client/rest.py", line 261, in request
raise PineconeApiException(http_resp=r)
pinecone.core.client.exceptions.PineconeApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'content-type': 'text/plain; charset=utf-8', 'access-control-allow-origin': '', 'vary': 'origin,access-control-request-method,access-control-request-headers', 'access-control-expose-headers': '', 'X-Cloud-Trace-Context': '34f1bc846da265c72a6b8b96b070b47e', 'Date': 'Wed, 17 Apr 2024 09:35:43 GMT', 'Server': 'Google Frontend', 'Content-Length': '85', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
HTTP response body: {"error":{"code":"ALREADY_EXISTS","message":"Resource already exists"},"status":409}

wouldn't this cause issues if you want to re-run the project?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.