Gabby Masini, Leora Baumgarten, Annika Sparrell, and Brynna Kilcline
May 2024
- Clone and cd into the directory.
- Generate your own Mistral API key or use the one we provide in our assignment submission email. Create a file called llm_secret.py in the repository's root directory
with the format:
key = "MISTRAL-API-KEY"
- Build and run the application:
$ docker build -t flask_app . $ docker run --name flask -dp 8080:8080 flask_app
- Navigate to http://127.0.0.1:8080 in your browser.
To evaluate the system's performance on the handwritten test set, run:
$ docker exec -i flask python evaluate.py --filepath test_data/test_questions.jsonl
(Adjust the filepath argument in order to evaluate performance on other test files.)
To run unit tests, run:
$ docker exec -i flask python -m unittest discover -p "*_tests.py"
The application uses SQLAlchemy as the database because it integrates smoothly with the tech stack. However, we did experiment with Elasticsearch, and you can test it out separately. Here are the steps for running it:
- Install the required Python packages.
- In the terminal, run:
You may need to add
$ docker network create elastic $ docker build -t es -f es_dockerfile . $ docker run --name es01 --net elastic -p 9200:9200 -it -m 0.5GB es
winpty
at the beginning of therun
command. - Copy and paste the password for the elastic user into a file called es_password.txt in the repository's root folder.
- In a separate terminal window, create the Elasticsearch index:
$ python elasticsearch_index.py
- You may now input your query:
$ python elastic_search.py --query <YOUR-QUERY>
You can also run the unit test to see that its output matches the SQLAlchemy output:
$ python elasticsearch_test.py
static/
- Contains images and styles for the frontendcss/
- Contains CSS filestyles.css
- Project stylesheet
favicon.ico
github.png
templates/
- Contains HTML filesindex.html
- Start page of siteresults.html
- Template for query response page
test_data/
- Data files for evaluation scriptsauthor_test_qs.jsonl
date_test_qs.jsonl
test_questions.jsonl
.gitignore
- Gitignore (usual extraneous files plus API key files)alchemy_database.py
- Code for filling and querying the SQLAlchemy databasealchemy_tests.py
- Unittests for alchemy databasebooks_db.db
- SQLAlchemy databasecreate_database.py
- Creates the SQLAlchemy database, does not need to be rerun after database exists in projectdockerfile
- The Dockerfile to containerize the projectelastic_search.py
- Code to query the database via elasticsearchelasticsearch_index.py
- Creates the Elasticsearch index, does not need to be rerun after database exists in projectelasticsearch_test.py
- Unittests for the elasticsearch functionalityes_dockerfile
- Specialized Dockerfile required to run Elasticsearch scriptses_password.txt
- Required to be created locally by the user, contains the Elasticsearch password generated with the above instructionsevaluate.py
- Runs evaluation scripts on the retrieval performance as well as quality of the answers output by the LLMevaluation_tests.py
- Unittests for the evaluation scriptsgenerate_test_qs.py
- Creates the automated test data as found intest_data/
llm.py
- Code to query the Mistral API to obtain LLM responsesllm_secret.py
- Required to be created locally by the user, contains a Mistral API key stored inkey
llm_tests.py
- Unittests for the LLM prompting codemain.py
- Flask frontend codeREADME.md
- You are here :)requirements.txt
- Project dependenciesutils.py
- Contains short utility functions that are used by multiple other files