Genome Sequencing Location Prediction

Aim

To develop and deploy a sophisticated and user-friendly web application capable of predicting the origin or spread location accurately based on genomic sequences. By combining robust AI/ML models, efficient backend systems, and an intuitive frontend interface, the goal is to provide a powerful tool for researchers, healthcare professionals, and enthusiasts to analyze and understand genomic data related to the COVID-19 pandemic.

Getting Started

Prerequisites

Make sure you have the following software installed on your local machine:

Docker
Node.js (only if you want to run the application locally without Docker)
Python (only if you want to run the application locally without Docker)

Running the Application with Docker

These instructions will help you run the application using Docker. If you prefer to run it locally, please skip to the "Local Development" section. The Docker files for the frontend and backend are already included in the repository and can be found in the Frontend/ and Backend/ directories respectively. These files are used by Docker Compose to build the images and run the containers.

Clone this directory and move to its root using cd AI-Model-Training-Deployment-Genome-Sequencing_Girlgenius/
Download the classifier model from here (The model is too large to be uploaded to GitHub) and copy it to the Backend/ directory. This step is important as the model is required for the backend service to run.
Build the Docker images for the frontend and backend by running docker-compose build
This will create the Docker images for the frontend and backend services with the names ai-model-training-deployment-genome-sequencing_girlgenius-frontend and ai-model-training-deployment-genome-sequencing_girlgenius-backend respectively.
After the images are successfully built, you can run the application using docker-compose up
Docker Compose will start the container, and the application frontend will be accessible at the following URL: http://localhost:3000
The backend service will be running at http://localhost:8000
To stop the application and shut down the containers, press Ctrl + C in the terminal where Docker Compose is running.

Note - Running Docker with sudo can be necessary if your user doesn't have the necessary permissions to interact with Docker.

Local Development

If you want to run the application locally without Docker for development purposes:

Clone this directory and move to its root using cd AI-Model-Training-Deployment-Genome-Sequencing_Girlgenius/

Frontend

Run cd Frontend/
Run npm install to install all the dependencies to run the application.
run npm start to run the app in the development mode.
Open http://localhost:3000 to view it in the browser.

Note - If you want to build the app for production, run npm run build and the build will be stored in the build/ directory. This will correctly bundle React in production mode and optimize the build for the best performance.

Backend

Run cd Backend/
Download the classifier model from here (The model is too large to be uploaded to GitHub) and copy it to the Backend/ directory. This step is important as the model is required for the backend service to run.
Run pip install -r requirements.txt to install all the dependencies to run the application.
Run python -m uvicorn main:app --host 0.0.0.0 --port 8000 to run the backend service.
The backend service will be running at http://localhost:8000

Additional Information

Frontend

The frontend is built using React and Bootstrap.
The console will display useful information such as the request and response data made to the backend service, and any errors that occur.

UI Features

Login Page: Implemented secure user authentication methods for enhanced security.
Main Page: Allows users to upload/input genome sequences, explore detailed sequence information, and visualize predictive analytics with statistics, and real-time interactive maps.
Data Visualization: Offers comprehensive dashboards to gain detailed insights into prediction methodologies and dataset characteristics.
Compare Sequences: Provided tools for comparative analysis between two sequences which includes alignment scores.

Backend

The backend service is built using FastAPI.
The API has two endpoints:
- /send_seq: This endpoint accepts a POST request with a JSON body containing the sequence file of which the location is to be predicted. The response is a JSON body containing the top predicted locations for the sequence in descending order of probability in the format:
```
{
  "location1": probability1,
  "location2": probability2,
  ...
}
```
- /align_seq: This endpoint accepts a POST request with a JSON body containing two sequence files. The response is a JSON body containing the alignment score of the two sequences in the format:
```
{
  "score": alignment_score
}
```

Testing and Research

The Testing and Research/ directory contains the Jupyter Notebooks used for testing and research purposes.
The Testing and Research/Covid India Dataset/ directory contains some files that can be used to test the application.

Note - The dataset used was obtained from RCoV19 and GISAID EpiCoV. The latter was used with permission (DOI - 10.55876/gis8.231206fs).

srujana-16 / gene-sequence-location-prediction Goto Github PK

gene-sequence-location-prediction's Introduction

Genome Sequencing Location Prediction

Aim

Getting Started

Prerequisites

Running the Application with Docker

Local Development

Frontend

Backend

Additional Information

Frontend

UI Features

Backend

Testing and Research

Authors

gene-sequence-location-prediction's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org