biomedicalqa's Introduction

BioMedical Question Answering

這是一個部屬生物醫學問答 AI 的方法

Training process

這個專案大部分的概念我都是參考 https://github.com/EmilyNLP/BioMedical-Question-Answering 這個博主的方法

主要的流程跟博主的方法基本一致，資料也大部分相同

用 PubMed Pretrained
用 MLM 方法 Pretrained backbone
將 MLM 頭部換成 QA 頭部並在 SQuAD 和 BioASQ 資料上 Fine-tune

PubMed

The main procedure to access PubMed corpous is the following:

Go to the website https://ftp.ncbi.nlm.nih.gov/pubmed/baseline/
download all the .gz .md5 file
preprocess to .txt file
The txt file is about 740 MB, therefore, we only use part of the articles

Another way to access the medical article is through FTP connection

Pretrained MLM

以下是 Pretrained MLM 的細部參數

Number of tokens: 50265
Epochs: 1
Num examples: 79974
Batch size: 4
Learning rate: 5e-5
Optimizer: AdamW
Block size: 512
Loss function: CE with mask tokens only
Scheduler: Linear Warmup Scheduler

Fine-tune on BioASQ

以下是 Fine-tune on BioASQ 的細部參數

Number of tokens: 2 (start, end)
Epochs: 1
Num examples: 88707、3025
Batch size: 4
Learning rate: 5e-5
Optimizer: AdamW
Block size: 512
Loss function: CE with mask tokens only
Scheduler: Linear Warmup Scheduler
Max sequence length: 384
Document stride: 128
Max query length: 64

Frontend framework

對於前端架構，我使用的是 ReactJS + TailwindCSS，網站排版是參照 ChatGPT 的網頁 (https://chat.openai.com/chat)

Backend framework

對於後端架構，為了引入 AI 模型，我使用的是 python 的 Django 系統，並加上 RESTful API 與 backend 的版本控制

Development

最後我們在部屬在 Docker 上，用法如下

docker build -t aihub .
docker run -p 8000:8000 aihub

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

dongdong-zoez / biomedicalqa Goto Github PK