Code Monkey home page Code Monkey logo

bert-for-radiology's Introduction

Classification of Radiological Text Reports using BERT

Data Preparation

Extraction of Free Text Reports

Single plain text files were stored on a network-drive. File-names and paths were prior extracted using R list.files() function and stored in one very large table. As the workstation used by us has >120 GB RAM, keeping such large files in memory is no problem. On Coimputers with smaller memory some workarrounds might be needed.

used script: text-extraction/extract-reports.R

Clean Text Dump

About one million reports are not usable, since they only document DICOM-imports, meetings, consistency tests or the like. These were removed by full and partial string matching. In this way, it was possible to remove a large part of the inappropriate diagnostic texts and reducing the number of text-reports from 4,790,000 to 3,841,543.

used scripts:
text-extraction/clean-report-texts.R

Converting the Texts to Document Format

For generation of a custom vocabulary and for generation of training-data for BERT, the source-files need to be in a specific document format which is:

"The input is a plain text file, with one sentence per line. (It is important that these be actual sentences for the "next sentence prediction" task). Documents are delimited by empty lines."

As all text files were stored as csv, they need to be converted to document format. Each row did contain a document, therefore pasting the empty line between documents was straightforward, however having only one sentence per line requires the documents to be split by sentence, which is more complicated.
In order to sentencize the reports the German nlp-module of spaCy was used. As this did not work perfectly and also split most of the radiology-specific abbreviations an other function to fix those wrong splits was written.

A notebook on the process and a python-script to run the code from the bash can be found in the folder pregeneration.

Create WordPiece vocabulary

Google research does not provide scripts to create a new WordPiece vocabulary. They do refer to other open source options such as:

But, as they mentoin, these are not compatible with their tokenization.py library. Therefore we used the modified library by kwonmha to build a custom vocabulary.

A notebook explaining the steps neseccary to create a custom WordPiece vocabulary can be found in the folder pretraining.

Create Pretraining Data

The create_pretraining_data.py script form Google was used. Due to memory limitations, the text dump had to be split into smaller parts. The Notebook gives more details on the procedure of data-preparation.

Run Pretraining

Two models were pretrained using the BERT base configuraton. One was pretrained from scratch, one using a German BERT Model as initial Checkpoint. The Notebook explains pretraining in more detail.

Finetuning of four different BERT models

The german BERT Model from deepset.ai, the multilingual BERT model from Google, and our two pretrained BERT models were all fintuned on varing ammounts of annotated text reports of chest radiographs. The steps of the fine-tuning process are explained in detail in the respective notebooks.

Results

Our BERT models achieve state of the art performance compared to the existing literature.

F1- scores  RAD-BERT RAD-BERT train size=1000  Olatunji et al.  2019  Reeson et al.  2018  Friedlin et al.  2006  Wang et al.   2017  MetaMap $   2017  Asatryan et al.  2011  Elkin et al.  2008  
Globally abnormal  0.96 0.96 0.83 na  na  0.93+  0.91+  na  na 
Cardiomegaly  na  na  0.23 na  0.97 0.88 0.9 na  na 
Congestion  0.9 0.86 na  na  0.98 0.83§  0.77§  na  na 
Effusion  0.92 0.92 na  na  0.98 0.87 0.81 na  na 
Opacity/Consolidation  0.92 0.88 0.63 na  na  0.91/0.80/0.77#  0.95/0.39/0.71#  0.24-0.57*  0.82
Pneumothorax  0.89 0.79 na  0.92 na  0.86 0.46 na  na  
Venous catheter  0.98 0.96 na  0.97 na  na  na  na  na  
Thoracic drain  0.95 0.9 na  0.95 na  na  na  na  na 
Medical devices  0.99 0.99 0.29 na  na  na  na  na  na 
Best F1-score   0.99 0.99 0.83 0.95 0.98 0.93 0.95 0.57 0.82
Worst F1-score  0.58 0.4 0.23 0.92 0.97 0.52 0.39 0.24 0.82

+ detection of normal radiographs
# Consolidation/Opacity was not reported but atelectasis, infiltration and pneumonia
$ As reported by Wang et al.34 \§ Congestion not reported, but edema.
* performed only pneumonia detection, as a clinical diagnosis and used varying thresholds of pneumonia-prevalence.
na not available/not reported by study

Citation

If you find the code helpful, please cite our published manuscript:

@article{10.1093/bioinformatics/btaa668,
    author = {Bressem, Keno K and Adams, Lisa C and Gaudin, Robert A and Tr√∂ltzsch, Daniel and Hamm, Bernd and Makowski, Marcus R and Schüle, Chan-Yong and Vahldiek, Janis L and Niehues, Stefan M},
    title = "{Highly accurate classification of chest radiographic reports using a deep learning natural language model pretrained on 3.8 million text reports}",
    journal = {Bioinformatics},
    year = {2020},
    month = {07},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa668},
    url = {https://doi.org/10.1093/bioinformatics/btaa668},
    note = {btaa668},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/doi/10.1093/bioinformatics/btaa668/33526133/btaa668.pdf},
}


bert-for-radiology's People

Contributors

kbressem avatar

Stargazers

Cathy Gao-Howard avatar Muhammad Kabir Hamzah  avatar  avatar Michael Rutherford avatar  avatar Hazal Türkmen avatar  avatar Markus Hinsche avatar  avatar Naoto Usuyama avatar David Higgins avatar

Watchers

Janis Vahldiek avatar James Cloos avatar Yuanyi Zhu avatar paper2code - bot avatar

bert-for-radiology's Issues

Missing pretrained models

I have noticed you didn't upload pretrained models on your repo, could you please upload it. Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.