Code Monkey home page Code Monkey logo

radiation-therapy-machine-learning's Introduction

Telomere length and chromosomal instability for predicting individual radiosensitivity and risk via machine learning

This repository is a supplement to my preprint: https://www.biorxiv.org/content/10.1101/2020.03.27.009043v3
and upcoming paper.  

One sentence summary

Here we explore the utility of telomeres and chromosome rearrangements for predicting cancer patient's risks of radiation late effects from radiotherapy, and present the first implementation of individual telomere length data in a machine learning model (XGBoost), providing a general framework for predicting individual outcomes from radiotherapy.  

Abstract

The ability to predict a cancer patient’s response to radiotherapy and risk of developing adverse late health effects would greatly improve personalized treatment regimens and individual outcomes. Telomeres represent a compelling biomarker of individual radiosensitivity and risk, as exposure can result in dysfunctional telomere pathologies that coincidentally overlap with many radiation-induced late effects, ranging from degenerative conditions like fibrosis and cardiovascular disease to proliferative pathologies like cancer. Here, telomere length was longitudinally assessed in a cohort of fifteen prostate cancer patients undergoing Intensity Modulated Radiation Therapy (IMRT) utilizing Telomere Fluorescence in situ Hybridization (Telo-FISH). To evaluate genome instability and enhance predictions for individual patient risk of secondary malignancy, chromosome aberrations were also assessed utilizing directional Genomic Hybridization (dGH) for high-resolution inversion detection. We present the first implementation of individual telomere length data in a machine learning model, XGBoost, trained on pre-radiotherapy (baseline) and in vitro exposed (4 Gy γ-rays) telomere length measures, to predict post-radiotherapy telomeric outcomes, which together with chromosomal instability provide insight into individual radiosensitivity and risk for radiation-induced late effects.

Data handling/workflow

The notebooks which document and execute the data analysis for this project are provided below in multiple formats. All code is written in python. The Nbviewer and HTML links provide static renderings of the jupyter notebooks. The Jupyter lab Binder link provides an interactive, fully functional rendering of the entire repo or individual notebooks. Each notebook is dedicated to a specific task in the workflow, detailed in the filename suffix.  


01_radiation_therapy_patients_data_EXTRACTION_CLEANING.ipynb

  • Extraction and cleaning of telomere length and chromosome rearrangement data

02_radiation_therapy_patients_data_VISUALIZATION_STATISTICS.ipynb

  • Multiple types of highly customized data visualizations
  • Feature engineering short/long telomeres
  • ANOVAs, regressions, statistical modeling

03_radiation_therapy_patients_data_MACHINE_LEARNING.ipynb

  • Extensive data-cleaning class pipelines to prepare data machine learning (ML) models
  • GridSearch / Bayesian optimization (don't run if your laptop is older like mine)
  • Training XGBoost models on pre-therapy data to predict post-therapy otucomes
  • Extensive evaluation of model metrics
  • Hierarchical clustering of patients using longitudinal data into high/low risk sub-groups

To launch all notebooks (may be slow):

Jupyter Lab
Binder

To launch individual notebooks:

Nbviewer Jupyter Lab HTML
01_radiation_therapy_patients_data_EXTRACTION_CLEANING.ipynb Binder HTML
02_radiation_therapy_patients_data_VISUALIZATION_STATISTICS.ipynb Binder HTML
03_radiation_therapy_patients_data_MACHINE_LEARNING.ipynb Binder HTML

 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.