Code Monkey home page Code Monkey logo

cancer's Introduction

OpenCog/SNet precision medicine for clinical trials poc project

Questions to answer:

  • what patterns of patient biomarkers characterize disease prognosis?
  • what patterns of patient biomarkers characterize treatment success?
  • what patterns of patient biomarkers characterize treatment adverse events?
  • what combination of biomarkers and treatment parameters characterize best patient outcomes?

Breast cancer tumor transcriptomes and clinical data sets

Each data set combines multiple studies with different gene sets and clincal variables that can be analyzed as an ensemble/meta-analysis or merged into one large matrix. Meta-analysis is more powerful with standard statistical methods due to data loss when variables from different studies are aligned and merged.

Data sources

Clustering Intra and Inter DatasEts

CoINcIDE is an unsupervised meta-graph clustering algorithm used to sub-type tumors from gene expression profiles from multiple patient study cohorts: paper, author's github.
The author's github includes useful but outdated R code for processing and merging microarray data sets from GEO. Updating the code for current use is ongoing, see cancer branch of fork of author's github repository in CoINcIDE.

curatedBreastData: 4,923 breast tumor microarray expression sets from 2,613 patients in 20 studies published as a Bioconductor R package [paper].

A Three-Gene Model to Robustly Identify Breast Cancer Molecular Subtypes

Haibe-Kains et. al (2012) develop a gaussian mixture subtype classification model (SCM) using microarray expression levels of three key genes (ER, HER2, and AURKA) from breast cancer tumor samples and compare it favorably to two other published SCMs and three published hierarchical clustering based single sample predictor (SSP) model classifiers, including the commercially available PAM50 molecular subtyping system, using dozens to hundreds of genes. An associated Bioconductor package genefu and the code to reproduce their findings are available.

MetaGxBreast: 39 breast cancer microarray expression datasets spanning 10,004 samples. Survival information is available for 6,847 patients, including overall survival (n = 4,425), metastasis free survival (n = 2,695), and relapse free survival (n = 1,858) [package][paper].

pdf copies of papers are in the lit dircetory

additional data and method sources

TCGA

1073 samples already included in MetaGXBreast dataset. These samples have other -omics assay data available for data integration analyses (whole genome sequencing, DNA methylation, proteomics, etc)
Link to multi-omics breast cancer sub-typing paper with analysis data available from TCGA. This a good review for understanding current thinking about breast cancer.

TCGA pan-cancer literature index
163 normal tissue frome breast cancer patients search table
1,145 blood samples bc search table

other data and methods links

state of the art tumor classification: Dynamic Classification Using Case-Specific Training Cohorts Outperforms Static Gene Expression Signatures in Breast Cancer

cancer's People

Contributors

astroseger avatar cics-oleg avatar mjsduncan avatar noskill avatar yashenkoxciv avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.