Code Monkey home page Code Monkey logo

pgxgdsc's Introduction

PGxGDSC

PGxGDSC is an automated pipeline for pharmacogenomic analysis of cancer drug response. Given a cancer type and therapeutic compound PGxGDSC identifies genes with mutually exclusive alterations in sensitive and resistance cancer cell models. Mutation profiles, gene expression and copy number variations are considered from COSMIC and CCLE repositories.

Step-by-step description

1.Download prerequisite data to the working directory

Download the data files below and the R scripts found in PGxGDSC/source/. Then set the working directory to the downloaded data and R scripts.

A. Drugs response data

  1. GDSC1-dataset
  2. GDSC2-dataset
  3. Compounds-annotation
  4. List of all annotated models

B. COSMIC Cell Lines Project data files

  • You will need to login to download the files.
  • COSMIC does not support older versions of the resource. As a result, release v94 data files, which were used for our analysis, are not available for download. You can download the data files from COSMIC latest release, release v95, to use the methodology. However, the results of the analysis for the three use cases described in the article might be different from the ones provided in the PGxGDSC/data/ file.

Go to the COSMIC Cell Lines Project and download the following files:

  1. Complete mutation data > Download Whole File > CosmicCLP_MutantExport.tsv.gz
  2. Copy Number Data > Download Whole File > CosmicCLP_CompleteCNA.tsv.gz
  3. Gene Expression > Download Whole File > CosmicCLP_CompleteGeneExpression.tsv.gz

C. DepMap/CCLE data files

The CCLE data are available through DepMap and can be downloaded following the links below:

  1. CCLE_mutations
  2. CCLE_expression_full
  3. CCLE_gene_cn

2.Run PGxGDSC/source/uploads_edits.R

The script loads the data files in the R workspace. This might take a while, so be patient.

3.Run PGxGDSC/source/main.R

Open the main.R script and configure the following parameters appropriately.

A. drug_name <- <drug.name

<drug.name: Insert drug name as referenced in Genomics of Drug Sensitivity in Cancer database e.g. drug_name <- "Afatinib".

B. dataset <- <dataset>

<dataset>: Insert Genomics of Drug Sensitivity in Cancer dataset of choice for the analysis. Available options are dataset <- GDSC2 and dataset <- GDSC1.

C. tissue <- <cancer.tissue>

<cancer.tissue>: Insert TCGA classification for the tissue you are running the analysis e.g. tissue <- "BRCA" (BRCA is the TCGA classification for Breast invasive carcinoma)

D.getScript(x, y)

Choose the type (x) and the source (y) of the data to be analyzed. Options for x are "MUTATIONS" (mutation data), "EXPRESSION" (gene expression data) and "CN" (copy number data). Options for y are "COSMIC" (COSMIC Cell Lines Project) and "DEPMAP" (DepMap/CCLE). e.g. getScript("MUTATIONS", "COSMIC")

If you want to run the analysis for another drug_name, dataset or tissue, you need first to run the command

rm(list=setdiff(ls(), c("getScript", "GDSC1", "GDSC2", "compounds_annotation", "CCLE_exp", "CCLE_mutations", "CCLE_CN", "COSMIC_exp", "COSMIC_mutations", "COSMIC_CNVs"))) then, insert a new drug_name, dataset or tissue and run getScript(x, y).

Internet connection is required in order to download the data files and to run source("KEGG.R") and source("pathway_interactions.R").

Citation: The work has been submitted for publication and is currently under review.

pgxgdsc's People

Contributors

andigoni avatar konstantinoskardamiliotis avatar

Stargazers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.