santina / team_undecided Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 216.76 MB

R 100.00%

team_undecided's People

Contributors

Stargazers

Watchers

team_undecided's Issues

Transfer Ownership

Hey @santina and @farnushfarhadi ,

We would like ownership of the repo. You can transfer ownership to anyone.

Thanks,
Arjun

Poster of Team Undecided

SHA: e3d892f
@santina
@singha53
Poster

Notes for Monday Feb6th Meeting

Hey people,

I will also try to put down what I remember of Amrits conversation with us.

Look at differential network analysis. This can be used to see the differences in networks between the two conditions (asthma and normal).
We can also see if we can try to group patients together based on methylation and rna-seq profiles. We can maybe create networks from groups of patients and use this differential network analysis again. This way we can see how the networks change among asthma endotypes (if any).
Try to download the data using the GEOquery package. It will give you all the data really nicely. I can't seem to do it from whatever computer I am on.
https://www.ncbi.nlm.nih.gov/pubmed/27473063

Differential expression analysis

conduct differential network analysis using limma voom
identify genes that are differentially expressed between control and patient groups

WGCNA modules identification

conduct WGCNA on controls to cluster genes into different modules
identify the most interesting modules that we want to perform differential network analysis on
try to limit the number of genes we have <500 genes in each module, probably okay?
output list of genes organized into WGCNA modules
in the subsequent step, we will identify the modules that have "higher connectivity" or "more connected genes" (between the two patient groups)

Differential coexpression analysis

Project Proposal for Team_Undecided

67fc517

https://github.com/STAT540-UBC/team_Undecided/blob/master/project_proposal.md

@ppavlidis @singha53 @farnushfarhadi

I made a pretty flow chart for our poster - see attached. Though it outlines what we need to do. I'll go ahead and create separate issues for each thing in preparation for our little hackathon tomorrow.

STAT 540 - Analysis Pipeline.pdf

Clustering

log2 transform data in order to reveal more variations
ensure we're using normalized data

Progress Report for Team_Undecided

9de029f

Progress Report

@ppavlidis @singha53 @farnushfarhadi @santina

Pathway identification given differentially expressed genes

pathway enrichment analysis, using wiki path?
identify interesting pathways that we want to conduct differential network analysis on

differential connectivity analysis using dna

Biological interpretation

now we have a list of modules / pathways that are significant from our differential network analysis statistics
what do they mean??

Initial proposal feedback

Name	Department/Program	Experties/Interests	GitHub ID
Arjun Baghela	Bioinformatics	Immunology & Transcriptomics	@abaghela
Emma Graham	Bioinformatics	Machine Learning & Metabolomics	@emmagraham
Allison Tai	Bioinformatics	Machine Learning & DNA Structure	@faelicy
Eric Chu	Bioinformatics	Neuroscience & Transcriptomics	@echu113

Team name: Undecided

One paragraph on the basic idea of the project:

Asthma is characterized by chronic inflammation, and affects over 400 million children and adults worldwide (1). The heterogeneity of the disease manifests as variation in clinical onset, responsiveness to treatment and comorbidities (2). Upstream events in the lung epithelial cells of the lower airway have been postulated to initiate Type II inflammation, which is mediated by CD4+ T cells, leading to cytokine production and remodeling of the cellular environment in the lower airway. Recent studies using RNA-seq data have characterized the Type II immune response in CD4+ T cells; however, the upstream events in epithelial cells that initiate this response remain unknown (3,4). A study recently published, which obtained RNA-Seq and methylation profiles for 76 asthma patients, investigated the genetic and epigenetic markers upregulated in lower airway epithelial cells during asthmatic responses (5). However, the conclusions of the study are limited by numerous confounding factors such as medication usage, comorbidities and artefacts of experimentation, which can obscure the detection of meaningful biological signals. Furthermore, the generation of interactive networks with WGCNA (6) in the aforementioned study may have removed meaningful connections in an attempt to reduce noise, and also has difficulties incorporating heterogeneous data. RNAseq and methylation data from lung epithelial cells in subjects with and without asthma will be analyzed to determine master regulator genes that initiate the Type II inflammatory response in lung epithelial cells. To begin our analysis, RNAseq data will be processed to remove the effect of confounding variables, and used to construct a co-expression network. Similarly, we will construct a co-expression network with differentially methylated CpGs (DMCs). Both these networks may give us insights into the genetic and epigenetic signatures that influence variation in asthma endotypes. We may try other analyses too, if we have time. These include determining whether methylation levels at DMCs are correlated with expression levels of nearby genes and integrating DMC and RNA-Seq data using a network-interaction based approach.

References

Pawankar R. 2014. Allergic diseases and asthma: a global public health concern and a call to action. World Allergy Organ. J. 7: 12.
Wesolowska-Andersen A, Seibold MA. Airway molecular endotypes of asthma: dissecting the heterogeneity. Curr Opin Allergy Clin Immunol. 2015;15(2):163–168. doi:
Locksley RM. Asthma and allergic inflammation. Cell. 2010;140:777–783.
Seumois, Grégory, et al. "Transcriptional profiling of Th2 cells identifies pathogenic features associated with asthma." The Journal of Immunology 197.2 (2016): 655-664
Nicodemus-Johnson, Jessie et al. “DNA Methylation in Lung Cells Is Associated with Asthma Endotypes and Genetic Risk.” JCI Insight 1.20 (2016): e90151. PMC. Web. 26 Jan. 2017.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi:10.1186/1471-2105-9-559.

Test for significance by bootstrapping

Use bootstrapping to generate distribution of differential network analysis statistics
sample samples with replacement... mixed the samples of two groups
conduct similar differential network analysis tests... repeat many many many times
p-value can be generated by looking at the proportion larger than the values obtained in the real tests in the distribution
adjust for multiple testing given a list of p-values

Data preprocessing

Normalize count by library size - can be done using voom?
Filter out lowly expressed genes (less than 3 per million?)
any other necessary preprocessing
produce PCA figures to illustrate why no further processing is necessary?

santina / team_undecided Goto Github PK

team_undecided's People

Contributors

Stargazers

Watchers

team_undecided's Issues

Recommend Projects

Recommend Topics

Recommend Org