team_undecided's People
team_undecided's Issues
Transfer Ownership
Hey @santina and @farnushfarhadi ,
We would like ownership of the repo. You can transfer ownership to anyone.
Thanks,
Arjun
Poster of Team Undecided
Notes for Monday Feb6th Meeting
Hey people,
I will also try to put down what I remember of Amrits conversation with us.
- Look at differential network analysis. This can be used to see the differences in networks between the two conditions (asthma and normal).
- We can also see if we can try to group patients together based on methylation and rna-seq profiles. We can maybe create networks from groups of patients and use this differential network analysis again. This way we can see how the networks change among asthma endotypes (if any).
- Try to download the data using the GEOquery package. It will give you all the data really nicely. I can't seem to do it from whatever computer I am on.
- https://www.ncbi.nlm.nih.gov/pubmed/27473063
Differential expression analysis
- conduct differential network analysis using limma voom
- identify genes that are differentially expressed between control and patient groups
WGCNA modules identification
- conduct WGCNA on controls to cluster genes into different modules
- identify the most interesting modules that we want to perform differential network analysis on
- try to limit the number of genes we have <500 genes in each module, probably okay?
- output list of genes organized into WGCNA modules
- in the subsequent step, we will identify the modules that have "higher connectivity" or "more connected genes" (between the two patient groups)
Differential coexpression analysis
Project Proposal for Team_Undecided
Pipeline PDF
I made a pretty flow chart for our poster - see attached. Though it outlines what we need to do. I'll go ahead and create separate issues for each thing in preparation for our little hackathon tomorrow.
Clustering
- log2 transform data in order to reveal more variations
- ensure we're using normalized data
Progress Report for Team_Undecided
Pathway identification given differentially expressed genes
- pathway enrichment analysis, using wiki path?
- identify interesting pathways that we want to conduct differential network analysis on
differential connectivity analysis using dna
Biological interpretation
- now we have a list of modules / pathways that are significant from our differential network analysis statistics
- what do they mean??
Initial proposal feedback
Name | Department/Program | Experties/Interests | GitHub ID |
---|---|---|---|
Arjun Baghela | Bioinformatics | Immunology & Transcriptomics | @abaghela |
Emma Graham | Bioinformatics | Machine Learning & Metabolomics | @emmagraham |
Allison Tai | Bioinformatics | Machine Learning & DNA Structure | @faelicy |
Eric Chu | Bioinformatics | Neuroscience & Transcriptomics | @echu113 |
Team name: Undecided
One paragraph on the basic idea of the project:
Asthma is characterized by chronic inflammation, and affects over 400 million children and adults worldwide (1). The heterogeneity of the disease manifests as variation in clinical onset, responsiveness to treatment and comorbidities (2). Upstream events in the lung epithelial cells of the lower airway have been postulated to initiate Type II inflammation, which is mediated by CD4+ T cells, leading to cytokine production and remodeling of the cellular environment in the lower airway. Recent studies using RNA-seq data have characterized the Type II immune response in CD4+ T cells; however, the upstream events in epithelial cells that initiate this response remain unknown (3,4). A study recently published, which obtained RNA-Seq and methylation profiles for 76 asthma patients, investigated the genetic and epigenetic markers upregulated in lower airway epithelial cells during asthmatic responses (5). However, the conclusions of the study are limited by numerous confounding factors such as medication usage, comorbidities and artefacts of experimentation, which can obscure the detection of meaningful biological signals. Furthermore, the generation of interactive networks with WGCNA (6) in the aforementioned study may have removed meaningful connections in an attempt to reduce noise, and also has difficulties incorporating heterogeneous data. RNAseq and methylation data from lung epithelial cells in subjects with and without asthma will be analyzed to determine master regulator genes that initiate the Type II inflammatory response in lung epithelial cells. To begin our analysis, RNAseq data will be processed to remove the effect of confounding variables, and used to construct a co-expression network. Similarly, we will construct a co-expression network with differentially methylated CpGs (DMCs). Both these networks may give us insights into the genetic and epigenetic signatures that influence variation in asthma endotypes. We may try other analyses too, if we have time. These include determining whether methylation levels at DMCs are correlated with expression levels of nearby genes and integrating DMC and RNA-Seq data using a network-interaction based approach.
References
- Pawankar R. 2014. Allergic diseases and asthma: a global public health concern and a call to action. World Allergy Organ. J. 7: 12.
- Wesolowska-Andersen A, Seibold MA. Airway molecular endotypes of asthma: dissecting the heterogeneity. Curr Opin Allergy Clin Immunol. 2015;15(2):163–168. doi:
- Locksley RM. Asthma and allergic inflammation. Cell. 2010;140:777–783.
- Seumois, Grégory, et al. "Transcriptional profiling of Th2 cells identifies pathogenic features associated with asthma." The Journal of Immunology 197.2 (2016): 655-664
- Nicodemus-Johnson, Jessie et al. “DNA Methylation in Lung Cells Is Associated with Asthma Endotypes and Genetic Risk.” JCI Insight 1.20 (2016): e90151. PMC. Web. 26 Jan. 2017.
- Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi:10.1186/1471-2105-9-559.
Test for significance by bootstrapping
- Use bootstrapping to generate distribution of differential network analysis statistics
- sample samples with replacement... mixed the samples of two groups
- conduct similar differential network analysis tests... repeat many many many times
- p-value can be generated by looking at the proportion larger than the values obtained in the real tests in the distribution
- adjust for multiple testing given a list of p-values
Data preprocessing
-
Normalize count by library size - can be done using voom?
-
Filter out lowly expressed genes (less than 3 per million?)
-
any other necessary preprocessing
-
produce PCA figures to illustrate why no further processing is necessary?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.