Cancer-transcriptomics Array Data Analysis This repository contains scripts for analyzing array data using various R packages and Bioconductor tools. The analysis includes preprocessing steps such as background correction, normalization, batch effect removal, and differential expression analysis.
To run the analysis, you need to install the following Bioconductor packages: frma, affy, cluster, impute, preprocessCore, WGCNA, GEOquery, simpleaffy, RColorBrewer, devtools, ggfortify, ggplot2, sva, limma, and hgu133plus2.db. You can install these packages using the following command:
biocLite(c("affy", "affy", "cluster", "impute", "preprocessCore", "GEOquery", "WGCNA", "GEOquery", "simpleaffy", "RColorBrewer", "devtools", "ggfortify", "ggplot2", "sva", "limma", "hgu133plus2.db"))
This step involves downloading the gene expression dataset from the Gene Expression Omnibus (GEO) database. GEO stores curated gene expression DataSets that can be used for analysis.
In this step, the dataset undergoes various preprocessing steps, including normalization, batch effect removal, and quality control. The Bioconductor packages and tools mentioned earlier are used for these processes.
After preprocessing, the dataset is ready for differential expression analysis. The analysis compares gene expression between different samples to identify genes that are differentially expressed. Bioconductor packages such as limma are used for this analysis.
This step involves generating graphical results to visualize the findings from the differential expression analysis. Packages like ggplot2 and RColorBrewer are utilized for creating plots and visualizations.
README: Under Construction Please note that this README is currently under construction. More detailed instructions and explanations will be added soon.