Comments (4)
08-07-2020
Specific details for imputation:
- Median imputation?
- k-NN; how to determine optimal k?
- Make sure that rows/columns are correct (assuming that rows are samples, columns are genes)
Immune deconvolution methods:
- Determine what are the gene expressions required for each method
- Correlate methods with each other and see if they suggest similar compositions
- Correlate results with existing results available from e.g. papers
- Leave out CIBERSORT due to registration wall (no R-package/source code available?)
Genes names:
- Take the whole list of gene names (biomaRt), save that as a R object inside the package (internal?)
- Have all GEX/CNA/... to contain those gene names, even if they're mainly populated with NA values (-> comforming dimensions for genes)
from curatedpcadata.
08-26-2020
Naming conventions
- Lower case in all function names
- Lower case also in function parameters
- Instead of "." use always "_" (due to the special use of "." in R)
- Structures variables have "DataType_StudyName" e.g. "MAE_TCGA"
from curatedpcadata.
09-11-2020
- Get rid of cBioPortal version of Taylor et al., focus on GEO-derived portion that uses our harmonization discipline
- Elaborate further on example analyses
- Elaborate further on uses of MultiAssayExperiment (use their examples maybe for inspiration, cheat sheet from https://bioconductor.riken.jp/packages/3.6/bioc/vignettes/MultiAssayExperiment/inst/doc/MultiAssayExperiment_cheatsheet.pdf etc)
- Make sure everybody's up to baseline of the master branch (and meet regarding conflicts if a better solution exists than is currently in master)
- Figure out which immunedeconv methods can be used with data that's not in TPM GEX-input format (e.g. xCell supporting ranked orders if native package is used?)
from curatedpcadata.
These issues have been now addressed at least for the grand majority, or altered and completed (for example inclusion of CIBERSORT after all).
from curatedpcadata.
Related Issues (20)
- Creation of sufficiently elaborate metadata for each MAE HOT 1
- Splitting apart OSF portion of TCGA HOT 2
- Purity Estimates HOT 2
- Abida et al focus on polyA / TCGA focus on TPM normalized GEX (OSF) HOT 1
- Risk score benchmarking HOT 1
- Gene ID aliases alter between datasets, especially older annotation ones HOT 2
- Row names HOT 1
- Benchmarking description HOT 1
- Double-checking ranks of newly normalized data HOT 1
- Include normal samples in Taylor et al. HOT 1
- Use the generic identifiers PCA#### / PAN#### in Taylor et al. HOT 1
- Sample IDs in Clinical data - Taylor et.al. HOT 2
- Fusion status HOT 1
- CNA and fusion sample size different in TCGA HOT 2
- Wang et al identifiers mismatch in derived variables HOT 1
- Weiner et al. & newly generated GEX for CIBERSORTx
- Cibersort results for Weiner et al. missing HOT 1
- colData does not work for MAE Barwick HOT 1
- Xenabrowser mapping not up to date HOT 1
- cBioPortal hg19 to hg38 liftOver HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from curatedpcadata.