Hello,
Thanks for the great package.
I only have pre-processed betas (from cord - see distribution below) so I must use meffil.estimate.cell.counts.from.betas.
There are a lot of negative values, especially for NK.
Of note, I did remove SNP-related probes, XY chromosome probes and low variability (<5%) probes and batch corrected for plate.
cellcountref = "andrews and bakulski cord blood"
cellcounts3<-meffil.estimate.cell.counts.from.betas(ARIEScordbetapoint0stICA,cell.type.reference=cellcountref,verbose=T)
head(cellcounts3)
Bcell CD4T CD8T Gran Mono NK nRBC
35 0.000000e+00 0.19415426 1.436839e-01 0.4361359 2.176463e-18 2.699185e-19 -6.938894e-18
42 -5.335846e-18 0.19536009 0.000000e+00 0.5927273 0.000000e+00 -8.517630e-18 2.775558e-17
45 -4.336809e-19 0.19231770 5.951174e-20 0.5538809 6.938894e-18 -9.660659e-18 0.000000e+00
50 -5.403754e-18 0.08274238 9.387805e-02 0.5950767 0.000000e+00 -9.282177e-18 0.000000e+00
73 -2.185562e-18 0.00000000 4.426202e-03 0.8130283 -1.199445e-18 0.000000e+00 0.000000e+00
118 1.168658e-01 0.03662122 6.565257e-02 0.5457070 1.602137e-18 -1.042258e-18 0.000000e+00
cellcountref2 ="cord blood gse68456"
cellcounts4<-meffil.estimate.cell.counts.from.betas(as.matrix(ARIEScordbetapoint0stICA),cell.type.reference=cellcountref2,verbose=T)
head(cellcounts4)
Bcell CD4T CD8T Gran Mono NK RBC
35 0.000000e+00 1.093173e-01 1.074017e-01 0.5587779 -3.088042e-19 3.544531e-03 0.03901592
42 6.938894e-18 8.018818e-02 6.655908e-02 0.6905427 0.000000e+00 0.000000e+00 0.00000000
45 0.000000e+00 6.050381e-02 9.786610e-02 0.6523306 0.000000e+00 0.000000e+00 0.00000000
50 0.000000e+00 6.646457e-02 5.677843e-02 0.6550116 2.477728e-02 0.000000e+00 0.00000000
73 -1.526199e-18 1.565778e-02 9.471646e-19 0.8668614 0.000000e+00 0.000000e+00 0.00000000
118 2.728014e-02 9.222220e-19 1.876269e-19 0.6439960 2.631825e-02 8.673617e-19 0.12577491
There are fewer negative values with the second reference but the distribution of estimated values between the two are very similar (see below)
Thanks for any thoughts,
J
Below is my session info:
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 LC_NUMERIC=C LC_TIME=English_Canada.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] FlowSorted.CordBloodNorway.450k_1.4.0 minfi_1.24.0 bumphunter_1.20.0 locfit_1.5-9.1
[5] iterators_1.0.10 foreach_1.4.4 Biostrings_2.46.0 XVector_0.18.0
[9] SummarizedExperiment_1.8.1 DelayedArray_0.4.1 Biobase_2.38.0 GenomicRanges_1.30.3
[13] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0
[17] meffil_1.0.0 statmod_1.4.30 quadprog_1.5-5 DNAcopy_1.52.0
[21] fastICA_1.2-1 lme4_1.1-19 Matrix_1.2-15 multcomp_1.4-8
[25] TH.data_1.0-9 survival_2.43-3 mvtnorm_1.0-8 matrixStats_0.54.0
[29] markdown_0.9 gridExtra_2.3 Cairo_1.5-9 knitr_1.21
[33] reshape2_1.4.3 plyr_1.8.4 ggplot2_3.1.0 sva_3.26.0
[37] BiocParallel_1.12.0 genefilter_1.60.0 mgcv_1.8-26 nlme_3.1-137
[41] limma_3.34.9 MASS_7.3-51.1 illuminaio_0.20.0
loaded via a namespace (and not attached):
[1] minqa_1.2.4 colorspace_1.4-0 siggenes_1.52.0 mclust_5.4.2 base64_2.0 rstudioapi_0.9.0 bit64_0.9-7
[8] AnnotationDbi_1.40.0 xml2_1.2.0 codetools_0.2-16 splines_3.4.3 nloptr_1.2.1 Rsamtools_1.30.0 annotate_1.56.2
[15] readr_1.3.1 compiler_3.4.3 httr_1.4.0 assertthat_0.2.0 lazyeval_0.2.1 prettyunits_1.0.2 tools_3.4.3
[22] bindrcpp_0.2.2 gtable_0.2.0 glue_1.3.0 GenomeInfoDbData_1.0.0 dplyr_0.7.8 doRNG_1.7.1 Rcpp_1.0.0
[29] multtest_2.34.0 preprocessCore_1.40.0 rtracklayer_1.38.3 xfun_0.4 stringr_1.3.1 rngtools_1.3.1 XML_3.98-1.16
[36] beanplot_1.2 zlibbioc_1.24.0 zoo_1.8-4 scales_1.0.0 hms_0.4.2 sandwich_2.5-0 GEOquery_2.46.15
[43] RColorBrewer_1.1-2 yaml_2.2.0 memoise_1.1.0 pkgmaker_0.27 biomaRt_2.34.2 reshape_0.8.8 stringi_1.2.4
[50] RSQLite_2.1.1 RMySQL_0.10.16 GenomicFeatures_1.30.3 bibtex_0.4.2 rlang_0.3.1 pkgconfig_2.0.2 bitops_1.0-6
[57] nor1mix_1.2-3 lattice_0.20-38 purrr_0.2.5 bindr_0.1.1 GenomicAlignments_1.14.2 bit_1.1-14 tidyselect_0.2.5
[64] magrittr_1.5 R6_2.3.0 DBI_1.0.0 pillar_1.3.1 withr_2.1.2 RCurl_1.95-4.11 tibble_2.0.1
[71] crayon_1.3.4 progress_1.2.0 grid_3.4.3 data.table_1.12.0 blob_1.1.1 digest_0.6.18 xtable_1.8-3
[78] tidyr_0.8.2 openssl_1.1 munsell_0.5.0 registry_0.5
Below is the histogram of my betas (to show it's bound by zero and 1.)
Histogram of estimated cell counts4
Histogram of estimated cell counts3