tanaylab / metacell Goto Github PK
View Code? Open in Web Editor NEWMetacell - Single-cell mRNA Analysis
Home Page: https://tanaylab.github.io/metacell
License: Other
Metacell - Single-cell mRNA Analysis
Home Page: https://tanaylab.github.io/metacell
License: Other
Original report by Anonymous.
I have a dataset with more cells (more than 40K) than number of genes.
mcell_add_gene_stat encounters an error as follows. The bug is probably because there are more cells than genes and 'replace=T' needs to be specified in 'sample' function:
Error in do.ply(i) :
task 1 failed - "cannot take a sample larger than the population when 'replace = FALSE'"
Calls: mcell_add_gene_stat ... scm_downsamp -> -> llply -> ->
Execution halted
Cellranger (at least v3.1) has changed the name convention of some output files. In particular for the gene_fn from genes.tsv
to features.tsv
this breaks the "guessing step" of functions such as mcell_import_scmat_10x
Original report by JPLau (Bitbucket: 5c2729a816ac1e4f7cbbda0b, ).
Hi,
an error was produced when I run mcell_coclust_from_graph_resamp( about 130k cells), and error is "Too many child processes". However, my station have 2 cpus with 48 cores and 512 GB RAM and I did't notice memory leaks. Also it can not change when I add "options(tgs_max.processes=16L)".
mcell_coclust_from_graph_resamp(
coc_id="fb_raw_filter_coc5000",
graph_id="fb_raw_filter_graph",
min_mc_size=20,
p_resamp=0.75, n_resamp=5000)
Got the following error:
filtered 165099049 left with 2807676 based on co-cluster imbalance
building metacell object, #mc 1729
add batch counts
compute footprints
33%...46%...59%...70%...84%...97%...100%
compute absolute ps
26%...34%...45%...58%...70%...82%...94%...100%
compute coverage ps
26%...35%...48%...59%...72%...85%...98%...100%
reordering metacells by hclust and most variable two markers
reorder on Stmn2 vs Ccnd2
Error in apply(u_gi[fr:to, ], 1, max) :
dim(X) must have a positive length
Calls: generate_mc ... mcell_plot_outlier_heatmap -> mc_compute_outlier_fc -> apply
In addition: Warning messages:
1: In cor(x, ctot) : the standard deviation is zero
2: In cor(x, ctot) : the standard deviation is zero
3: In cor(x, ctot) : the standard deviation is zero
4: In cor(x, ctot) : the standard deviation is zero
Execution halted
Hello, I was wondering: is there an easy way to obtain a list of genes expressed in a specific metacell of interest? Any help would be much appreciated!
Thanks
Alessandra
Original report by Alexandru Ioan Voda (Bitbucket: 5c7dcdcfa0c37a763f398f75, ).
The package installed alright:
> install.packages('BiocManager')
Installing package into ‘/gfs/devel/avoda/R/x86_64-pc-linux-gnu-library/3.5’
(as ‘lib’ is unspecified)
trying URL 'https://www.stats.bris.ac.uk/R/src/contrib/BiocManager_1.30.4.tar.gz'
Content type 'application/x-gzip' length 242304 bytes (236 KB)
==================================================
downloaded 236 KB
* installing *source* package ‘BiocManager’ ...
** package ‘BiocManager’ successfully unpacked and MD5 sums checked
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (BiocManager)
The downloaded source packages are in
‘/tmp/RtmpYstCG4/downloaded_packages’
> BiocManager::install('metacell', site_repository = 'tanaylab.bitbucket.io/repo', update = FALSE)
Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.0 (2018-04-23)
Installing package(s) 'metacell'
trying URL 'tanaylab.bitbucket.io/repo/src/contrib/metacell_0.3.32.tar.gz'
Content type 'application/x-tar' length 108522 bytes (105 KB)
==================================================
downloaded 105 KB
* installing *source* package ‘metacell’ ...
** R
** inst
** byte-compile and prepare package for lazy loading
Warning: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘metacell’
Warning: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘metacell’
Warning: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘metacell’
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
Warning: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘metacell’
Warning: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘metacell’
Warning: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘metacell’
The downloaded source packages are in
‘/tmp/RtmpYstCG4/downloaded_packages’
* DONE (metacell)
> library(metacell)
Warning messages:
1: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘metacell’
2: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘metacell’
3: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘metacell’
But the vignette isn't accessible (replicated on two computers, 1 Mac, 1 Linux):
> vignette("metacell")
Warning message:
vignette ‘metacell’ not found
> vignette("metacell-package")
Warning message:
vignette ‘metacell-package’ not found
> vignette("MetaCell")
Warning message:
vignette ‘MetaCell’ not found
Original report by Anonymous.
Hi, I get an error when running mcell_mc_from_coclust_balanced on a custom dataset of about 100,000 cells.
#!R
mcell_mc_from_coclust_balanced(mc_id = "l5_all_mc",
coc_id = "l5_all_coc1000",
mat_id = "l5_all",
K = 20,
min_mc_size = 20,
alpha = 2)
filtered 73766018 left with 7690818 based on co-cluster imbalance
building metacell object, #mc 1034
add batch counts
Error: Must use a vector in `[`, not an object of class matrix.
Call `rlang::last_error()` to see a backtrace
#!R
> rlang::last_trace()
█
1. └─metacell::mcell_mc_from_coclust_balanced(...)
2. ├─metacell::scdb_add_mc(mc_id, tgMCCov(mc, outliers, mat))
3. └─metacell::tgMCCov(mc, outliers, mat)
4. └─methods::new(structure("tgMCCov", package = "metacell"), ...)
5. ├─methods::initialize(value, ...)
6. └─metacell::initialize(value, ...)
7. └─metacell:::.local(.Object, ...)
8. └─metacell::mc_update_stats(.Object, scmat)
9. └─metacell::mc_compute_n_bc(mc, scmat)
10. └─base::table(...)
11. └─base::factor(a, exclude = exclude)
12. └─base::order(y)
13. └─base::lapply(z, function(x) if (is.object(x)) as.vector(xtfrm(x)) else x)
14. └─base:::FUN(X[[i]], ...)
15. ├─base::as.vector(xtfrm(x))
16. ├─base::xtfrm(x)
17. └─base::xtfrm.default(x)
18. ├─base::as.vector(rank(x, ties.method = "min", na.last = "keep"))
19. └─base::rank(x, ties.method = "min", na.last = "keep")
20. ├─x[!nas]
21. └─tibble:::`[.tbl_df`(x, !nas)
22. └─tibble:::check_names_df(i, x)
#!R
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] metacell_0.3.41
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 pillar_1.4.0 compiler_3.6.0 plyr_1.8.4
[5] bitops_1.0-6 tools_3.6.0 tibble_2.1.1 gtable_0.3.0
[9] lattice_0.20-38 pkgconfig_2.0.2 rlang_0.3.4 graph_1.62.0
[13] Matrix_1.2-17 igraph_1.2.4.1 rstudioapi_0.10 Rgraphviz_2.28.0
[17] yaml_2.2.0 parallel_3.6.0 cluster_2.0.9 dplyr_0.8.1
[21] pdist_1.2 stats4_3.6.0 grid_3.6.0 tidyselect_0.2.5
[25] glue_1.3.1 R6_2.4.0 tgconfig_0.1.2 ggplot2_3.1.1
[29] purrr_0.3.2 magrittr_1.5 BiocGenerics_0.30.0 scales_1.0.0
[33] tgstat_2.3.4 assertthat_0.2.1 colorspace_1.4-1 entropy_1.2.1
[37] RCurl_1.95-4.12 lazyeval_0.2.2 tgutil_0.1.2 munsell_0.5.0
[41] crayon_1.3.4 dbscan_1.1-3 zoo_1.8-5
Thank you!
Francesco
Original report by Yaniv Lubling (Bitbucket: 557058:11761e0c-f009-41f9-a29c-6f6993e0539c, ).
m_coc = as.matrix(sparseMatrix(i = coc@coclust$node1, j = coc@coclust$node2, x=coc@coclust$cnt))
crashes. i and j expect a numeric, not a character vector as is the case for the 10x matrix input (barcodes)
Also imcompatibility between
m_coc and m_samp - I guess due to small cell filtering, but not 100 % sure.
Best
David
Hi aviezerl,
I am unable to install MetaCell because of gstat:
ERROR: Unix-only package
* removing 'C:/Users/danmor/Documents/R/win-library/4.0/tgstat'
Error: Failed to install 'metacell' from GitHub:
Failed to install 'tgstat' from GitHub:
(converted from warning) installation of package ‘C:/Users/danmor/AppData/Local/Temp/RtmpIvTXXx/file2714303929c2/tgstat_2.3.16.tar.gz’ had non-zero exit status
Does the "Unix-only package" imply MetaCell can only be installed on a Unix system?
Thank you in advance for your help!
Original report by Eva Xia (Bitbucket: 5bf4282fd3a70b37b7bc0353, ).
Hi,
I tried installing using BiocManager with the command provided but am getting the following error:
ERROR: dependency ‘tgstat’ is not available for package ‘metacell’
I have the latest Mac OS (10.14.1), Bioconductor version 3.8 (BiocManager 1.30.4), and R 3.5.1
Thank you very much for your help.
-Eva
Hi,
Is it possible to increase the number of Metacells to more than 200 (between 800 and 1000) ?
Thanks a lot,
Anissa
Original report by Yaniv Lubling (Bitbucket: 557058:11761e0c-f009-41f9-a29c-6f6993e0539c, ).
line 229: should assign by colnames and not indices (altough it's probably/usually the same) , so should be:
thresh_K2[as.numeric(names(thresh_Kr))] = as.numeric(colnames(deg_wgt))[thresh_Kr]
Original report by Eric Kernfeld (Bitbucket: 557058:595ef195-9819-4a8e-ac1c-12054f2dcc5f, GitHub: ekernf01).
When I run this code on the attached files (~3k bladder cells from the Tabula Muris 10X data), I get the following error. Do you have any suggestions? Thanks!
#!R
> mcell_mc_from_coclust_balanced(
coc_id="temp_coc500",
mat_id= "temp",
mc_id= "temp_mc",
K=30, min_mc_size=30, alpha= 2)
filtered 0 left with 11026 based on co-cluster imbalance
building metacell object, #mc 1
add batch counts
compute footprints
compute absolute ps
compute coverage ps
reordering metacells by hclust and most variable two markers
Error in apply(mc@cov_gc[marks, ], 1, max) :
dim(X) must have a positive length
> traceback()
4: stop("dim(X) must have a positive length")
3: apply(mc@cov_gc[marks, ], 1, max)
2: mcell_mc_reorder_hc(mc_id)
1: mcell_mc_from_coclust_balanced(coc_id = "temp_coc500", mat_id = "temp",
mc_id = "temp_mc", K = 30, min_mc_size = 30, alpha = 2)
Original report by Anonymous.
Hi, an error was produced when I run mcell_add_cgraph_from_mat_bknn according to hthe pipeline about "Running metacell analysis: guided tutorial on 8K PBMCs"
#!R
> mcell_add_cgraph_from_mat_bknn(mat_id="test",
+ gset_id = "test_feats",
+ graph_id="test_graph",
+ K=100,
+ dsamp=T)
will downsample the matrix, N= 1877
will build balanced knn graph on 8276 cells and 921 genes, this can be a bit heavy for 20,000 cells
Error in tgs_cor_knn(x, k = knn * k_expand) :
argument "y" is missing, with no default
Could you please kindly advice me with this?
Thank you so much!
Best regards, Wenbin Ye
Original report by Jason Ji (Bitbucket: 5c2a37b0fbbe6428a7f336d7, ).
Hi,
Is there a way to get the results for all cells without removing outliers? The function 'mcell_add_cgraph_from_mat_bknn' seems to treat some cells as outliers and the final results(test_2dproj) for these cells cannot be obtained.
The error happened while the guided tutorial on 8K PBMCs was attempted to be run. It run without problems, until the method "mcell_add_cgraph_from_mat_bknn(mat_id="test", gset_id = "test_feats", graph_id="test_graph", K=100, dsamp=T)" generates the error.
Error message:
Error in intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) :
invalid character indexing
Calls: mcell_add_cgraph_from_mat_bknn ... callGeneric -> eval -> eval -> [ -> [ -> subCsp_cols -> intI
Moreover: Warning message:
In mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed, :
scheduled cores 1, 9 did not deliver results, all values of the jobs will be affected
Execution stopped
the program was executed in a 16GB RAM intel i7 machine
Original report by Anonymous.
Hi Author,
I really like the idea and figures of MetaCell and am trying to apply it to my PBMC dataset.
But the question is: how do I set the 'priority' parameter in pbmc_mc_colorize table? And if I want to use two cell markers to determine a cell type, how do I create it in this table? Does cell clustering heavily rely on 'priority' parameter?
Thanks and best,
Michelle
I am trying to cluster 4286 cells but only 1172 of them get a cluster assignment at the end of the pipeline. None of them have less than 500 UMIs, which is the lower filtering cutoff I'm using. The matrix object attributes(mat$object)$mat
still contains 4286 cells at the end, but attributes(mc$object)$mc
has only 1172, so I'm confused about the stage where the cells are excluded from the clustering annotation.
Also, how is mitochondrial expression treated in the metacell pipeline? I don't see any filtering of what could by dying cells.
Thank you
Original report by David Brocks (Bitbucket: 557058:06895b55-c358-4d77-94c7-e6384e0b676a, GitHub: dave15o1).
how can i read seurat object into metacell?
Original report by Sasikanth M (Bitbucket: 557058:8a01ef4f-c793-43aa-b940-38f63fa4c750, ).
Hi,
I got the following error when running mcell_add_gene_stat
Calculating gene statistics...
will downsamp
done downsamp
will gen mat_n
done gen mat_n
done computing basic gstat, will compute trends
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Can you help figuring out what might have caused this error?
Thank you
SM
How to find out the quantity of a particular gene in each metacell?
Is there a Metacell function that answers that?
Original report by Anonymous.
Hi Aviezer Lifshitz:
Thank you very much for you relpy. The system of linux and R software version are as follows:
BiocManager::install('metacell', site_repository = 'tanaylab.bitbucket.io/repo', update = FABioconductor version 3.7 (BiocManager 1.30.4), R 3.5.1 (2018-07-02)
Installing package(s) 'metacell'
also installing the dependency ‘tgstat’
downloaded 134 KB
downloaded 6.7 MB
The downloaded source packages are in
‘/tmp/RtmpOvoHsZ/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning messages:
1: In install.packages(pkgs = doing, lib = lib, repos = repos, ...) :
installation of package ‘tgstat’ had non-zero exit status
2: In install.packages(pkgs = doing, lib = lib, repos = repos, ...) :
installation of package ‘metacell’ had non-zero exit status
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.6 (Santiago)
Matrix products: default
BLAS: /data/home/heshuai/software/R-3.5.1/lib64/R/lib/libRblas.so
LAPACK: /data/home/heshuai/software/R-3.5.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] BiocManager_1.30.4 compiler_3.5.1 tools_3.5.1
we use the linux system as follows:
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.6 (Santiago)
Release: 6.6
Codename: Santiago
Hi Tanay lab,
When running mcell_mc_split_filt
function, followed immediately by mcell_mc2d_force_knn
, we get the following error:
Error in mc2d_comp_cell_coord(mc_id, graph_id, mgraph, mc_xy, symmetrize = symmetrize) :
MC-ERR: Length mismatch in number of projected MC and overal mc
Are we missing a step in the pipeline to be able to correct the mismatch?
Thanks for any help, and thanks for the package!
The line
fig_cmp_dir = paste(.scfigs_base, fig_cmp_dir,sep="")
should be
fig_cmp_dir = paste(.scfigs_base, fig_cmp_dir,sep="/")
instead (the sep option should contain "/").
I running Metacell and the gstat table of my set of genes, contains the item "is_on_count" column, however, for some genes, for example, gene X, a value of 0 was assigned, but when I am going to plot the enrichment bar graph, in some metacells this gene X has log2 fold change values. How can this happen? if there was no expression of this gene, would there be log2 FC in some metacells?
Original report by Yue Wu (Bitbucket: 5caa79174353071e5657a01a, ).
Hi there,
First of all, congrats on the amazing package of Metacell (forgot about it last time). And thank you for the kind assistance in the installation problem last time. Could you guys put up with me for one more kind of dump question?
I am trying to look into the data in the amazing paper of "Dysfunctional CD8+ T cells form a proliferative, dynamically regulated compartment within human melanoma". However, when I run the
rl(scdb_dir="scrna_db_Li2018", scfigs_dir="figs_Li2018")
It turned out
Error in readLines(con) : cannot open the connection
In addition: Warning message:
In readLines(con) :
Error in readLines(con) : cannot open the connection
6.
readLines(con)
5.
paste(string, collapse = "\n")
4.
yaml.load(readLines(con), error.label = error.label, ...)
3.
yaml::yaml.load_file(config_file, eval.expr = TRUE)
2.
tgconfig::override_params("config/mel_all.yaml", package = "metacell")
1.
rl(scdb_dir = "scrna_db_Li2018/", scfigs_dir = "figs_Li2018/")
I tried to input the route more specifically, but it did not seem to debug
rl(scdb_dir="/Users//Melenoma/melenoma data/scrna_db_Li2018/", scfigs_dir="/Users//Melenoma/melenoma_data/figs_Li2018/")
Error in readLines(con) : cannot open the connection
In addition: Warning message:
In readLines(con)
For the information(may not be relevant), I am running this code in MAC with the R3.5.3.
Could you please kindly advice me with this?
Thank you so much!
Best regards,
Yue
hi,
how to change the parameter of "downsample_n" when running mcell_add_gene_stat?
Thanks
Louis
Original report by Anonymous.
Hi,
Thank you for providing this sc analysis package. I am interested in using this package on a dataset that has been previously filtered. Is it possible to import a previously filtered dataset from a SingleCellExperiment or Seurat object?
If this feature is not available, is it possible to combine multiple 10X datasets during import and maintain the identity from each corresponding batch?
Best,
Michael
Hello,
I am running Metacell with a public dataset and I encountered an error in the supervised filtering of features step.
I have been trying to apply the code provided in the vignettes and I came across an error running the mcell_mc_split_filt function. I looked at the source code to figure out why this error is produced, but couldn't understand exactly what may be causing it. The error goes as follows:
Error in .local(.Object, ...) :
matrix cell names and mc assignments+outliers differ in tgMCCov initialization
In addition: Warning message:
In mclapply(nms_mc, split_dbscan, mc.cores = mc_cores) :
all scheduled cores encountered errors in user code
The code I ran and the session info is provided below.
library("metacell")
#must set up a directory to save Robjects==========================================================
folder.path <- '~/path/to/folder/'
data.name <- 'gsexxx'
#========================================================================================
if(!dir.exists(folder.path)) dir.create(folder.path)
scdb_init(folder.path, force_reinit=T)
#load data directory(barcodes genes matrix) to scmat format============================================
mcell_import_scmat_10x(data.name, base_dir = '~/path/to/data/10xformat/')
mat = scdb_mat(data.name)
#=========================================================================================
print(dim(mat@mat))
if(!dir.exists(paste0(folder.path,'figs/'))) dir.create(paste0(folder.path,"figs/"))
scfigs_init(paste0(folder.path,"figs/"))
mcell_plot_umis_per_cell(data.name)
mat = scdb_mat(data.name)
nms = c(rownames(mat@mat), rownames(mat@ignore_gmat))
ig_genes = c(grep("^IGJ", nms, v=T),
grep("^IGH",nms,v=T),
grep("^IGK", nms, v=T),
grep("^IGL", nms, v=T))
bad_genes = unique(c(grep("^MT-", nms, v=T), grep("^MTMR", nms, v=T), grep("^MTND", nms, v=T),"NEAT1","TMSB4X", "TMSB10", ig_genes))
bad_genes
mcell_mat_ignore_genes(new_mat_id=data.name, mat_id=data.name, bad_genes, reverse=F)
mcell_mat_ignore_small_cells(data.name, data.name, 800)
mcell_add_gene_stat(gstat_id=data.name, mat_id=data.name, force=T)
mcell_gset_filter_varmean(gset_id=paste0(data.name,' _feats'), gstat_id=data.name, T_vm=0.08, force_new=T) #scaled variance 0.08
mcell_gset_filter_cov(gset_id = paste0(data.name,' _feats'), gstat_id=data.name, T_tot=100, T_top3=2) #genes expressed at least in 100 cells
mcell_plot_gstats(gstat_id=data.name, gset_id=paste0(data.name,' _feats'), fig_dir = paste0(folder.path, 'figs/'))
genes_anchors = c('MKI167','PCNA', 'TOP2A', 'TXN', 'HSP90AB1', 'FOS')
tab_fn = paste(paste0(folder.path,'figs/'), "lateral_gmods.txt", sep="/")
gset_nm = "lateral"
mcell_mat_rpt_cor_anchors(mat_id=data.name, gene_anchors = genes_anchors,
cor_thresh = 0.1, #this is the threshold, genes above this will be filtered (low because it's scRNA)
gene_anti = c(), tab_fn = tab_fn,
sz_cor_thresh = 0.2) # this threshold is size, some cell cycle genes are correalted with size as well
gcor_mat = read.table(tab_fn, header=T)
head(gcor_mat)
print(dim(gcor_mat))
foc_genes = apply(gcor_mat[, genes_anchors], 1, which.max)
gset = gset_new_gset(sets = foc_genes, desc = "Cell cycle and stress correlated genes")
scdb_add_gset(gset_nm, gset)
sub_mat_id = paste(data.name, gset_nm, sep="_")
mcell_mat_ignore_genes(new_mat_id = sub_mat_id, mat_id = data.name, ig_genes = names(foc_genes), reverse = T)
num.cluster <- as.numeric(20)
mcell_gset_split_by_dsmat(gset_id = gset_nm, mat_id = sub_mat_id,
K = num.cluster)
gset = scdb_gset(gset_nm)
print(gset)
mcell_plot_gset_cor_mats(gset_id = gset_nm, scmat_id = sub_mat_id)
#===============================================================================
clusters <- seq(1,num.cluster)
good.genes <- c(9,8,20,17,16,15,14) #show no correlation with cell size
technical.genes <- clusters[!clusters %in% good.genes]
#================================================================================
mcell_gset_remove_clusts(gset_id = gset_nm,
filt_clusts = technical.genes,
new_id = paste0(gset_nm, "_filtered"),
reverse=T)
lateral_gset_id = paste0(gset_nm, "_filtered")
lateral_gset = scdb_gset(lateral_gset_id)
print(lateral_gset)
marker_gset = scdb_gset(paste0(data.name,' _feats'))
marker_gset = gset_new_restrict_gset(gset = marker_gset,
filt_gset = lateral_gset, inverse = T,
desc = "cgraph gene markers w/o lateral genes")
scdb_add_gset(paste0(data.name,"_feats_filtered"), marker_gset)
mcell_add_cgraph_from_mat_bknn(mat_id=data.name,
gset_id = paste0(data.name,"_feats_filtered"),
graph_id=paste0(data.name,"_graph_filtered"),
K=100,
dsamp=T)
mcell_coclust_from_graph_resamp(
coc_id=paste0(data.name,"_coc500_filtered"),
graph_id=paste0(data.name,"_graph_filtered"),
min_mc_size=20,
p_resamp=0.75,
n_resamp=500)
mcell_mc_from_coclust_balanced(
coc_id=paste0(data.name, "_coc500_filtered"),
mat_id= data.name,
mc_id= paste0(data.name,"_filtered_mc"),
K=30, #number of neighbors we wish to minimally associate with each cell (determines the size of metacells)
min_mc_size=30,
alpha=2)
mcell_mc_split_filt(new_mc_id=paste0(data.name, "_filtered_mc_f"),
mc_id = paste0(data.name, "_filtered_mc"),
mat_id=data.name,
T_lfc=3, plot_mats=F)
sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] metacell_0.3.41
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 RColorBrewer_1.1-2 plyr_1.8.6 pillar_1.4.4 compiler_3.6.0 iterators_1.0.12
[7] bitops_1.0-6 tools_3.6.0 lifecycle_0.2.0 tibble_3.0.1 gtable_0.3.0 lattice_0.20-41
[13] pkgconfig_2.0.3 rlang_0.4.7 foreach_1.5.0 graph_1.64.0 Matrix_1.2-18 igraph_1.2.5
[19] rstudioapi_0.11 Rgraphviz_2.30.0 yaml_2.2.1 parallel_3.6.0 cluster_2.1.0 dplyr_0.8.5
[25] doMC_1.3.6 vctrs_0.3.1 pdist_1.2 stats4_3.6.0 grid_3.6.0 tidyselect_1.1.0
[31] glue_1.4.1 R6_2.4.1 tgconfig_0.1.2 pheatmap_1.0.12 ggplot2_3.3.2 purrr_0.3.4
[37] magrittr_1.5 codetools_0.2-16 BiocGenerics_0.32.0 scales_1.1.1 ellipsis_0.3.1 tgstat_2.3.5
[43] assertthat_0.2.1 colorspace_1.4-1 entropy_1.2.1 RCurl_1.98-1.2 tgutil_0.1.2 munsell_0.5.0
[49] crayon_1.3.4 dbscan_1.1-5 zoo_1.8-8
Original report by Anonymous.
When I try to run:
mcell_add_gene_stat(gstat_id="mat_tk01", mat_id="mat_tk01", force=T) on my matrix mat_tk01, I get this error after it tries to downsample the data:
Error in do.ply(i) : task 1 failed - "cannot take a sample larger than the population when 'replace = FALSE'"
The matrix I have has 9498 genes and 4514 cells. How can I resolve this? I can't get past this step, even if I try to make my own gene set for building the knn--the issue then arises with the command:
mcell_add_cgraph_from_mat_bknn
Hi,
In applying gset_import_table I received the error that object 'set' is not found.
I had a closer look at the function definition here and I think there is a typo here:
#' Import gene set for a text table
#
#' @param fn file name to read from
#' @param desc tetual description of the gset, if null the desc will be the file name
#' @export
gset_import_table = function(fn, desc=NULL)
{
sets = read.table(fn, h=T, sep="\t")
if(is.null(desc)) {
desc = fn
}
if("gene" %in% colnames(sets) & "set" %in% colnames(sets)) {
gs = sets$set
names(gs) = set$gene #### <<<< Here set$gene should be sets$gene
return(tgGeneSets(gs, desc))
} else {
stop("cannot initialize gset from ", fn, " file must be a tab delim table with a header including the fields gene and set\n")
}
}
As indicated in the code above I think in line names(gs) = set$gene
, set$gene
should be sets$gene
.
I added this slightly amended function definition to my code and the problem was resolved. Is this the correct fix for the problem?
Hi,
Thank you for this amazing tool!
I was reading the published paper in Genome Biology and noticed in the 'Methods' section->'Defining the metacell gene expression profile' part, the equation for calculating a regularized geometric mean for gene expression intensity with each metacell is:
while when i came to the R package code for this part, I hope I located it right,
the mc_compute_fp
function in mc.r script.
it goes like this:
clust_geomean = t(tgs_matrix_tapply(us[f_g_cov,], mc@mc, function(y) {exp(mean(log(1+y)))-1}))
It looks like the minus 1 part is different between source code and the equation in the main text, in the source code, '-1' is done after exp(...), while in the main text method, '-1' is inside exp, i.e., exp(... -1 ).
I wonder if it's a typo in the published paper or there is something wrong with the code.
Bests!
Ming
Calling mcell_mc_plot_by_factor
in a fresh environment returns an error:
> mcell_mc_plot_by_factor(mc_id_f, meta_field = 'batch_set_id', mat_id)
Error in png(filename = sub("ps$", "png", fn), width = w, height = h, :
invalid 'width' argument
This seems to happen as a result of internally looking for the mcp_heatmap_width
, apparently absent by default.
> tgconfig::get_param("mcp_heatmap_width", "metacell")
NULL
A workaround is to declare mcp_heatmap_width
before calling mcell_mc_plot_by_factor
to make it run, e.g:
> tgconfig::set_param("mcp_heatmap_width", 1000, 'metacell')
> mcell_mc_plot_by_factor(mc_id_f, meta_field = 'batch_set_id', mat_id)
A simple internal assertion in mcell_mc_plot_by_factor
should solve this.
if( is.null(tgconfig::get_param("mcp_heatmap_width", "metacell")){
tgconfig::set_param("mcp_heatmap_width", 1000, 'metacell')
}
mcp_heatmap_width = tgconfig::get_param("mcp_heatmap_width")
Hi,
I am running this function mcell_mc_from_coclust_balanced with your data.
I am getting this error:
Error in thresh_K[as.numeric(names(thresh_Kr))] <- thresh_Kr :
NAs are not allowed in subscripted assignments
In addition: Warning message:
In mcell_coclust_filt_by_k_deg(coc_id, K, alpha) :
NAs introduced by coercion
Could you help me, please?
Hello world,
I am running this packages with 100,000 cells but I am stuck in this function mcell_mat_rpt_cor_anchors.
I am getting this error:
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 102
Could you help me, please?
When I ran mcell_import_scmat_tsv(mat_nm, fn, meta_fn) and got the following error message:
"Error in .scm_init_md(.Object, cell_metadata) :
Metadata cells names are incompatible with matrix cells - aborting".
When I checked the code of mcell_import_scmat_tsv, and found the problem came from "md = fread(meta_fn, sep="\t")". When this line was run, the cell_meta_data was read and generated a dataframe, a column name "V1" was added to the rownames of the original table, so an additional column was added to the datafrome. This metadata dataframe should be processed as umis dataframe. If possible, could you please fix this bug? Thank you very much.
Original report by Eric Kernfeld (Bitbucket: 557058:595ef195-9819-4a8e-ac1c-12054f2dcc5f, GitHub: ekernf01).
Hi Tanay Lab,
Installing tgstat on OSX 10.11.6 gives me this error.
./tgstat.h:270:16: error: use of undeclared identifier 'CLOCK_REALTIME'
clock_gettime(CLOCK_REALTIME, &t2);
^
It looks like it is a Mac OS issue, not a problem with your package, but maybe the package can be made compatible with old OSX.
https://stackoverflow.com/questions/5167269/clock-gettime-alternative-in-mac-os-x
Original report by Yue Wu (Bitbucket: 5caa79174353071e5657a01a, ).
Hi there~
Hope all is good! I am creating this issue because I come across such problem when trying to install the Metacell package in R.
install.packages("BiocManager")
trying URL 'https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.5/BiocManager_1.30.4.tgz'
Content type 'application/x-gzip' length 287459 bytes (280 KB)
==================================================
downloaded 280 KB
The downloaded binary packages are in
/var/folders/0_/sfnkh6s95xsb48vt4w2ytsbr1b5tp_/T//RtmpuhOyVK/downloaded_packages
BiocManager::install('metacell', site_repository = 'tanaylab.bitbucket.io/repo', update = FALSE)
Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.3 (2019-03-11)
Installing package(s) 'metacell'
also installing the dependencies ‘data.table’, ‘tgutil’
Warning: unable to access index for repository tanaylab.bitbucket.io/repo/bin/macosx/el-capitan/contrib/3.5:
cannot open URL 'tanaylab.bitbucket.io/repo/bin/macosx/el-capitan/contrib/3.5/PACKAGES'
There is a binary version available but the source version is later:
binary source needs_compilation
data.table 1.12.0 1.12.2 TRUE
Do you want to install from sources the package which needs compilation? (Yes/no/cancel) Yes
installing the source packages ‘data.table’, ‘tgutil’, ‘metacell’
downloaded 3.2 MB
downloaded 37 KB
downloaded 5.1 MB
The downloaded source packages are in
‘/private/var/folders/0_/sfnkh6s95xsb48vt4w2ytsbr1b5tp_/T/RtmpuhOyVK/downloaded_packages’
library(metacell)
Error in library(metacell) : there is no package called ‘metacell’
Could you please kindly help with with the problem?
Thank you so much!
Original report by Hong Ma (Bitbucket: 5bd1df58eee9a94b12bd8e52, ).
Hi,
Could you please give me some suggestion about installation? Thank you very much!
I'm using R/3.5.1 on windows 10. It keeps showing error while installation like:
BiocManager::install('metacell', site_repository = 'tanaylab.bitbucket.io/repo', update = FALSE)
Bioconductor version 3.7 (BiocManager 1.30.3), R 3.5.1 (2018-07-02)
Installing package(s) 'metacell'
Warning: unable to access index for repository tanaylab.bitbucket.io/repo/src/contrib:
scheme not supported in URL 'tanaylab.bitbucket.io/repo/src/contrib/PACKAGES'
Warning: unable to access index for repository tanaylab.bitbucket.io/repo/bin/windows/contrib/3.5:
scheme not supported in URL 'tanaylab.bitbucket.io/repo/bin/windows/contrib/3.5/PACKAGES'
Warning message:
package ‘metacell’ is not available (for R version 3.5.1)
I also tried to install the .zip file from R. But it still shows error like:
install.packages("C:/Users/Yuhan/Downloads/tanaylab-metacell-9b0544b282e6.zip", repos = NULL, type = "win.binary")
Installing package into ‘C:/Users/Yuhan/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)
library(metacell)
Error in library(metacell) : there is no package called ‘metacell’
Original report by Elad Chomsky (Bitbucket: 557058:d5a1345c-78db-4c3d-8734-e95917451718, GitHub: echomsky).
Calling scm_ignore_cells() restructure tgScMat@mat and tgScMat@cell, but does not ensure that tgScMat@cell_metadata matches the new order of cells.
Original report by Anonymous.
Hi there,
I just updated to metacell 0.3.27 and I'm now getting an error - Error in plot.new() : figure margins too large in mcell_mc_plot_hierarchy .
It worked for me before the update.
See details below:
library("metacell")
Warning messages:
1: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘metacell’
2: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘metacell’
3: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘metacell’sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] metacell_0.3.27
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 pillar_1.3.0 compiler_3.5.1
[4] plyr_1.8.4 bindr_0.1.1 bitops_1.0-6
[7] tibble_1.4.2 gtable_0.2.0 lattice_0.20-38
[10] pkgconfig_2.0.2 rlang_0.3.0.1 graph_1.60.0
[13] Matrix_1.2-15 igraph_1.2.2 Rgraphviz_2.26.0
[16] yaml_2.2.0 parallel_3.5.1 bindrcpp_0.2.2
[19] cluster_2.0.7-1 dplyr_0.7.8 pdist_1.2
[22] stats4_3.5.1 grid_3.5.1 tidyselect_0.2.5
[25] glue_1.3.0 R6_2.3.0 tgconfig_0.0.21
[28] ggplot2_3.1.0 purrr_0.2.5 magrittr_1.5
[31] BiocGenerics_0.28.0 scales_1.0.0 tgstat_2.3.1
[34] assertthat_0.2.0 colorspace_1.3-2 entropy_1.2.1
[37] RCurl_1.95-4.11 lazyeval_0.2.1 tgutil_0.0.3
[40] munsell_0.5.0 crayon_1.3.4 dbscan_1.1-3
[43] zoo_1.8-4
*** caught segfault ***
address 0x558cbaa2fc00, cause 'memory not mapped'
Traceback:
1: tgs_graph_cover(edges, min_mc_size, cooling = tgs_clust_cool, burn_in = tgs_clust_burn)
2: mcell_mc_from_coclust_balanced(coc_id = "retina_coc500", mat_id = "retina", mc_id = "retina_mc", K = 30, min_mc_size = 30, alpha = 2)
3: eval(expr, envir, enclos)
4: eval(expr, envir, enclos)
5: withVisible(eval(expr, envir, enclos))
6: withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler, error = eHandler, message = mHandler)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e), "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))})
11: try(f, silent = TRUE)
12: handle(ev <- withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler, error = eHandler, message = mHandler))
13: timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler, error = eHandler, message = mHandler)))
14: evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, debug = debug, last = i == length(out), use_try = stop_on_error != 2L, keep_warning = keep_warning, keep_message = keep_message, output_handler = output_handler, include_timing = include_timing)
15: evaluate(request$content$code, envir = .GlobalEnv, output_handler = oh, stop_on_error = 1L)
16: doTryCatch(return(expr), name, parentenv, handler)
17: tryCatchOne(expr, names, parentenv, handlers[[1L]])
18: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
19: doTryCatch(return(expr), name, parentenv, handler)
20: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), names[nh], parentenv, handlers[[nh]])
21: tryCatchList(expr, classes, parentenv, handlers)
22: tryCatch(evaluate(request$content$code, envir = .GlobalEnv, output_handler = oh, stop_on_error = 1L), interrupt = function(cond) { log_debug("Interrupt during execution") interrupted <<- TRUE}, error = .self$handle_error)
23: executor$execute(msg)
24: handle_shell()
25: kernel$run()
26: IRkernel::main()
An irrecoverable exception occurred. R is aborting now ...
Original report by Eric Kernfeld (Bitbucket: 557058:595ef195-9819-4a8e-ac1c-12054f2dcc5f, GitHub: ekernf01).
Hi Tanay lab,
Maybe you know about this already, but the vignette links give "file not found". Example:
https://tanaylab.bitbucket.io/metacell-r/articles/amphimedon.html
If I have a SingleCellExperiment object - I can convert it to metacell object.
Now, how to assign id to this metacell object so I can call functions on it (e.g. mcell_plot_umis_per_cell) or is there a way to call functions on metacell objects directly, without referring to its id?
Thanks!
Original report by Anonymous.
BiocManager::install('metacell')
Bioconductor version 3.8 (BiocManager 1.30.1), R 3.5.1 (2018-07-02)
Installing package(s) 'metacell'
Update old packages: 'BiocManager'
Update all/some/none? [a/s/n]: n
Warning message:
package metacell is not available (for R version 3.5.1)
Hi,
Thanks for developing this package and for providing the vignettes. They are really helpful.
I unfortunately ran into an error while running the 'supervised filtering of feature genes' vignette. When I want to generate the gene-gene correlation matrix I get the following error:
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : invalid character indexing
I ran the other vignette prior to running this one, as described in the tutorial. I also noticed that when I replace the gene_anchors with other genes (like LAG3 or CD8B) it worked.
Could you help figuring out what causes this error?
Much appreciated,
Nordin
Original report by Anonymous.
Hi,
Congrats on the nice paper on bR.
Is there a manual for using Metacell? or it's not open to public yet?
Thanks,
Qingsong
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.