kieranrcampbell / clonealign Goto Github PK
View Code? Open in Web Editor NEWBayesian inference of clone-specific gene expression estimates by integrating single-cell RNA-seq and single-cell DNA-seq data
License: Apache License 2.0
Bayesian inference of clone-specific gene expression estimates by integrating single-cell RNA-seq and single-cell DNA-seq data
License: Apache License 2.0
Hi @kieranrcampbell,
Congratulations for this amazing tool! I have one equation about one parameter I cannot find how to set. The help function for "clonealign::clonealign" discusses a "size_factors" parameter that can be set to "infer" or "fixed", which if I well understand is "s_n" of the main clonealign equation (correct?). However, the actual function definition does not have such parameter. Has the option been taken off or maybe set by "infer" as default?
Hi,
I was wondering any hints on why I came across the error "Initial elbo is NA"?
Occasionally, I met across the error. They were gone if I removed features with either NAs or zeros in all cells. But sometimes they still exist, so I was wondering why? Could it also because the tf$session did not run at all?
clonealign/R/inference-tflow.R
Line 369 in 289b9a2
Hi @kieranrcampbell,
I have a question possibly related to issue 13#issue, concerning model fits and outputs interpretation. From the output of plot_clonealign (version 2.0), I see that the delta of the scaled lognormalized RNA expressions among two clones do not proportionally reflect the CN delta. For example, considering one chromosome, when the CN is 3 and CN delta between two clones is zero, I do not find a zero delta between the RNA z-scores for expressions corresponding to the same two clones. This happen also the other way round, if there is a variation in CN-space, I do not necessarily see an approximate proportional variation on the corresponding RNA expression. What I mostly can appreciate is that the sign of the delta between DNA and RNA are consistent.
It seems a similar behaviour is also present in the plots of the main paper, so I guess it's alright and I just wanted to check. Is there any clonealign function plot to add RNA counts to the above plots?
the inference runs through one ELBO computation and throws an error (below) if rownames(L) and colnames(Y) don't match or are both NULL.
onstructing tensorflow graph
Removing 0 genes with low counts
Optimizing ELBO
running VB [=====================>] 100% | elbo -931590.1875 | chg elbo 0.0064%
ELBO converged or reached max iterations
Computing final ELBO
Error in L[tflow_res$retained_genes, ] : subscript out of bounds
Hi,
How do you think of a proper way to quantify the consistency between DNA and RNA data in terms of clonal frequency?
Without SIDR-seq / G&T-seq technology available, it is still possible to achieve G&T by performing single-cell DNA-seq and RNA-seq on two groups of cells but randomly selected from the same population of cell suspension. In theory, the scDNA-seq and scRNA-seq should reflect the same cell content, e.g., clones. Therefore, the clonal frequency estimated by DNA-seq and inferred by CloneAlign would be similar. This is also what is in your paper:
1152 single-cells post-QC (methods) were assigned to clones A, B, and C with prevalence of 80.6%, 13.8%, and 5.6%, closely matching the expected proportions inferred from the single-cell DNA-seq (82.3%, 10.8%, and 6.9%).
But my question is how to quantify the consistency? Your paper sort of eyeball the similarity. I was thinking of chisq.test
, but I don't know if it makes sense.
m <- 1152
pa <- 80.6/100
pb <- 13.8/100
pc <- 5.6/100
stopifnot(sum(c(pa, pb, pc)) == 1)
ea <- 82.3/100
eb <- 10.8/100
ec <- 6.9/100
stopifnot(sum(c(ea, eb, ec)) == 1)
chisq.test(m * c(pa, pb, pc), p = c(ea, eb, ec), rescale.p=F)
# X-squared = 12.826, df = 2, p-value = 0.00164
The p-value suggested the clonal frequency of DNA and CloneAlign be significantly different, which was against the presumption. I don't mean to challenge the result, because in this specific case, I noticed the cells may not come from the same cell suspension; it violated the presumption.
We linked gene expression to clones in SA501 by generating single-cell RNA-seq from the SA501X2B xenograft passage using 10X genomics (methods) and assigned each cell to a clone (A, B or C) using clonealign.
In sum, my question is how to quantify the clonal frequency consistency instead of 'human-like' guess?
This is currently handled via setting lambda to 1. Should instead modify the likelihood so it's Pois(0.1) or similar
when colnames(L) get named in case they are NULL with
colnames(Y) <- paste0("gene_", letters[seq_len(ncol(Y))])
wouldn't that give gene_NA if there are more than 26 genes?
https://github.com/kieranrcampbell/clonealign/blob/master/R/clonealign.R#L257
Hi Kieran,
Did you assess the performance of the new version of the package versus the previous one? I have noticed some things:
initial_shrinks
parameter;initial_shrinks
seems to give more stability. Does this make sense? What does this parameter actually do? It's hard to get it from the code.It would be nice to have some sort of idea about how this new model works. Maybe update the paper? Thanks!
Would you know how to fix this issue?
Constructing tensorflow graph
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'
Calls: run_clonealign ... py_get_attr_or_item -> py_get_attr -> py_get_attr_impl
Execution halted
Thanks
Hi Kieran,
I noticed that despite the same input, different runs of clonealign
generated different results. Setting the seed number does not help.
I was wondering how do you think of this type of fluctuation?
I am not surprised the same cell was assigned differently in different runs partly because it is an inference framework based on probability. In general, the assignments were consistent with each other, which is good. But could it be possible to make it reproducible the way how tSNE could be fixed when seed number is fixed, for example?
library(SingleCellExperiment)
library(clonealign)
data(example_sce)
copy_number_data <- rowData(example_sce)[,c("A", "B", "C")]
set.seed(42)
cal <- clonealign(example_sce, copy_number_data)
print(cal)
table(cal$clone)
Hello, I have installed the "tensorflow" and "clonealign" packages as shown the guidance. However, there always an error as below:
Constructing tensorflow graph
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'
Then, I saw the setting of python in r-tensorflow,
>reticulate::py_config()
python: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python
libpython: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/libpython3.6m.dylib
pythonhome: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow:/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow
version: 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
numpy: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/lib/python3.6/site-packages/numpy
numpy_version: 1.16.4
tensorflow: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/lib/python3.6/site-packages/tensorflow
python versions found:
/Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python
/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
/anaconda/envs/r-tensorflow/bin/python
/usr/bin/python
/usr/local/bin/python
/usr/local/bin/python3
/Users/Amssbaixiangqi/miniconda2/envs/pyclone/bin/python
/anaconda/envs/python27/bin/python
/anaconda/envs/python35/bin/python
/Users/Amssbaixiangqi/miniconda2/bin/python
/Users/Amssbaixiangqi/server413/bin/python
It is noted that another python are used in virtual environment '/Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python'. Thus I removed all paths about the virtual environment. After that, clonealign is successfully installed.
> Sys.getenv()
__CF_USER_TEXT_ENCODING 0x1F5:0x19:0x34
__KMP_REGISTERED_LIB_4892 0x1018c0c20-cafed01c-libomp.dylib
__KMP_REGISTERED_LIB_6593 0x1018c0c20-cafe7519-libomp.dylib
Apple_PubSub_Socket_Render /private/tmp/com.apple.launchd.8G2juMc7bT/Render
CLICOLOR_FORCE 1
DISPLAY /private/tmp/com.apple.launchd.sN5ofUkFip/org.macosforge.xquartz:0
DYLD_FALLBACK_LIBRARY_PATH /Library/Frameworks/R.framework/Resources/lib:/Library/Frameworks/R.framework/Resources/lib:/Users/Amssbaixiangqi/lib:/usr/local/lib:/usr/lib:::/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/jre/lib/server:::/Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/jre/lib/server
EDITOR vi
GIT_ASKPASS rpostback-askpass
GITHUB_PAT 34bba114f6b4e43b786ff2c41a82ca69c5dbe0ce
HOME /Users/Amssbaixiangqi
LANG zh_CN.UTF-8
LC_CTYPE zh_CN.UTF-8
LN_S ln -s
LOGNAME Amssbaixiangqi
MAKE make
PAGER /usr/bin/less
PATH /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Library/TeX/texbin
R_BROWSER /usr/bin/open
R_BZIPCMD /usr/bin/bzip2
R_DOC_DIR /Library/Frameworks/R.framework/Resources/doc
R_GZIPCMD /usr/bin/gzip
R_HOME /Library/Frameworks/R.framework/Resources
R_INCLUDE_DIR /Library/Frameworks/R.framework/Resources/include
R_LIBS_SITE
R_LIBS_USER ~/Library/R/3.6/library
R_PAPERSIZE a4
R_PDFVIEWER /usr/bin/open
R_PLATFORM x86_64-apple-darwin15.6.0
R_PRINTCMD lpr
R_QPDF /Library/Frameworks/R.framework/Resources/bin/qpdf
R_RD4PDF times,inconsolata,hyper
R_SESSION_INITIALIZED PID=6593:NAME="reticulate"
R_SESSION_TMPDIR /var/folders/md/rqyl44w50b5cpm9wtsh19q200000gn/T//RtmpLYCtf2
R_SHARE_DIR /Library/Frameworks/R.framework/Resources/share
R_STRIP_SHARED_LIB strip -x
R_STRIP_STATIC_LIB strip -S
R_SYSTEM_ABI osx,gcc,gxx,gfortran,gfortran
R_TEXI2DVICMD /usr/local/bin/texi2dvi
R_UNZIPCMD /usr/bin/unzip
R_ZIPCMD /usr/bin/zip
RETICULATE_REQUIRED_MODULE tensorflow
RMARKDOWN_MATHJAX_PATH /Applications/RStudio.app/Contents/Resources/resources/mathjax-26
RS_RPOSTBACK_PATH /Applications/RStudio.app/Contents/MacOS/rpostback
RS_SHARED_SECRET dbc39f44-b589-45e4-921b-10943a4f8c63
RSTUDIO 1
RSTUDIO_CONSOLE_COLOR 256
RSTUDIO_CONSOLE_WIDTH 111
RSTUDIO_PANDOC /Applications/RStudio.app/Contents/MacOS/pandoc
RSTUDIO_SESSION_PORT 25163
RSTUDIO_USER_IDENTITY Amssbaixiangqi
RSTUDIO_WINUTILS bin/winutils
SED /usr/bin/sed
SHELL /bin/bash
SSH_ASKPASS rpostback-askpass
SSH_AUTH_SOCK /private/tmp/com.apple.launchd.3tDs9H4Bj5/Listeners
TAR /usr/bin/tar
TERM xterm-256color
TMPDIR /var/folders/md/rqyl44w50b5cpm9wtsh19q200000gn/T/
USER Amssbaixiangqi
VIRTUAL_ENV /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow
XPC_FLAGS 0x0
XPC_SERVICE_NAME 0
> Sys.unsetenv('VIRTUAL_ENV')
> Sys.unsetenv('PATH')
> Sys.unsetenv('RETICULATE_REQUIRED_MODULE')
> Sys.getenv()
> reticulate::py_config()
python: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
libpython: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/libpython3.6m.dylib
pythonhome: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow:/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow
version: 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
numpy: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/python3.6/site-packages/numpy
numpy_version: 1.16.4
tensorflow: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/python3.6/site-packages/tensorflow
python versions found:
/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
/anaconda/envs/r-tensorflow/bin/python
/usr/bin/python
/usr/local/bin/python
/usr/local/bin/python3
/Users/Amssbaixiangqi/miniconda2/envs/pyclone/bin/python
/anaconda/envs/python27/bin/python
/anaconda/envs/python35/bin/python
/Users/Amssbaixiangqi/miniconda2/bin/python
/Users/Amssbaixiangqi/server413/bin/python
Thank you for uploading this useful tool. I have a question about the code. When I try to run the clonealign function, I get the error message "copy_number_data must have same number of genes (rows) as gene_expression_data". The rownames of my cnv matrix and my expression matrix are of the same size and contain the same genes. When looking in the code, could it be that if(nrow(L) != G)
should be changed to if(nrow(L) != N)
, or do you think this is caused by something else?
N <- nrow(Y)
G <- ncol(Y)
L <- copy_number_data
` if(nrow(L) != G) {
stop("copy_number_data must have same number of genes (rows) as gene_expression_data")
}
Thank you for your time
`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.