The clonealign from kieranrcampbell

the inference runs through one ELBO computation and throws an error (below) if rownames(L) and colnames(Y) don't match or are both NULL.

onstructing tensorflow graph
Removing 0 genes with low counts
Optimizing ELBO
  running VB [=====================>] 100% | elbo -931590.1875 | chg elbo 0.0064%
ELBO converged or reached max iterations
Computing final ELBO
Error in L[tflow_res$retained_genes, ] : subscript out of bounds

Add test_that for gradient checking

delta-CN and delta-RNA

Hi @kieranrcampbell,

I have a question possibly related to issue 13#issue, concerning model fits and outputs interpretation. From the output of plot_clonealign (version 2.0), I see that the delta of the scaled lognormalized RNA expressions among two clones do not proportionally reflect the CN delta. For example, considering one chromosome, when the CN is 3 and CN delta between two clones is zero, I do not find a zero delta between the RNA z-scores for expressions corresponding to the same two clones. This happen also the other way round, if there is a variation in CN-space, I do not necessarily see an approximate proportional variation on the corresponding RNA expression. What I mostly can appreciate is that the sign of the delta between DNA and RNA are consistent.
It seems a similar behaviour is also present in the plots of the main paper, so I guess it's alright and I just wanted to check. Is there any clonealign function plot to add RNA counts to the above plots?

when colnames(L) get named in case they are NULL with colnames(Y) <- paste0("gene_", letters[seq_len(ncol(Y))]) wouldn't that give gene_NA if there are more than 26 genes?

when colnames(L) get named in case they are NULL with
colnames(Y) <- paste0("gene_", letters[seq_len(ncol(Y))])
wouldn't that give gene_NA if there are more than 26 genes?

https://github.com/kieranrcampbell/clonealign/blob/master/R/clonealign.R#L257

Solve the problem in installing clonealign

Hello, I have installed the "tensorflow" and "clonealign" packages as shown the guidance. However, there always an error as below:

Constructing tensorflow graph
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'

Then, I saw the setting of python in r-tensorflow,

>reticulate::py_config()
python: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python
libpython: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/libpython3.6m.dylib
pythonhome: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow:/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow
version: 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
numpy: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/lib/python3.6/site-packages/numpy
numpy_version: 1.16.4
tensorflow: /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/lib/python3.6/site-packages/tensorflow
python versions found:
/Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python
/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
/anaconda/envs/r-tensorflow/bin/python
/usr/bin/python
/usr/local/bin/python
/usr/local/bin/python3
/Users/Amssbaixiangqi/miniconda2/envs/pyclone/bin/python
/anaconda/envs/python27/bin/python
/anaconda/envs/python35/bin/python
/Users/Amssbaixiangqi/miniconda2/bin/python
/Users/Amssbaixiangqi/server413/bin/python

It is noted that another python are used in virtual environment '/Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin/python'. Thus I removed all paths about the virtual environment. After that, clonealign is successfully installed.

> Sys.getenv()
__CF_USER_TEXT_ENCODING 0x1F5:0x19:0x34
__KMP_REGISTERED_LIB_4892 0x1018c0c20-cafed01c-libomp.dylib
__KMP_REGISTERED_LIB_6593 0x1018c0c20-cafe7519-libomp.dylib
Apple_PubSub_Socket_Render /private/tmp/com.apple.launchd.8G2juMc7bT/Render
CLICOLOR_FORCE 1
DISPLAY /private/tmp/com.apple.launchd.sN5ofUkFip/org.macosforge.xquartz:0
DYLD_FALLBACK_LIBRARY_PATH /Library/Frameworks/R.framework/Resources/lib:/Library/Frameworks/R.framework/Resources/lib:/Users/Amssbaixiangqi/lib:/usr/local/lib:/usr/lib:::/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/jre/lib/server:::/Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/jre/lib/server
EDITOR vi
GIT_ASKPASS rpostback-askpass
GITHUB_PAT 34bba114f6b4e43b786ff2c41a82ca69c5dbe0ce
HOME /Users/Amssbaixiangqi
LANG zh_CN.UTF-8
LC_CTYPE zh_CN.UTF-8
LN_S ln -s
LOGNAME Amssbaixiangqi
MAKE make
PAGER /usr/bin/less
PATH /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Library/TeX/texbin
R_BROWSER /usr/bin/open
R_BZIPCMD /usr/bin/bzip2
R_DOC_DIR /Library/Frameworks/R.framework/Resources/doc
R_GZIPCMD /usr/bin/gzip
R_HOME /Library/Frameworks/R.framework/Resources
R_INCLUDE_DIR /Library/Frameworks/R.framework/Resources/include
R_LIBS_SITE
R_LIBS_USER ~/Library/R/3.6/library
R_PAPERSIZE a4
R_PDFVIEWER /usr/bin/open
R_PLATFORM x86_64-apple-darwin15.6.0
R_PRINTCMD lpr
R_QPDF /Library/Frameworks/R.framework/Resources/bin/qpdf
R_RD4PDF times,inconsolata,hyper
R_SESSION_INITIALIZED PID=6593:NAME="reticulate"
R_SESSION_TMPDIR /var/folders/md/rqyl44w50b5cpm9wtsh19q200000gn/T//RtmpLYCtf2
R_SHARE_DIR /Library/Frameworks/R.framework/Resources/share
R_STRIP_SHARED_LIB strip -x
R_STRIP_STATIC_LIB strip -S
R_SYSTEM_ABI osx,gcc,gxx,gfortran,gfortran
R_TEXI2DVICMD /usr/local/bin/texi2dvi
R_UNZIPCMD /usr/bin/unzip
R_ZIPCMD /usr/bin/zip
RETICULATE_REQUIRED_MODULE tensorflow
RMARKDOWN_MATHJAX_PATH /Applications/RStudio.app/Contents/Resources/resources/mathjax-26
RS_RPOSTBACK_PATH /Applications/RStudio.app/Contents/MacOS/rpostback
RS_SHARED_SECRET dbc39f44-b589-45e4-921b-10943a4f8c63
RSTUDIO 1
RSTUDIO_CONSOLE_COLOR 256
RSTUDIO_CONSOLE_WIDTH 111
RSTUDIO_PANDOC /Applications/RStudio.app/Contents/MacOS/pandoc
RSTUDIO_SESSION_PORT 25163
RSTUDIO_USER_IDENTITY Amssbaixiangqi
RSTUDIO_WINUTILS bin/winutils
SED /usr/bin/sed
SHELL /bin/bash
SSH_ASKPASS rpostback-askpass
SSH_AUTH_SOCK /private/tmp/com.apple.launchd.3tDs9H4Bj5/Listeners
TAR /usr/bin/tar
TERM xterm-256color
TMPDIR /var/folders/md/rqyl44w50b5cpm9wtsh19q200000gn/T/
USER Amssbaixiangqi
VIRTUAL_ENV /Users/Amssbaixiangqi/.virtualenvs/r-tensorflow
XPC_FLAGS 0x0
XPC_SERVICE_NAME 0

> Sys.unsetenv('VIRTUAL_ENV')
> Sys.unsetenv('PATH')
> Sys.unsetenv('RETICULATE_REQUIRED_MODULE')
> Sys.getenv()

> reticulate::py_config()
python: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
libpython: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/libpython3.6m.dylib
pythonhome: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow:/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow
version: 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
numpy: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/python3.6/site-packages/numpy
numpy_version: 1.16.4
tensorflow: /Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/lib/python3.6/site-packages/tensorflow
python versions found:
/Users/Amssbaixiangqi/miniconda2/envs/r-tensorflow/bin/python
/anaconda/envs/r-tensorflow/bin/python
/usr/bin/python
/usr/local/bin/python
/usr/local/bin/python3
/Users/Amssbaixiangqi/miniconda2/envs/pyclone/bin/python
/anaconda/envs/python27/bin/python
/anaconda/envs/python35/bin/python
/Users/Amssbaixiangqi/miniconda2/bin/python
/Users/Amssbaixiangqi/server413/bin/python

size_factors parameter

Hi @kieranrcampbell,
Congratulations for this amazing tool! I have one equation about one parameter I cannot find how to set. The help function for "clonealign::clonealign" discusses a "size_factors" parameter that can be set to "infer" or "fixed", which if I well understand is "s_n" of the main clonealign equation (correct?). However, the actual function definition does not have such parameter. Has the option been taken off or maybe set by "infer" as default?

Different runs upon the same input generated different results

Hi Kieran,
I noticed that despite the same input, different runs of clonealign generated different results. Setting the seed number does not help.

I was wondering how do you think of this type of fluctuation?

I am not surprised the same cell was assigned differently in different runs partly because it is an inference framework based on probability. In general, the assignments were consistent with each other, which is good. But could it be possible to make it reproducible the way how tSNE could be fixed when seed number is fixed, for example?

library(SingleCellExperiment)
library(clonealign)
data(example_sce)
copy_number_data <- rowData(example_sce)[,c("A", "B", "C")]
set.seed(42)
cal <- clonealign(example_sce, copy_number_data)
print(cal)
table(cal$clone)

Remove outlying genes now multinomial likelihood is used

Handle situation when copy number is 0

This is currently handled via setting lambda to 1. Should instead modify the likelihood so it's Pois(0.1) or similar

Initial elbo is NA

Hi,
I was wondering any hints on why I came across the error "Initial elbo is NA"?

Occasionally, I met across the error. They were gone if I removed features with either NAs or zeros in all cells. But sometimes they still exist, so I was wondering why? Could it also because the tf$session did not run at all?

clonealign/R/inference-tflow.R

Line 369 in 289b9a2

stop("Initial elbo is NA")

module 'tensorflow' has no attribute 'reset_default_graph'

Would you know how to fix this issue?

Constructing tensorflow graph
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'
Calls: run_clonealign ... py_get_attr_or_item -> py_get_attr -> py_get_attr_impl
Execution halted

Thanks

Performance with multinomial likelihood

Hi Kieran,

Did you assess the performance of the new version of the package versus the previous one? I have noticed some things:

it seems to be rather sensitive to the initial_shrinks parameter;
the ELBO is not a good proxy for best assignments.
In my experiments, a high value of initial_shrinks seems to give more stability. Does this make sense? What does this parameter actually do? It's hard to get it from the code.

It would be nice to have some sort of idea about how this new model works. Maybe update the paper? Thanks!

How to quantify the clonal frequency consistency between DNA and CloneAlign?

Hi,
How do you think of a proper way to quantify the consistency between DNA and RNA data in terms of clonal frequency?

Without SIDR-seq / G&T-seq technology available, it is still possible to achieve G&T by performing single-cell DNA-seq and RNA-seq on two groups of cells but randomly selected from the same population of cell suspension. In theory, the scDNA-seq and scRNA-seq should reflect the same cell content, e.g., clones. Therefore, the clonal frequency estimated by DNA-seq and inferred by CloneAlign would be similar. This is also what is in your paper:

1152 single-cells post-QC (methods) were assigned to clones A, B, and C with prevalence of 80.6%, 13.8%, and 5.6%, closely matching the expected proportions inferred from the single-cell DNA-seq (82.3%, 10.8%, and 6.9%).

But my question is how to quantify the consistency? Your paper sort of eyeball the similarity. I was thinking of chisq.test, but I don't know if it makes sense.

m <- 1152

pa <- 80.6/100
pb <- 13.8/100
pc <- 5.6/100
stopifnot(sum(c(pa, pb, pc)) == 1)

ea <- 82.3/100
eb <- 10.8/100
ec <- 6.9/100
stopifnot(sum(c(ea, eb, ec)) == 1)

chisq.test(m * c(pa, pb, pc), p = c(ea, eb, ec), rescale.p=F)
# X-squared = 12.826, df = 2, p-value = 0.00164

The p-value suggested the clonal frequency of DNA and CloneAlign be significantly different, which was against the presumption. I don't mean to challenge the result, because in this specific case, I noticed the cells may not come from the same cell suspension; it violated the presumption.

We linked gene expression to clones in SA501 by generating single-cell RNA-seq from the SA501X2B xenograft passage using 10X genomics (methods) and assigned each cell to a clone (A, B or C) using clonealign.

In sum, my question is how to quantify the clonal frequency consistency instead of 'human-like' guess?

row dimensionality issue

Thank you for uploading this useful tool. I have a question about the code. When I try to run the clonealign function, I get the error message "copy_number_data must have same number of genes (rows) as gene_expression_data". The rownames of my cnv matrix and my expression matrix are of the same size and contain the same genes. When looking in the code, could it be that if(nrow(L) != G) should be changed to if(nrow(L) != N), or do you think this is caused by something else?

N <- nrow(Y)
G <- ncol(Y)
L <- copy_number_data

` if(nrow(L) != G) {
stop("copy_number_data must have same number of genes (rows) as gene_expression_data")
}

Thank you for your time

`

kieranrcampbell / clonealign Goto Github PK

clonealign's Introduction

clonealign's People

Contributors

Stargazers

Watchers

Forkers

clonealign's Issues

Recommend Projects

Recommend Topics

Recommend Org