Hi Weiyan, Thank you for a great package. I'm trying to analyze abou

Hi Katharina, I forgot to mention that the output of <code class="no

Error while plotting boxplot and calculating InferPotency about landscent HOT 10 CLOSED

chenweiyan commented on July 30, 2024

Error while plotting boxplot and calculating InferPotency

from landscent.

Comments (10)

ChenWeiyan commented on July 30, 2024

Hi Sinhal,

Thanks for trying out my package!

It seems from your output object My SR.o$SR. The SR values are not all properly calculated, e.g the 999th/1000th are NULL.

Moreover, if the calculation of SR was correct, My SR.o$SR would be vector, rather than a list, which is now your case.

But I do not know the detail of your data. So one suggestion is, since 8k cells are not a big number, you can probably extract the expr matrix with command like assay(SingleCellExperiment object, "count"), and then give it to LandSCENT, which will make things simpler, and you may find out why the error happened.

Hope it be helpful!

Best,
Weiyan

from landscent.

SinhaI commented on July 30, 2024

Dear Weiyan, Thanks for your reply. As you suggested I have tried to extract the data as expr matrix. But still I am getting that error. I think I am doing something terrible wrong here. Below I am giving my code. It will be great if you can suggest something more. Let me know if you need some more information. Thanks for your help. Best regards, Indranil. library(LandSCENT) data(net13Jun12.m) data(phenoExample.v) library(Seurat) require(scater) #here I am getting the Seurat object sub2.pbmc.integrated <- readRDS(file ="sub2_pbmc_SCT_14102020.rds") sub2.sce <- as.SingleCellExperiment(sub2.pbmc.integrated) example.m.sub2 <- assay(sub2.sce, "counts") example.sce <- SingleCellExperiment(assay = list(counts = example.m.sub2)) counts(example.sce) <- as(counts(example.sce), "dgCMatrix") sizeFactors(example.sce) <- librarySizeFactors(example.sce) example.sce <- logNormCounts(example.sce) example.m <- as.matrix(assay(example.sce, i = "logcounts")) min(example.m) example.m[example.m < 0.001] <- 0.1 require(AnnotationDbi) require(org.Hs.eg.db) anno.v <- mapIds(org.Hs.eg.db, keys = rownames(example.m), keytype = "SYMBOL", column = "ENTREZID", multiVals = "first") unique_anno.v <- unique(anno.v) example_New.m <- matrix(0, nrow = length(unique_anno.v), ncol = dim(example.m)[2]) for (i in seq_len(length(unique_anno.v))) { tmp <- example.m[which(anno.v == unique_anno.v[i]) ,] if (!is.null(dim(tmp))) { tmp <- colSums(tmp) / dim(tmp)[1] } example_New.m[i ,] <- example_New.m[i ,] + tmp } rownames(example_New.m) <- unique_anno.v colnames(example_New.m) <- colnames(example.m) example_New.m <- example_New.m[-which(rownames(example_New.m) %in% NA) ,] min(example_New.m) Integration.l <- DoIntegPPI(exp.m = example_New.m, ppiA.m = net13Jun12.m, log_trans = TRUE) str(Integration.l) #save.image(file = "LandSCENT_work_space_7.RData") SR.o <- CompSRana(Integration.l, local = TRUE, mc.cores = 40) my.data <- FetchData(sub2.pbmc.integrated,c("type", "patient", "sample", "seurat_cluster", "orig.ident")) boxplot(SR.o$SR ~ my.data$type, main = "SR values against cell types", xlab = "Cell Types", ylab = "SR values")

…

On Sat, Nov 14, 2020 at 10:15 AM Weiyan Chen ***@***.***> wrote: Hi Sinhal, Thanks for trying out my package! It seems from your output object My SR.o$SR. The SR values are not all properly calculated, e.g the 999th/1000th are NULL. Moreover, if the calculation of SR was correct, My SR.o$SR would be vector, rather than a list, which is now your case. But I do not know the detail of your data. So one suggestion is, since 8k cells are not a big number, you can probably extract the expr matrix with command like assay(SingleCellExperiment object, "count"), and then give it to LandSCENT, which will make things simpler, and you may find out why the error happened. Hope it be helpful! Best, Weiyan — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#13 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJWOZ47Z2FPAPMPKB4KHLE3SPZDD5ANCNFSM4TVHL6HQ> .

from landscent.

ChenWeiyan commented on July 30, 2024

Hi Indranil,

Since you are setting log_trans = TRUE in DoIntegPPI , you can actually leave the normalization aside and let the function do it by itself.

And I can tell that you are trying to convert gene IDs. So I attached a script I wrote(and additional resource file) for gene ID converting. Maybe you can grab something useful.

So basically, you only need to extract count matrix from Seurat object ---> check if the row/col names are properly assigned ---> convert the gene IDs ---> then give it to LandSCENT ---> wait to see what will happen.

Best,
Weiyan

GeneMap.zip

from landscent.

KatharinaKohler commented on July 30, 2024

Hi Weiyan,

Thank you for this tool! I am currently also trying to use the LandSCENT package for a Seurat object and it seems that I am running into the same problem as Indranil. I am trying to get corresponding gene IDs by using the code from the tutorial (step 3.3).

My code:

> #reading in count data from Seurat object
> expression_matrix_ <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))
> 
> #get gene annotation fitting to the PPI
> require(AnnotationDbi)
> require(org.Hs.eg.db)
> anno.v <- mapIds(org.Hs.eg.db, keys = rownames(expression_matrix_), keytype = "SYMBOL", 
>                  column = "ENTREZID", multiVals = "first")
> unique_anno.v <- unique(anno.v)
> example_New.m <- matrix(0, nrow = length(unique_anno.v), ncol = dim(expression_matrix_)[2])
> 
> 
> for (i in seq_len(length(unique_anno.v))) {
>   tmp <- expression_matrix_[which(anno.v == unique_anno.v[i]) ,]
>   if (!is.null(dim(tmp))) {
>     tmp <- colSums(tmp) / dim(tmp)[1]
>   }
>   example_New.m[i ,] <- example_New.m[i ,] + tmp
> }
> 
> rownames(example_New.m) <- unique_anno.v
> colnames(example_New.m) <- colnames(expression_matrix_)
> example_New.m <- example_New.m[-which(rownames(example_New.m) %in% NA) ,]
> matrix_sce <- example_New.m
> str(matrix_sce)

My expression_matrix_ looks like this:

num [1:24113, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:24113] "RP11-34P13.7" "FO538757.2" "AP006222.2" "RP4-669L17.10" ...
  ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...

And this is what I get out of str(matrix_sce):

num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:17215] "26155" "339451" "84069" "57801" ...
  ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...

So I think that there is a problem with the step of gene ID conversion, but I can't find it.

I also tried your script for gene conversion Weiyan, but I seem to do something wrong, as if I do this I get the error "scRNA-seq data should have the same gene identifier with the network!" when I do the integration step.

I would appreciate any help!

Best,
Katharina

from landscent.

ChenWeiyan commented on July 30, 2024

Hi Katharina,

Thanks for your feedback.

So for the gene ID conversion, the following is a quick explanation for my script:

If you want to get homologous genes between human and mouse, you may load the HomoG first with HomoG <- read.delim("homologene.data", header = FALSE)
Give the function your gene IDs with: (you can set HomoG = NULL if no need for homologous genes)
ID_conversion <- GeneMap_opt(HomoG = HomoG, geneID, input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")
After it is done, ID_conversion$GeneMapG will give you the transferred IDs and you can use ID_conversion$Gene.idx to subset the original matrix, i.e New.data <- Ori.data[ID_conversion$Gene.idx ,]. And the row names of New.data should be assigned with ID_conversion$GeneMapG.

Hope this can help you.

Best,
Weiyan

from landscent.

KatharinaKohler commented on July 30, 2024

Hi Weiyan,

Thank you for this explanation!
I tried this but again it did not work. Here my code:

`expression_matrix_ <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))

ID_conversion <- GeneMap_opt(HomoG = NULL, rownames(expression_matrix_), input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")

New.data <- expression_matrix_[ID_conversion$Gene.idx ,]
rownames(New.data) <- ID_conversion$GeneMapG

Integration.l <- DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE)
`

The error I get here is

Fejl i DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE) :
Non identical!!!

My New.data object looks like this:

num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...

attr(, "dimnames")=List of 2
..$ : Named chr [1:17215] "26155" "339451" "84069" "57801" ...
.. ..- attr(, "names")= chr [1:17215] "NOC2L" "KLHL17" "PLEKHN1" "HES4" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...

Can you see what is going wrong here?

Best,
Katharina

from landscent.

ChenWeiyan commented on July 30, 2024

Hi Katharina,

I forgot to mention that the output of GeneMap_opt under GeneMapG is a named vector.

You need to set

names(rownames(New.data)) <- NULL

right before

DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE)

Best,
Weiyan

from landscent.

KatharinaKohler commented on July 30, 2024

Hi Weiyan,

Thank you so much for that! This part works now but I am still having the same problems with CompSRana as before. I am sorry for bothering again but I am very stuck with this. When I try to run CompSRana, I get the warning:

Warning message:
In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, :
scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 did not deliver results, all values of the jobs will be affected

If I ignore this and continue I would get the same problem as Indranil. The SR values are NULL and SR.o$SR is a list instead of a vector.

My Integration.l object look like this (which in my eyes seems fine):

str(Integration.l)
List of 3
$ expMC: num [1:9949, 1:20286] 0.138 0.138 2.712 0.138 0.138 ...
..- attr(, "dimnames")=List of 2
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
.. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
$ adjMC: num [1:9949, 1:9949] 0 1 1 1 1 1 1 1 1 1 ...
..- attr(, "dimnames")=List of 2
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
$ data : num [1:17215, 1:20286] 0.138 0.138 0.138 0.138 0.138 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:17215] "26155" "339451" "84069" "57801" ...
.. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...

In case you have a clue what I could be doing wrong I would be very happy about any tip!

Best,
Katharina

from landscent.

ChenWeiyan commented on July 30, 2024

Hi Katharina,

The integration of data and PPI network looks fine now.

The warning itself indicates you will not obtain SR correctly for all cells. So I am afraid it cannot be ignored for good.

Warning message:
In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, :
scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 
37, 38, 39, 40 did not deliver results, all values of the jobs will be affected

In fact, I have encountered such situation before.
In my case, I was using Linux Server with CentOS 7. And it threw such warning because of running out of memory since the parallel task consumed too much memory. It went away after I changed the mc.cores into a smaller number.

In your case, I am not sure if it exactly the same. But I suspect you are running on a server too, since you have at least 40 cores to run :) So lowering down the core number may help.
And in your case, your cell number is reasonably large, i.e ~20k. So you probably need to check the memory use of your R process first, and then estimate a reasonable core number based on such memory, In case your overall memory would overflow.

Best,
Weiyan

from landscent.

ChenWeiyan commented on July 30, 2024

I closed this issue for now since inactivation. Please reopen if needed.

from landscent.

Error while plotting boxplot and calculating InferPotency about landscent HOT 10 CLOSED

Comments (10)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent