Comments (10)
Hi Sinhal,
Thanks for trying out my package!
It seems from your output object My SR.o$SR
. The SR values are not all properly calculated, e.g the 999th/1000th are NULL
.
Moreover, if the calculation of SR was correct, My SR.o$SR
would be vector
, rather than a list
, which is now your case.
But I do not know the detail of your data. So one suggestion is, since 8k cells are not a big number, you can probably extract the expr matrix with command like assay(SingleCellExperiment object, "count")
, and then give it to LandSCENT, which will make things simpler, and you may find out why the error happened.
Hope it be helpful!
Best,
Weiyan
from landscent.
from landscent.
Hi Indranil,
Since you are setting log_trans = TRUE
in DoIntegPPI
, you can actually leave the normalization aside and let the function do it by itself.
And I can tell that you are trying to convert gene IDs. So I attached a script I wrote(and additional resource file) for gene ID converting. Maybe you can grab something useful.
So basically, you only need to extract count matrix from Seurat object ---> check if the row/col names are properly assigned ---> convert the gene IDs ---> then give it to LandSCENT ---> wait to see what will happen.
Best,
Weiyan
from landscent.
Hi Weiyan,
Thank you for this tool! I am currently also trying to use the LandSCENT package for a Seurat object and it seems that I am running into the same problem as Indranil. I am trying to get corresponding gene IDs by using the code from the tutorial (step 3.3).
My code:
> #reading in count data from Seurat object
> expression_matrix_ <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))
>
> #get gene annotation fitting to the PPI
> require(AnnotationDbi)
> require(org.Hs.eg.db)
> anno.v <- mapIds(org.Hs.eg.db, keys = rownames(expression_matrix_), keytype = "SYMBOL",
> column = "ENTREZID", multiVals = "first")
> unique_anno.v <- unique(anno.v)
> example_New.m <- matrix(0, nrow = length(unique_anno.v), ncol = dim(expression_matrix_)[2])
>
>
> for (i in seq_len(length(unique_anno.v))) {
> tmp <- expression_matrix_[which(anno.v == unique_anno.v[i]) ,]
> if (!is.null(dim(tmp))) {
> tmp <- colSums(tmp) / dim(tmp)[1]
> }
> example_New.m[i ,] <- example_New.m[i ,] + tmp
> }
>
> rownames(example_New.m) <- unique_anno.v
> colnames(example_New.m) <- colnames(expression_matrix_)
> example_New.m <- example_New.m[-which(rownames(example_New.m) %in% NA) ,]
> matrix_sce <- example_New.m
> str(matrix_sce)
My expression_matrix_ looks like this:
num [1:24113, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:24113] "RP11-34P13.7" "FO538757.2" "AP006222.2" "RP4-669L17.10" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
And this is what I get out of str(matrix_sce):
num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:17215] "26155" "339451" "84069" "57801" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
So I think that there is a problem with the step of gene ID conversion, but I can't find it.
I also tried your script for gene conversion Weiyan, but I seem to do something wrong, as if I do this I get the error "scRNA-seq data should have the same gene identifier with the network!" when I do the integration step.
I would appreciate any help!
Best,
Katharina
from landscent.
Hi Katharina,
Thanks for your feedback.
So for the gene ID conversion, the following is a quick explanation for my script:
- If you want to get homologous genes between human and mouse, you may load the HomoG first with
HomoG <- read.delim("homologene.data", header = FALSE)
- Give the function your gene IDs with: (you can set
HomoG = NULL
if no need for homologous genes)
ID_conversion <- GeneMap_opt(HomoG = HomoG, geneID, input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")
- After it is done,
ID_conversion$GeneMapG
will give you the transferred IDs and you can useID_conversion$Gene.idx
to subset the original matrix, i.eNew.data <- Ori.data[ID_conversion$Gene.idx ,]
. And the row names ofNew.data
should be assigned withID_conversion$GeneMapG
.
Hope this can help you.
Best,
Weiyan
from landscent.
Hi Weiyan,
Thank you for this explanation!
I tried this but again it did not work. Here my code:
`expression_matrix_ <- as.matrix(GetAssayData(integrated_new, slot="counts", assay="RNA"))
ID_conversion <- GeneMap_opt(HomoG = NULL, rownames(expression_matrix_), input_type = "SYMBOL", output_type = "ENTREZID", input_species = "HUM", output_species = "HUM")
New.data <- expression_matrix_[ID_conversion$Gene.idx ,]
rownames(New.data) <- ID_conversion$GeneMapGIntegration.l <- DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE)
`
The error I get here is
Fejl i DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE) :
Non identical!!!
My New.data object looks like this:
num [1:17215, 1:20286] 0 0 0 0 0 0 0 0 0 0 ...
- attr(, "dimnames")=List of 2
..$ : Named chr [1:17215] "26155" "339451" "84069" "57801" ...
.. ..- attr(, "names")= chr [1:17215] "NOC2L" "KLHL17" "PLEKHN1" "HES4" ...
..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
Can you see what is going wrong here?
Best,
Katharina
from landscent.
Hi Katharina,
I forgot to mention that the output of GeneMap_opt
under GeneMapG
is a named vector.
You need to set
names(rownames(New.data)) <- NULL
right before
DoIntegPPI(exp.m = New.data, ppiA.m = net17Jan16.m, log_trans = TRUE)
Best,
Weiyan
from landscent.
Hi Weiyan,
Thank you so much for that! This part works now but I am still having the same problems with CompSRana as before. I am sorry for bothering again but I am very stuck with this. When I try to run CompSRana, I get the warning:
Warning message:
In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, :
scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 did not deliver results, all values of the jobs will be affected
If I ignore this and continue I would get the same problem as Indranil. The SR values are NULL
and SR.o$SR
is a list
instead of a vector.
My Integration.l object look like this (which in my eyes seems fine):
str(Integration.l)
List of 3
$ expMC: num [1:9949, 1:20286] 0.138 0.138 2.712 0.138 0.138 ...
..- attr(, "dimnames")=List of 2
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
.. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
$ adjMC: num [1:9949, 1:9949] 0 1 1 1 1 1 1 1 1 1 ...
..- attr(, "dimnames")=List of 2
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
.. ..$ : chr [1:9949] "29974" "339" "3837" "988" ...
$ data : num [1:17215, 1:20286] 0.138 0.138 0.138 0.138 0.138 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:17215] "26155" "339451" "84069" "57801" ...
.. ..$ : chr [1:20286] "AAACCTGAGTCGTTTG_1" "AAACCTGAGTGGTCCC_1" "AAACCTGAGTGTACCT_1" "AAACCTGCACAACGTT_1" ...
In case you have a clue what I could be doing wrong I would be very happy about any tip!
Best,
Katharina
from landscent.
Hi Katharina,
The integration of data and PPI network looks fine now.
The warning itself indicates you will not obtain SR correctly for all cells. So I am afraid it cannot be ignored for good.
Warning message:
In mclapply(idx.l, CompSRanaPRL, exp.m = Integration.l$expMC, adj.m = Integration.l$adjMC, :
scheduled cores 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40 did not deliver results, all values of the jobs will be affected
In fact, I have encountered such situation before.
In my case, I was using Linux Server with CentOS 7. And it threw such warning because of running out of memory since the parallel task consumed too much memory. It went away after I changed the mc.cores
into a smaller number.
In your case, I am not sure if it exactly the same. But I suspect you are running on a server too, since you have at least 40 cores to run :) So lowering down the core number may help.
And in your case, your cell number is reasonably large, i.e ~20k. So you probably need to check the memory use of your R process first, and then estimate a reasonable core number based on such memory, In case your overall memory would overflow.
Best,
Weiyan
from landscent.
I closed this issue for now since inactivation. Please reopen if needed.
from landscent.
Related Issues (15)
- install HOT 3
- Error at the InferLandmark() step HOT 6
- CellSR.o not showing the tsne plot and DoDiffusionMap.o$root is not calculated HOT 4
- Error: Failed to install 'unknown package' from GitHub: Line starting 'Lice ...' is malformed! HOT 2
- Error: Failed to install 'LandSCENT' from GitHub: stderr is not a pipe HOT 3
- Plot_DiffusionMap colour by phenotype HOT 2
- DoIntegPPI gene identifier error HOT 3
- Hi,
- Hi, HOT 2
- Out of memory HOT 4
- net13Jun12.m and net17Jan16.m contain duplicate rows and columns HOT 2
- Error: Failed to install 'unknown package' from GitHub HOT 8
- too many NULL in SR HOT 5
- Comparison between different Data sets HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from landscent.