jefworks-lab / stdeconvolve Goto Github PK

View Code? Open in Web Editor NEW

95.0 95.0 11.0 65.67 MB

Reference-free cell-type deconvolution of multi-cellular spatially resolved transcriptomics data

Home Page: http://jef.works/STdeconvolve/

R 100.00%

stdeconvolve's People

Contributors

Stargazers

Watchers

Forkers

gangeszs criphg lgl0522 h-wen wenmm pylyd bharatm26 rimanb namini94 phuongov kriquin-un

stdeconvolve's Issues

Tutorial requests

Ideas for additional tutorials that may be helpful for users:

Starting from 10X output (use mouse cortex Visium as demo example)
Starting from DBiT-seq output
Alternative, faster visualization of results by plotting each deconvolved cell-type proportion as a separate plot to avoid waiting on pie charts to render

choosing the optimal number of "Ks"

What is the best way to choose the optimal number of K's or cell-types for a dataset?
Is it just by observing the plot from the below code? How does one know what to set the upper limit to? i.e. in the below example, there could be more than 9 cell types present.
ldas <- fitLDA(t(as.matrix(cd)), Ks = seq(2, 9, by = 1))
Also, my R session often crashes when running the above code.

Match annotation and deconvolved cell types

Thanks for the tool, I am truly impressed by the remarkable findings presented in your work.

I have been working on implementing your simulation for MPOA. However, when attempting to compare the deconvolved cell-types with the ground truth, I've encountered a challenge. The heatmap generated does not exhibit a clearly defined diagonal block, making it difficult to match the deconvolved cell-types with the ground truth cell-types. In this context, I am seeking guidance on the process of matching the deconvolved cell-types and calculating the Root Mean Square Error (RMSE) concerning the ground truth for STdeconvolve.

Furthermore, I have been working through the GSEA tutorial to assign deconvolved cell types to their corresponding ground truth cell types. However, I have encountered challenges, as there are instances where the process does not yield the desired results.

Even when following the tutorial's provided examples, it remains a complex task to effectively match the deconvolved cell types to a specific annotation. I am seeking guidance on how to achieve a more reliable and accurate matching process.

Any insights or suggestions you can provide regarding this issue would be immensely valuable to me.

Thank you once again for your time and assistance.

Visium 'se' object creation error

Hi,

I have tried to run my data using your visium tutorial. I keep getting this error in the initial step. This happens with both your example data and my data. Could you please tell me how to sort this out.

f <- "visiumTutorial/"
if(!file.exists(f)){
dir.create(f)
}

se <- SpatialExperiment::read10xVisium(samples = f,
type = "sparse",
data = "filtered")
Error in names(xyz) <- names(sfs) <- sids :
'names' attribute [1] must be the same length as the vector [0]

Using stdeconvolve with normalized / integrated data

Dear Brendan,

I am using stdeconvolve with an 8-sample visium integrated seurat object. All samples were individually normalized with seurat's sctransform algorithm before anchor-based integration (also seurat toolkit).

The data output from sctransform (and consequently also after integration) can be negative and is of data format double. Hence, it cannot be used with stdeconvolve.

I have two questions:

why are non-negative integer required for stdeconvolve?
could you think of a way to transform the data in accordance with the algorithms requirements? Would a simple as.integer() + x be a valid way to do this?

Thank you so much in advance for your time!
Best,
Christoph

The order of MPOB slices

Hi @bmill3r , thanks for your excellent work. I saw that the MERFISH MPOB data are split as several different slices by the column called 'Bregma' and there are multimuple values which confused me. They are "-0.04" "-0.09" "-0.14" "-0.19" "-0.24" "-0.29" "0.01" "0.06" "0.11" "0.16" "0.21" "0.26". Could you help me sort these so that I could know the correct order from posterior to anterior? Thanks again for your time.

How to annotate cells in the STdeconvolve

Hi, @bmill3r
Sorry to trouble you. I'm new to analyzing spatial transcriptomes. I am trying to do it with your tools to deconvolve. I have two questions.First, I see the tutorial in “https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/visium_10x.md”， but I don't quite understand how you annotate spots .After I finished running the following code, I got a graph that splits spots into several topics.Are you annotating the genes that are highly expressed in each topic?How do you present the annotated results of each cell? Which object to add to, can you give an example?

optLDA <- optimalModel(models = ldas, opt = 15)
results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconProp <- results$theta
deconGexp <- results$beta
plt <- vizAllTopics(theta = deconProp,
pos = pos,
r = 45,
lwd = 0,
showLegend = TRUE,
plotTitle = NA) +
ggplot2::guides(fill=ggplot2::guide_legend(ncol=2)) +

outer border

ggplot2::geom_rect(data = data.frame(pos),
ggplot2::aes(xmin = min(x)-90, xmax = max(x)+90,
ymin = min(y)-90, ymax = max(y)+90),
fill = NA, color = "black", linetype = "solid", size = 0.5) +
ggplot2::theme(
plot.background = ggplot2::element_blank()
) +

remove the pixel "groups", which is the color aesthetic for the pixel borders

ggplot2::guides(colour = "none")

Second,in the tutorial "https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/visium_10x.md#compare-to-transcriptional-clustering" ，I don't quite understand the relationship between the topics in the diagram above and the cluster in this tutorial.What if the number of topics and clusters doesn't match?Can I determine which topic belongs to which cluster based on best match (highest correlation) of clusters (coms) and topics (decon)?

Cell type specific counts, and testing between conditions

Hello STdeconvolve devs,

I am looking for a good tool to use in my analysis of a large visium dataset, consisting of brain tissue from 2 different conditions. Ultimately, I am interested in comparing gene expression for a given cell type across these conditions.

I think STdeconvolve would be the perfect tool for this assuming it produces cell type specific counts that can be plugged into DE testing algorithms. Is this a feature of STdeconvolve, and if so, do you have any advice for how I can extract said counts and normalize them for DE testing? I have noticed it implemented in certain ref-based deconvolution algorithms, but I am specifically interested in the reference-free approach of this tool. Any advice would be appreciated!

-Dillon

Adding deconvolved cell types to Seurat `meta.data`

Hi @bmill3r ,

Thank you for this amazing tool!

I have run the workflow you have suggested and seems like the topics are closely related to the clusters identified by the Seurat workflow (at least for my Visium data). To double check this, it would be great if I can add a new column to the the meta.data of my Seurat object with the topic that each cell barcode belongs to, or better yet, both the topic and the final cell type annotation for that topic (including NAs) as two new columns to the [email protected].

Is there a wrapper somewhere that I missed to get this information summarises from the theta or beta matrices?

Many thanks,
Shani.

jef.works/stdeconvolve/

Visium image is flipped - mirror symmetry

I am following the Visium tutorial and the plots that are being generated with my Visium data are being flipped around (mirror-image) for each plot. When I read them into the Spatial Experiment object, they are orientated the correct way. How can I correct this? I am following the exact code in the visium tutorial

Could not find function "simulateBregmaSpots"

Dear Brendan,

Sorry to bother you.

I wanted to generate the gene counts and metadata tables of individual cells profiled by MERFISH from the Moffit et al. 2018 raw data. And I followed the code at https://jef.works/STdeconvolve/process_mpoa_data.html.

but when I generate hash tables of simulated spots for each bregma separately by:

 FN7_hash <- simulateBregmaSpots(annot.table_,
                                 counts = counts_, # gene counts matrix
                                 patch_size = 100) # size of the simulated pixels in um2 (units of the Centroids)

I get the following error:

Error in simulateBregmaSpots(annot.table_, counts = counts_, patch_size = 100) : 
  could not find function "simulateBregmaSpots"

I dont't konw how to solve this problem, could you please give me some help?

Thank you so much in advance for your time!
Best,
Yi Liu

ST deconvolve analyses with replicates

Dear all,

I am writing regarding a way of plotting results when having 2 replicates of the same condition. I am comparing healthy condition (n=2) with pathological condition (n=2), meaning 4 brain slices in total. I have run them individually first. I have used a K=9 (identifying 9 topics each). The most upregulated gene gives a name to each Topic and generally, due to the high variability in sample prep, it is hard to identify the same Topic gene label. Interestingly, one topic appeared to be present in both replicates of my pathological condition and not in my healthy one. Now I would like to plot this result but I don't have any good idea to highlight this result. Have you thought how to improve such thing when having biological replicates?

Thank you so much in advance,
Best,

@dcare91

How to make annote on my own data?

it is a great tools!If i want to use my own data, i would like to know how to make the input annotation? I have seen the issue on how to annotation celltype , but i can't understand the input format of annotation and the finnal output. Actually, in my opinion, the final result should be a dataframe, for example, containing the celltype(col), cell ID(row) and the proportion. Thanks.

Cell-type Annotation with STdeconvolve outputs

Hi,

Thank you very much for this amazing tool!
After running STdeconvolve on human femoral & carotid plaque Visium data, I'm thinking of doing cell-type annotation with Azimuth. Do you have any suggestions on how to implement Azimuth on STdeconvolve outputs?

Thanks,
Daniel Lee

Clustering a specific cell type

Hello Stdeconvolve developers,

I am interested in using Stdeconvolve to better understand a large spatial dataset I'm working with, and am most excited about its ability to generate cell type specific transcriptomes. However, when I tried it out, I had some difficulty isolating a cell type that interests me most. Would it be possible to tailor the analysis pipeline to focus in on a cell type I care most about, using known markers for that given cell type? I would be grateful for any help with this.

-Dillon

mOB dataset

Hello,
I am using Spatial 10X Visium data and I have been able to load my data most of the time, except when your example is set with the mOB or data(mOB) dataset. It works great and is very useful. Is there a way to understand what was done to obtain the mOB included in STdeconvolve or any other repositories as if I was processing it from a Spatial 10X Visium SpaceRanger output? Thank you!
Best regards,

Integrated Object Visualization

Hi Brendan,

I hope this inquiry finds you well, and thank you very much for kindly answering my questions. :)

I'm also trying to validate my cell type deconvolution results by integrating 3 Visium data before running the deconvolution function. However, when I followed the dataset integration instructions you've suggested (https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/process_bcl_data.md), ran the deconvolution functions (fitLDA, optimalModel, and getBetaTheta), and tried to visualize the deconvolution scene for each sample, I could not recall any spots. Here's the error message I've got: "Plotting scatterpies for 0 pixels with 20 cell-types...this could take a while if the dataset is large". I also looked into your dialogue with AteeqMKhaliq about this (#21) but I still couldn't make anything other than an empty plot. Could you give any pointers on how I may be able to fix this?

Here's the entire code I ran:
###########################################
#1 Data integration steps

Read the data

se1 <- SpatialExperiment::read10xVisium("/Users/.../Matrix/PIC83", "PIC83", type = "sparse", data = "filtered")
se2 <- SpatialExperiment::read10xVisium("/Users/.../Matrix/PIC84", "PIC84", type = "sparse", data = "filtered")
se3 <- SpatialExperiment::read10xVisium("/Users/.../Matrix/PIC85-1", "PIC85-1", type = "sparse", data = "filtered")

Combine counts and positions for all samples

all_counts <- cbind(as.matrix(counts(se1)), as.matrix(counts(se2)), as.matrix(counts(se3)))
all_pos <- rbind(spatialCoords(se1), spatialCoords(se2), spatialCoords(se3))

Assign slice identifiers

se1_slice <- rep(1, nrow(spatialCoords(se1)))
se2_slice <- rep(2, nrow(spatialCoords(se2)))
se3_slice <- rep(3, nrow(spatialCoords(se3)))
all_slice <- c(se1_slice, se2_slice, se3_slice)
names(all_slice) <- rownames(all_pos)

clean up poor spots and genes

allClean <- cleanCounts(counts = all_counts,
min.reads = 10,
min.lib.size = 10,
verbose = TRUE)

merge the 3 sections together

all_paths <- list()
all_paths[[1]] <- as.matrix(t(all_counts[, names(all_slice[all_slice == 1])]))
all_paths[[2]] <- as.matrix(t(all_counts[, names(all_slice[all_slice == 2])]))
all_paths[[3]] <- as.matrix(t(all_counts[, names(all_slice[all_slice == 3])]))

Selecting ODGenes across 3 samples

all_genes <- lapply(all_paths, function(p){
print(dim(p))
genes <- colnames(p)
print(length(genes))
genes
})
all_genes <- Reduce(intersect, all_genes)
length(all_genes)

all_ODgenes <- lapply(all_paths, function(p) {
dat <- preprocess(p,
extractPos = FALSE,
selected.genes = all_genes,
nTopGenes = 5,
genes.to.remove = NA,
removeAbove = 0.95,
removeBelow = NA,
min.reads = 10,
min.lib.size = 10,
min.detected = 1,
ODgenes = TRUE,
od.genes.alpha = 0.05,
gam.k = 5,
nTopOD = 500)
colnames(dat$corpus)
})
unionAllGenes <- Reduce(union, all_ODgenes)
length(unionAllGenes)

allCorpus <- preprocess(t(as.matrix(all_counts)),
extractPos = FALSE,
selected.genes = unionAllGenes,
min.reads = 0,
min.lib.size = 0)
allCorpus$pos <- all_pos[rownames(allCorpus$corpus), ]

filter spots in all_slice based on those kept in the corpus

all_slice_filt <- all_slice[names(all_slice) %in% rownames(allCorpus$pos)]
length(all_slice_filt)
allCorpus$slice <- all_slice_filt
dim(allCorpus$corpus)
allCorpus$slm
dim(allCorpus$pos)
length(allCorpus$slice)

###########################################
#2 Running the LDA model

fit LDA models to the data

ldas <- fitLDA(allCorpus$corpus, Ks = seq(20, 20, by = 1),
perc.rare.thresh = 0.05,
plot=TRUE,
verbose=TRUE)

optLDAS <- optimalModel(models = ldas, opt = 20)
results <- getBetaTheta(optLDAS, perc.filt = 0.05, betaScale = 1000)

Create a mapping between Ensembl IDs and gene symbols using the biomaRt package

ensembl <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
ensembl_gene_mapping <- getBM(attributes = c("ensembl_gene_id", "external_gene_name"), mart = ensembl)

Write a function to convert Ensembl IDs to gene symbols using the mapping

convert_ensembl_to_symbol <- function(ensembl_id, mapping) {
gene_symbol <- mapping$external_gene_name[match(ensembl_id, mapping$ensembl_gene_id)]
return(gene_symbol)
}

Convert the Ensembl IDs in the colnames(results$beta) into gene symbols

colnames(results$beta) <- sapply(colnames(results$beta), convert_ensembl_to_symbol, mapping = ensembl_gene_mapping)

Check for any NA

any_na <- any(is.na(colnames(results$beta)))

no NA value

Top 10 marker genes for each cluster

topGenes(results$beta, n=10)
###########################################
#3 Deconvolution Visualization

Extract the theta values and position information for the first sample

theta1 <- results$theta[rownames(allCorpus$pos)[allCorpus$slice == 1],]
pos_int1 <- allCorpus$pos[rownames(allCorpus$pos)[allCorpus$slice == 1],]
colnames(pos_int1) <- c("y", "x")

vizAllTopics(theta = theta1,
pos = pos_int1,
r = 5,
lwd = 0,
showLegend = TRUE,
plotTitle = NA) +
ggplot2::guides(fill = ggplot2::guide_legend(ncol = 1)) +
ggplot2::geom_rect(data = data.frame(pos_int1),
ggplot2::aes(xmin = min(x) - 90, xmax = max(x) + 90,
ymin = min(y) - 90, ymax = max(y) + 90),
fill = NA, color = "black", linetype = "solid", size = 0.5) +
ggplot2::theme(
plot.background = ggplot2::element_blank()
) +
ggplot2::guides(colour = "none")

###########################################
Here are the outputs I get from running "dim(theta1) and dim(pos_int1)":
[1] 1644 20
[1] 1644 2
Also, here are few first-line overview of theta1 and pos_int1 when I run head functions:
head(theta1):
1 2 3 4
AAACAAGTATCTCCCA-1 0.09521523 0.06524648 0.000000 0.0000000
AAACAGAGCGACTCCT-1 0.00000000 0.00000000 0.000000 0.0000000
AAACATTTCCCGGATT-1 0.00000000 0.00000000 0.000000 0.3612481
AAACCCGAACGAAATC-1 0.05972998 0.00000000 0.000000 0.0000000
AAACCTAAGCAGCCGG-1 0.00000000 0.00000000 0.000000 0.0000000
AAACCTCATGAAGTTG-1 0.00000000 0.00000000 0.254858 0.0000000
5 6 7 8 9 10
AAACAAGTATCTCCCA-1 0.0000000 0.00000000 0 0 0.0000000 0.2516840
AAACAGAGCGACTCCT-1 0.0000000 0.00000000 0 0 0.1006190 0.7903896
AAACATTTCCCGGATT-1 0.0000000 0.00000000 0 0 0.0000000 0.4040613
AAACCCGAACGAAATC-1 0.0000000 0.06014447 0 0 0.4313043 0.0000000
AAACCTAAGCAGCCGG-1 0.3908903 0.00000000 0 0 0.0000000 0.0000000
AAACCTCATGAAGTTG-1 0.0000000 0.00000000 0 0 0.0000000 0.0000000
11 12 13 14 15 16
AAACAAGTATCTCCCA-1 0.4222455 0 0 0.1656088 0.0000000 0.0000000
AAACAGAGCGACTCCT-1 0.0000000 0 0 0.0000000 0.0000000 0.0000000
AAACATTTCCCGGATT-1 0.0000000 0 0 0.0000000 0.2346906 0.0000000
AAACCCGAACGAAATC-1 0.2131955 0 0 0.0000000 0.0000000 0.0858006
AAACCTAAGCAGCCGG-1 0.1892510 0 0 0.0000000 0.0000000 0.2062755
AAACCTCATGAAGTTG-1 0.4748901 0 0 0.0000000 0.0000000 0.0000000
17 18 19 20
AAACAAGTATCTCCCA-1 0.0000000 0 0.0000000 0
AAACAGAGCGACTCCT-1 0.1089915 0 0.0000000 0
AAACATTTCCCGGATT-1 0.0000000 0 0.0000000 0
AAACCCGAACGAAATC-1 0.0000000 0 0.1498251 0
AAACCTAAGCAGCCGG-1 0.0000000 0 0.2135833 0
AAACCTCATGAAGTTG-1 0.0000000 0 0.2702520 0

head(pos_int1):
y x
AAACAAGTATCTCCCA-1 9041 14153
AAACAGAGCGACTCCT-1 16203 13154
AAACATTTCCCGGATT-1 6843 13606
AAACCCGAACGAAATC-1 10054 15630
AAACCTAAGCAGCCGG-1 6027 12012
AAACCTCATGAAGTTG-1 11521 4619

I know this is a wildly lengthy post, but I couldn't figure out what the issue is after spending almost the entire day debugging, so I would really appreciate your help!!

Thank you very much,
Daniel

Question about STdeconvolve: MOB layer transcriptional profiles

Hi，@bmill3r，

I recently read your paper "Reference-free cell type deconvolution of multi-cellular pixel-resolution spatially resolved transcriptomics data" and found it very insightful.

I tried to replicate the results of the MOB dataset using the code shared on GitHub and got good results. But I have a question: the transcriptional profiles of different layers of MOB were provided on github. How did you calculate these transcriptional profiles? Where can I find the corresponding single-cell data set about layer transcriptional profiles?

Error when loading counts matrix in "additional features" preprocessing step

Hello,

Hope you are all doing well. I've had success running MERINGUE and implementing the "getting started" tutorial for STdeconvolve with our data. I had no issues loading in our counts (cd) matrix (large dgCMatrix) until the pre-processing step in the "additional features" tutorial:

corpus1 <- preprocess(t(cd),
                       alignFile = NA, # if there is a file to adjust pixel coordinates this can be included.
                       extractPos = FALSE, # optional argument
                       selected.genes = NA, # 
                       nTopGenes = 3, # remove the top 3 expressed genes (genes with most counts) in dataset
                       genes.to.remove = c("^Trmt"), # ex: remove tRNA methyltransferase genes (gene names that begin with "Trmt")
                       removeAbove = 0.95, # remove genes present in 95% or more of pixels
                       removeBelow = 0.05, # remove genes present in 5% or less of pixels
                       min.reads = 10, # minimum number of reads a gene must have across pixels
                       min.lib.size = 100, # minimum number of reads a pixel must have to keep (before gene filtering)
                       min.detected = 1, # minimum number of pixels a gene needs to have been detected in
                       ODgenes = TRUE, # feature select for over dispersed genes
                       nTopOD = 100, # number of top over dispersed genes to use, otherwise use all that pass filters if `NA`
                       od.genes.alpha = 0.05, # alpha param for over dispersed genes
                       gam.k = 5, # gam param for over dispersed genes
                       verbose = TRUE)

I get the error:
Error in t.default(cd) : argument is not a matrix

Was wondering if you had any insight as to why it does not recognize the cd counts matrix considering that it was working for the other analyses-- can share what our matrix looks like if need be.

Thanks in advance!

Error in asMethod(object) : Cholmod error 'problem too large' at file

Hi,
Sorry to bother you, but I am stuck again at one point here, I am currently having a combined object with ~130,000 cells and when I run the restrictCorpus I am getting the following error

## filter for features in less than 100% of pixels but more than 5% of pixels
pdac.mat <- restrictCorpus(pdach1.mat, removeAbove = 1.0, removeBelow = 0.05)

and the error is as follows


Error in asMethod(object) :
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 102
Calls: restrictCorpus ... is.data.frame -> as.matrix -> as.matrix.Matrix -> as -> asMethod
Execution halted

Please let me know how best I can handle this error.

Thanks again, and have a great day.
Regards,
AMK

Question related to 10x Visium STdeconvolve

Hi,

I have a question related to STdeconvolve and if it's possible to fit the model on multiple samples at the same time? I have in total 16 samples that are all similar/equal to one another. Where I would like to run STdeconvolve and fit the model on all 16 samples. I've created a script for that to run it on the spatial raw data but I was wondering if the model can be used on 16 samples at the same time and still detect differences in cell type compositions between samples if there are any? The hypothesis is that we do not have cell type differences between the samples. However, we would like to know where which cell type is located within our samples.

Thank you,
Joy

Analyze specific area of the slide

Hi there, thanks for developing STdeconvolve. Is there a way to define a particular area on the slide and analyze only that, e.g. in case the slide has two sections of tissue or within the same section?

How to know what are topics when plotting pie charts on that pixel?

I tried different ways to define the topic cell type but not mentioned in the algorithm I guess.Wanted to ask you guys If we can actually say what a topic here instead of groups.

scatter plot circles instead of sctter pie circles

@bmill3r Would it be possible to plot the output of vizAllTopics as individual scatterplot circles instead of scatterpie circles? For example with geom_point() or some other way. Could I edit the vizAllTopics function to do this? And if so where can I access the code to do so?
Thanks.

Error when using annotateCellTypesGSEA

I'm trying to annotate cell types of visium 10x data using the annotateCellTypesGSEA function. This is a minimal script to reproduce the error

library(STdeconvolve)
library(SpatialExperiment)



se <- SpatialExperiment::read10xVisium(samples = "./116_run",
           type = "sparse",
           data = "filtered")
se

# this is the genes x barcode sparse count matrix
cd <- se@assays@data@listData$counts
pos <- SpatialExperiment::spatialCoords(se)

# change column names to x and y
# for this dataset, we will visualize barcodes using "pxl_col_in_fullres" = "y" coordinates, and "pxl_row_in_fullres" = "x" coordinates
colnames(pos) <- c("y", "x")

counts <- cleanCounts(cd, min.lib.size = 100, min.reads = 10)
corpus <- restrictCorpus(counts, removeAbove=1.0, removeBelow = 0.05, nTopOD = 1000)

#ldas <- fitLDA(t(as.matrix(corpus)), Ks = seq(2, 15, by = 1),
ldas <- fitLDA(t(as.matrix(corpus)), Ks = 10,
               perc.rare.thresh = 0.05,
               plot=TRUE,
               verbose=TRUE)

optLDA <- optimalModel(models = ldas, opt = "min")

results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconProp <- results$theta
deconGexp <- results$beta



gset <- list(
  test_type1 = c("FLG", "FLG2"),
  test_type2 = c("KTR25", "KTR71")
)

celltype_annotations <- annotateCellTypesGSEA(beta = results$beta, gset = gset, qval = 0.05)

The error itself is

initial: [1e+02 - Error in stats::ecdf(as.numeric(rvl$p/s.pm)) :
  'x' must have 1 or more non-missing values
Calls: annotateCellTypesGSEA -> <Anonymous> -> bulk.gsea -> <Anonymous>
Execution halted

I can provide the dataset by email if needed

deconvolved cell-types to seurat object

Hi Brendan,
Sorry for bothering you again, I have deconvolved the cell types and I have compared them in terms of clustering the barcodes transcriptionally as detailed in 10X tutorials. Now I have 18 beautiful clusters, is there a way I can add or convert or create a 2D t-SNE embedding into a Seurat object? I wanted to do the downstream analysis between these clusters. Sorry if I am asking too many questions, but trust me you are my last hope. Thanks again for creating such a beautiful tool.
Best wishes,
Regards,
AK

STdeconvolve and Seurat

Hi all!

I performed my ST analysis from 10X using Seurat framework. Then I discovered this amazing tool to deepen in my data without using Single Cell. Would have any option to combine Seurat objects with STdeconvolve?

Thanks in advance.

Error when running “vizAllTopics”

Hi,

This is a great software！
I am currently using 10x visium data for analysis and want to define cell types, but when running "vizAllTopics", I get an error:

“Error: 'transparentCol' is not an exported object from 'namespace:STdeconvolve'”

This is my code：

plt <- vizAllTopics(theta = deconProp, pos = pos, groups = rep("0", dim(deconProp)[1]), group_cols = c("0" = STdeconvolve::transparentCol("black", percent = 100)), r = 45, lwd = 0, showLegend = TRUE, plotTitle = NA)

Inconsistencies with Louvain Clustering

Hi there,
Thank you for developing such a useful and well-documented tool.

While processing my data, I have noticed inconsistencies in the transcriptional clustering.

Running this exact same block of code (adapted from the vignette) back-to-back on the same sample has resulted in several different figures, some with different numbers of clusters. See figures below.

pcs.info <- stats::prcomp(t(log10(as.matrix(counts)+1)), center=TRUE)
nPcs <- 8 
pcs <- pcs.info$x[,1:nPcs]
emb <- Rtsne::Rtsne(pcs,
                    is_distance=FALSE,
                    perplexity=30,
                    num_threads=1,
                    verbose=FALSE)$Y
rownames(emb) <- rownames(pcs)
colnames(emb) <- c("x", "y")
k <- 25
com <- MERINGUE::getClusters(pcs, k, weight=TRUE, method = igraph::cluster_louvain)
tempCom <- com
dat <- data.frame("emb1" = ppos[,"x"],
                  "emb2" = ppos[,"y"],
                  "Cluster" = tempCom)
plt <- ggplot2::ggplot(data = dat) +
  ggplot2::geom_point(ggplot2::aes(x = emb1, y = emb2,
                                   color = Cluster), size = 1.2) +
  ggplot2::scale_color_manual(values = rainbow(n = 15)) +
  # ggplot2::scale_y_continuous(expand = c(0, 0), limits = c( min(dat$emb2)-1, max(dat$emb2)+1)) +
  # ggplot2::scale_x_continuous(expand = c(0, 0), limits = c( min(dat$emb1)-1, max(dat$emb1)+1) ) +
  ggplot2::labs(title = "",
                x = "x",
                y = "y") +
  ggplot2::theme_classic() +
  ggplot2::theme(axis.text.x = ggplot2::element_text(size=15, color = "black"),
                 axis.text.y = ggplot2::element_text(size=15, color = "black"),
                 axis.title.y = ggplot2::element_text(size=15),
                 axis.title.x = ggplot2::element_text(size=15),
                 axis.ticks.x = ggplot2::element_blank(),
                 plot.title = ggplot2::element_text(size=15),
                 legend.text = ggplot2::element_text(size = 12, colour = "black"),
                 legend.title = ggplot2::element_text(size = 15, colour = "black", angle = 0, hjust = 0.5),
                 panel.background = ggplot2::element_blank(),
                 plot.background = ggplot2::element_blank(),
                 panel.grid.major.y =  ggplot2::element_blank(),
                 axis.line = ggplot2::element_line(size = 1, colour = "black")
                 # legend.position="none"
  ) +
  ggplot2::guides(colour = ggplot2::guide_legend(override.aes = list(size=2), ncol = 2)
  ) +
  ggplot2::coord_equal()
plt

Various figures produced:

I know the tSNE/UMAP plots are often vary slightly in their arrangement, but I haven't encountered a situation where the number of clusters change. Can you clarify if this is normal, or a bug, or something I'm missing/doing incorrectly? (I apologize in advance if this is an issue I should have opened in the Meringue GitHub.)

Thank you!

How to start with 10 x visium data

Hi,

Very useful software!
I just wonder how to use it with the 10 x visium spaceranger output?
Do you have a brief intro for that?
Thank you.

Superimpose scatterpies on an H&E image

Dear Dr. Miller (@bmill3r ),

I am working on spatial transcriptomics data that I created using the 10X visium kit.
I used this tool to deconvolute the cell types in my tissue and it works like a charm.
Now I am trying to superimpose the scatterpies I got on top of the H&E images.
I saw in the vizAllTopics function that you added an "overlay" argument but I didn't succeeded to place the scatterpies and the image on top of each other.
Can I please get your assistance on how to do so?

Thank you so much for creating STdeconvolve,
Best regards,
Peleg

integrated Object

Hi, Thanks for developing reference free Deconvolve method, I thoroughly enjoy running it on the example data. Going forward, I have an integrated spatial Seurat object, we have 9 spatial samples(images ) as one object. My question is how best I can use this method on our dataset. Should I run STdeconvolve on individual samples or can I run this together? My codes are as follows.


` extract the counts matrix

mat <- Matrix((st@assays$Spatial@counts), sparse = TRUE)

remove poor genes and pixels
mat <- cleanCounts(mat, min.lib.size = 100)
filter for features in less than 100% of pixels but more than 5% of pixels
pdach1.mat <- restrictCorpus(mat, removeAbove = 1.0, removeBelow = 0.05)


pdf("pdach1.perplexity.pdf")
ldas = fitLDA(mat, Ks = seq(2, 9, by=1), plot=T, verbose=TRUE)
dev.off()

optLDA <- optimalModel(models = ldas, opt = "kneed")

results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconvolution resultsattach(frame)
deconProp =results$theta
write.table(deconProp, "All_results_decon.txt", quote=F)

 predicted transcriptional profiles of the deconvolved cell-types as the beta matrix
deconGexp = results$beta

#SAMPLE 1
sample1.pos <- GetTissueCoordinates(st@images$sample1)

 change column names to x and y
colnames(sample1.pos) <- c('x','y')
`

Please let me know how best I can approach this issue.

Thank you for your help.
Regards,
AK

How to work on Slide-seq adata object?

Dear STdeconvolve developers,

I am very interested in using STdeconvolve to explore my dataset, I know this maybe a silly question.
But how to load Slide-seq data to the workflow of STdeconvolve?
I am familiar with R and Python, maybe point me to some documents or tell me the strategy would be fine.

Thanks!

Giotto Object to STdeconvolve, and general questions about workflow, choosing K, cell type annotation, etc

Thank you so much for creating this tool for deconvolution without scRNAseq data.

I am working with an extremely rare pediatric tumor where the samples are few and far between and there are basically no publicly available scRNAseq reference dataset.

I am currently running my samples through Visium platform for SRT and using Giottosuite to ultimate analyze my data. The output is a Giotto object. I have read through the documentation using 10x data but I wasn't sure how to utilize Giotto object in the code for setting up to perform STdeconvolve. Can I please get some assistance.

Thank you so much I cannot wait to use STdeconvolve

How do i view all log2fc gene.

Hi,

I've been following the 10x Visium protocol and it seems to work well, but before i annotate the cell type, i would love to do some sanity check of the log2fc genes expression. I am trying to pull all the log2fc gene expression but after following the instruction, it only allows me to view top 6 gene after applying the following command

"log2topgene2 <- lapply(1:20, function(i) {
head(sort(log2(deconGexp[i,]/colMeans(deconGexp[-i,])), decreasing=TRUE))
})"

"ps <- lapply(colnames(deconProp), function(celltype) {

celltype <- as.numeric(celltype)

highly expressed in cell-type of interest

highgexp <- names(which(deconGexp[celltype,] > 3))

high log2(fold-change) compared to other deconvolved cell-types

log2fc <- sort(log2(deconGexp[celltype,highgexp]/colMeans(deconGexp[-celltype,highgexp])), decreasing=TRUE)
markers <- names(log2fc)[1] ## label just the top gene"

Is there a way not limit the gene and maybe display top 200 genes?

Suggestions to improve user experienced

As I'm using STdeconvolve on new datasets, here are some enhancements that I believe will help improve the user experience. This is a running list. Please feel free to add and check off as needed.

fitLDA would benefit from a progress bar that shows if verbose=TRUE. See https://github.com/r-lib/progress
vizAllTopics should check if groups is a factor and cast if not (otherwise throws errors)
vizAllTopics would benefit from automatically setting an appropriate r based on the scale of pos unless users override with a specific choice (ex. max(0.5, max(pos)/nrow(pos)*10 or something)
restrictCorpus should limit the number of genes to some top most variable set (similar to how veloviz does it) in the event there are too many features (ex > 1000 by default or some other user-specified number)
fitLDA needs a legend to indicate the shaded regions which are where alpha > 1
progress bar for plotting? (I don't think this is possible with ggplot, so maybe some indication it is doing something?
function to rotate the coordinates for plotting (UPDATE: see responses below for quick solution to this)

From STdeconvolve/docs/visium_10x.md. Error when call function CleanCounts()

This is my SpatialExperiment object:

spatial_data <- read10xVisium(
    samples='data/sample',                  
    sample_id='10X_Mouse_Brain',  
    type='HDF5',
    data = 'filtered',
    images='hires',
    load=FALSE
)
spatial_data

class: SpatialExperiment
dim: 32285 2702
metadata(0):
assays(1): counts
rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ...
ENSMUSG00000095019 ENSMUSG00000095041
rowData names(1): symbol
colnames(2702): AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 ...
TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
colData names(4): in_tissue array_row array_col sample_id
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
imgData names(4): sample_id image_id data scaleFactor

Using CleanCounts():

x <- spatial_data@assays@data@listData$counts
counts <- cleanCounts(x, min.lib.size = 100, min.reads = 10)

Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...): 'x' must be an array of at least two dimensions
Traceback:

cleanCounts(x, min.lib.size = 100, min.reads = 10)
Matrix::rowSums(counts)
Matrix::rowSums(counts)
base::rowSums(x, na.rm = na.rm, dims = dims, ...)
stop("'x' must be an array of at least two dimensions")

I look at source code of cleanCounts() function, and found that the problem is due to the line `counts <- Matrix::Matrix(counts, sparse = TRUE)` which will convert count matrix of SpatialExperiment into a sparse matrix with shape (87234070 x 1). And when apply filter metrics to that matrix, the result will be 0 cells and 0 genes.

STdeconvolve

The differentially expressed genes for each deconvolved cell-type

Hi all,
I don't have single cell data. I want to use this tool to determine the cell type of my 10x visium data. I would like to get the top5 marker genes in each cluster instead of the top one. What should I do? Thanks!

Question about MERFISH benchmarking

Thanks for your great work. I'm wondering in the MERFISH benchmarking, how do you align the cell types of scRNA-seq which is used as a reference to the cell types of cells in MERFISH and where do you obtain the scRNA-seq data. Could you please explain this detailedly to me? Thanks.

Combination of STdeconvolve with BayesSpace

Hi bmill3r,

Thank you very much for this exciting package. It is very useful, especially when retrospectively analyzing archival tissue with Visium FFPE.

Since you feature the BayesSpace package in the Nat Com Paper (https://github.com/edward130603/BayesSpace), I was wondering if an upstream BayesSpace resolution enhancement would improve the performance of the STdeconvolve algorithm. I could imagine that STdeconvolve would in principle benefit from a higher (although predicted) resolution of the underlying gene expression matrix.

Do you think that this approach would be statistically sound or would it violate the underlying STdeconvolve assumptions?

Thank you very much for time in advance and thanks again for your great work.

Mismatching when trying to reproduce 10x Visium Coronal Mouse Brain tutorial

Hi,

I am trying to reproduce the coronal mouse brain section tutorial https://jef.works/STdeconvolve/visium_10x.html.

If I use exactly the same code as the one shown in that tutorial, I get some different results mainly because of the filtering step:

library(STdeconvolve)
library(MERINGUE)
library(MUDAN)

f <- "visiumTutorial/"

if(!file.exists(f)){
dir.create(f)
}

if(!file.exists(paste0(f, "V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.tar.gz"))){
tar_gz_file <- "http://cf.10xgenomics.com/samples/spatial-exp/1.1.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.tar.gz"
download.file(tar_gz_file,
destfile = paste0(f, "V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.tar.gz"),
method = "auto")
}
untar(tarfile = paste0(f, "V1_Adult_Mouse_Brain_filtered_feature_bc_matrix.tar.gz"),
exdir = f)

if(!file.exists(paste0(f, "V1_Adult_Mouse_Brain_spatial.tar.gz"))){
spatial_imaging_data <- "http://cf.10xgenomics.com/samples/spatial-exp/1.1.0/V1_Adult_Mouse_Brain/V1_Adult_Mouse_Brain_spatial.tar.gz"
download.file(spatial_imaging_data,
destfile = paste0(f, "V1_Adult_Mouse_Brain_spatial.tar.gz"),
method = "auto")
}
untar(tarfile = paste0(f, "V1_Adult_Mouse_Brain_spatial.tar.gz"),
exdir = f)

se <- SpatialExperiment::read10xVisium(samples = f,
type = "sparse",
data = "filtered")
se

cd <- se@assays@data@listData$counts

pos <- SpatialExperiment::spatialCoords(se)

change column names to x and y

for this dataset, we will visualize barcodes using "pxl_col_in_fullres" = "y" coordinates, and "pxl_row_in_fullres" = "x" coordinates

colnames(pos) <- c("y", "x")

counts <- cleanCounts(cd, min.lib.size = 100, min.reads = 10)

The above line outputs: > colnames(pos) <- c("y", "x")

counts <- cleanCounts(cd, min.lib.size = 100, min.reads = 10)
Filtering matrix with 2702 cells and 32285 genes ...
Resulting matrix has 26 cells and 1596 genes

Further to this,

corpus <- restrictCorpus(counts, removeAbove=1.0, removeBelow = 0.05, nTopOD = 1000)

outputs:

Removing 23 genes present in 100% or more of pixels...
1573 genes remaining...
Removing 0 genes present in 5% or less of pixels...
1573 genes remaining...
Restricting to overdispersed genes with alpha = 0.05...
Calculating variance fit ...
Using gam with k=5...
20 overdispersed genes ...
Using top 1000 overdispersed genes.

number of top overdispersed genes available: 20

However, in your tutorial the same lines of code produce:

Removing 21 genes present in 100% or more of pixels...

17429 genes remaining...

Removing 4472 genes present in 5% or less of pixels...

12957 genes remaining...

Restricting to overdispersed genes with alpha = 0.05...

Calculating variance fit ...

Using gam with k=5...

2348 overdispersed genes ...

Using top 1000 overdispersed genes

Can you please check this and help me understand why there is mismatching? Is it just a typo in your tutorial? If yes can you please let me know the correct filter you applied?

One more question!

Is it possible to run stDeconvolve on the whole dataset - i.e. apply no filtering at all ?

Thanks!!

InferCNV from STdeconvolve data

Hi @bmill3r ,

Thank you for developing this tool. Deconvolution and annotation have worked well for me thus far. I am now looking to attempt to do CNV analysis on the deconvolved cell types (I have already done them on the non-deconvolved samples) as a form of comparison and achieving more granularity. I am struggling with creating the initial infercnv Object. I'm not sure your familiarity with the package, so I'll briefly detail what is needed.

In order to create an infercnv object, I need a raw_counts_matrix which is the matrix of genes (rows) vs. cells (columns) containing the raw counts. If I am correct, I can use the corpus for my raw counts matrix so that isn't an issue.

You then need an annotations_file which for me is typically a matrix of the barcodes (rows) vs Seurat clusters (columns).
The goal would be to have the annotations file be the barcodes corresponding to each cell type. I know the results$theta are proportions of each cell type for each spot, so I am having difficulty trying to circumvent that issue either by making each barcode correspond to the cell type of the highest proportion, or another idea if you have one.

I greatly appreciate any assistance you can give.

Thanks,
Rob

Integration with STUtility

Hi, I'm wondering if there is a way for to integrate STdeconvolve with STUtility/Seurat?
I have already performed filtering and clustering using these with a saved s4 object (example below)

Formal class 'Seurat' [package "SeuratObject"] with 13 slots
..@ assays :List of 2
.. ..$ RNA:Formal class 'Assay' [package "SeuratObject"] with 8 slots
.. ..$ SCT:Formal class 'SCTAssay' [package "Seurat"] with 9 slots
..@ meta.data :'data.frame': 11714 obs. of 22 variables:
.. ..$ orig.ident : Factor w/ 1 level "SeuratProject": 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ nCount_RNA : num [1:11714] 4083 3920 8126 8507 4338 ...
.. ..$ nFeature_RNA : int [1:11714] 1429 1466 2578 2524 1752 1650 3310 4130 2476 2823 ...
.. ..$ slide : chr [1:11714] "D17" "D17" "D17" "D17" ...
.. ..$ treatment : chr [1:11714] "PBS" "PBS" "PBS" "PBS" ...
.. ..$ id : int [1:11714] 27 29 31 33 35 37 39 41 43 45 ...
.. ..$ labels : chr [1:11714] "Default" "Default" "Default" "Default" ...
.. ..$ sample_id : chr [1:11714] "section_1" "section_1" "section_1" "section_1" ...
.. ..$ slide1 : Factor w/ 6 levels "D17","D19","D54",..: 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ percent.mito : num [1:11714] 29.5 27.6 21.7 21.7 20.1 ...
.. ..$ percent.ribo : num [1:11714] 8.55 8.55 9.61 10.09 9.84 ...
.. ..$ nCount_SCT : num [1:11714] 11606 11413 11185 11262 11202 ...
.. ..$ nFeature_SCT : int [1:11714] 1991 2066 2589 2533 2202 2120 3310 3857 2486 2823 ...
.. ..$ SCT_snn_res.0.1: Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ SCT_snn_res.0.2: Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 6 1 1 ...
.. ..$ SCT_snn_res.0.3: Factor w/ 10 levels "0","1","2","3",..: 1 1 5 5 1 5 5 7 5 5 ...
.. ..$ SCT_snn_res.0.4: Factor w/ 11 levels "0","1","2","3",..: 1 1 3 3 1 3 3 6 3 3 ...
.. ..$ SCT_snn_res.0.6: Factor w/ 13 levels "0","1","2","3",..: 1 1 2 2 1 2 2 5 2 2 ...
.. ..$ SCT_snn_res.0.8: Factor w/ 16 levels "0","1","2","3",..: 1 1 3 7 1 7 3 9 7 3 ...
.. ..$ SCT_snn_res.1 : Factor w/ 18 levels "0","1","2","3",..: 1 1 2 16 1 16 2 4 16 2 ...
.. ..$ SCT_snn_res.1.2: Factor w/ 21 levels "0","1","2","3",..: 1 1 2 7 1 7 2 21 7 2 ...
.. ..$ seurat_clusters: Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 6 1 1 ...
..@ active.assay: chr "SCT"
..@ active.ident: Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 6 1 1 ...
.. ..- attr(*, "names")= chr [1:11714] "CTGCACCTGGAACCGC-1_1" "GTTCTTCCCTCGATGT-1_1" "AACGATAGAAGGGCCG-1_1" "AGGCCCATTGTACAGG-1_1" ...
..@ graphs :List of 2
.. ..$ SCT_nn :Formal class 'Graph' [package "SeuratObject"] with 7 slots
.. ..$ SCT_snn:Formal class 'Graph' [package "SeuratObject"] with 7 slots
..@ neighbors : list()
..@ reductions :List of 4
.. ..$ pca :Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
.. ..$ tsne :Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
.. ..$ umap :Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
.. ..$ umap.3d:Formal class 'DimReduc' [package "SeuratObject"] with 9 slots
..@ images : list()
..@ project.name: chr "SeuratProject"
..@ misc : list()
..@ version :Classes 'package_version', 'numeric_version' hidden list of 1
.. ..$ : int [1:3] 4 0 2
..@ commands :List of 6
.. ..$ SCTransform.RNA :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
.. ..$ RunPCA.SCT :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
.. ..$ FindNeighbors.SCT.pca:Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
.. ..$ FindClusters :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
.. ..$ RunTSNE :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
.. ..$ RunUMAP.SCT.pca :Formal class 'SeuratCommand' [package "SeuratObject"] with 5 slots
..@ tools :List of 1
.. ..$ Staffli:Formal class 'Staffli' [package "STutility"] with 12 slots

About loading 10X data

I wanted to load a filtered data from a spatial 10X experiment by:

se <- SpatialExperiment::read10xVisium(samples = "outs/filtered_feature_bc_matrix.h5", type = "sparse", data = "filtered")

However, I get en error:
Error in SpatialExperiment::read10xVisium(samples = "outs/filtered_feature_bc_matrix.h5", : No matching files found for 'images=c(“lowres”)

I tried to change the code to:
se <- SpatialExperiment::read10xVisium(samples = "outs/filtered_feature_bc_matrix.h5", type = "sparse", data = "filtered", images = c("lowres"), load = TRUE)
but didn't work. What would I do?

Thanks

STdeconvolve on multiple sections at the same time

Dear Team,

Thank you for your amazing package!

I wanted to ask you if it's possible to run STdeconvolve on multiple sections at the same time (I have done this by following the instructions from breast cancer sections. I have in total 16 similar sections obtained from 10x Visium that I put through. However, I'm trying to plot the topics on the ST data by the following code:

plt <- vizAllTopics(theta = deconProp,
pos = pos,
r = 45,
lwd = 0,
showLegend = TRUE,
plotTitle = NA) +
ggplot2::guides(fill=ggplot2::guide_legend(ncol=2)) +

outer border

ggplot2::geom_rect(data = data.frame(pos),
ggplot2::aes(xmin = min(x)-90, xmax = max(x)+90,
ymin = min(y)-90, ymax = max(y)+90),
fill = NA, color = "black", linetype = "solid", size = 0.5) +

ggplot2::theme(
plot.background = ggplot2::element_blank()
) +

remove the pixel "groups", which is the color aesthetic for the pixel borders

ggplot2::guides(colour = "none")

However, this results into 1 plot for all 16 samples. Is there a possibility to plot all 16 samples separately?

With kind regards,
Joy Otten

Visualization-deconvolved cell types

I attempted to visualize some of the results but when I apply the commands, I cannot see the deconvolved cell types properly or distinctly. Also, I should mention that:
1- I get no error. Just the visualized images do not seem to be proper ones.
2- I do follow this link for my analysis: https://github.com/JEFworks-Lab/STdeconvolve/blob/devel/docs/visium_10x.md
I hope I am on the right way.

Cell volumes for MPOA reproduction

Dear Dr. Miller (@bmill3r ),

Sorry to bother you, I am trying to run through the pseudo synthetic ST data (MPOA) analysis using STDeconvolve MPOA tutorial, but cannot find the cell volume data for animal FN7, datasets 171021_FN7_2_M22_M26 and 171023_FN7_1_M22_M26 in order to get discrete transcipt counts for each of the genes in order to continue with the LDA modelling. Could you perhaps share them?

Best regards,
paulmorio

vizAllTopics "groups" parameter not working correctly

Hi there, and thank you so much for developing such a great package! I have gone through the tutorial with my own data and everything is working well except for the groups parameter of the vizAllTopics() function. I have two different columns of meta.data that I would like to color code the scatterpies by: 1) cluster and 2) tissue region (like shown in this tutorial). However, when I attempt to color code my scatterpies by either of these meta.data values, the resulting plot is wrong and it looks like the annotations are being assigned to the completely wrong spots (this is especially obvious when looking at my tissue region annotations, which should be grouped together throughout the plot but are instead scattered like confetti). The resulting Topics that I get back look correct, and seem to be identifying known cell types in the correct spots. It's just the groups annotation that is not working. If this is a user issue, do you have any thoughts or advice on where I might need to tweak the code? I have followed everything according to the tutorials thus far.

jefworks-lab / stdeconvolve Goto Github PK

stdeconvolve's People

Contributors

Stargazers

Watchers

Forkers

stdeconvolve's Issues

outer border

remove the pixel "groups", which is the color aesthetic for the pixel borders

Read the data

Combine counts and positions for all samples

Assign slice identifiers

clean up poor spots and genes

merge the 3 sections together

Selecting ODGenes across 3 samples

filter spots in all_slice based on those kept in the corpus

fit LDA models to the data

Create a mapping between Ensembl IDs and gene symbols using the biomaRt package

Write a function to convert Ensembl IDs to gene symbols using the mapping

Convert the Ensembl IDs in the colnames(results$beta) into gene symbols

Check for any NA

no NA value

Top 10 marker genes for each cluster

Extract the theta values and position information for the first sample

highly expressed in cell-type of interest

high log2(fold-change) compared to other deconvolved cell-types

This is my SpatialExperiment object:

Using CleanCounts():

change column names to x and y

for this dataset, we will visualize barcodes using "pxl_col_in_fullres" = "y" coordinates, and "pxl_row_in_fullres" = "x" coordinates

Removing 21 genes present in 100% or more of pixels...

17429 genes remaining...

Removing 4472 genes present in 5% or less of pixels...

12957 genes remaining...

Restricting to overdispersed genes with alpha = 0.05...

Calculating variance fit ...

Using gam with k=5...

2348 overdispersed genes ...

Using top 1000 overdispersed genes

Hi, I'm wondering if there is a way for to integrate STdeconvolve with STUtility/Seurat? I have already performed filtering and clustering using these with a saved s4 object (example below)

outer border

remove the pixel "groups", which is the color aesthetic for the pixel borders

Recommend Projects

Recommend Topics

Recommend Org

Hi, I'm wondering if there is a way for to integrate STdeconvolve with STUtility/Seurat?
I have already performed filtering and clustering using these with a saved s4 object (example below)