Code Monkey home page Code Monkey logo

five_masking_schemes's Introduction

Introduce-zeros-to-scRNA-seq-data

Five masking schemes that introduce zeros to scRNA-seq data

five_masking_schemes's People

Contributors

ruochenj avatar

Stargazers

 avatar Christian Gaetano avatar

Watchers

 avatar

five_masking_schemes's Issues

五种掩蔽方案的问题

`# five ways of introducing zeros to a matrix:
set.seed(1234)

can change this value to set zero_prop of the no zero values to zero.

zero_prop = 0.5

a toy example matrix

complete_mat <- matrix(rnorm(120,2,2), 3, 4)
dim(complete_mat)
complete_mat[complete_mat < 0.1] = 0

random mask (all genes)

sce_ct_nzi <- complete_mat
zi_idx <- matrix(rbinom(dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2], size = 1, prob = (1-zero_prop)), nrow = dim(sce_ct_nzi)[1], ncol = dim(sce_ct_nzi)[2])
sce_ct_zi <- sce_ct_nzi * zi_idx
sce_ct_zi1 <- sce_ct_zi
sum(sce_ct_zi1 == 0) / (dim(sce_ct_zi)[1] * dim(sce_ct_zi)[2])

quantile mask (all genes)

introduce zero by truncation

sce_ct_nzi2 <- complete_mat
sce_ct_zi2 <- sce_ct_nzi2
idx_nz1 <- which(sce_ct_nzi2 > 0)
cutoff <- quantile(sce_ct_nzi2[idx_nz1], zero_prop)
idx_zero2 <- which(sce_ct_nzi2[idx_nz1] <= cutoff)
idx_zero2 <- sample(idx_nz1[idx_zero2], floor(zero_prop * length(idx_nz1)))
sce_ct_zi2[idx_zero2] <- 0
sum(sce_ct_zi2 == 0) / (dim(sce_ct_zi2)[1] * dim(sce_ct_zi2)[2])
sum(complete_mat == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])
zp_rec <- sum(sce_ct_zi2 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_zi)[2])

random mask (gene specific)

sce_ct_nzi3 <- complete_mat
sce_ct_zi3 <- sce_ct_nzi3
ZC_avg <- apply(sce_ct_nzi3, 1, FUN = function(x){
if(sum(x == 0) == length(x)){
return(c(0,0))
}else{
return(c(sum(x > 0), mean(x[x>0])))
}
})
f <- function(lambda){
sce_ct_zi3 <- apply(sce_ct_nzi3, 1, FUN = function(x){
if(sum(x == 0) == length(x)){
return(x)
}else{
nz_idx <- which(x > 0)
x[nz_idx] <- x[nz_idx] * (1-rbinom(length(nz_idx), size = 1, prob = exp(-lambdalog(mean(x[nz_idx]) + 1.01)^2)))
return(x)
}
})
return(sum(sce_ct_zi3 == 0) / (dim(sce_ct_nzi3)[1] * dim(sce_ct_nzi3)[2]) - zp_rec)
}
lambda = uniroot(f, c(0,20))$root
sce_ct_zi3 <- t(apply(sce_ct_nzi3, 1, FUN = function(x){
if(sum(x == 0) == length(x)){
return(x)
}else{
nz_idx <- which(x > 0)
x[nz_idx] <- x[nz_idx] * (1-rbinom(length(nz_idx), size = 1, prob = exp(-lambda
log(mean(x[nz_idx]) + 1.01)^2)))
return(x)
}
}))
sum(sce_ct_zi3 == 0) / (dim(sce_ct_nzi3)[1] * dim(sce_ct_nzi3)[2])

quantile mask (same percentage)

sce_ct_nzi4 <- complete_mat
sum(sce_ct_nzi4 == 0) / (dim(sce_ct_nzi4)[1] * dim(sce_ct_nzi4)[2])
sce_ct_zi4 <- t(apply(sce_ct_nzi4, 1, FUN = function(x){
idx_nz1 <- which(x > 0)
cutoff <- quantile(x[idx_nz1], zero_prop)
idx_zero2 <- which(x[idx_nz1] <= cutoff)
idx_zero2 <- sample(idx_nz1[idx_zero2], floor(zero_prop * length(idx_nz1)))
x[idx_zero2] <- 0
return(x)
}))
sum(sce_ct_zi4 == 0) / (dim(sce_ct_zi4)[1] * dim(sce_ct_zi4)[2])

quantile mask (gene specific)

sce_ct_nzi5 <- complete_mat
g <- function(lambda){
sce_ct_zi5 <- apply(sce_ct_nzi5, 1, FUN = function(x){
if(sum(x == 0) == length(x)){
return(x)
}else{
x[rank(x) <= (length(x) * exp(-lambdalog(mean(x[x > 0]))^2))] = 0
return(x)
}
})
return(sum(sce_ct_zi5 == 0) / (dim(sce_ct_nzi5)[1] * dim(sce_ct_nzi5)[2]) - zp_rec)
}
lambda = uniroot(g, c(0,200))$root
sce_ct_zi5 <- t(apply(sce_ct_nzi5, 1, FUN = function(x){
if(sum(x == 0) == length(x)){
return(x)
}else{
x[rank(x) <= (length(x) * exp(-lambda
log(mean(x[x > 0]))^2))] = 0
return(x)
}
}))
sum(sce_ct_zi5 == 0) / (dim(sce_ct_nzi5)[1] * dim(sce_ct_nzi5)[2])

sum(sce_ct_zi1 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])
sum(sce_ct_zi2 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])
sum(sce_ct_zi3 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])
sum(sce_ct_zi4 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])
sum(sce_ct_zi5 == 0) / (dim(sce_ct_nzi)[1] * dim(sce_ct_nzi)[2])

Next steps

Run imputation methods on the scRNA-seq data with increased zero values (sce_ct_zi1, sce_ct_zi2, sce_ct_zi3, sce_ct_zi4, sce_ct_zi5)

`
image
image
image
image
image
image
您好!我想请问一下,为什么这五种掩蔽方案,我得到的结果掩蔽比例有的不是0.5呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.