Code Monkey home page Code Monkey logo

smile's Introduction

SMILE

SMILE: Mutual Information Learning for Integration of Single Cell Omics Data

Citation

Xu et al. "SMILE: mutual information learning for integration of single-cell omics data". Bioinformatics

Requirements

  • numpy
  • scipy
  • pandas
  • scikit-learn
  • scanpy
  • pytorch

Updates

Update 06/30/2022 (Identifying shared signatures across modalities)

To identify shared signatures across modalities via SMILE, please see tutorial

|----SMILE_identify_shared_signatures_across_modalities.ipynb

Update 05/09/2022 (Using joint-profiling data as reference for integration)

##rna_X: RNA-seq data; dna_X: ATAC-seq data, rna_X and dna_X are paired data
##rna_X_unpaired: RNA-seq data; dna_X_unpaired: ATAC-seq data, rna_X_unpaired and dna_X_unpaired are unpaired data, and we wish to integrate unpaired data
##Both rna_X and dna_X are matrices in which each row represents one cell while each column stands for a feature
##each row in rna_X and dna_X is paired for training purpose
##Within modality, for example rna_X and rna_X_unpaired, data share the same feature space

## Proecessed data could be found at https://doi.org/10.5281/zenodo.7776066

from SMILE import littleSMILE
from SMILE import ReferenceSMILE_trainer
integrater = littleSMILE(input_dim_a=rna_X.shape[1],input_dim_b=dna_X.shape[1],clf_out=20)
ReferenceSMILE_trainer(rna_X,dna_X,rna_X_unpaired,dna_X_unpaired, integrater, train_epoch=1000)

Integrate RNA-seq/ATAC-seq from Granja et al 2019 with 10X multiome PBMC data as reference

|----SMILE_data_integration_withReference.ipynb

Tutorial

For quick start

##rna_X: RNA-seq data; dna_X: ATAC-seq data
##Both rna_X and dna_X are matrices in which each row represents one cell while each column stands for a feature
##each row in rna_X and dna_X is paired for training purpose 

from SMILE import SMILE
from SMILE import PairedSMILE_trainer
net = littleSMILE(input_dim_a=rna_X.shape[1],input_dim_b=dna_X.shape[1],clf_out=30)
ReferenceSMILE_trainer(X_rna_paired,X_dna_paired,X_rna_unpaired,X_dna_unpaired,net,batch_size=1024, f_temp = 0.2)

For detail

For use of SMILE in multi-source single-cell transcriptome data

|----SMILE_MouserCortex_RNAseq_example.ipynb

For use of SMILE in integration of multimodal single-cell data

|----SMILE_Celllines_RNA-ATAC-integration_example.ipynb

For screening key genes that contribute co-embedding

|----screen_factors_for_coembedding.py
Processed data can be found at https://figshare.com/articles/dataset/Mouse_skin_data_by_SHARE-seq/16620367

smile's People

Contributors

rpmccordlab avatar imxman avatar

Stargazers

wook2014 avatar Marta Moreno Gonzalez avatar Leqi Tian avatar STEPHEN avatar  avatar  avatar Jiantao Shi avatar Zhu Jiahui avatar Alberto Labarga avatar Hiram Coria avatar slp avatar Ying avatar

Watchers

 avatar

smile's Issues

How to set hyperparameters to reproduce the results?

Thanks for sharing the work!

I have trouble in reproducing the figure Fig.2C in your paper. Here are my reproduction steps:

  1. the same preprocessing steps you proposed:
    use scanpy to finish LogNormalize(scale_factor=1e4), select hvgs by batch (n_hvgs=2000)
  2. the default model and training parameters:
    clf_out = 25
    learning_rate=1e-2,
    batch_size = 512,
    num_epoch=5,
    f_temp = 0.1
    p_temp = 0.15

I tried to increase the training epochs, turn down learning rate or change the temperature parameters. But it didn't work at all, just as the picture shows.
SMILE_d5
.

So, could you share the settings you produce the Fig.2C?
thanks

single point for each cell in the UMAP

Hi, I'm using SMILE for integrating different single-cell omics and I find it very useful. One thing that biologists are asking me is to have in the UMAP a single point representing for each cell the all the different omics together. Do you think this make sense? In case, do you have any suggestion about have to proceed?

Problem with GPU

Hi, I'm trying your software, but I get this error. I don't have a gpu and no nvidia driver are installed.
I'm using conda and I installed all the required packages. Can you help me in solving this issue?

PairedSMILE_trainer(X_a = rna_X, X_b = dna_X, model = net, num_epoch=20)##training for 20 epochs
Traceback (most recent call last):
File "", line 1, in
File "/home/imerelli/scvar/SMILE.py", line 124, in PairedSMILE_trainer
f_con = ContrastiveLoss(batch_size = batch_size,temperature = f_temp)
File "/home/imerelli/scvar/contrastive_loss_pytorch.py", line 8, in init
self.register_buffer("temperature", torch.tensor(temperature).cuda())
File "/DATA_NFS/anaconda3/envs/integration/lib/python3.8/site-packages/torch/cuda/init.py", line 196, in _lazy_init
_check_driver()
File "/DATA_NFS/anaconda3/envs/integration/lib/python3.8/site-packages/torch/cuda/init.py", line 98, in _check_driver
raise AssertionError("""
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.