dynverse / dynbenchmark Goto Github PK

View Code? Open in Web Editor NEW

190.0 16.0 39.0 21.33 MB

Comparison of methods for trajectory inference on single-cell data 🥇

Home Page: https://doi.org/10.1101/276907

License: Other

TeX 26.50% R 73.45% Shell 0.04%

trajectory-inference benchmarking bioinformatics-algorithms

dynbenchmark's Introduction

ℹ️ Tutorials

Benchmarking trajectory inference methods

This repo contains the scripts to reproduce the manuscript

A comparison of single-cell trajectory inference methods Wouter Saelens* , Robrecht Cannoodt* , Helena Todorov , Yvan Saeys
doi:10.1038/s41587-019-0071-9

Dynverse

Under the hood, dynbenchmark makes use of most dynverse package for running the methods, comparing them to a gold standard, and plotting the output. Check out dynverse.org for an overview!

Experiments

From start to finish, the repository is divided into several experiments, each with their own scripts and results. These are accompanied by documentation using github readmes and can thus be easily explored by going to the appropriate folders:

#	id	scripts	results
1	Datasets	📄➡	📊➡
2	Metrics	📄➡	📊➡
3	Methods	📄➡	📊➡
4	Method testing	📄➡	📊➡
5	Scaling	📄➡	📊➡
6	Benchmark	📄➡	📊➡
7	Stability	📄➡	📊➡
8	Summary	📄➡	📊➡
9	Guidelines	📄➡	📊➡
10	Benchmark interpretation	📄➡	📊➡
11	Example predictions	📄➡	📊➡
12	Manuscript	📄➡	📊➡
	Varia	📄➡

We also have several additional subfolders:

Manuscript: Source files for producing the manuscript.
Package: An R package with several helper functions for organizing the benchmark and rendering the manuscript.
Raw: Files generated by hand, such as figures and spreadsheets.
Derived: Intermediate data files produced by the scripts. These files are not git committed.

Guidelines

Based on the results of the benchmark, we provide context-dependent user guidelines, available as a shiny app. This app is integrated within the dyno pipeline, which also includes the wrappers used in the benchmarking and other packages for visualising and interpreting the results.

Datasets

The benchmarking pipeline generates (and uses) the following datasets:

Gold standard single-cell datasets, both real and synthetic, used to evaluated the trajectory inference methods

The performance of methods used for the results overview figure and the dynguidelines app.
General information about trajectory inference methods, available as a data frame in dynmethods::methods

Methods

All wrapped methods are wrapped as both docker and singularity containers. These can be easily run using dynmethods.

Installation

dynbenchmark has been tested using R version 3.5.1 on Linux. While running the methods also works on on Windows and Mac (see dyno), running the benchmark is currently not supported on these operating system, given that a lot of commands are linux specific.

In R, you can install the dependencies of dynbenchmark from github using:

# install.packages("devtools")
devtools::install_github("dynverse/dynbenchmark/package")

This will install several other “dynverse” packages. Depending on the number of R packages already installed, this installation should take approximately 5 to 30 minutes.

On Linux, you will need to install udunits and ImageMagick:

Debian / Ubuntu / Linux Mint: sudo apt-get install libudunits2-dev imagemagick
Fedora / CentOS / RHEL: sudo dnf install udunits2-devel ImageMagick-c++-devel

Docker or Singularity (version ≥ 3.0) has to be installed to run TI methods. We suggest docker on Windows and MacOS, while both docker and singularity are fine when running on linux. Singularity is strongly recommended when running the method on shared computing clusters.

For windows 10 you can install Docker CE, older Windows installations require the Docker toolbox.

You can test whether docker is correctly installed by running:

dynwrap::test_docker_installation(detailed = TRUE)

## ✔ Docker is installed

## ✔ Docker daemon is running

## ✔ Docker is at correct version (>1.0): 1.39

## ✔ Docker is in linux mode

## ✔ Docker can pull images

## ✔ Docker can run image

## ✔ Docker can mount temporary volumes

## ✔ Docker test successful -----------------------------------------------------------------

## [1] TRUE

Same for singularity:

dynwrap::test_singularity_installation(detailed = TRUE)

## ✔ Singularity is installed

## ✔ Singularity is at correct version (>=3.0): v3.0.0-13-g0273e90f is installed

## ✔ Singularity can pull and run a container from Dockerhub

## ✔ Singularity can mount temporary volumes

## ✔ Singularity test successful ------------------------------------------------------------

## [1] TRUE

These commands will give helpful tips if some parts of the installation are missing.

dynbenchmark's People

Contributors

Stargazers

Watchers

dynbenchmark's Issues

Failed Installation - Error : object ‘get_platform_from_counts’ is not exported by 'namespace:dyngen'

I got the following error when trying to install dynbenchmark:

> devtools::install_github("dynverse/dynbenchmark/package")
Downloading GitHub repo dynverse/dynbenchmark@master
✔  checking for file ‘/tmp/RtmpOkmvRG/remotes29e382fcc140/dynverse-dynbenchmark-74fe868/package/DESCRIPTION’ ...
─  preparing ‘dynbenchmark’:
✔  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  looking to see if a ‘data/datalist’ file should be added
─  building ‘dynbenchmark_0.0.0.9000.tar.gz’ (766ms)
   
Installing package into ‘/home/pmonteagudo/R/x86_64-redhat-linux-gnu-library/3.5’
(as ‘lib’ is unspecified)
* installing *source* package ‘dynbenchmark’ ...
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
Warning: replacing previous import ‘dyneval::calculate_harmonic_mean’ by ‘dynutils::calculate_harmonic_mean’ when loading ‘dynbenchmark’
Warning: replacing previous import ‘dyneval::calculate_geometric_mean’ by ‘dynutils::calculate_geometric_mean’ when loading ‘dynbenchmark’
Warning: replacing previous import ‘dyneval::calculate_arithmetic_mean’ by ‘dynutils::calculate_arithmetic_mean’ when loading ‘dynbenchmark’
Error : object ‘get_platform_from_counts’ is not exported by 'namespace:dyngen'
ERROR: lazy loading failed for package ‘dynbenchmark’
* removing ‘/home/pmonteagudo/R/x86_64-redhat-linux-gnu-library/3.5/dynbenchmark’
Error in i.p(...) : 
  (converted from warning) installation of package ‘/tmp/RtmpOkmvRG/file29e3879a6627c/dynbenchmark_0.0.0.9000.tar.gz’ had non-zero exit status

Any thoughts on that? Thanks in advance

Synthetic dataset code

Hi,
It would be very helpful if you could share the code you used to create the synthetic datasets.
Specifically I am interested in creating a dataset with "golden standard" trajectory.
For my understanding, you used Splatter framework in combination with a trajectory backbone.
I can't understand how you were able to specify all of the "from" and "to" nodes by using Splatter path simulation.
Any help / code will be appreciated!

Cannot install dynbenchmark using: devtools::install_github("dynverse/dynbenchmark/package")

I used Ubuntu 22.04 LTS
My session info shows that I installed all other dyn* except dynbenchmark.

HOW to use "evaluate_ti_method" ??

evaluate_ti_method(dataset, method, parameters, metrics,
give_priors = NULL, output_model = TRUE, seed = function()
random_seed(), map_fun = map, verbose = FALSE)

it is hard to understand how to use.
can you provide more？

put QC results into results/ instead of derived/

Error: `id` must evaluate to column positions or names, not a function

Hi, thanks again for a great job!

I ran into the following problem, not sure if I'm doing something wrong or it is an actual error. I also have a couple of questions regarding the overall structure of the benchmarking repository and how to execute it.

When running the dynbenchmark/scripts/06-benchmark/1-submit_jobs.R in line 23:

 methods <-
    dynwrap::get_ti_methods(method_ids, evaluate = FALSE) %>%
    mapdf(function(m) {
      l <- m$fun()
      l$fun <- m$fun
      l$type <- "function"
      l
    }) %>%
    list_as_tibble() %>%
    select(id, type, fun, everything())

As I see it after modifying the tibble obtained by dynwrap::get_ti_methods() with the mapdf() there is no column 'id'. Therefore, the following operation select() raises the following error:

Error: `id` must evaluate to column positions or names, not a function

This could be easily fixed by keeping the 'id' info in the mapdf(): l$id <- m$id

  methods <-
    dynwrap::get_ti_methods(method_ids, evaluate = FALSE) %>%
    mapdf(function(m) {
      l <- m$fun()
      l$fun <- m$fun
      l$type <- "function"
      l$id <- m$id
      l
    }) %>%
    list_as_tibble() %>%
    select(id, type, fun, everything())

Is this an actual error and proper fix, or I'm missing something?

In general what would be the proper way to try to recreate your results?
By sequentially executing your scripts, results are stored in the 'derived' folder, but it seems that sometimes the scripts are pointing to the 'results' folder (initially empty) instead of the 'derived'. I solved this by downloading the dynbenchmark_results repository into the 'results' folder. Now, I'm a little confused about how the scripts and the 'derived' and 'results' folder interact with each other. Could you please give a couple of words on that?

Thanks again!!

definitions of trajectory types

Hi,
can you please help clarify the key difference between the tree trajectory and the multifurcation? I was looking at supplementary fig S2(a) but not quite sure how you define the bifurcation, vs multifurcation vs. tree.
so for a multifurcation, at any node one can split into any number of branches? (0 or more)
for a tree, at any node one can split into??

Thanks a lot

Add results for each experiment

Will you keep updating on new methods?

Thanks for the great work!
There are more single-cell trajectory inference methods published recently and surely in the future. Will you keep adding the new methods into your evaluation? Thanks!

Error in devtools::install_github("dynverse/dynbenchmark/package")

> devtools::install_github("dynverse/dynbenchmark/package")

Using github PAT from envvar GITHUB_PAT
Downloading GitHub repo dynverse/dynbenchmark@master
'/usr/bin/git' clone --depth 1 --no-hardlinks --recurse-submodules [email protected]:dynverse/dynbenchmark_results.git /tmp/RtmpbRc9jx/remotes2ce925b8666e8/dynverse-dynbenchmark-e16b237/package/../results
Warning: Permanently added the RSA host key for IP address '140.82.118.4' to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Error: Failed to install 'dynbenchmark' from GitHub:
  Command failed (128)
In addition: Warning message:
In system(full, intern = TRUE, ignore.stderr = quiet) :
  running command ''/usr/bin/git' clone --depth 1 --no-hardlinks --recurse-submodules [email protected]:dynverse/dynbenchmark_results.git /tmp/RtmpbRc9jx/remotes2ce925b8666e8/dynverse-dynbenchmark-e16b237/package/../results' had status 128

Add vector of root cells

Hello,

Does any one have an idea how to add a vector of cells to the model ?

Thanks

Support for SLURM or Linux Server

I do not have much experience with cluster computing, but I have access to a SLURM cluster.
Is there any straightforward way to execute your pipeline there?

if the previous is not possible, I still would like to make a comprehensive analysis of all the methods and datasets, is there any way to easily handle all datasets/methods dependencies by re-using as much of your code as possible to execute it in a regular Linux Server?

install_github("dynverse/dynbenchmark") not working

Error message on Mac:
Installation failed: Does not appear to be an R package (no DESCRIPTION)

Error message on Linux:
HTTP error 404: Not found.