Code Monkey home page Code Monkey logo

dynbenchmark's Introduction

Build Status Lifecycle doi ℹ️ Tutorials  

Benchmarking trajectory inference methods

This repo contains the scripts to reproduce the manuscript

A comparison of single-cell trajectory inference methods Wouter Saelens* , Robrecht Cannoodt* , Helena Todorov , Yvan Saeys
doi:10.1038/s41587-019-0071-9 altmetric

Dynverse

Under the hood, dynbenchmark makes use of most dynverse package for running the methods, comparing them to a gold standard, and plotting the output. Check out dynverse.org for an overview!

Experiments

From start to finish, the repository is divided into several experiments, each with their own scripts and results. These are accompanied by documentation using github readmes and can thus be easily explored by going to the appropriate folders:

# id scripts results
1 Datasets 📄➡ 📊➡
2 Metrics 📄➡ 📊➡
3 Methods 📄➡ 📊➡
4 Method testing 📄➡ 📊➡
5 Scaling 📄➡ 📊➡
6 Benchmark 📄➡ 📊➡
7 Stability 📄➡ 📊➡
8 Summary 📄➡ 📊➡
9 Guidelines 📄➡ 📊➡
10 Benchmark interpretation 📄➡ 📊➡
11 Example predictions 📄➡ 📊➡
12 Manuscript 📄➡ 📊➡
Varia 📄➡

We also have several additional subfolders:

  • Manuscript: Source files for producing the manuscript.
  • Package: An R package with several helper functions for organizing the benchmark and rendering the manuscript.
  • Raw: Files generated by hand, such as figures and spreadsheets.
  • Derived: Intermediate data files produced by the scripts. These files are not git committed.

Guidelines

Based on the results of the benchmark, we provide context-dependent user guidelines, available as a shiny app. This app is integrated within the dyno pipeline, which also includes the wrappers used in the benchmarking and other packages for visualising and interpreting the results.

dynguidelines

Datasets

The benchmarking pipeline generates (and uses) the following datasets:

  • Gold standard single-cell datasets, both real and synthetic, used to evaluated the trajectory inference methods DOI

datasets

  • The performance of methods used for the results overview figure and the dynguidelines app.

  • General information about trajectory inference methods, available as a data frame in dynmethods::methods

Methods

All wrapped methods are wrapped as both docker and singularity containers. These can be easily run using dynmethods.

Installation

dynbenchmark has been tested using R version 3.5.1 on Linux. While running the methods also works on on Windows and Mac (see dyno), running the benchmark is currently not supported on these operating system, given that a lot of commands are linux specific.

In R, you can install the dependencies of dynbenchmark from github using:

# install.packages("devtools")
devtools::install_github("dynverse/dynbenchmark/package")

This will install several other “dynverse” packages. Depending on the number of R packages already installed, this installation should take approximately 5 to 30 minutes.

On Linux, you will need to install udunits and ImageMagick:

  • Debian / Ubuntu / Linux Mint: sudo apt-get install libudunits2-dev imagemagick
  • Fedora / CentOS / RHEL: sudo dnf install udunits2-devel ImageMagick-c++-devel

Docker or Singularity (version ≥ 3.0) has to be installed to run TI methods. We suggest docker on Windows and MacOS, while both docker and singularity are fine when running on linux. Singularity is strongly recommended when running the method on shared computing clusters.

For windows 10 you can install Docker CE, older Windows installations require the Docker toolbox.

You can test whether docker is correctly installed by running:

dynwrap::test_docker_installation(detailed = TRUE)
## ✔ Docker is installed

## ✔ Docker daemon is running

## ✔ Docker is at correct version (>1.0): 1.39

## ✔ Docker is in linux mode

## ✔ Docker can pull images

## ✔ Docker can run image

## ✔ Docker can mount temporary volumes

## ✔ Docker test successful -----------------------------------------------------------------

## [1] TRUE

Same for singularity:

dynwrap::test_singularity_installation(detailed = TRUE)
## ✔ Singularity is installed

## ✔ Singularity is at correct version (>=3.0): v3.0.0-13-g0273e90f is installed

## ✔ Singularity can pull and run a container from Dockerhub

## ✔ Singularity can mount temporary volumes

## ✔ Singularity test successful ------------------------------------------------------------

## [1] TRUE

These commands will give helpful tips if some parts of the installation are missing.

dynbenchmark's People

Contributors

helena-todd avatar rcannood avatar zouter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dynbenchmark's Issues

Failed Installation - Error : object ‘get_platform_from_counts’ is not exported by 'namespace:dyngen'

I got the following error when trying to install dynbenchmark:

> devtools::install_github("dynverse/dynbenchmark/package")
Downloading GitHub repo dynverse/dynbenchmark@master
✔  checking for file ‘/tmp/RtmpOkmvRG/remotes29e382fcc140/dynverse-dynbenchmark-74fe868/package/DESCRIPTION’ ...
─  preparing ‘dynbenchmark’:
✔  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  looking to see if a ‘data/datalist’ file should be added
─  building ‘dynbenchmark_0.0.0.9000.tar.gz’ (766ms)
   
Installing package into ‘/home/pmonteagudo/R/x86_64-redhat-linux-gnu-library/3.5’
(as ‘lib’ is unspecified)
* installing *source* package ‘dynbenchmark’ ...
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
Warning: replacing previous import ‘dyneval::calculate_harmonic_mean’ by ‘dynutils::calculate_harmonic_mean’ when loading ‘dynbenchmark’
Warning: replacing previous import ‘dyneval::calculate_geometric_mean’ by ‘dynutils::calculate_geometric_mean’ when loading ‘dynbenchmark’
Warning: replacing previous import ‘dyneval::calculate_arithmetic_mean’ by ‘dynutils::calculate_arithmetic_mean’ when loading ‘dynbenchmark’
Error : object ‘get_platform_from_counts’ is not exported by 'namespace:dyngen'
ERROR: lazy loading failed for package ‘dynbenchmark’
* removing ‘/home/pmonteagudo/R/x86_64-redhat-linux-gnu-library/3.5/dynbenchmark’
Error in i.p(...) : 
  (converted from warning) installation of package ‘/tmp/RtmpOkmvRG/file29e3879a6627c/dynbenchmark_0.0.0.9000.tar.gz’ had non-zero exit status

Any thoughts on that? Thanks in advance

Synthetic dataset code

Hi,
It would be very helpful if you could share the code you used to create the synthetic datasets.
Specifically I am interested in creating a dataset with "golden standard" trajectory.
For my understanding, you used Splatter framework in combination with a trajectory backbone.
I can't understand how you were able to specify all of the "from" and "to" nodes by using Splatter path simulation.
Any help / code will be appreciated!

HOW to use "evaluate_ti_method" ??

evaluate_ti_method(dataset, method, parameters, metrics,
give_priors = NULL, output_model = TRUE, seed = function()
random_seed(), map_fun = map, verbose = FALSE)

it is hard to understand how to use.
can you provide more?

Error: `id` must evaluate to column positions or names, not a function

Hi, thanks again for a great job!

I ran into the following problem, not sure if I'm doing something wrong or it is an actual error. I also have a couple of questions regarding the overall structure of the benchmarking repository and how to execute it.

When running the dynbenchmark/scripts/06-benchmark/1-submit_jobs.R in line 23:

 methods <-
    dynwrap::get_ti_methods(method_ids, evaluate = FALSE) %>%
    mapdf(function(m) {
      l <- m$fun()
      l$fun <- m$fun
      l$type <- "function"
      l
    }) %>%
    list_as_tibble() %>%
    select(id, type, fun, everything())

As I see it after modifying the tibble obtained by dynwrap::get_ti_methods() with the mapdf() there is no column 'id'. Therefore, the following operation select() raises the following error:

Error: `id` must evaluate to column positions or names, not a function

This could be easily fixed by keeping the 'id' info in the mapdf(): l$id <- m$id

  methods <-
    dynwrap::get_ti_methods(method_ids, evaluate = FALSE) %>%
    mapdf(function(m) {
      l <- m$fun()
      l$fun <- m$fun
      l$type <- "function"
      l$id <- m$id
      l
    }) %>%
    list_as_tibble() %>%
    select(id, type, fun, everything())

Is this an actual error and proper fix, or I'm missing something?

  • In general what would be the proper way to try to recreate your results?
    By sequentially executing your scripts, results are stored in the 'derived' folder, but it seems that sometimes the scripts are pointing to the 'results' folder (initially empty) instead of the 'derived'. I solved this by downloading the dynbenchmark_results repository into the 'results' folder. Now, I'm a little confused about how the scripts and the 'derived' and 'results' folder interact with each other. Could you please give a couple of words on that?

Thanks again!!

definitions of trajectory types

Hi,
can you please help clarify the key difference between the tree trajectory and the multifurcation? I was looking at supplementary fig S2(a) but not quite sure how you define the bifurcation, vs multifurcation vs. tree.
so for a multifurcation, at any node one can split into any number of branches? (0 or more)
for a tree, at any node one can split into??

Thanks a lot

Will you keep updating on new methods?

Thanks for the great work!
There are more single-cell trajectory inference methods published recently and surely in the future. Will you keep adding the new methods into your evaluation? Thanks!

Error in devtools::install_github("dynverse/dynbenchmark/package")

> devtools::install_github("dynverse/dynbenchmark/package")

Using github PAT from envvar GITHUB_PAT
Downloading GitHub repo dynverse/dynbenchmark@master
'/usr/bin/git' clone --depth 1 --no-hardlinks --recurse-submodules [email protected]:dynverse/dynbenchmark_results.git /tmp/RtmpbRc9jx/remotes2ce925b8666e8/dynverse-dynbenchmark-e16b237/package/../results
Warning: Permanently added the RSA host key for IP address '140.82.118.4' to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Error: Failed to install 'dynbenchmark' from GitHub:
  Command failed (128)
In addition: Warning message:
In system(full, intern = TRUE, ignore.stderr = quiet) :
  running command ''/usr/bin/git' clone --depth 1 --no-hardlinks --recurse-submodules [email protected]:dynverse/dynbenchmark_results.git /tmp/RtmpbRc9jx/remotes2ce925b8666e8/dynverse-dynbenchmark-e16b237/package/../results' had status 128

Support for SLURM or Linux Server

I do not have much experience with cluster computing, but I have access to a SLURM cluster.
Is there any straightforward way to execute your pipeline there?

if the previous is not possible, I still would like to make a comprehensive analysis of all the methods and datasets, is there any way to easily handle all datasets/methods dependencies by re-using as much of your code as possible to execute it in a regular Linux Server?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.