Code Monkey home page Code Monkey logo

syntumorsizer's Introduction

Syntumorsizer

Synumorsizer is a Snakemake workflow for making synthetic tumour sequencing data. It implements the procedure described in Cooke et al. that haplotypes and realigns germline reads before somatic mutation spike-in, resulting in more realistic haplotype structure than if naive spike-in is used.

The basic input of the workflow is:

  • Germline sequencing reads of the the sample to be mutated, in FASTQ format.
  • Somatic variants to spike into the germline sequencing reads, in VCF.

and the output is:

  • Sequencing reads with somatic mutations spiked in, in FASTQ or BAM format.

Running

To run the workflow, you'll need Snakemake, Conda, and Singularity installed.

Configuration

You'll need to complete a YAML file specifying the inputs and parameters. An example is included in the config directory.

Example

To run the bundled example locally:

$ snakemake --configfile config/example.yaml --cores 16 --use-conda --use-singularity --singularity-args "-B /run"

Once the workfow is complete, you should find a directory results including:

  • NA24385_COAD_R{1,2}.fastq.gz: raw spiked sequencing reads.
  • NA24385_COAD.hs37d5.bwa-mem.bam: remapped spiked sequencing reads.
  • NA24385_COAD.hs37d5.vcf.gz: spiked-in somatic mutations - a subset of the input variants with added annotations.

Note that the input variants will be sampled to satisfy the specified somatic mutation rate(s), so if you want all the input variants to be included then just make sure the mutation_rates are sufficiently high.

Take a look at the Snakemake docs to learn other ways of executing the workflow, such as running on a cluster.

Limitations

  • Only paired-end Illuina quality reads are supported.
  • Only BWA-MEM and Octopus are options for read mapping and haplotyping, respectively.
  • Tumour subclonal structure (aka tumour phylogeny) is not modelled (only local haplotype structure is respected).
  • No CNVs - only small variants (SNVs/MNVs/indels).
  • Human samples only.

syntumorsizer's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.