Code Monkey home page Code Monkey logo

crest's Introduction

crest

Classification of RelationShip Types

Introduction

CREST (Classification of RelationShip Types) is a tool that uses identity-by-descent (IBD) segments to classify second-degree relatives as avuncular, half-siblings, or grandparent/grandchild.

Quick Start

Follow these steps to get CREST results quickly and easily. All file names and directories in brackets should be replaced with names and directories of your choosing.

Your Data

CREST, and IBIS, if you choose to use it, requires genotype data in a PLINK binary file format. Note that CREST currently uses autosomal IBD only, so if you have non-autosomal data, be sure to exclude it later on.

Getting IBD Segments

We recommend using IBIS to extract IBD information. It tends to infer contiguous segments, which is especially important for sex inference. Before running IBIS, we advise adding a genetic map to your .bim file. See the IBIS documentation here:

./add-map-plink.pl [your data].bim [map directory]/genetic_map_GRCh37_chr{1..22}.txt > [your new data].bim

Then run IBIS itself:

ibis [your data].bed [your new data].bim [your data].fam -f [your IBIS data]

or if you rename the new .bim, you can supply all three at once:

ibis -b [your data] -f [your IBIS data]

...

For sex-inference, you will need to convert sex-specific genetic maps of your choosing to a .simmap format file. Information on how to do this can be found here:

bash
wget https://github.com/cbherer/Bherer_etal_SexualDimorphismRecombination/raw/master/Refined_genetic_map_b37.tar.gz
tar xvzf Refined_genetic_map_b37.tar.gz
printf "#chr\tpos\tmale_cM\tfemale_cM\n" > [your map].simmap
for chr in {1..22}; do
  paste Refined_genetic_map_b37/male_chr$chr.txt Refined_genetic_map_b37/female_chr$chr.txt \
    | awk -v OFS="\t" 'NR > 1 && $2 == $6 {print $1,$2,$4,$8}' \
    | sed 's/^chr//' >> [your map].simmap;
done

Now you are ready to run the sex-inference script:

CREST_sex_inference.py -i [your IBIS data].seg -m [your map].simmap -b [your (new) data].bim -o [sex inference output]

Thorough Start

Pre-CREST Data Generation and Curation

Relationship Type Inference

Parental Sex Inference

Command line arguments:

  • -i or --input: name of the input file (required)

  • -o or --output: name of the output file (required)

  • -m or --map: name of a genetic map in .simmap format (required)

  • -b or --bim: name of the .bim file from your IBD data (required)

  • -w or --window: window size in kilobases (optional, integer, defaults to 500 kb)

    • Windows are symmetric about the IBD segment ends, so a 500 kb window extends 250 kb in both directions.
  • -k or --keep: list of sample pairs to keep for sex-inference analysis (optional, defaults to None)

Output file format

The format of the output file is

id1 id2 segment_number GP_lod HS_lod

The last two columns are quasi-LOD scores under the HS and GP models, where LOD = log10 (p(maternal) / p(paternal)). In other words, positive scores indicate a pair is more probably maternal, while negative scores indicate a pair is more probably paternal.

crest's People

Contributors

jenssannerud avatar williamslab avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.