This pre-processing procedure requires:
- QCTOOL, version 1.0
- GTOOL, version 1.0
- PLINK, version 2.00
- Python, version 3.7.4
Before starting the pre-processing procedure, the script paths_and_parameters.sh
should be modified to add information about :
- paths:
- main project directory:
prj_dir
- QCTOOL software:
oxford_dir
- GTOOL software:
oxford_dir2
- PLINK software:
plink_dir
- genotype data:
geno_dir
- imputed data:
imp_dir
- output data:
out_dir
- main project directory:
- parameters:
- list of chromosomes:
CHR_LIST
- SNP call rate:
SNP_CALLRATE
- SNP minor allele frequency:
SNP_MAF
- sample call rate:
SAMPLE_CALLRATE
- window size for linkage disequilibrium analysis:
LD_WINDOWSIZE
- step size for linkage disequilibrium analysis:
LD_STEPSIZE
- coefficient of correlation squared for linkage disequilibrium analysis:
LD_R2
- threshold for Hardy-Weinberg equilibrium analysis:
HWE
- list of chromosomes:
Pre-processing can be performed by running the script main_script
as follows :
bash ./code/main_script.sh
It will successively call the scripts for the different steps, which can be found under ./code/steps_code
. Below is a description of each step:
Output examples for each steps can be found under ./code/README.md
This repository is maintained by:
Data acquisition and analyses in the present study were conducted under UK Biobank Application #14762.