Comments (13)
Hi @prasundutta87 , given that --haploid_contigs
and --par_regions_bed
are arguments for postprocess_variants, you can try using:
`--postprocess_variants_extra_args="--haploid_contigs=yourValue,--par_regions_bed=yourValue""
If you're using Docker, you likely need to make sure the par_regions_bed path are something that the run in Docker can access.
from deepvariant.
And actually, if it's on the child sample or parent samples, you should use postprocess_variants_child_extra_args
, postprocess_variants_parent1_extra_args
, postprocess_variants_parent2_extra_args
.
from deepvariant.
Hi @pichuan , thanks a lot for the quick reply. I am running the "run_deeptrio" script from DeepTrio docker (v1.6.1). As I have trios, I am running it all together in one command as described in https://github.com/google/deepvariant/blob/r1.6.1/docs/deeptrio-pacbio-case-study.md. I will then be merging the single sample gvcfs using GLnexus as described there.
Since, I am using trios, will just using `--postprocess_variants_extra_args="--haploid_contigs=yourValue,--par_regions_bed=yourValue"" help or I need to use the second suggestion? I have attached the example command I am running for your reference:
apptainer run \ SNV_analysis_DeepTrio/images/deepvariant_deeptrio-1.6.1.sif \ run_deeptrio \ --model_type=PACBIO \ --ref=hg38_reference/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna \ --reads_child=BAMS/Proband_sorted.bam \ --reads_parent1=BAMS/Parent-1_sorted.bam \ --reads_parent2=BAMS/Parent-2_sorted.bam \ --output_vcf_child output/Proband.output.vcf.gz \ --output_vcf_parent1 output/Parent-1.output.vcf.gz \ --output_vcf_parent2 output/Parent-2.output.vcf.gz \ --sample_name_child 'Proband' \ --sample_name_parent1 'Parent-1' \ --sample_name_parent2 'Parent-2' \ --intermediate_results_dir output/intermediate_results_dir \ --output_gvcf_child output/Proband.g.vcf.gz \ --output_gvcf_parent1 output/Parent-1.g.vcf.gz \ --output_gvcf_parent2 output/Parent-2.g.vcf.gz \ --regions chr21 \ --num_shards=5
from deepvariant.
Hi @prasundutta87 ,
Right, given that you're running on trio, you'll for sure want to specify different things for the parent1 and parent2: only the dad would be using haploid_contigs
par_regions_bed
flags. The mom would have two copies of chrX, so you should not specify chrX to be haploid.
And, for the child: you should only use those flags if the child is male.
from deepvariant.
Sounds good, thanks a lot @pichuan . Just an additional query. If I want to joint call SNVs/indels in a set of multiple trios, which pipeline is recommended? Deepvariant or DeepTrio. My understanding is that if we joint call multiple samples together, the cohort allele frequency could be used to filter rare variants. If we use DeepTrio, AF calculation will be restricted to only the trio or family. What is the best practice here in terms of Deepvariant?
from deepvariant.
Both DeepVariant and DeepTrio can produce gVCFs, so you can joint call in a similar manner. In both cases you should use the DeepVariant_unfiltered preset for GLnexus, because there is family structure present which the other filtering presets wouldn't know about.
Because you can joint genotype multiple trio gVCFs from either DeepVariant or DeepTrio in the same way, I would use DeepTrio to produce the gVCFs, take all of the gVCFs and run them together through glnexus, and then you can still use allele frequency information.
To be clear, the unfiltered preset looks like this:
sudo docker run \
-v "${PWD}/output":"/output" \
quay.io/mlin/glnexus:v1.2.7 \
/usr/local/bin/glnexus_cli \
--config DeepVariant_unfiltered \
/output/child_trio_1.g.vcf.gz\
/output/parent1_trio_1.g.vcf.gz \
/output/parent2_trio_1.g.vcf.gz \
/output/child_trio_2g.vcf.gz\
/output/parent1_trio_2.g.vcf.gz \
/output/parent2_trio_2.g.vcf.gz \
| sudo docker run -i google/deepvariant:deeptrio-"${BIN_VERSION}" \
bcftools view - \
| sudo docker run -i google/deepvariant:deeptrio-"${BIN_VERSION}" \
bgzip -c > output/trio_cohort_merged.vcf.gz
from deepvariant.
Hi @AndrewCarroll ,..sorry for the late reply, but aren't the inherent models different for DeepTrio and DeepVariant?
from deepvariant.
Yes, they are different models. You would presumably want to use either one or the other (all DeepVariant or all DeepTrio). You can merge either with glnexus though.
from deepvariant.
Sounds good, thanks a lot, @AndrewCarroll
from deepvariant.
Hi @pichuan,
I tried to use postprocess_variants_child_extra_args, postprocess_variants_parent1_extra_args, postprocess_variants_parent2_extra_args.
with deeptrio by running the docker: docker://google/deepvariant:deeptrio-1.6.1
here's my example code where parent1 is the father in the trio
/opt/deepvariant/bin/deeptrio/run_deeptrio \
--model_type "WGS" \
--ref {params.ref_genome} \
--reads_child {input.child_cram} \
--reads_parent1 {input.dad_cram} \
--reads_parent2 {input.mom_cram} \
--output_vcf_child {output}/{params.child_name}.vcf.gz \
--output_vcf_parent1 {output}/{params.dad_name}.vcf.gz \
--output_vcf_parent2 {output}/{params.mom_name}.vcf.gz \
--sample_name_child {params.child_name} \
--sample_name_parent1 {params.dad_name} \
--sample_name_parent2 {params.mom_name} \
--num_shards {threads} \
--intermediate_results_dir deeptrio_tmp/{wildcards.family} \
--postprocess_variants_parent1_extra_args="--haploid_contigs="chrX,chrY",--par_regions_bed={input.PAR}" \
--output_gvcf_child {output}/{params.child_name}.g.vcf.gz \
--output_gvcf_parent1 {output}/{params.dad_name}.g.vcf.gz \
--output_gvcf_parent2 {output}/{params.mom_name}.g.vcf.gz \
--novcf_stats_report
but I got the following error:
FATAL Flags parsing error: Unknown command line flag 'postprocess_variants_parent1_extra_args'. Did you mean: postprocess_variants_extra_args ?
Pass --helpshort or --helpfull to see help on flags.
checking --helpfull, it seems only --postprocess_variants_extra_args is available.
Did I misunderstand the use of postprocess_variants_parent1_extra_args?
Thank you.
from deepvariant.
Hi @zihhuafang ,
The flag was added after 1.6.1 was released. Sorry about that.
Note that run_deeptrio
is a wrapper script that just runs the underlying binaries. So, for now you can run run_deeptrio
with the --dry_run
flag, which will print out all the commands it is going to run with each of the steps.
From there, you can modify to make sure the postprocess_variants has the correct flag for the corresponding samples. Sorry for the inconvenience for now. The postprocess_variants_parent1_extra_args
flag will be available in the next release.
from deepvariant.
Hi @pichuan,
Thanks a lot for the explanation.
Will proceed as you suggested!
from deepvariant.
If I generated a VCF file for a trio (with a father using deeptrio) or a solo male (using deepvariant) without chrX,chrY option. Can I fix the VCF after the run is finished?
As a suggestion, it would be nice if the algorithm takes care of this automatically :)
from deepvariant.
Related Issues (20)
- Fatal Python error: Segmentation fault HOT 3
- How to get list of variants after make_examples step? HOT 1
- Highest mapping quality = 42 in bowtie2 HOT 3
- Output files are missing after running deepvariant. HOT 10
- Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR HOT 1
- Dynamic cast failed HOT 6
- question for INDEL variant calling HOT 14
- Question about the time it takes for VC analysis HOT 5
- Merging vcf files error with glnexus:v1.2.7 HOT 6
- [E::vcf_parse_format] Incorrect number of FORMAT fields at NC_059157.1:24900 HOT 2
- postprocess_variants: Found multiple file patterns in input filename space HOT 7
- Issues with Incompatible TensorRT libraries in docker image google/deepvariant:latest-gpu and google/deepvariant:1.6.1-gpu HOT 9
- CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected HOT 7
- Info ONT R10.4.1 data HOT 3
- error while running deepvariant with a bam file with phasing information
- Error while using deepvariant with a bam file that is phased HOT 4
- Homozygous GT value while IGV shows otherwise HOT 8
- Fix male VCF after calling without --haploid_contigs="chrX,chrY" and/or --par_regions_bed parameters HOT 1
- gvcf with true depth and not (only) min_dp HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.