Code Monkey home page Code Monkey logo

Comments (3)

heuermh avatar heuermh commented on September 23, 2024

GIAB made available a "straw man" draft benchmark sequence-resolved SV callset (v0.4.0) by integrating candidate SV and large indel calls >=20bp

ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_UnionSVs_05092017/Preliminary_Integrations_v0.4.0/

VCF header lines from the union (svanalyzer_union_170509_v0.4.0.vcf.gz)

FILTER=<ID=NoConsensusGT,Description="No individual had genotypes from svviz agree across all datasets with confident genotypes">
FILTER=<ID=LongReadHomRef,Description="Long reads supported homozygous reference for all individuals">
FORMAT=<ID=GTcons1,Number=1,Type=String,Description="Consensus Genotype using the GT from svviz2 rather than ref and alt allele counts, which is sometimes inaccurate for large variants">
FORMAT=<ID=PB_GT,Number=1,Type=String,Description="Genotype predicted by svviz from PacBio">
FORMAT=<ID=PB_REF,Number=1,Type=Integer,Description="Number of PacBio reads supporting the REF allele as predicted by svviz">
FORMAT=<ID=PB_ALT,Number=1,Type=Integer,Description="Number of PacBio reads supporting the ALT allele as predicted by svviz">
FORMAT=<ID=10X_GT,Number=1,Type=String,Description="Genotype predicted by svviz from 10X by combining the GT's from each haplotype">
FORMAT=<ID=10X_REF_HP1,Number=1,Type=Integer,Description="Number of 10X reads on haplotype 1 supporting the REF allele as predicted by svviz">
FORMAT=<ID=10X_ALT_HP1,Number=1,Type=Integer,Description="Number of 10X reads on haplotype 1 supporting the ALT allele as predicted by svviz">
FORMAT=<ID=10X_REF_HP2,Number=1,Type=Integer,Description="Number of 10X reads on haplotype 2 supporting the REF allele as predicted by svviz">
FORMAT=<ID=10X_ALT_HP2,Number=1,Type=Integer,Description="Number of 10X reads on haplotype 2 supporting the ALT allele as predicted by svviz">
FORMAT=<ID=ILL250bp_GT,Number=1,Type=String,Description="Genotype predicted by svviz from Illumina 250bp reads">
FORMAT=<ID=ILL250bp_REF,Number=1,Type=Integer,Description="Number of Illumina 250bp reads supporting the REF allele as predicted by svviz">
FORMAT=<ID=ILL250bp_ALT,Number=1,Type=Integer,Description="Number of Illumina 250bp reads supporting the ALT allele as predicted by svviz">
FORMAT=<ID=ILLMP_GT,Number=1,Type=String,Description="Genotype predicted by svviz from Illumina mate-pair reads">
FORMAT=<ID=ILLMP_REF,Number=1,Type=Integer,Description="Number of Illumina mate-pair reads supporting the REF allele as predicted by svviz">
FORMAT=<ID=ILLMP_ALT,Number=1,Type=Integer,Description="Number of Illumina mate-pair reads supporting the ALT allele as predicted by svviz">
FORMAT=<ID=BNG_LEN_DEL,Number=1,Type=Integer,Description="Length of a deletion predicted by BioNano in a region overlapping this variant">
FORMAT=<ID=BNG_LEN_INS,Number=1,Type=Integer,Description="Length of an insertion predicted by BioNano in a region overlapping this variant">
FORMAT=<ID=nabsys_svm,Number=1,Type=Float,Description="Nabsys SVM score for this variant if it was evaluated">
INFO=<ID=END,Number=1,Type=Integer,Description="End position of the structural variant">
INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of SV:DEL=Deletion, CON=Contraction, INS=Insertion, DUP=Duplication, INV=Inversion">
INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Difference in length between REF and ALT alleles">
INFO=<ID=ClusterIDs,Number=1,Type=String,Description="IDs of SVs that cluster with this SV">
INFO=<ID=NumClusterSVs,Number=1,Type=Integer,Description="Total number of SV calls in this cluster">
INFO=<ID=ExactMatchIDs,Number=1,Type=String,Description="IDs of SVs that are exactly the same call as this SV">
INFO=<ID=NumExactMatchSVs,Number=1,Type=Integer,Description="Total number of SVs in this exact cluster">
INFO=<ID=ClusterMaxShiftDist,Number=1,Type=Float,Description="Maximum relative shift distance between two SVs in this cluster">
INFO=<ID=ClusterMaxSizeDiff,Number=1,Type=Float,Description="Maximum relative size difference between two SVs in this cluster">
INFO=<ID=ClusterMaxEditDist,Number=1,Type=Float,Description="Maximum relative edit distance between two SVs in this cluster">
INFO=<ID=PBcalls,Number=1,Type=Integer,Description="Number of PacBio calls in this cluster">
INFO=<ID=Illcalls,Number=1,Type=Integer,Description="Number of Illumina calls in this cluster">
INFO=<ID=TenXcalls,Number=1,Type=Integer,Description="Number of 10X Genomics calls in this cluster">
INFO=<ID=CGcalls,Number=1,Type=Integer,Description="Number of Complete Genomics calls in this cluster">
INFO=<ID=PBexactcalls,Number=1,Type=Integer,Description="Number of PacBio calls exactly matching the call output for this cluster">
INFO=<ID=Illexactcalls,Number=1,Type=Integer,Description="Number of Illumina calls exactly matching the call output for this cluster">
INFO=<ID=TenXexactcalls,Number=1,Type=Integer,Description="Number of 10X Genomics calls exactly matching the call output for this cluster">
INFO=<ID=CGexactcalls,Number=1,Type=Integer,Description="Number of Complete Genomics calls exactly matching the call output for this cluster">
INFO=<ID=HG2count,Number=1,Type=Integer,Description="Number of calls discovered in HG002 in this cluster">
INFO=<ID=HG3count,Number=1,Type=Integer,Description="Number of calls discovered in HG003 in this cluster">
INFO=<ID=HG4count,Number=1,Type=Integer,Description="Number of calls discovered in HG004 in this cluster">
INFO=<ID=NumTechs,Number=1,Type=Integer,Description="Number of technologies from which calls were discovered in this cluster">
INFO=<ID=NumTechsExact,Number=1,Type=Integer,Description="Number of technologies from which calls were discovered that exactly match the call output for this cluster">
INFO=<ID=DistBack,Number=1,Type=Integer,Description="Distance to the closest non-matching variant before this variant">
INFO=<ID=DistForward,Number=1,Type=Integer,Description="Distance to the closest non-matching variant after this variant">
INFO=<ID=DistMin,Number=1,Type=Integer,Description="Distance to the closest non-matching variant in either direction">
INFO=<ID=DistMinlt1000,Number=1,Type=String,Description="TRUE if Distance to the closest non-matching variant in either direction is less than 1000bp, suggesting possible complex or compound heterozygous variant">
INFO=<ID=MultiTech,Number=1,Type=String,Description="TRUE if callsets from more than one technology are in this cluster, i.e., NumTechs>1">
INFO=<ID=MultiTechExact,Number=1,Type=String,Description="TRUE if callsets from more than one technology exactly matches the call output for this cluster, i.e., NumTechsExact>1">
INFO=<ID=DistMinPASSlt1000,Number=1,Type=String,Description="TRUE if Distance to the closest non-matching PASS variant in either direction is less than 1000bp, suggesting possible complex or compound heterozygous variant or inaccurate call">
INFO=<ID=MendelianError,Number=1,Type=String,Description="TRUE if all individuals have a consensus GT and they are not consistent with Mendelian inheritance">

and from one of the supporting files (union_170509_refalt.2.2.2.clustered.simpleINFO.techcounts.2techor5caller.vcf.gz)

FILTER=<ID=NOT2TECH,Description="All calls in this cluster were discovered from only one technology">
INFO=<ID=END,Number=1,Type=Integer,Description="End position of the structural variant">
INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of SV:DEL=Deletion, CON=Contraction, INS=Insertion, DUP=Duplication, INV=Inversion">
INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Difference in length between REF and ALT alleles">
INFO=<ID=ClusterIDs,Number=1,Type=String,Description="IDs of SVs that cluster with this SV">
INFO=<ID=NumClusterSVs,Number=1,Type=Integer,Description="Total number of SV calls in this cluster">
INFO=<ID=ExactMatchIDs,Number=1,Type=String,Description="IDs of SVs that are exactly the same call as this SV">
INFO=<ID=NumExactMatchSVs,Number=1,Type=Integer,Description="Total number of SVs in this exact cluster">
INFO=<ID=ClusterMaxShiftDist,Number=1,Type=Float,Description="Maximum relative shift distance between two SVs in this cluster">
INFO=<ID=ClusterMaxSizeDiff,Number=1,Type=Float,Description="Maximum relative size difference between two SVs in this cluster">
INFO=<ID=ClusterMaxEditDist,Number=1,Type=Float,Description="Maximum relative edit distance between two SVs in this cluster">
INFO=<ID=PBcalls,Number=1,Type=Integer,Description="Number of PacBio calls in this cluster">
INFO=<ID=Illcalls,Number=1,Type=Integer,Description="Number of Illumina calls in this cluster">
INFO=<ID=TenXcalls,Number=1,Type=Integer,Description="Number of 10X Genomics calls in this cluster">
INFO=<ID=CGcalls,Number=1,Type=Integer,Description="Number of Complete Genomics calls in this cluster">
INFO=<ID=PBexactcalls,Number=1,Type=Integer,Description="Number of PacBio calls exactly matching the call output for this cluster">
INFO=<ID=Illexactcalls,Number=1,Type=Integer,Description="Number of Illumina calls exactly matching the call output for this cluster">
INFO=<ID=TenXexactcalls,Number=1,Type=Integer,Description="Number of 10X Genomics calls exactly matching the call output for this cluster">
INFO=<ID=CGexactcalls,Number=1,Type=Integer,Description="Number of Complete Genomics calls exactly matching the call output for this cluster">
INFO=<ID=HG2count,Number=1,Type=Integer,Description="Number of calls discovered in HG002 in this cluster">
INFO=<ID=HG3count,Number=1,Type=Integer,Description="Number of calls discovered in HG003 in this cluster">
INFO=<ID=HG4count,Number=1,Type=Integer,Description="Number of calls discovered in HG004 in this cluster">
INFO=<ID=NumTechs,Number=1,Type=Integer,Description="Number of technologies from which calls were discovered in this cluster">
INFO=<ID=NumTechsExact,Number=1,Type=Integer,Description="Number of technologies from which calls were discovered that exactly match the call output for this cluster">
INFO=<ID=DistBack,Number=1,Type=Integer,Description="Distance to the closest non-matching variant before this variant">
INFO=<ID=DistForward,Number=1,Type=Integer,Description="Distance to the closest non-matching variant after this variant">
INFO=<ID=DistMin,Number=1,Type=Integer,Description="Distance to the closest non-matching variant in either direction">
INFO=<ID=DistMinlt1000,Number=1,Type=String,Description="TRUE if Distance to the closest non-matching variant in either direction is less than 1000bp, suggesting possible complex or compound heterozygous variant">
INFO=<ID=MultiTech,Number=1,Type=String,Description="TRUE if callsets from more than one technology are in this cluster, i.e., NumTechs>1">
INFO=<ID=MultiTechExact,Number=1,Type=String,Description="TRUE if callsets from more than one technology exactly matches the call output for this cluster, i.e., NumTechsExact>1">

from bdg-formats.

fnothaft avatar fnothaft commented on September 23, 2024

...and just in time for your visit!

from bdg-formats.

heuermh avatar heuermh commented on September 23, 2024

Closing as WontFix

from bdg-formats.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.