Code Monkey home page Code Monkey logo

reviewer's Introduction

Repeat Expansion Viewer (REViewer)

REViewer is a tool for visualizing alignments of reads in regions containing tandem repeats. REViewer requires a BAMlet with graph-realigned reads generated by ExpansionHunter and the corresponding variant catalog.

Introductory example

License

REViewer is provided under the terms and conditions of the GPLv3 license. It relies on several third party packages provided under other open source licenses, please see COPYRIGHT.txt for additional details.

Installation

The simplest way of obtaining REViewer is by downloading a Linux binary corresponding to the latest release from the Releases page. The link to the binary is located in the Assets section.

REViewer can also be built from source with CMake.

cd REViewer
mkdir build; cd build
cmake ..; make

Usage

REViewer requires output files generated by ExpansionHunter v3.0.0 or above along with the matching variant catalog file and reference genome.

REViewer \
  --reads <BAMlet generated by ExpansionHunter> \
  --vcf <VCF file generated by ExpansionHunter> \
  --reference <FASTA file with reference genome> \
  --catalog <Variant catalog> \
  --locus <Locus to analyze> \
  --output-prefix <Prefix for the output files>

Note that the BAMlet generated by ExpansionHunter (--reads parameter) must be sorted and indexed.

Introductory guides

  • A blog post describing the method
  • Examples of read pileups corresponding to correctly and incorrectly genotyped repeats.
  • You can use the files under /reviewer/tests/inputs/ to test REViewer on your own machine. (Don't use the outputs from the ExpansionHunter repository example; it contains variant locus features that REViewer does not support, and will crash REViewer.)

Reference documentation

Companion tools

  • FlipBook is an image server for REViewer developed by Ben Weisburd. It provides a convenient way to inspect large quantities of read pileups.

  • Review BAMs is script that allows applying REViewer to a regular BAM file (by running ExpansionHunter in the background). It was developed by Andreas Halman.

Citation

reviewer's People

Contributors

canyansi avatar ctsa avatar egor-dolzhenko avatar fo40225 avatar sclamons avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reviewer's Issues

Segmentation fault (core dumped)

Hi @egor-dolzhenko,
This is more of a combined REViewer/EH question.

I am trying to generate a Reviewer graph for a locus from RNASeq data. It runs until the plot blueprint stage and then generates the segmentation fault error.
[2021-11-08 17:42:57.730] [info] Loading specification of locus NIPA1
[2021-11-08 17:42:57.823] [info] Extracted 84 frags
[2021-11-08 17:42:57.823] [info] Calculating fragment length
[2021-11-08 17:42:57.824] [info] Fragment length is estimated to be 191
[2021-11-08 17:42:57.824] [info] Extracting genotype paths
[2021-11-08 17:42:57.826] [info] Phasing
[2021-11-08 17:42:57.830] [info] Found 2 paths defining genotype
[2021-11-08 17:42:57.830] [info] Projecting reads onto haplotype paths
[2021-11-08 17:42:57.831] [info] Projected 84 read pairs
[2021-11-08 17:42:57.831] [info] Generating fragment alignments
[2021-11-08 17:42:57.831] [info] Generated 84 fragment alignments
[2021-11-08 17:42:57.831] [info] Assigning fragment origins
[2021-11-08 17:42:57.833] [info] Found assignments for 84 frags
[2021-11-08 17:42:57.833] [info] Generating plot blueprint
Segmentation fault (core dumped)

Locus information from the ExpHunter run is as follows-
"NIPA1": {
"AlleleCount": 2,
"Coverage": 14.291508923742562,
"FragmentLength": 191,
"LocusId": "NIPA1",
"ReadLength": 151,
"Variants": {
"NIPA1": {
"CountsOfFlankingReads": "()",
"CountsOfInrepeatReads": "()",
"CountsOfSpanningReads": "()",
"Genotype": "0/0",
"GenotypeConfidenceInterval": "0-714/0-714",
"ReferenceRegion": "chr15:22786677-22786701",
"RepeatUnit": "GCG",
"VariantId": "NIPA1",
"VariantSubtype": "Repeat",
"VariantType": "Repeat"

If the lack of FR, IRR and SR is causing the error, then I wonder how the coverage for the loci was calculated.

Regards,
Hasna

Repeat units differ significantly in VCF and SVG

Hi,

we have a case where there seems to be bug in REViewer.
The repeat units for C9ORF72 are 2/677 according to the VCF generated by ExpansionHunter 5.0.0:
chr9 27573528 . C , . PASS END=27573546;REF=3;RL=18;RU=GGCCCC;VARID=C9ORF72;REPID=C9ORF72 GT:SO:REPCN:REPCI:ADSP:ADFL:ADIR:LC 1/2:SPANNING/INREPEAT:2/677:2-2/628-970:27/0:7/41:0/612:57.641694

However when running REViewer 0.2.7 the SVG image shows 386 repeat units for the second allele:
DNA2204013A1_02_repeats_expansionhunter_C9ORF72

This file should contain everything you need to replicate the issue:
REviewer_issue_.zip

If you need more information or data, I can provide that.

Best,
Marc

Note to me: it is in sample DNA2204013A1_02

Can't Process example in ExpansionHunter

Here is my command
../REViewer/build/install/bin/REViewer --reads example/output/repeats_realigned.srt.bam --vcf example/output/repeats.vcf --reference example/input/reference.fa --catalog example/input/variants.json --locus SNV_AND_STR --output-prefix test

Here is response
[2021-03-23 08:54:23.506] [info] Loading specification of locus SNV_AND_STR
[2021-03-23 08:54:23.537] [info] Extracted 549 frags
[2021-03-23 08:54:23.537] [info] Calculating fragment length
[2021-03-23 08:54:23.538] [info] Fragment length is estimated to be 351
[2021-03-23 08:54:23.538] [info] Extracting genotype paths
REViewer: /home/missionbio/REViewer/reviewer/app/GenotypePaths.cpp:106: std::map<std::pair<unsigned int, unsigned int>, std::vector<std::vector > > getVariantPathPieces(int, const string&, const LocusSpecification&): Assertion `variantSpec.classification().type == VariantType::kRepeat' failed.
Aborted (core dumped)

REViewer v0.2.1
ExpansionHunter v4.0.2 / v3.1.2/ v3.2.0

Add --vcf to README Usage section.

Like other ExpansionHunter tools, it was really easy to install and start using REViewer.
The one error I ran into was that the --vcf arg is required.
Also, it might be worth noting that the BAMlet generated by ExpansionHunter has to be sorted and indexed.

"Missing flanking read fragments"

Hi Egor @egor-dolzhenko

I am using a bam file with manually selected paired end reads aligned with Repeat Unit (genomic location used as defined in the variant catalog) to generate a reviewer plot. We extracted these reads from the realigned bam file generated by Expansion Hunter.

I get the foll. error-

"Missing flanking read fragments"
image

What would this error message imply and is there a way to rectify this?

Attempting to run REViewer on ExpansionHunter's example data gives assertion failure

Heyo, I'm not very savvy with cpp so can't quite figure this one out on my own. Full error message:

REViewer-v0.2.7-linux_x86_64: app/GenotypePaths.cpp:119: std::map<std::pair<unsigned int, unsigned int>, std::vector<std::vector<unsigned int> > > getGenotypeNodesByNodeRange(int, const string&, const LocusSpecification&): Assertion `variantSpec.classification().type == VariantType::kRepeat' failed.
run_REViewer.scr: line 7:  7312 Aborted                 (core dumped) REViewer-v0.2.7-linux_x86_64 --reads repeats_realigned.sorted.bam --reference ../input/reference.fa --catalog ../input/variants.json --vcf repeats.vcf --locus SNV_AND_STR --output-prefix reviewer_out_

All the input files are identical to or derived from the example data bundled with the latest ExpansionHunter release. The only change is I sorted and indexed the bamlet with samtools.

Thanks in advance!

metrics.tsv AlleleDepth of head and v0.2.7 deviate significant

I compiled version 0.27 and the head of the default branch. The output of the AlleleDepth differs: the total AlleleDepth is the same, but the first Allele always seems lower. A minor deviation was expected, but the variation is quite significant.

For example(variant names are renamed because of privacy concerns):

==> sampleX.metrics.tsv_0.2.7 <==

VariantId       Genotype        AlleleDepth
var1       4/4     16.44/15.44
var2       6/6     24.11/18.72
var3       4/4     28.38/25.75
var4 9/9     22.39/21.39
var5       4/4     15.75/16.95
var6       2/2     15.45/19.91

==> sampleX.metrics.tsv_head <==

VariantId       Genotype        AlleleDepth
var1       4/4     12.56/19.31
var2       6/6     19.39/23.44
var3       4/4     24.94/29.19
var4 9/9     18.22/25.56
var5       4/4     23.60/9.10
var6       2/2     19.82/15.55

The input files are identical. Both binaries were compiled with the same container(available upon request) and the binary of version V0.2.7 gives the same results as the static compiled version downloaded from Git Hub.

Which version should be considered the "gold standard"?

GenotypePaths.cpp:125: std::map<std::pair<unsigned int, unsigned int>, std::vector<std::vector<unsigned int> > > getVariantPathPieces(int, const string&, const LocusSpecification&): Assertion `variantPaths.size() == 2' failed

I ran into this error in the following run

REViewer --reads expansion_hunter4_realigned.sorted.bam --vcf expansion_hunter4.vcf --reference /gcsfuse_mounts/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta --catalog repeat_specs__all_pathogenic_loci_variant_catalog.json --locus chrX-25013649-25013697-CGC --output-prefix chrX-25013649-25013697-CGC_ExpansionHunter4
[2021-01-15 12:30:28.244] [info] Loading specification of locus chrX-25013649-25013697-CGC
[2021-01-15 12:30:28.586] [info] Extracted 99 frags
[2021-01-15 12:30:28.586] [info] Calculating fragment length
[2021-01-15 12:30:28.586] [info] Fragment length is estimated to be 352
[2021-01-15 12:30:28.586] [info] Extracting genotype paths
REViewer: /REViewer-master/reviewer/app/GenotypePaths.cpp:125: std::map<std::pair<unsigned int, unsigned int>, std::vector<std::vector<unsigned int> > > getVariantPathPieces(int, const string&, const LocusSpecification&): Assertion `variantPaths.size() == 2' failed.
/bin/bash: line 65:  1604 Aborted                 (core dumped) REViewer --reads expansion_hunter4_realigned.sorted.bam --vcf expansion_hunter4.vcf --reference /gcsfuse_mounts/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta --catalog /localized/gnomad-bw2/project1/ref/GRCh38/repeats_db/repeat_specs__all_pathogenic_loci_variant_catalog.json --locus chrX-25013649-25013697-CGC --output-prefix chrX-25013649-25013697-CGC-ExpansionHunter4

Diplotype score in phasing.tsv

How do we interpret the diplotype score outputted in phasing.tsv?
What does it mean, and how can we use it for QC filtering?

Allow compressed VCFs

We are very used to having only compressed VCFs. While the ExHu-VCFs are tiny, it would be nifty if REViewer would read the compressed files as well, if via htslib or just libgz. Cheers, and thanks for the good work!

How to launch test data retrieved from Expansion Hunter repository

Hello again!

I am interested in how to create the correct pipeline consisted of EH + Stranger (big thx to @dnil) + REViewer that should work on test data at first.
I've taken variant catalog from the example as it follows https://github.com/Illumina/ExpansionHunter/blob/master/example/input/variants.json and had this one:
[error] Error loading locus SNV_AND_STR: Flanks can contain at most 5 characters N but found 2000 Ns.

Does REViewer work on test data from EH or I should take another example?

Thanks.

Bamlet generation for Reviewer viewing

Hi Egor,
I am generating bamlets to generate plots using Reviewer to support Expansion hunter results. EH generates a new bam file with the extension "_realigned.bam". Is that the BAMlet?
Should I use that file or use my original bam file as input for makebamlet.py (EHDN) to make bamlets for my desired genes?
-Hasna

Installation error

HI @egor-dolzhenko
I am trying to install REViewer in MACOS using the source file.
However, I am running into the foll. error. which is probably related with Boost libraries-

Scanning dependencies of target Boost
[ 42%] Creating directories for 'Boost'
[ 45%] Performing download step (download, verify and extract) for 'Boost'
-- Downloading...
dst='/Users/hasnahana/Downloads/REViewer-0.1.1/build/Boost-prefix/src/boost_1_73_0.tar.bz2'
timeout='none'
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
-- Retrying...
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
-- Retry after 5 seconds (attempt #2) ...
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
-- Retry after 5 seconds (attempt #3) ...
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
-- Retry after 15 seconds (attempt #4) ...
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
-- Retry after 60 seconds (attempt #5) ...
-- Using src='https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2'
CMake Error at Boost-stamp/download-Boost.cmake:159 (message):
Each download failed!

error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 52.38.32.109:443...

TCP_NODELAY set

Connected to dl.bintray.com (52.38.32.109) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---
     error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 35.164.5.2:443...

TCP_NODELAY set

Connected to dl.bintray.com (35.164.5.2) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---
     error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 35.164.5.2:443...

TCP_NODELAY set

Connected to dl.bintray.com (35.164.5.2) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---
     error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 35.164.5.2:443...

TCP_NODELAY set

Connected to dl.bintray.com (35.164.5.2) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---
     error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 35.164.5.2:443...

TCP_NODELAY set

Connected to dl.bintray.com (35.164.5.2) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---
     error: downloading 'https://dl.bintray.com/boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2' failed
     status_code: 22
     status_string: "HTTP response code said error"
     log:
     --- LOG BEGIN ---
       Trying 35.164.5.2:443...

TCP_NODELAY set

Connected to dl.bintray.com (35.164.5.2) port 443 (#0)

ALPN, offering http/1.1

TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

ALPN, server did not agree to a protocol

Server certificate: *.bintray.com

Server certificate: GeoTrust RSA CA 2018

Server certificate: DigiCert Global Root CA

GET /boostorg/release/1.73.0/source/boost_1_73_0.tar.bz2 HTTP/1.1

Host: dl.bintray.com

User-Agent: curl/7.65.0

Accept: /

Mark bundle as not supporting multiuse

The requested URL returned error: 403 Forbidden

Closing connection 0

     --- LOG END ---

make[2]: *** [Boost-prefix/src/Boost-stamp/Boost-download] Error 1
make[1]: *** [CMakeFiles/Boost.dir/all] Error 2
make: *** [all] Error 2

Issues building from source with cmake

Hi there!

I'm trying to install this in my local space on an HPC and I'm getting this error both when I clone the repo and when I download the latest release:

make[4]: *** [thirdparty/graph-tools/CMakeFiles/graphtools.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [reviewer-prefix/src/reviewer-stamp/reviewer-build] Error 2
make[1]: *** [CMakeFiles/reviewer.dir/all] Error 2
make: *** [all] Error 2

I'm setting up the environment using conda. Any thoughts as to what is causing the error?

Thanks!
Sheina

Error fix

Hi, I'm trying to visualize the repeat analyzed with ExpansionHunter.
However, whenever I try, the error below comes up and it doesn't work, even I specify ATN1 for locus

[error] Field LocusId must be present in {"LocusResults":{"ATN1":{"AlleleCount":2,"Coverage":18.405405405405403,"FragmentLength":545,"LocusId":"ATN1","ReadLength":150,"Variants":{"ATN1":{"CountsOfFlankingReads":"(3, 1), (5, 2), (8, 1), (9, 1), (10, 2), (11, 1), (14, 4), (17, 2), (18, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(17, 1), (19, 9), (22, 5), (23, 1)","Genotype":"19/22","GenotypeConfidenceInterval":"19-19/22-22","ReferenceRegion":"chr12:6936716-6936773","RepeatUnit":"CAG","VariantId":"ATN1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"ATXN1":{"AlleleCount":2,"Coverage":24.486486486486488,"FragmentLength":565,"LocusId":"ATXN1","ReadLength":150,"Variants":{"ATXN1":{"CountsOfFlankingReads":"(2, 1), (4, 1), (5, 1), (7, 1), (8, 2), (9, 3), (12, 2), (15, 2), (16, 1), (17, 2), (18, 1), (19, 2), (20, 1), (21, 4), (22, 2), (23, 1), (24, 2), (25, 2), (26, 2), (28, 1), (29, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(29, 6), (30, 5)","Genotype":"29/30","GenotypeConfidenceInterval":"29-29/30-30","ReferenceRegion":"chr6:16327633-16327723","RepeatUnit":"TGC","VariantId":"ATXN1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"ATXN10":{"AlleleCount":2,"Coverage":25.7027027027027,"FragmentLength":564,"LocusId":"ATXN10","ReadLength":150,"Variants":{"ATXN10":{"CountsOfFlankingReads":"(1, 4), (2, 4), (3, 3), (5, 2), (6, 4), (7, 4), (8, 2), (9, 1), (10, 1), (11, 3), (12, 2), (13, 7), (14, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(7, 1), (14, 10), (16, 7)","Genotype":"14/16","GenotypeConfidenceInterval":"14-14/16-16","ReferenceRegion":"chr22:45795354-45795424","RepeatUnit":"ATTCT","VariantId":"ATXN10","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"ATXN3":{"AlleleCount":2,"Coverage":27.486486486486484,"FragmentLength":553,"LocusId":"ATXN3","ReadLength":150,"Variants":{"ATXN3":{"CountsOfFlankingReads":"(1, 1), (2, 1), (3, 1), (6, 1), (7, 3), (8, 2), (9, 1), (13, 2), (14, 1), (15, 3), (16, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(11, 13), (16, 12)","Genotype":"11/16","GenotypeConfidenceInterval":"11-11/16-16","ReferenceRegion":"chr14:92071009-92071042","RepeatUnit":"GCT","VariantId":"ATXN3","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"ATXN7":{"AlleleCount":2,"Coverage":12.81081081081081,"FragmentLength":551,"LocusId":"ATXN7","ReadLength":150,"Variants":{"ATXN7":{"CountsOfFlankingReads":"(4, 1), (6, 1), (7, 2), (9, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(0, 1), (2, 1), (10, 10)","Genotype":"10/10","GenotypeConfidenceInterval":"10-10/10-10","ReferenceRegion":"chr3:63912684-63912714","RepeatUnit":"GCA","VariantId":"ATXN7","VariantSubtype":"Repeat","VariantType":"Repeat"},"ATXN7_GCC":{"CountsOfFlankingReads":"(1, 1), (2, 2), (3, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(4, 14)","Genotype":"4/4","GenotypeConfidenceInterval":"4-4/4-4","ReferenceRegion":"chr3:63912714-63912726","RepeatUnit":"GCC","VariantId":"ATXN7_GCC","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"ATXN8OS":{"AlleleCount":2,"Coverage":32.67567567567568,"FragmentLength":546,"LocusId":"ATXN8OS","ReadLength":150,"Variants":{"ATXN8OS":{"CountsOfFlankingReads":"(1, 1), (2, 5), (3, 2), (5, 1), (7, 1), (8, 1), (9, 2), (11, 4), (12, 1), (13, 2), (14, 1), (15, 2), (16, 3), (18, 1), (20, 1), (21, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(14, 1), (15, 13), (23, 11)","Genotype":"15/23","GenotypeConfidenceInterval":"15-15/23-23","ReferenceRegion":"chr13:70139383-70139428","RepeatUnit":"CTG","VariantId":"ATXN8OS","VariantSubtype":"Repeat","VariantType":"Repeat"},"ATXN8OS_CTA":{"CountsOfFlankingReads":"(2, 2), (3, 1), (4, 3), (6, 1), (7, 1), (9, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(8, 16), (9, 20)","Genotype":"8/9","GenotypeConfidenceInterval":"8-8/9-9","ReferenceRegion":"chr13:70139353-70139383","RepeatUnit":"CTA","VariantId":"ATXN8OS_CTA","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"C9ORF72":{"AlleleCount":2,"Coverage":20.35135135135135,"FragmentLength":547,"LocusId":"C9ORF72","ReadLength":150,"Variants":{"C9ORF72":{"CountsOfFlankingReads":"(1, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(2, 32)","Genotype":"2/2","GenotypeConfidenceInterval":"2-2/2-2","ReferenceRegion":"chr9:27573528-27573546","RepeatUnit":"GGCCCC","VariantId":"C9ORF72","VariantSubtype":"RareRepeat","VariantType":"Repeat"}}},"CACNA1A":{"AlleleCount":2,"Coverage":19.945945945945947,"FragmentLength":511,"LocusId":"CACNA1A","ReadLength":150,"Variants":{"CACNA1A":{"CountsOfFlankingReads":"(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (7, 3), (9, 2), (10, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(13, 17)","Genotype":"13/13","GenotypeConfidenceInterval":"13-13/13-13","ReferenceRegion":"chr19:13207858-13207897","RepeatUnit":"CTG","VariantId":"CACNA1A","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"CBL":{"AlleleCount":2,"Coverage":17.513513513513512,"FragmentLength":541,"LocusId":"CBL","ReadLength":150,"Variants":{"CBL":{"CountsOfFlankingReads":"(2, 1), (4, 2), (7, 1), (8, 1), (9, 1), (10, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(11, 24)","Genotype":"11/11","GenotypeConfidenceInterval":"11-11/11-11","ReferenceRegion":"chr11:119206289-119206322","RepeatUnit":"CGG","VariantId":"CBL","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"CNBP":{"AlleleCount":2,"Coverage":25.62162162162162,"FragmentLength":570,"LocusId":"CNBP","ReadLength":150,"Variants":{"CNBP":{"CountsOfFlankingReads":"(1, 1), (2, 3), (3, 1), (4, 2), (6, 2), (7, 4), (10, 1), (15, 3), (19, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(15, 14), (16, 1), (19, 7)","Genotype":"15/19","GenotypeConfidenceInterval":"15-15/19-19","ReferenceRegion":"chr3:129172576-129172656","RepeatUnit":"CAGG","VariantId":"CNBP","VariantSubtype":"Repeat","VariantType":"Repeat"},"CNBP_CA":{"CountsOfFlankingReads":"(3, 1), (4, 2), (5, 1), (6, 1), (7, 2), (8, 1), (9, 1), (10, 2), (11, 1), (14, 1), (15, 3), (17, 2), (18, 3), (19, 2), (20, 1), (23, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(20, 2), (22, 7), (24, 1)","Genotype":"20/22","GenotypeConfidenceInterval":"20-20/22-22","ReferenceRegion":"chr3:129172696-129172732","RepeatUnit":"CA","VariantId":"CNBP_CA","VariantSubtype":"Repeat","VariantType":"Repeat"},"CNBP_CAGA":{"CountsOfFlankingReads":"(1, 1), (2, 2), (4, 1), (6, 1), (8, 1), (9, 1), (10, 5)","CountsOfInrepeatReads":"(2, 1)","CountsOfSpanningReads":"(2, 9), (9, 1), (10, 10)","Genotype":"2/10","GenotypeConfidenceInterval":"2-2/10-10","ReferenceRegion":"chr3:129172656-129172696","RepeatUnit":"CAGA","VariantId":"CNBP_CAGA","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"CSTB":{"AlleleCount":2,"Coverage":18.56756756756757,"FragmentLength":553,"LocusId":"CSTB","ReadLength":150,"Variants":{"CSTB":{"CountsOfFlankingReads":"(1, 3), (2, 4)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(2, 13), (3, 14)","Genotype":"2/3","GenotypeConfidenceInterval":"2-2/3-3","ReferenceRegion":"chr21:43776443-43776479","RepeatUnit":"CGCGGGGCGGGG","VariantId":"CSTB","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"DIP2B":{"AlleleCount":2,"Coverage":22.054054054054053,"FragmentLength":551,"LocusId":"DIP2B","ReadLength":150,"Variants":{"DIP2B":{"CountsOfFlankingReads":"(1, 2), (4, 1), (5, 1), (7, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(7, 18)","Genotype":"7/7","GenotypeConfidenceInterval":"7-7/7-7","ReferenceRegion":"chr12:50505001-50505022","RepeatUnit":"GGC","VariantId":"DIP2B","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"DMPK":{"AlleleCount":2,"Coverage":21.486486486486488,"FragmentLength":528,"LocusId":"DMPK","ReadLength":150,"Variants":{"DMPK":{"CountsOfFlankingReads":"(2, 1), (3, 1), (5, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(5, 11), (13, 8)","Genotype":"5/13","GenotypeConfidenceInterval":"5-5/13-13","ReferenceRegion":"chr19:45770204-45770264","RepeatUnit":"CAG","VariantId":"DMPK","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"FXN":{"AlleleCount":2,"Coverage":21.243243243243242,"FragmentLength":535,"LocusId":"FXN","ReadLength":150,"Variants":{"FXN":{"CountsOfFlankingReads":"(1, 2), (2, 3), (3, 1), (5, 4), (6, 1), (9, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(8, 7), (9, 9)","Genotype":"8/9","GenotypeConfidenceInterval":"8-8/9-9","ReferenceRegion":"chr9:69037286-69037304","RepeatUnit":"GAA","VariantId":"FXN","VariantSubtype":"Repeat","VariantType":"Repeat"},"FXN_A":{"CountsOfFlankingReads":"(4, 1), (14, 1), (17, 1), (20, 2), (21, 1), (27, 1), (38, 1)","CountsOfInrepeatReads":"(13, 1), (63, 1)","CountsOfSpanningReads":"(4, 3), (15, 1), (19, 1), (26, 1), (27, 15), (30, 1), (31, 1), (128, 1)","Genotype":"27/150","GenotypeConfidenceInterval":"27-27/58-259","ReferenceRegion":"chr9:69037261-69037286","RepeatUnit":"A","VariantId":"FXN_A","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"GIPC1":{"AlleleCount":2,"Coverage":19.864864864864867,"FragmentLength":536,"LocusId":"GIPC1","ReadLength":150,"Variants":{"GIPC1":{"CountsOfFlankingReads":"(1, 1), (2, 1), (3, 1), (4, 1), (7, 1), (9, 3), (11, 1), (12, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(11, 1), (12, 22)","Genotype":"12/12","GenotypeConfidenceInterval":"12-12/12-12","ReferenceRegion":"chr19:14496041-14496074","RepeatUnit":"CCG","VariantId":"GIPC1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"GLS":{"AlleleCount":2,"Coverage":23.513513513513516,"FragmentLength":559,"LocusId":"GLS","ReadLength":150,"Variants":{"GLS":{"CountsOfFlankingReads":"(1, 1), (2, 2), (3, 3), (4, 1), (5, 2), (8, 1), (10, 1), (11, 1), (12, 4), (16, 1), (18, 2), (19, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(15, 11), (19, 8), (20, 1)","Genotype":"15/19","GenotypeConfidenceInterval":"15-15/19-19","ReferenceRegion":"chr2:190880872-190880920","RepeatUnit":"GCA","VariantId":"GLS","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"HTT":{"AlleleCount":2,"Coverage":21.243243243243242,"FragmentLength":531,"LocusId":"HTT","ReadLength":150,"Variants":{"HTT":{"CountsOfFlankingReads":"(2, 1), (3, 2), (4, 2), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (11, 1), (14, 5), (15, 1), (16, 3), (17, 2), (19, 1), (20, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(16, 1), (17, 7), (20, 8)","Genotype":"17/20","GenotypeConfidenceInterval":"17-17/20-20","ReferenceRegion":"chr4:3074876-3074933","RepeatUnit":"CAG","VariantId":"HTT","VariantSubtype":"Repeat","VariantType":"Repeat"},"HTT_CCG":{"CountsOfFlankingReads":"(1, 3), (5, 2), (6, 1), (7, 3), (8, 1), (9, 1), (10, 1), (11, 4), (12, 3)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(12, 17)","Genotype":"12/12","GenotypeConfidenceInterval":"12-12/12-12","ReferenceRegion":"chr4:3074939-3074966","RepeatUnit":"CCG","VariantId":"HTT_CCG","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"JPH3":{"AlleleCount":2,"Coverage":22.864864864864863,"FragmentLength":571,"LocusId":"JPH3","ReadLength":150,"Variants":{"JPH3":{"CountsOfFlankingReads":"(1, 1), (2, 1), (4, 1), (7, 1), (8, 1), (9, 2), (11, 2), (12, 1), (13, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(14, 14), (15, 9)","Genotype":"14/15","GenotypeConfidenceInterval":"14-14/15-15","ReferenceRegion":"chr16:87604287-87604329","RepeatUnit":"CTG","VariantId":"JPH3","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"NIPA1":{"AlleleCount":2,"Coverage":16.62162162162162,"FragmentLength":575,"LocusId":"NIPA1","ReadLength":150,"Variants":{"NIPA1":{"CountsOfFlankingReads":"(1, 1), (2, 1), (4, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(3, 1), (5, 1), (8, 14)","Genotype":"8/8","GenotypeConfidenceInterval":"8-8/8-8","ReferenceRegion":"chr15:22786677-22786701","RepeatUnit":"GCG","VariantId":"NIPA1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"NOP56":{"AlleleCount":2,"Coverage":23.513513513513516,"FragmentLength":543,"LocusId":"NOP56","ReadLength":150,"Variants":{"NOP56":{"CountsOfFlankingReads":"(2, 1), (3, 2), (4, 2), (5, 3), (6, 3), (8, 3)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(7, 10), (8, 16)","Genotype":"7/8","GenotypeConfidenceInterval":"7-7/8-8","ReferenceRegion":"chr20:2652733-2652757","RepeatUnit":"GGCCTG","VariantId":"NOP56","VariantSubtype":"Repeat","VariantType":"Repeat"},"NOP56_CGCCTG":{"CountsOfFlankingReads":"(1, 1), (2, 5)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(2, 28)","Genotype":"2/2","GenotypeConfidenceInterval":"2-2/2-2","ReferenceRegion":"chr20:2652757-2652775","RepeatUnit":"CGCCTG","VariantId":"NOP56_CGCCTG","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"NOTCH2NL":{"AlleleCount":2,"Coverage":17.594594594594593,"FragmentLength":557,"LocusId":"NOTCH2NL","ReadLength":150,"Variants":{"NOTCH2NL":{"CountsOfFlankingReads":"(1, 2), (3, 2), (4, 1), (5, 1), (6, 1), (7, 1), (8, 2), (10, 2), (13, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(1, 1), (9, 5), (15, 7)","Genotype":"9/15","GenotypeConfidenceInterval":"9-9/15-15","ReferenceRegion":"chr1:149390802-149390841","RepeatUnit":"GGC","VariantId":"NOTCH2NL","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"PABPN1":{"AlleleCount":2,"Coverage":17.83783783783784,"FragmentLength":535,"LocusId":"PABPN1","ReadLength":150,"Variants":{"PABPN1":{"CountsOfFlankingReads":"(2, 1), (3, 1), (4, 2), (5, 1), (6, 1), (15, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(5, 1), (6, 18)","Genotype":"6/6","GenotypeConfidenceInterval":"6-6/6-6","ReferenceRegion":"chr14:23321472-23321490","RepeatUnit":"GCG","VariantId":"PABPN1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"PHOX2B":{"AlleleCount":2,"Coverage":15.486486486486486,"FragmentLength":552,"LocusId":"PHOX2B","ReadLength":150,"Variants":{"PHOX2B":{"CountsOfFlankingReads":"(1, 1), (2, 1), (3, 1), (4, 4), (6, 1), (7, 1), (8, 1), (10, 1), (11, 1), (13, 2), (14, 1), (19, 4), (20, 1)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(20, 6), (34, 1)","Genotype":"20/20","GenotypeConfidenceInterval":"20-20/20-23","ReferenceRegion":"chr4:41745972-41746032","RepeatUnit":"GCN","VariantId":"PHOX2B","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"PPP2R2B":{"AlleleCount":2,"Coverage":22.702702702702705,"FragmentLength":550,"LocusId":"PPP2R2B","ReadLength":150,"Variants":{"PPP2R2B":{"CountsOfFlankingReads":"(3, 3), (5, 2), (6, 1), (7, 1), (8, 1), (9, 5), (10, 1), (12, 1), (13, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(10, 17), (16, 7)","Genotype":"10/16","GenotypeConfidenceInterval":"10-10/16-16","ReferenceRegion":"chr5:146878727-146878757","RepeatUnit":"GCT","VariantId":"PPP2R2B","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"RFC1":{"AlleleCount":2,"Coverage":25.135135135135137,"FragmentLength":548,"LocusId":"RFC1","ReadLength":150,"Variants":{"RFC1":{"CountsOfFlankingReads":"(2, 4), (3, 3), (4, 2), (5, 1), (6, 3), (7, 1), (8, 2), (9, 6), (10, 1), (12, 1), (14, 1), (15, 1), (16, 1), (17, 1), (18, 1), (20, 1), (21, 1), (22, 2), (23, 2), (26, 1), (27, 1), (28, 2)","CountsOfInrepeatReads":"(30, 24), (31, 4)","CountsOfSpanningReads":"(10, 9)","Genotype":"10/88","GenotypeConfidenceInterval":"10-10/69-124","ReferenceRegion":"chr4:39348424-39348479","RepeatUnit":"AARRG","VariantId":"RFC1","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"TBP":{"AlleleCount":2,"Coverage":28.135135135135133,"FragmentLength":540,"LocusId":"TBP","ReadLength":150,"Variants":{"TBP":{"CountsOfFlankingReads":"(1, 3), (2, 1), (3, 2), (4, 2), (5, 1), (7, 1), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (16, 2), (17, 1), (19, 2), (20, 3), (22, 2), (25, 1), (26, 4), (27, 1), (30, 1), (33, 1), (36, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(4, 2), (5, 1), (6, 2), (16, 1), (32, 5), (36, 1)","Genotype":"32/36","GenotypeConfidenceInterval":"32-32/36-47","ReferenceRegion":"chr6:170561906-170562017","RepeatUnit":"GCA","VariantId":"TBP","VariantSubtype":"Repeat","VariantType":"Repeat"}}},"TCF4":{"AlleleCount":2,"Coverage":22.702702702702705,"FragmentLength":576,"LocusId":"TCF4","ReadLength":150,"Variants":{"TCF4":{"CountsOfFlankingReads":"(1, 2), (2, 3), (3, 2), (5, 1), (6, 1), (7, 3), (9, 1), (10, 1), (11, 3), (13, 1), (14, 2), (15, 1), (18, 1), (20, 2), (21, 3), (23, 3), (24, 1), (25, 2), (26, 1), (32, 2)","CountsOfInrepeatReads":"()","CountsOfSpanningReads":"(19, 1), (26, 6), (29, 7)","Genotype":"26/29","GenotypeConfidenceInterval":"26-26/29-29","ReferenceRegion":"chr18:55586155-55586227","RepeatUnit":"CAG","VariantId":"TCF4","VariantSubtype":"Repeat","VariantType":"Repeat"}}}},"SampleParameters":{"SampleId":"DA0000005225","Sex":"Female"}}

Thank you in advance

Strand bias (QC metric) in REViewer

Hi Egor,

I have a request for a new QC metric that would be analogous to the 'Fisher Strand' metric in GATK (https://gatk.broadinstitute.org/hc/en-us/articles/360040096152-FisherStrand). Basically, one would expect an even ratio of forward and reverse reads to be assigned to each of the two alleles at an STR site, but if there is a bias toward one of those directionalities on one of the alleles, it could indicate a low quality call.

Thanks again for all the great work you are doing on these tools!

Best,
Matt

Display degenerate base pairs

Hi,

It would be really useful to have an option to display degenerate base pairs in the images generated by REViewer.

For example, this would help to distinguish AAAAG / AAGGG repeats at the RFC1 locus when reviewing these images

Does REViewer plot reads from off-target regions?

I ran EHv4 on the FXN locus, using 2 repeat specs which are identical except that the 2nd one includes GAA off-target regions:

{
        "LocusId": "FXN-chr9-69037286-69037304-GAA",
        "LocusStructure": "(GAA)*",
	"RepeatUnit": "GAA",
        "ReferenceRegion": "chr9:69037286-69037304",
        "VariantType": "RareRepeat",
	"OfftargetRegions": []
    },
    {
        "LocusId": "FXN-chr9-69037286-69037304-GAA-with-off-targets",
        "LocusStructure": "(GAA)*",
	"RepeatUnit": "GAA",
        "ReferenceRegion": "chr9:69037286-69037304",
        "VariantType": "RareRepeat",
	"OfftargetRegions": [
	    "chr2:220546033-220546610",
	    "chr5:127247161-127247640",
	    "chrX:51621350-51621856",
	    "chr1:101657701-101658187",
	    "chr13:102161416-102161881",
	    "chr7:37848005-37848522",
	    "chrY:25645531-25646013",
	    "chr7:84690949-84691442",
	    "chrUn_KN707747v1_decoy:1062-2074",
	    "chr6:50708070-50708556",
	    "chrY:24024122-24024600"
	]
    },

These are the EHv4 results in the (relatively rare) WGS sample where the genotypes from the 2 specs differed significantly between no-off-targets:

chr9	69037286	.	A	<STR9>,<STR110>	.	PASS	END=69037304;REF=6;RL=18;RU=GAA;VARID=FXN-chr9-69037286-69037304-GAA;REPID=FXN-chr9-69037286-69037304-GAA	GT:SO:REPCN:REPCI:ADSP:ADFL:ADIR:LC	1/2:SPANNING/INREPEAT:9/110:9-10/63-153:2/0:5/13:0/12:42.016851

and with off-targets:

chr9	69037286	.	A	<STR33>,<STR726>	.	PASS	END=69037304;REF=6;RL=18;RU=GAA;VARID=FXN-chr9-69037286-69037304-GAA-with-0.01-threshold-off-targets;REPID=FXN-chr9-69037286-69037304-GAA-with-0.01-threshold-off-targets	GT:SO:REPCN:REPCI:ADSP:ADFL:ADIR:LC	1/2:FLANKING/INREPEAT:33/726:33-111/658-1313:0/0:13/13:4/108:42.016851

I then ran REViewer for both outputs, and got these plots

no-off-targets:
CDS-nC6iXU_FXN-chr9-69037286-69037304-GAA_ExpansionHunter4

with-off-targets:
CDS-nC6iXU_FXN-chr9-69037286-69037304-GAA-with-0 01-threshold-off-targets_ExpansionHunter4

finally, this is the plot from when I used the standard FXN repeat spec included in the EHv4 repo:
CDS-nC6iXU_FXN-chr9-69037286-69037304-GAA-official_ExpansionHunter4

I'm wondering how to interpret the "with-off-targets plot". In REViewer docs, I saw
..the current version of REViewer visualizes repeats whose span does not exceed the fragment length (longer repeats are capped at the fragment length).
Does REViewer not plot the off-target FRRs?

Thanks
-Ben

unable to find the compatible libcurl4

Encountering an issue while installing.

82%] Performing configure step for 'reviewer'
CMake Error at /usr/local/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find CURL (missing: CURL_LIBRARY CURL_INCLUDE_DIR)
Call Stack (most recent call first):
/usr/local/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
/usr/local/share/cmake-3.20/Modules/FindCURL.cmake:181 (find_package_handle_standard_args)
CMakeLists.txt:20 (find_package)

-- Configuring incomplete, errors occurred!
See also "/home/sngene1/REViewer/build/reviewer-prefix/src/reviewer-build/CMakeFiles/CMakeOutput.log".
make[2]: *** [CMakeFiles/reviewer.dir/build.make:91: reviewer-prefix/src/reviewer-stamp/reviewer-configure] Error 1
make[1]: *** [CMakeFiles/Makefile2:94: CMakeFiles/reviewer.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

[error] Failed to extract reads from the specified region

Hello, I keep getting this error, I've tried with three different loci, the error remains the same.

"LocusResults": {
    "AFF2": {
      "AlleleCount": 2,
      "Coverage": 31.54054054054054,
      "FragmentLength": 352,
      "LocusId": "AFF2",
      "ReadLength": 150,
      "Variants": {
        "AFF2": {
          "CountsOfFlankingReads": "()",
          "CountsOfInrepeatReads": "()",
          "CountsOfSpanningReads": "()",
          "Genotype": "0/0",
          "GenotypeConfidenceInterval": "0-714/0-714",
          "ReferenceRegion": "X:147582151-147582211",
          "RepeatUnit": "GCC",
          "VariantId": "AFF2",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    },
"AR": {
      "AlleleCount": 2,
      "Coverage": 41.91891891891891,
      "FragmentLength": 350,
      "LocusId": "AR",
      "ReadLength": 150,
      "Variants": {
        "AR": {
          "CountsOfFlankingReads": "(1, 2), (2, 2), (3, 3), (4, 1), (5, 1), (6, 2), (8, 6), (9, 1), (10, 3), (15, 3), (17, 1), (18, 3), (19, 5), (20, 2), (23, 1), (24, 1), (26, 1), (28, 1)",
          "CountsOfInrepeatReads": "()",
          "CountsOfSpanningReads": "(27, 16), (28, 9)",
          "Genotype": "27/28",
          "GenotypeConfidenceInterval": "27-27/28-28",
          "ReferenceRegion": "X:66765158-66765227",
          "RepeatUnit": "GCA",
          "VariantId": "AR",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    },

I thought maybe the problem was that there are no InrepeatReads, but the problem remains if only using an entry where all three read categories have members, e.g:

"LocusResults": {
    "RFC1": {
      "AlleleCount": 2,
      "Coverage": 41.513513513513516,
      "FragmentLength": 351,
      "LocusId": "RFC1",
      "ReadLength": 150,
      "Variants": {
        "RFC1": {
          "CountsOfFlankingReads": "(1, 2), (2, 2), (3, 3), (4, 2), (5, 2), (7, 2), (8, 5), (10, 2), (11, 1), (15, 1), (16, 1), (17, 2), (20, 1), (21, 1), (22, 1), (25, 1), (29, 4)",
          "CountsOfInrepeatReads": "(30, 15), (31, 1), (32, 1)",
          "CountsOfSpanningReads": "(10, 11), (15, 1)",
          "Genotype": "10/65",
          "GenotypeConfidenceInterval": "10-10/53-102",
          "ReferenceRegion": "4:39350044-39350099",
          "RepeatUnit": "AARRG",
          "VariantId": "RFC1",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    }
  }

[2022-03-25 14:42:00.258] [info] Loading specification of locus RFC1
[2022-03-25 14:42:00.260] [error] Failed to extract reads from the specified region

What to do? :-(

REViewer-v0.2.7-linux_x86_64\
	--reads "$out"/exphout_"${sample}"_realigned.sorted.bam\
	--reference "$genome"\
	--catalog "$catalog"\
	--vcf exphout_"${sample}".vcf\
	--locus "RFC1"\
	--output-prefix "$out"/reviewerout_"${sample}"

Grateful for help with this! Am excited to see the pretty diagrams...

Build fails due to fmt

CMake Error at CMakeLists.txt:23 (find_package):
By not providing "Findfmt.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "fmt", but
CMake did not find one.

Could not find a package configuration file provided by "fmt" with any of
the following names:

fmtConfig.cmake
fmt-config.cmake

Add the installation prefix of "fmt" to CMAKE_PREFIX_PATH or set "fmt_DIR"
to a directory containing one of the above files. If "fmt" provides a
separate development package or SDK, be sure it has been installed.

[error] Failed to extract reads from the specified region

Hello!

I have a problem while executing the main command. Option --locus as far as I understood requires an appropriate string of LocusID from the EH's outfile. For instance, as it follows in my result file (.json) (a fragment):
.. }, "ATXN10": { "AlleleCount": 2, "Coverage": 0.32432432432432434, "FragmentLength": 215, "LocusId": "ATXN10", "ReadLength": 150, "Variants": { "ATXN10": { "CountsOfFlankingReads": "()", "CountsOfInrepeatReads": "()", "CountsOfSpanningReads": "()", "ReferenceRegion": "chr22:46191234-46191304", "RepeatUnit": "ATTCT", "VariantId": "ATXN10", "VariantSubtype": "Repeat", "VariantType": "Repeat" } } ..
I tried to paste ATXN10 as an argument of --locus option, but have not gotten any success.

*btw, I have looked at Issue #2, particularly at the type of command from bw2. In his/her case, he/she got the EH's outfile with specified contig name and coordinates in LocusID:
"LocusId": "FXN-chr9-69037286-69037304-GAA". That makes me worried about that: possibly my outfile from EH is wrong?

Thank you so much!

Unable to create path

Hi

When I ran the command, the error message showed as below.

1

What did this mean? Could you please give me any suggestion?

Thanks

stoi: no conversion error

I got this error on one out of ~300 samples. When I ran

REViewer --reads CDS-ulp4x8.expansion_hunter4_realigned.sorted.bam --vcf CDS-ulp4x8.expansion_hunter4.vcf --reference hg38.fa --catalog ./FXN_variant_catalog_with_0.01_threshold_off_targets.json --locus  FXN-chr9-69037286-69037304-GAA --output-prefix CDS-ulp4x8_FXN-chr9-69037286-69037304-GAA_ExpansionHunter4

the output was

[2021-01-10 18:14:17.325] [info] Loading specification of locus FXN-chr9-69037286-69037304-GAA
[2021-01-10 18:14:17.389] [info] Extracted 79 frags
[2021-01-10 18:14:17.389] [info] Calculating fragment length
[2021-01-10 18:14:17.389] [info] Fragment length is estimated to be 171
[2021-01-10 18:14:17.389] [info] Extracting genotype paths
[2021-01-10 18:14:17.393] [error] stoi: no conversion

for these input files:
files.zip

Invalid contig name for alternate contigs

Hi,

our variant catalog contains OfftargetRegions on alternate contigs, e.g. "chr4_KI270790v1_alt:200469-201085"

ExpansionHunter works without issues and i get a valid vcf-file.

When starting REviewer with the exact same assembly reference and catalog, i get the error:

[error] Invalid contig name chr4_KI270790v1_alt

Removing all alt. contigs from the catalog results in the error:
[error] Failed to extract reads from the specified region

This is the catalog file: variant_catalog_gnomAD_with_offtargets.GRCh38_202306.json

Adding secondary info for checking read and mapping quality

Looking at REViewer plots, I start wondering whether particular reads that contradict or support a give genotype are low quality. It might be useful if REViewer could (optionally?) draw extra info like:

  • plotting some number for each read that describes mapping quality
  • showing read pairs rather than reads
  • whether a mate or a read pair was recovered from a distant locus
  • read strand

BAM alignments are required to have "XG" auxiliary tag

hi,all
when I run REViewer,the error is presented as follows:
image

my bam file is generated by minimap2 and when I added XG:i:0 to bam file,it showed "Unexpected auxiliary tag XG:C"

So what does the "XG" tag mean? how do I get it?

Looking forward to your reply. Thank you.

Problem related to the pre-installed software and packages

Good day,

Thank you so much for the tools!
I've constructed the pipeline consisted of EH -> .. -> REViewer and got the problem that both EH and REViewer install the same soft and packages (for instance, Boost). Moreover, I've installed them on my own. It would be great to skip a soft installation step in cases when it is already done.

[error] Failed to extract reads from the specified region

Hello,

I get an error message when I try to run REViewer and I don't understand why. All files are present and seem to be complete. I have sorted and indexed my BAM. Thank you by advance for your help.

./REViewer-v0.2.7-linux_x86_64 --reads results/19A5405_realigned_sorted.bam --vcf results/19A5405.vcf --reference /srv/nfs/disnap/NGS/bioInfo/Bases_de_donnees/GRCh37_decoy/GRCh37Decoy.fasta --catalog CSTB.json --locus CSTB --output-prefix results/19A5405
[2023-09-21 15:43:32.147] [info] Loading specification of locus CSTB
[2023-09-21 15:43:32.159] [error] Failed to extract reads from the specified region
cat CSTB.json
[
    {
        "LocusId": "CSTB",
        "LocusStructure": "(CGCGGGGCGGGG)*",
        "ReferenceRegion": "chr21:45196324-45196360",
        "VariantType": "Repeat"
    }
]
samtools view results/19A5405_realigned_sorted.bam chr21:45196324-45196360 | head
NL500104:777:HCM3GAFX5:3:11502:24673:4992       77      chr21 45196325  0  *    *       0       0       CGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGCG     IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII!IIIIIIIIIIIIIIIIII!IIIIIIIIIIIIIIIIIIIIII   XG:Z:CSTB,0,1[12M]2[1M1X61M]
NL500104:777:HCM3GAFX5:3:11401:3971:20320       77      chr21 45196325  0  *    *       0       0       CGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCGGCGCTGGTGTCGGGAGGGAGCG     IIIIIIIIIIII!IIIIIIII!II!IIIIIIII!III!IIIIIIIIIIIIII!IIIIIIIIIIIIIIIIII!III   XG:Z:CSTB,0,1[12M]2[1M1X38M1X22M]
NL500104:777:HCM3GAFX5:3:11407:21027:17686      125     chr21 45196325  0  *    *       0       0       GCGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGC     III!IIIIIIIIIIIII!IIIIIIIIIIIIIIIIIIIIIIII!IIIIIIII!IIIIIII!I!IIIIIIII!III!   XG:Z:CSTB,0,1[1S12M]2[1M1X60M]
NL500104:777:HCM3GAFX5:2:11211:22012:2769       125     chr21 45196325  0  *    *       0       0       CGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGCG     !I!IIII!IIIIIIII!IIII!IIIIIIIIIIIIIII!IIIIIIIIIIIIIIIIIII!IIIIIIIIIIIIIIIII   XG:Z:CSTB,0,1[12M]2[1M1X61M]
NL500104:777:HCM3GAFX5:2:21211:25239:12812      189     chr21 45196325  0  *    *       0       0       CGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCGGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGCG     !I!!II!!IIIIIIIIIIIIIIIIII!III!IIIII!!III!IIIIIIIIIIIII!IIIIII!IIIIIIIIIIII   XG:Z:CSTB,0,1[12M]2[1M1X23M1X37M]
NL500104:777:HCM3GAFX5:2:11202:2727:9211        189     chr21 45196325  0  *    *       0       0       GCGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGC     I!I!IIII!IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII!IIIIIIIIIIIIIIIIIIII!IIIIIIIIIII   XG:Z:CSTB,0,1[1S12M]2[1M1X60M]
NL500104:777:HCM3GAFX5:2:21308:17388:5398       125     chr21 45196325  0  *    *       0       0       GCGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGTCGGGAGGGAGC     I!I!IIII!IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII!IIIIIII!IIIIIII!IIII!IIIIIIIIIII   XG:Z:CSTB,0,1[1S12M]2[1M1X60M]
NL500104:777:HCM3GAFX5:3:21407:19722:5663       125     chr21 45196325  0  *    *       0       0       CGGGGGGCGGGGCGGGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGT     !I!III!!III!!I!IIII!IIIIIIII!IIIIIIIIIIIIIIIIIIII!III!IIIIIIIIIIIIIIIIIIIII   XG:Z:CSTB,0,1[2M1X9M]1[2M1X9M]2[1M1X49M]
NL500104:777:HCM3GAFX5:1:11102:16995:20161      189     chr21 45196325  0  *    *       0       0       CGCGGGGCGGGGCGCGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTGT     II!IIII!IIIII!!III!!IIIIIIIII!IIIIII!IIIIIIIIIIIIIIIIIIII!III!IIIII!I!IIIII   XG:Z:CSTB,0,1[12M]1[12M]2[1M1X49M]
NL500104:777:HCM3GAFX5:4:21512:6434:14465       125     chr21 45196325  0  *    *       0       0       GCGCGGGGCGGGGCGGGGGGCGGGGAGCCTGGCCACCACTCGCCGCAGGCTGGGTCTCCGCGCCCAGCGCTGGTG     I!IIIIII!IIII!I!IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII!I!IIIIIIIIIIIIIII!II!I   XG:Z:CSTB,0,1[1S12M]1[2M1X9M]2[1M1X48M]
grep -v "#" results/19A5405.vcf
chr21  45196324 .       G       <STR2>  .       PASS    END=45196360;REF=3;RL=36;RU=CGCGGGGCGGGG;VARID=CSTB;REPID=CSTB  GT:SO:REPCN:REPCI:ADSP:ADFL:ADIR:LC     1/1:SPANNING/SPANNING:2/2:2-2/2-2:112/112:184/184:0/0:55.288681

Complilation error

Hi ,
I am facing an issue during compilation. Could you please help me with the below error?

make[5]: *** [thirdparty/graph-tools/CMakeFiles/graphtools.dir/src/graphalign/GappedAligner.cpp.o] Error 1
make[4]: *** [thirdparty/graph-tools/CMakeFiles/graphtools.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [reviewer-prefix/src/reviewer-stamp/reviewer-build] Error 2
make[1]: *** [CMakeFiles/reviewer.dir/all] Error 2
make: *** [all] Error 2

Thanks and Regards,

ysayyed

Possibility to visualize more than 1 locus

Good day,

I would like to get some information about whether it is possible to get the results not only for one locus, but for some of them. If not, is it appropriate to add, for instance, --loci {LocusID1, LocusID2, .. LocusIDN} in the future?

Thanks a lot.

[E::bgzf_read_block] Invalid BGZF header at offset 590043

Hi,
I am new with REViewer can I encounter an error with the "BGZF header".
As output i get an empty ".metrics.tsv" and ".phasing.tsv" file and no images.
Do you know a solution for this?

BAM_EXPANSIONHUNTER="patientID_realigned.bam"
samtools sort patientID_realigned.bam -o patientID_realigned.sorted.bam
samtools index patientID_realigned.sorted.bam patientID_realigned.bam.bai

$ REViewer \

--reads $BAM_EXPANSIONHUNTER
--vcf $VCF_EXPANSIONHUNTER
--reference $REFERENCE_GENOME
--catalog $VARIANT_CATALOG
--locus AFF2,AR,ATN1,ATXN1,ATXN10,ATXN2,ATXN3,ATXN7,ATXN7,ATXN8OS,ATXN8OS,C9ORF72,CACNA1A,CBL,CNBP,CSTB,DIP2B,DMPK,FMR1,FXN,GIPC1,GLS,HTT,JPH3,NIPA1,NOP56,NOTCH2NL,PABPN1,PHOX2B,PPP2R2B,RFC1,TBP,TCF4
--output-prefix $OUTPUT_PREFIX
[2022-10-28 13:21:13.966] [info] Loading specification of locus AFF2
[E::bgzf_read_block] Invalid BGZF header at offset 590043
[E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes
[2022-10-28 13:21:13.968] [info] Extracted 0 frags
[2022-10-28 13:21:13.968] [info] Calculating fragment length
[2022-10-28 13:21:13.974] [error] There are no read alignments in the target region

I appreciate your help,
/Tine

Graph Interpretation

Hi,
I have couple of questions about interpretation of REViewer graphs.

1- How are base substitutions and deletions displayed in the graphs generated by REViewer?

2- What does this dark line signify?
image
image

3- Some graphs have three reads aligned (the one in between are IRRs). How reliable are genotypes supported by such three reads (given that the flanks are 100% matching)?
image

4- Why are some reads of the same color but lighter in appearance?
image

Thank you,
Hasna

Repeat-size interval visualisation

Hello,

It would be fantastic to have a template with the repeat-size info (every 5 repeats or so).

--1----------------------5--------------------------10---------------------------15---------------------
CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-CGG-

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.