gear-genomics / tracy Goto Github PK

Basecalling, alignment, assembly and deconvolution of Sanger Chromatogram trace files

Home Page: https://www.gear-genomics.com/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.24% C++ 99.65% Dockerfile 0.12%

chromatogram pcr tracy sanger-trace-alignment sanger-sequencing sequencing variant-calling alignment genetic-engineering pcr-products

tracy's People

Contributors

Stargazers

Watchers

Forkers

lry198010 pyspider polojacky xiongxu hy101 rickeyestes changmianji-bigdatateam linzhi2013 sumerian-health a113n bioruffo hanx-iao jyamgj blex-max ssyamoako vikash84 takedaib lic-1233 hotliu

tracy's Issues

Couldn't anchor the Sanger trace in the selected reference genome

Hi,
I used "decompose" to call variants from sanger ab1 file (forward). But, I got the following massage:

Find Reference Match
Load FM-Index
Couldn't anchor the Sanger trace in the selected reference genome.````

I have checked the file and it has no problem.

> The sequence is:
AGATGCCTTGCTGCCCGTGNCTTTTGCGTGCAAGAGAACTGAGAGCNCCCAANGGGATGGAGCTGTGTAAGAAGTACCAGCAGCAGACCGTGGTGGCCATTGACCTGGCTGGAGATGAGACCATCCCAGGAAGCAGCCTCTTGCCTGGACATGTCCAGGCCTACCAGGTGGGTCCTGTGAGAAGGAATGGAGAGGCTGGCCCTGGGTGAGCTTGTCTCCCACCCATAGTTGGTGGAGAACAGTGACGATCGCA

`**Not: the reverse file has no problem, and run correctly**`

**`Note2: I have seen a closed issue, someone had the same problem. But, unfortunately I dont know the location of the sequence on the genome and I cant use a custom fasta file instead of reference genome.`**

`my command is: 
tracy decompose -o forward -a homo_sapiens -q 100 -u 100 -r GRch37/Homo_sapiens.GRCh37.dna.primary_assembly.fa.gz AD-F.ab1`

tracy: command not found

I was able to install tracy on Linux and I ran ./tracy_v0.5.7_linux_x86_64bit -h to open the software. The command line outputs saying "This is free software... etc.". I want to run an assemble and typed, tracy assemble file1.ab1 file2.ab1 but get an error that says: "tracy: command not found." What am I doing wrong?

How can I provide a file with a list of input files to tracy?

Hi,

This might be a very basic question, but I can't see how to do it in the manual help files. I would like to provide tracy with over 200 .abi input files at one go. Is it possible to provide a file containing a list of all input files rather than having them listed separately within the input command?

Thanks,
Jenni

Basecall on ab1 files with empty corrected signals (DATA9-12)

Hello,
I have a bunch of AB1 files without corrected intensity levels, so only data on fields DATA1-4.
When I try tracy on these files I am getting a segfault error.
Do you know if it is possible to perform basecalling on them? I suspect I will have to generate them again.

Thanks!

sequence and qual not of the same length

Hi,
Thank you for a great tool.
I am using it daily on 50-300 sanger files, and today one of the ab1 files produced sequence and qual score of different length in tracy basecall. the ab1file did not contain useful data, but I thought it might be relevant to report that tracy basecall can produce this issue.
I can send you the ab1 file if you are interested?
I am using Tracy version: v0.5.3 (conda install)

Base quality and consensus generation

First off - thanks for all the great work on tracy. It's quite amazing to me how few tools there are for performing trace file assembly - so thanks for filling this void with a very nice tool!

I have been using tracy quite a bit recently to assemble trace files, and perform variant calling relative to a reference sequence. Generally, this seems to work very well using tracy, but I have a question (related to a previous issue ) on the interplay between base-call confidence (on the chromatogram), and consensus formation.

I'm seeing incorrect consensus calls being made for a particular base where one of the trace files contains a low-confidence call and the other a high confidence call. From what I understand (based on your previous explanation) tracy does not use the base quality from the chromatogram, and I guess just choses on base over the other when there's a disagreement?

Here's what I'm seeing:

This shows 2 trace files in Geneious. When I assemble these using tracy assemble --format fastq --inccons trace1.ab1 trace2.ab1 the resulting consensus contains insertions at both positions highlighted in red. This is strange to me - the base quality in trace 2 is clearly higher than in trace 1. Or is it the case that with insertions in one trace file, there is no base to compare to in the second trace file, so the insertion is included in the consensus, irrespective of quality?

Is this expected behaviour?

Thanks for any help!

-p signal to noise ratio

It's not made explicit which way round the input signal to noise ratio is interpreted - I assume for example that the default 0.33 means noise 1:3 signal for a base to be called. Since it's not documented, could you clarify for me? Thanks very much.

How is a strand orientation in de novo assemble outputs determined?

Hello

To assemble forward and reverse ab1 files from Sanger sequence, I executed assemble command without reference (i.e. de novo assemble) like

$ tracy assemble forward.ab1 reverse.ab1

and expected that the aligned fasta display sequences in the same strand and direction as forward would be, but in fact, it followed in reverse strand manner.
It seems like it is not determined at random because I always get outputs in reverse strand manner, and when I switched the order of arguments, that is, reverse.ab1 first and forward.ab1 second, I got the aligned fasta in forward strand manner.

So, I guess that a strand orientation in de novo assemble outputs is always based on the last ab1 file given in command line.
Is this correct?

Failing to create reference due to lack of RAM

Hi!

I tried to create a fm9 file for the hg38, but the program quickly ate up all the 16GB of available RAM and killed the process. Is this normal behavior?

The file was in fasta.gz format.

Thanks!

consensus sequence in out.vertical file

Hello

I have install v0.6.1 to assemble forward and reverse Sanger sequences.
Because a source of the sequences is a cloning vector, I need to compromise mismatch sites between forward and reverse overlaps(that is, no heterozygous site is expected).
I think I should do this based on quality information, but now I'm wondering if an out.vertical file is the one I'm looking for.
My question is whether the consensus sequence(the most right character) in out.vertical file always chooses the one with higher quality.

(example)
-T|T
-C|C
-A|A
-A|A
-A|A
GG|G
AA|A
AA|A
GG|G
TT|T
CC|C
TT|T
AC|A <- is this A chosen because its quality is higher than C?
CC|C
CC|C

Indigo

I'm using the FASTA sequence (single sequence, as you can see in the image below) as a reference, to align my *ab1 file using the Indigo module (https://www.gear-genomics.com/indigo/). But, I don't know why the program it's returning the message: "Fasta file has incorrect file type!". Could you help me solve this problem?

My fasta file looks like this:

Peak percentage cut off

Thank you for the awesome tool! I'm using it to deconvolute SARS-CoV-2 Sanger data and I've found that sometimes I get erroneous variant calls because of some underlying noise in one of the reads that is not present in the other read.

For example if you look at the sequence AAACTG there is some underlying noise in the forward read (bottom) but not the reverse read, or there is contrasting noise that should really cancel out, but sometimes is called as a variant.

Ideally I would like to use Tracy like the Indigo tool (but I want to be able to have forward and reverse reads) and I would like to be able to tune the peak percentage cut off.

And another question, is there a way to have automatic annotation of Amino acids in the vcf file? I see that this is done in similar pipelines with bcftools and a gff file.

Issues Installing/Compiling on Mac

Hi,

Tracy seems awesome. However, I'm having trouble install/compiling on both an Intel Mac and an M1 Mac. Unfortunately, I don't have much experience with compilers and C++.

Attempt Installing from Source

Following the instructions in the documentation.

Some issues:

I tried installing the required system libraries via homebrew. I got a few "Caveats" about 'zlib' and 'bzip2':

==> bzip2
zlib is keg-only, which means it was not symlinked into /opt/homebrew,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

For compilers to find zlib you may need to set:
  export LDFLAGS="-L/opt/homebrew/opt/zlib/lib"
  export CPPFLAGS="-I/opt/homebrew/opt/zlib/include"

For pkg-config to find zlib you may need to set:
  export PKG_CONFIG_PATH="/opt/homebrew/opt/zlib/lib/pkgconfig"
==> bzip2
bzip2 is keg-only, which means it was not symlinked into /opt/homebrew,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.

If you need to have bzip2 first in your PATH, run:
  echo 'export PATH="/opt/homebrew/opt/bzip2/bin:$PATH"' >> ~/.zshrc

For compilers to find bzip2 you may need to set:
  export LDFLAGS="-L/opt/homebrew/opt/bzip2/lib"
  export CPPFLAGS="-I/opt/homebrew/opt/bzip2/include"

Do you have any recommendations about adding these compiler flags?

When compiling (i.e. make all), I got the following error

~/code/tracy (main) » make all                 
if [ -r src/htslib/Makefile ]; then cd src/htslib && autoreconf -i && ./configure --disable-s3 --disable-gcs --disable-libcurl --disable-plugins && /Library/Developer/CommandLineTools/usr/bin/make && /Library/Developer/CommandLineTools/usr/bin/make lib-static && cd ../../ && touch .htslib; fi
/bin/sh: autoreconf: command not found
make: *** [.htslib] Error 127

As such, following Stack Exchange, I downloaded the automake package which then provides the autoreconf command. This should be included in the list of packages (in the documentation) to be installed via homebrew.

After this, I got further in the compilation process but still failed. Here was the command that failed

g++ -std=c++14 -isystem /Users/adityaprasad/code/tracy/src/jlib/ -isystem /Users/adityaprasad/code/tracy/src/htslib/ -isystem /Users/adityaprasad/code/tracy/src/sdslLite//include -pedantic -W -Wall -O3 -fno-tree-vectorize -DNDEBUG src/tracy.cpp -o src/tracy -L/Users/adityaprasad/code/tracy/src/htslib/ -L/Users/adityaprasad/code/tracy/src/htslib//lib -L/Users/adityaprasad/code/tracy/src/sdslLite//lib -lboost_iostreams -lboost_filesystem -lboost_system -lboost_program_options -lboost_date_time -ldl -lpthread -lhts -lz -llzma -lbz2 -Wl,-rpath,/Users/adityaprasad/code/tracy/src/htslib/
In file included from src/tracy.cpp:13:
src/index.h:9:10: fatal error: 'boost/dynamic_bitset.hpp' file not found
#include <boost/dynamic_bitset.hpp>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [src/tracy] Error 1

I'm not sure what to do here. I have my own installation of boost. Do I have to link it to that somehow?

Attempt Installing via Conda

For the M1 Mac, this just fails to find the tracy package.

~ » conda install -c bioconda tracy            
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - tracy

Current channels:

  - https://conda.anaconda.org/bioconda/osx-arm64
  - https://conda.anaconda.org/bioconda/noarch
  - https://conda.anaconda.org/conda-forge/osx-arm64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/osx-arm64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/osx-arm64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

For the Intel Mac, it starts out better but still fails. In a pre-existing environment, I get the following error

~ » conda install -c bioconda tracy                         
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: | 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                          

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package libcxx conflicts for:
python=3.9 -> libffi[version='>=3.3,<3.4.0a0'] -> libcxx[version='>=4.0.1']
python=3.9 -> libcxx[version='>=10.0.0|>=12.0.0|>=14.0.6']

Package xz conflicts for:
python=3.9 -> xz[version='>=5.2.10,<6.0a0|>=5.4.2,<6.0a0|>=5.2.8,<6.0a0|>=5.2.6,<6.0a0|>=5.2.5,<6.0a0']
tracy -> htslib[version='>=1.17,<1.18.0a0'] -> xz[version='>=5.2.4,<5.3.0a0|>=5.2.5,<5.3.0a0|>=5.2.6,<5.3.0a0|>=5.2.6,<6.0a0']

Package zlib conflicts for:
python=3.9 -> sqlite[version='>=3.41.2,<4.0a0'] -> zlib[version='>=1.2.13,<2.0a0']
python=3.9 -> zlib[version='>=1.2.11,<1.3.0a0|>=1.2.12,<1.3.0a0|>=1.2.13,<1.3.0a0']

In a fresh environment, I get the following error

~ » conda install -c bioconda tracy                         
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: / 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                          

UnsatisfiableError:

Meaning of Trimming Stringency levels

Hello,

Tracy looks like a fantastic CLI, thank you for creating and maintaining it.

Can you please provide more details on the stringency -t option in the assembly subcommand. I am guessing 9 is the highest, 1 is the lowest. It would be great to know what the difference between stringency 4 and 5, and if the trimming is at the start/end only or will trimming happen on each trace/base.

I notice the decompose subcommand includes additional trimming parameters (trimLeft and trimRight), I think these would be handy to include in assembly too.

Thank you,

Ammar

Can tracy assemble trim option be disabled?

Hi,

Trim option is used by default in tracy assemble, but can tirm option be disabled?

Thank you!

--incref (to break consensus ties) is not a feature yet

Hi Tobias,

I know you added in the --incref feature to include reference in the consensus computation. However, I upgraded to most recent tracy and the --incref is still not an accessible feature. Do I have to manually install this feature somehow?

Cannot execute Tracy binary on mac

I'm trying to get Tracy to run (as part of a pipeline we're using in our lab) on a macOS catalina system. Unfortunately the binary can't execute on a mac - I imagine perhaps it simply wasn't designed to run on mac, but do you perhaps have any suggestions on how to get around this?

Bioconda tracy package runs very slowly where precompiled binary is fast

Per #62, running the bioconda tracy package is really slow for the consensus command at approx ~15s per run compared to about ~1 for the precompiled binary. I am calling tracy in a bash pipeline, within a loop and having activated the conda env prior to the loop, in the following way:

    if
    tracy consensus \
    -o "$out_dir"/assembly/"$code"_cons \
    -q 0 -u 0 -r 0 -s 0 \
    -b "$code" \
    "$ffile" \
    "$file" \
    >> "$out_dir"/logs/basecall_log.txt 2>&1
    then

    ...

The only thing that changes to use the precompiled binary is tracy consensus becomes ./tracy/tracy consensus. It is unclear to me why this change would result in such a massive slowdown. I could avoid the bioconda package, but I cannot get the precompiled binary to run on mac. Thanks!

I am perfoming testing on Ubuntu 20.04 with Tracy 0.7.1

terminate called while attempting to run tracy on WSL

I cannot get tracy consensus to run on certain .ab1 files on a colleagues WSL install. I can however get them to run on a native linux install. The error is the following:

terminate called after throwing an instance of 'std::length_error'
what(): cannot create std::vector larger than max_size()

I've attached a pair of traces from the failing batch below
004_B05.zip

Annotation error in the results generated sanger sequencer

The sequencer shows annotation for insertion one bp ahead instead of showing the position before and after the insertion. What could be the possible issue or is that an acceptable annotation?

Sample data for testing tracy

I am busy working on a Galaxy wrapper for the tracy subcommands. Could you perhaps provide some sample data to test with?

"File lacks basecalls!" error

Hi,
I have been trying to export ab1 basecalling into tsv, but I get the error "File lacks basecalls!". Is there anything I am missing to run it properly?
Here what I used:
tracy basecall -f tsv -o testout.tsv 10F_Kata_Arg4_Arg_F.ab1
I attach the sample file.
Thanks!
10H_Kata_Ser1_Ser_F.zip

Can't download latest tracy 0.5.7 via conda

I tried download Tracy on my conda as documented here: https://anaconda.org/bioconda/tracy.

However, this installs the old version (version: 0.5.3) and not the most recent 0.5.7 version. Is this a glitch on conda? I was able to download recent version as a statically linked binary.

Returning error code from bcftools

I created a malformed reference file that had a whitespace after ">". When trying to write the variant in the file, BCFTOOLS returned ID -1 error and the variant didn't appear in the final BCF. Still, tracy returned error code 0.

Is there any way that tracy could somehow signal this in the error code, as well as any other problem that might arise? No need for specifics, just a non-zero code. If this is missed, I think there is no way to differentiate the homozygous reference call from this error.

rs identifiers

Hello here,
I am working with tracy v0.5.3 on the commandline to call variants. I first downloaded a selected section of the human genome as the reference. On calling the variants, the results don't have the rs identifiers yet the web results give the rs identifiers

I thought i would resort to the full human genome reference (GRCH38) but this returned the error below
"terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc"

Triploids?

Hi,

Just a different question too.

We might have some individuals which are aneuploid - based on previous microsatellite work with my organisms populations.

I realise that Tracy wants a max of two alleles for decomposition. What happens if there are three? Does it just give a high decomposition error score?

Many thanks,
Jenni

docker images differences between quay.io and dockerhub

Hi @tobiasrausch
Thanks for developing and maintaining tracy. I've just discovered your tool and started trying it out, especially within a workflow manager like nextflow.

I noticed some differences between the docker containers of tracy that are made available to the community and I was wondering which one you were maintaining and would advise:

dockerhub: https://hub.docker.com/r/geargenomics/tracy/tags
quay.io: https://quay.io/repository/biocontainers/tracy?tab=tags

The dockerhub version is the most up-to-date with respect to the repository (built on 2023-10-27), it is really small in size (~15Mb on disk) and blazing fast! However, there is no tags to track down the version of tracy, the working directory is /root, which you need to look up in the Dockerfile.

The quay.io version has tags, and a working directory at the / but it is way (2-5x) slower for no reason that I could point to. This version was built in May but with the most recent tag of the repository.

One of the main differences between the two versions is that I can only use one of them within nextflow. There is no bash command in no bash command in the dockerhub version. This is an executable container, which is deprecated in workflow managers (see nextflow-io/nextflow#529).

This is unfortunate that the two versions of the containers do not agree as their combined pros would be awesome (tags, speed, small size, runnable in nextflow)!

Building the container using the Dockerfile in the repository create a container that is very similar to the one on dockerhub. However, with a small workaround there I could make the docker image be used by nextflow by adding bash to the alpine image via:

RUN apk add --no-cache bash

Let me know how would you envision this moving forward with the continuous releases of docker images.

Best,

Old index data format. Please rebuild your reference genome index!

I got this issue on the latest code when try to decompose a sanger seq result, what is the matter? Is this a samtools version related issue?

Options for specifying overlap, 'fracmatch' in consensus subcommand

Hi @tobiasrausch - thanks for the great work on this package to date. I've been experimenting with tracy consensus and tracy assemble and have found myself wanting the level of control afforded in the latter when defining the conditions under which a pair of reads would assemble in the consensus command. Specifically being able to specify a minimum overlap length would be useful. Is there a specific reason why these aren't considered common 'alignment options' across methods?

Thanks in advance for considering!

Tracy assemble time

This is not so much of an issue, but I just wanted to dig a bit into expected completion times for tracy assemble.

I'm running tracy assemble (de novo assembly) on 4 .ab1 files, each with around 1Kb of high quality calls. The sequenced construct is ~2Kb (so there's a lot of overlap between the 4 .ab1 files

Here's the command I'm running:

tracy assemble primer_f_1.ab1 primer_f_2.ab1 primer_r_1.ab1 primer_r_2.ab1 --format fastq

This takes around 2 minutes to complete, which seems like a long time. Is this expected?

cannot anchor to reference genome

Dear Tracy Team,

I have a .ab1 trace for which I can extract a fasta with Tracy basecall.
However, I am unable to get an anchor on my reference genome (a proprietary plant genome).
my command line is:

 tracy decompose  -v -g ref.fasta.gz -o BM21_G2_4-G2_F4.bcf trace_files/BM21_G2_4-G2_F4.ab1

Here is the message error from tracy:

[2020-Nov-30 12:57:37] Load ab1 file
[2020-Nov-30 12:57:37] Load FM-Index
[2020-Nov-30 12:57:40] Find Reference Match
Couldn't anchor the Sanger trace in the selected reference genome.

A blast against the db nevertheless returns me a hit with 2 hsps

>ref
Length=230546352

 Score = 948 bits (513),  Expect = 0.0
 Identities = 594/635 (94%), Gaps = 5/635 (1%)
 Strand=Plus/Plus

Query  12         TAT-TTCTAAA-ATTACTTTCAATAATGCCATTTATATTTACTTTGAAGCATATGTTGNT  69
                  ||| ||||||| ||||| || | ||||||||||||||||||||||||||||||||||| |
Sbjct  226270899  TATCTTCTAAATATTAC-TTAATTAATGCCATTTATATTTACTTTGAAGCATATGTTGTT  226270957

Query  70         TGAACTCTTCAAAACTATTGAAATAGGTGCATGTCGGATTCTCTAGAATTAAATTATTTT  129
                  |||||||||||||| ||||| |||||||||||||||||||||||||||||||||||||||
Sbjct  226270958  TGAACTCTTCAAAATTATTGTAATAGGTGCATGTCGGATTCTCTAGAATTAAATTATTTT  226271017

Query  130        GTATAATTTGACACCAACGCCATAACAATTTTCNAGAGTTCAAACAACATAGTTTGAAAA  189
                  ||| |||||||||||||||| |||||||||||  ||||||||||||||||||||||||||
Sbjct  226271018  GTAGAATTTGACACCAACGCGATAACAATTTTGGAGAGTTCAAACAACATAGTTTGAAAA  226271077

Query  190        CAATCATAATTGAAAAATTGCTGAAAAATATGTTACTCAAACTTTTTAAAAATTCTATCC  249
                  |||||||||||||||||||| ||||||||||||||||| |||||||||||||||||||||
Sbjct  226271078  CAATCATAATTGAAAAATTGGTGAAAAATATGTTACTCGAACTTTTTAAAAATTCTATCC  226271137

Query  250        AGTGTTTGTTGGATTCTCCATAAGTTGTACATTTTTGGAGGATCTAACACCAGCACGACC  309
                  |||||||||||||||||||| ||||||||||||||||||||||||||||| || ||||||
Sbjct  226271138  AGTGTTTGTTGGATTCTCCAAAAGTTGTACATTTTTGGAGGATCTAACACGAGTACGACC  226271197

Query  310        ATATTCTCGGAACTATATAAACCAAGTGTGTGTTTCATAGTAATTTTTTCTTATCAGATC  369
                  ||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  226271198  ATATTCTCGGAACAATATAAACCAAGTGTGTGTTTCATAGTAATTTTTTCTTATCAGATC  226271257

Query  370        CTTCCAAAATACACTATCACTATTCTGATGGATTTTTCTTTTGACCAAATTTTATTGCC-  428
                   ||| |||||||||||||||||||||||||||||||||||||||| ||||||||||||
Sbjct  226271258  TTTCAAAAATACACTATCACTATTCTGATGGATTTTTCTTTTGACAAAATTTTATTGCAG  226271317

Query  429        AATTTGTTCAATGCCAGTATATACACTTCATCTAGTACAATAAGAATGAGCGCTTTCAGA  488
                  ||||||||||||||||||||| ||||||||||||||||||| |||||| || ||||||||
Sbjct  226271318  AATTTGTTCAATGCCAGTATAGACACTTCATCTAGTACAAT-AGAATGGGCACTTTCAGA  226271376

Query  489        AATGATAAAAAATCCCAAAATTCTCGAAACATCCCAACAAGAAATGGACCATATTGTTGG  548
                  ||||||||||||||||||||||||| ||| ||| ||||||||||||||||| ||| ||||
Sbjct  226271377  AATGATAAAAAATCCCAAAATTCTCAAAAAATCACAACAAGAAATGGACCAAATTATTGG  226271436

Query  549        AAAAAATAGACGTCTAATTGAATCTGATATTCCAAATCTACCTTAATTACGAGTAGTTTG  608
                  ||||||||||||| ||||||||||||||||||||||||||||||| ||||| | | ||||
Sbjct  226271437  AAAAAATAGACGTTTAATTGAATCTGATATTCCAAATCTACCTTATTTACGTGCAATTTG  226271496

Query  609        CAAAGAANCATTTCNAAAACACCCTTCTACCCCAT  643
                  ||||||| |||||| |||||||||||| || ||||
Sbjct  226271497  CAAAGAAGCATTTCGAAAACACCCTTCAACACCAT  226271531


 Score = 211 bits (114),  Expect = 4e-52
 Identities = 181/215 (84%), Gaps = 1/215 (0%)
 Strand=Plus/Plus

Query  429        AATTTGTTCAATGCCAGTATATACACTTCATCTAGTACAATAAGAATGAGCGCTTTCAGA  488
                  |||||||| | |||  ||| | |||||||||||||  |||| |||||| || ||| ||||
Sbjct  226259276  AATTTGTTTACTGCTGGTACAGACACTTCATCTAGCGCAAT-AGAATGGGCACTTGCAGA  226259334

Query  489        AATGATAAAAAATCCCAAAATTCTCGAAACATCCCAACAAGAAATGGACCATATTGTTGG  548
                  |||||| |||||| | | |||| |  ||| | | |||||||||| |||||| ||| ||||
Sbjct  226259335  AATGATGAAAAATTCAACAATTTTGAAAAAAGCACAACAAGAAACGGACCAAATTATTGG  226259394

Query  549        AAAAAATAGACGTCTAATTGAATCTGATATTCCAAATCTACCTTAATTACGAGTAGTTTG  608
                  ||||||||||||| ||||||||||||||||||||||||||||||| ||||| | | ||||
Sbjct  226259395  AAAAAATAGACGTTTAATTGAATCTGATATTCCAAATCTACCTTATTTACGTGCAATTTG  226259454

Query  609        CAAAGAANCATTTCNAAAACACCCTTCTACCCCAT  643
                  ||||||| |||||| |||||||||||| || ||||
Sbjct  226259455  CAAAGAAACATTTCGAAAACACCCTTCAACACCAT  226259489

How are the base quality score generated?

Hi,

I am using tracy assemble to assemble between 2 - 4 trace files. I am outputting the consensus as a .fastq file, and then aligning this to a reference sequence.

Downstream, I am performing some analysis that filters on per-nucleotide quality scores, and I am not sure that I understand how the these are translated from the base signal from the chromatogram to the base quality of the consensus calculated within tracy assemble. Typically, I only see 2 different base quality scores on a consensus (e.g. 19 and 24).

Do you have any insight into this?

I'm calling tracy like so:

tracy assemble \
            --format fastq \
            --inccons \
            --trim 3 \
            --outprefix ${colony_id} \
            colony_1_p1.ab1 colony_1_p2.ab1

[feature request] Further options to control trimming with consensus

Would it be possible to implement a trim option to take a set number of bases after an initial trim threshold? e.g. for a sequence:

ACTGATCTACTAGATCCC

-q 5 -'crop' 10

we get:

~~ACTGA~~ TCTACTAGAT ~~CCC~~

Does that make sense?
Thanks!

Consensus not assembling properly

Hi,

I put in two .ab1 files and a reference file to generate a consensus. Tracy wasn't able to generate the proper nucleotide base as shown below (there is a "T" where there should be an "A" if this is based on majority rule.) I am wondering what the problem is, thank you.

Can Tracy train itself to take into account previously identified alleles?

Hi,

I'm looking to use Tracy to decompose alleles from several hundred sequence traces. It looks like a very useful tool - thanks for writing it.

My organism is quite heterozygous. I could expect that there would be more than one (many) variants within an allele.

Am I understanding that during decomposition it takes the first haplotype to be the closest match to the reference sequence, and denotes all other mutations present to the second haplotype?

Is it possible to instead ask it to:

Do this for all traces
Then re-assess - starting with paired haplotypes with low levels of mutations (perhaps even one which is identical to the reference and the other with a single SNP for example) to identify potentially a pair of haplotypes in another sample, where each has mutations relative to the reference - for example one might have one or two SNPs, commonly found as a haplotype in other samples, while the other haplotype has an indel and a third SNP...

Does the pearl command work this way? Or does it just align already decomposed sequences/primary basecalls from a non-decomposed sequence?

I realise either way it is still guessing the haplotype for the sequence.

Many thanks,
Jenni

Only single-chromosome FASTA files are supported

Hi,

I am using Tracy (Version: 0.7.1) to call SNVs from sanger sequencing ab1 files

Command is listed as follows,
./tracy decompose -o forward -a homo_sapiens -r /bioinfo/data/Genomes/NCBI/build37/Sequence/WholeGenomeFasta/human_g1k_v37.fasta GW21T135C07-2_K-E11-F_TSS20210511-021-02103_G01.ab1

However, "Only single-chromosome FASTA files are supported" was returned, and bcf file was not generated as expected.

Any suggestions to fix the issue?

Thanks,
Junfeng

How to get the name out of a trace file?

I used the following command:

tracy basecall -f json -o output_file_path input_file_path

I expected to get the name (identifier) in the output json file.

It would be useful if I can get the name using the same above API

Extract new traces from a “ tracy decompose” run?

Can tracy decompose write separate output ab1 files for the decomposed alleles?

tracy assemble options.

hello.
I assemble to sanger sequences used tracy assemble.
however, I want to use this tools myself, but I don't understand all of options.
so, i find manual in tracy github and paper, there was no mention of an explanation anywhere.
I want to know these options detaily.

The basis of the trim option. Trim option (-t) range is 1~9. is this value mean peak value or first length? or any other value? please explain this options.
I want to assemble sequence to use peak value. In this tools (tracy assemble), any options select sort peak options?

Thanks.
young yu.

Not human samples

Hello,
I have used tracy to call and annotate variants from human samples and it worked well. I have sanger sequenced samples of P.falciparum.
I am inquiring if there's a way i could use tracy to call and annotate these samples

Cannot perform de novo assembly

Looking at the tutorial and examples, you should be able to perform de novo assembly by using tracy assembly without a reference file. But when I attempt to do this with my ab1 files, it says that I need to specify a reference file. Example:

$ tracy assemble *ab1
[2021-Mar-26 13:22:34] tracy assemble 275F.ab1 275F-RC.ab1 3AccOut-RC.ab1 3InOut-RC.ab1 4-3LTR.ab1 4-eGFP-C.ab1 4-Frag-21-L.ab1 4-Frag-26-R-RC.ab1 5AccOut.ab1 5INOut.ab1 5LTR.ab1 eGFP-C-RC.ab1 eGFP-N-RC.ab1 Frag-05-L.ab1 Frag-09-L.ab1 Frag-10-L.ab1 Frag-14-R.ab1 Frag-14-R-RC.ab1 Frag-17-L.ab1 Frag-17-L-RC.ab1 Frag-18-L.ab1 Frag-21-L.ab1 Frag-22-R.ab1 Frag-22-R-RC.ab1 Frag-25-L.ab1 Frag-26-R-RC.ab1 Frag-27-L.ab1 Frag-33-R-RC.ab1 Frag-36-R.ab1 Frag-36-R-RC.ab1 Frag-37-L.ab1 Frag-39-L.ab1 
Please specify a reference file!

Any insight would be greatly appreciated. Thanks!

Can tracy's consensus option be updated?

I am using Tracy to generate consensus, and then the mutations are being analysed using Mummer. But when I was comparing the result with the Pearl (online) of the gear-genomics, some mutations were not seen. I traced the problem back to the base-calling. So, if it is possible to update the algorithm of Tracy to that of Pearl for base-calling, it will be great.

It is just my observation. I am not an expert. Excuse me if I am wrong.

Decompose doesn't give bcf file give json instead

tracy decompose -v -o outprefix -g ref.fna Sample.ab1
used this command for variant calling, getting a json file but no bcf

consensus not obtained from majority of bases present

I assembled the contigs with a reference to obtain the consensus shown below. However, the consensus generated a "A" where there was supposed to be a "G", based on majority rule. Is there a way I can make consensus generate a "G" based on the majority of bases present? some paramters I can set somehow?

`Couldn't anchor the Sanger trace in the selected reference genome.` error in tracy but not in Indigo

Hi,

I'm experiencing a very similar issue to #34 .

We've sequenced a chunk of the human MBL2 gene of ~250 nt. However, the machine sequences almost 1000 nucleotides; therefore, most of the sequence in the ab1 file is just rubbish.
For that reason I've decided to set -q 50 -u 750, so that the first 50 low-quality bases and the last 750 (false) bases are excluded.
I tried first with the whole GATK GRCh38 genome and got the error Couldn't anchor the Sanger trace in the selected reference genome. when running both Indigo (setting left and right trim sizes to 50 and 750, respectively) and tracy in the command line as follows:

tracy decompose -o forward -r Homo_sapiens_assembly38.fasta.gz -q 50 -u 750 MF-102_MBL2.ab1
[2023-May-04 12:10:37] tracy decompose -o forward -a homo_sapiens -r Homo_sapiens_assembly38.fasta.gz -q 50 -u 750 MF-102_2MBL2.ab1 
[2023-May-04 12:10:37] Load ab1 file
[2023-May-04 12:10:37] Find Reference Match
[2023-May-04 12:10:37] Load FM-Index
Couldn't anchor the Sanger trace in the selected reference genome.

As you pointed out here, that issue could be circumvented using a shorter sequence as a reference file.

Then I downloaded and indexed the fasta file for the MBL2 gene and repeated the process with the same parameters. Although it works well now with Indigo, tracy still fails with the same error message in the command line.

I'm using tracy v0.7.5 singularity container in CentOS 7.9.

Kernel too old

Hello there,
I was getting started with Tracy and the following error comes when I try to run the program from the conda installation and the precompiled binary,

(base) [larteag7@apolo banano-cultivables]$ conda create -n sangering bioconda::tracy bioconda::seqtk (sangering) [larteag7@apolo banano-cultivables]$ tracy basecall -f fastq B100_907R.ab1 FATAL: kernel too old Aborted (sangering) [larteag7@apolo banano-cultivables]$ tracy --help FATAL: kernel too old Aborted
Is there anyway to adress it?
Thanks in advance,
Luis Alfonso.

Alignment result differ when using indexed and unindexed genome

When I run these commands:

bgzip reference1.fasta
tracy index -o reference.fasta.fm9 reference.fasta.gz
tracy align --reference reference1.fasta.gz input1.ab1

I get a different result to doing

tracy align --reference reference1.fasta input1.ab1

The input files (reference1.fasta and input1.ab1) can be found here.

Here is the start of the alignment for the indexed case:

>input1
--------TTTTTTTTTGAGCGGGTCGAACCGTCACGAAAAGAAAAGGGGAAGAACCATCAGCAGGAGTAATCCGTATTTTAATTGGATCCACAT-TCATAGCAAACACCAAAAATCCATATTGGGACCACAATCCCAACAAAGACCACTGGAC-
AGAAGCCAACAAGGTAGGAGTGGGAG-CATTCGG-GCCTGGGTTC--ACTCCCCCACACGGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAAGGCATGCTGACAACATTACCAGCAAATCCGCCTCCTGCCTCCACCAATCGACAGTCAGGAAGG
CAGCCTACCCCAATCACTCCACCTTTGAG-AGACACTCATCCTCAGGCCATGCAGTGGAATTCCACAACATTCCACCAAGCTCTGCAGGATCCCAGAGTAAATCCTGCTGGTGGCTCCAGTTCCGGAACAGTGAACCCTG-TTCCGACTACTGCC
TCACTCATCTCGTCAATCTTCTCGAGGATTGGGGACCCTGCACCGAACATGGAAAGCATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCGGGGTTTTTCTTGTTGACAAAAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTG
GACTTCTCTCAATTTTCTAGGGGGAGCT--CCCGTGTGTCTTGGCCAAAATTCTCAGT--CCCAAACCTCCAGTCACTCACCAACCTCTTGTCCTCCAATTTGTCCTGCCTATCGCTGGATGTGTCTGCGGCGGTTTATCATATTCCT-CTTCAT
CGTGCTGCT------ATGCCTCATCATCCTGTTGGGTTCGTCTGCACCATCAAAAGAATGTTGCCCCGGGTTGTGATTAAAAATTCCAAGGAGCAAGAAAGCCACCCACTACGGGAACCAGGGCCGGAAGCTGAACAAGTCATTTTTCAAAGGAA
AATGAAAGATTTTCTTTCTTATTTGTGGGGGAAAAGCAAAAAAAGGAAAAAGGAAATTGGGGTTACAAACCCCACCCCCAAGGGATTTGGG--AAATACCATATTTTAAAGGGGAAAGGGCCGCATAACCCATTAAAAATTGCATATTTTAAATT
TTTTTTTTTTGAGAAAGAGGGGGGAC-------------------

and the same for the alignment without indexing:

>input1
TTTTTTTTTGAGCGGGTCGAACCGTCACGAAAAGAAAAGGGGAAGAACCATCAGCAGGAGTAATCCGTATTTTAATTGGATCCACATTCATAGCAAACACCAAAAATCCATATTGGGACCACAATCCCAACAAAGACCACTGGACAGAAGCCAAC
AAGGTAGGAGTGGGAGCATTCGGGCCTGGGTTCACTCCCCCACACGGAGGCCTTTTGGGGTGGAGCCCTCAGGCTCAAGGCATGCTGACAACATTACCAGCAAATCCGCCTCCTGCCTCCACCAATCGACAGTCAGGAAGGCAGCCTACCCCAAT
CACTCCACCTTTGAGAGACACTCATCCTCAGGCCATGCAGTGGAATTCCACAACATTCCACCAAGCTCTGCAGGATCCCAGAGTAAA------------TCCTGCTGGTGGCTCCAGTTCCGGAACAGTGAACCCTGTTCCGACTACTGCCTCAC
TCATCTCGTCAATCTTCTCGAGGATTGGGGACCCTGCACCGAACATGGAAAGCATCACATCAGGATTCCTAGGACCCCTGCTCGTGTTACAGGCGGGGTTTTTCTTGTTGACAAAAATCCTCACAATACCGCAGAGTCTAGACTCGTGGTGGACT
TCTCTCAATTTTCTAGGGGGAGCTCCCGTGTGTCTTGGCCAAAATTCTCAGTCCCAAACCTCCAGTCACTCACCAACCTCTTGTCCTCCAATTTGTCCTGCCTATCGCTGGATGTGTCTGCGGCGGTTTATCATATTCCTCTTCATCGTGCTGCT
ATGCCTCATCATCCTGTTGGGTTCGTCTGCACCATCAAAAGAATGTTGCCCCGGGTTGTGATTAAAAATTCCAAGGAGCAAGAAAGCCACCCACTACGGGAACCAGGGCCGGAAGCTGAACAAGTCATTTTTCAAAGGAAAATGAAAGATTTTCT
TTCTTATTTGTGGGGGAAAAGCAAAAAAAGGAAAAAGGAAATTGGGGTTACAAACCCCACCCCCAAGGGATTTGGGAAATACCATATTTTAAAGGGGAAAGGGCCGCATAACCCATTAAAAATTGCATATTTTA-AATTTTTTTTTTTTGAGAAA
GAGGGGGGAC--------

(Tracy version 0.6.1)

Support for generating consensus sequence from forward and reverse sanger

Hi,

Thanks for maintaining Tracy!

There are remarkably few (reliable) programs available for generating consensus sequences from forward and reverse Sanger data! Luckily, Tracy exists; I'm generating consensus sequences of fungal ITS from Sanger forward and reverse seqs in the following way:

# basecall the forward seq for use as reference
tracy basecall -f fasta -o ref.fa forward.ab1

# assemble reverse using forward seq as reference
tracy assemble -t 4 -d 1 -r ref.fa -o con reverse.ab1

Then I'm extracting the gap free consensus sequence from the output JSON. I am assembling with -d 1 on the understanding that it will ensure the consensus sequence only includes bases where the reads match.

This works so far, but it feels a bit hacky and there are some niceties which would be extremely helpful. Most importantly, in situations such as A - N it would be nice to be able to take A into the consensus sequence rather than dropping it when -d 1. It would also be nice to be able to output the consensus sequence directly. What do you think?

SCF version greater 2.9 required!

I am trying to assemble some old (c.a. 2010) Sanger traces and I'm logging the following error.

[2022-Feb-17 15:06:11] tracy assemble --inccons -o data0/tracy_assemble/6_512_1A04 ./data0/traces/6_512_1A04_ITS4_R0.ab1 ./data0/traces/6_512_1A04_ITS1F_R0.ab1 
[2022-Feb-17 15:06:11] Load ab1 files
SCF version greater 2.9 required!

Why is this the case, and is there any way we can work around this? In my opinion, it's a limitation to Tracy since so much .ab1 data is now old/outdated.

Thanks for maintaining Tracy!

What is the meaning of the `trim` parameter in the `assemble` command. Trim value 1 and 9 produced identical consensus sequences.

What are the units of the trimming stringency in the assemble command?
-t [ --trim ] arg (=4) trimming stringency [1:9]
What does 1-9 mean and may you please provide short description or refer us to the description of the trimming algorithm?

I have done assembly using different trimming stringency and got the following results:

	trim	length	f_leadingGaps	f_mid_gaps	m_f_gap_indexes	f_trailingGaps	r_leadingGaps	rev_mid_gaps	m_r_gap_indexes	r_trailingGaps	total_gaps	sequence
0	1	348	39	2	[4, 9]	0	0	0	[]	42	41	"TTTGATCGTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGCCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGCGGC"
1	2	315	133	0	[]	0	0	0	[]	120	133	"CGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAG"
2	3	320	57	0	[]	0	0	0	[]	52	57	"GGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGAC"
3	4	327	39	0	[]	0	0	0	[]	46	39	"CTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCC"
4	5	336	40	0	[]	0	0	0	[]	49	40	"GTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGG"
5	6	340	40	1	[3]	0	0	0	[]	49	41	"TCGTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGC"
6	7	343	40	1	[5]	0	0	0	[]	47	41	"GATCGTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGCG"
7	8	345	40	1	[6]	0	0	0	[]	45	41	"TGATCGTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATAAATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGCGG"
8	9	348	39	2	[4, 9]	0	0	0	[]	42	41	"TTTGATCGTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTAGAACGCTGAGAACTGGTGCTTGCACCGGTTCAAGGAGTTGCGAACGGGTGAGTAACGCGTAGGTAACCTACCTCATAGCGGGGGATAACTATTGGAAACGATAGCTAATACCGCATAAGAGAGACTAACGCATGTTAGTAATTTAAAAGGGGCAATTGCTCCACTATGAGATGGACCTGCGTTGTATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATACATAGCCGACCTGAGAGGGTGATCGCCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGCGGC"

Trim 1 and trim 9 produced identical consensus sequences, which does not make sense.
Another interesting observation is that trim value 2 has a very high values for the leadingGaps and trailingGaps both for F and R strands.

May you please provide short description of the trimming algorithm to understand whats going on.
Thank you!