natir / yacrd Goto Github PK

View Code? Open in Web Editor NEW

70.0 3.0 8.0 37.44 MB

Yet Another Chimeric Read Detector

License: MIT License

Rust 100.00%

bioinformatics long-reads sequence chimera

yacrd's People

Contributors

Stargazers

Watchers

Forkers

rchikhi pythseq xtmgah yedomon jguhlin 1383385 kevinclydon carden24

yacrd's Issues

Compilation error due to incompatible clap version?

During an update due to migrating to GCC10 on Bioconda I'm seeing the following error:

2022-02-28T23:15:17.9297028Z 23:15:17 BIOCONDA INFO (OUT) error[E0432]: unresolved import `clap::Clap`
2022-02-28T23:15:17.9297782Z 23:15:17 BIOCONDA INFO (OUT)   --> src/main.rs:25:5
2022-02-28T23:15:17.9298056Z 23:15:17 BIOCONDA INFO (OUT)    |
2022-02-28T23:15:17.9298417Z 23:15:17 BIOCONDA INFO (OUT) 25 | use clap::Clap;
2022-02-28T23:15:17.9298802Z 23:15:17 BIOCONDA INFO (OUT)    |     ^^^^^^^^^^ no `Clap` in the root

Perhaps clap has changed its API?

trouble for installation

EDIT: False alarm, I forgot a parameter in the command. (chimeric).
It actually works very well. It was just the Readme that confused me a little bit 😅 .

Hi Pierre,

I'm very interested by your package, but the installation fails or almost fails. I tried to install from conda, source and cargo on my mac 10.14.4 and via Docker (official Rust image) for source and cargo. The two next commands outputs the same thing everytime:

Get the help message

yacrd 0.5.1 Omanyte
Pierre Marijon <[email protected]>
Yet Another Chimeric Read Detector

USAGE:
    yacrd [SUBCOMMAND]

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

SUBCOMMANDS:
    chimeric     In chimeric mode yacrd detect chimera if coverage gap are in middle of read
    help         Prints this message or the help of the given subcommand(s)
    scrubbing    In scrubbing mode yacrd remove all part of read not covered

Run chimera detection

yacrd -i /data/mapping.paf -o /data/reads.yacrd
error: Found argument '-i' which wasn't expected, or isn't valid in this context

USAGE:
    yacrd [SUBCOMMAND]

For more information try --help

Do you have any idea of the origin of the problem?

Regards,

Error: --input provided more than once....

I'm getting this error:

$ yacrd scrubb -i run1.all.report.yacrd -i run1.fastq -o run2.scrubbed.fastq
error: The argument '--input <input>' was provided more than once, but cannot be used multiple times

USAGE:
    yacrd --input <input> --output <output> scrubb --input <input> --output <output>

For more information try --help

Which seems odd that all the readme's show multiple inputs and outputs? I don't recall seeing this behavior on centOS. This is installed on Mac OS via bioconda.

$ yacrd --version
yacrd 0.6.0

post-chimeric detection operation: extract

Write in a file records if they contains a chimeric read.

Provide clearer documentation on output file name munging (and/or a means to specify output file names)

Hi there,

I've just been testing out yacrd, with the following command:

$MINIMAP2 -x ava-ont -g 500 -t 72 $1.chopped.fastq.gz $1.chopped.fastq.gz | \
yacrd chimeric -f $1.chopped.fastq.gz > $1_dechimerized.fastq.gz

ERR3219853.fastq.gz.chopped.fastq.gz
I ran it on a file named ERR3219853.fastq.gz (after also running PoreChop over that).

What I expected, based on the readme, was for the fastq.gz to be written to ERR3219853.fastq.gz_dechimerized.fastq.gz. Instead, what I got was a text-based report written to ERR3219853.fastq.gz_dechimerized.fastq.gz, and the dechimerized fastq written to `ERR3219853_filtered.fastq.gz.chopped.fastq.gz.

Re-reading the readme, and digging into the source code, I now understand what it's getting at, but I think it would be better to more explicitly state how yacrd munges the output filename by inserting _filtered before the ".". Even better would be to provide an option to specify the output filename (and report filename) explicitly.

Detection of chimeras in nanopore reads

Hi,

We are trying to detect chimeras in some nanopore 16s reads. Although we already know that there are some chimeras in our reads, we are not able to detect them using the parameters from the yacrd example:

minimap2 -x ava-ont -g 500 reads.fasta reads.fasta > overlap.paf
yacrd -i overlap.paf -o report.yacrd -c 4 -n 0.4 scrubb -i reads.fasta -o reads.scrubb.fasta

Could you please help us with the correct parameters to use for this detection?

Thanks in advance.

Regards

Filter from .yacrd report

Dear Natir,
Thanks for the great tool for removing chimera from nanopore reads! Amazing!
The scrubb and filter commands work nicely for me using .paf as an input. However, I want to filter my reads now by the .yacrd report file. I am getting an error: "src/stack.rs 205". The command I was using is:
yacrd -i test.yacrd -o chimera/test.output.yacrd filter -i test.fastq -o chimera/test.fastq

Maybe I am doing something wrong? I would be very grateful if you could help me out.
Thanks a lot in advance for your kind support.
Best,
Maraike

raw fastq file sequences ID did not fully appear in the *report.yacrd

Hi,
great tool, excited to get this to work!
I want to use yacrd to detect chimeric reads in my pacbio ccs data. The data is amplified and the coverage is a little bit low. After running yacrd, I noticed the raw fastq file sequences ID did not fully appear in the *report.yacrd, are there any reads that have been filtered and not labeled? If I just want to detect and remove chimeric reads, can I just delete the reads labeled Chimeric? What pipeline should I use? Could you please give me some suggestions?

This is the command I used:
minimap2 -x ava-pb A.fastq.gz A.fastq.gz >A.overlap.paf
yacrd -i A.overlap.paf -o A.report.yacrd split -i A.fastq.gz -o A.filter.fastq.gz

Thank you very much!

read and write compressed file

we can use :

zlib
bz2
gzstream
boost::iostreams

post-detection operation code factorization

Create a PostDetectionOperation trait to factorize many code between each post-operation type and support format.

1.0 progression

Multiple input #13
Multi-threading overlapping file parsing #14
Multi-threading post-chimeric detection operation #15
Implement post-chimeric detection:
- filtring
- spliting (only sequence data) #16
- extracting (invert operation of filtering) #17
Better panic message #18
Output write in json #20

Order of Chimera detection and scrubbing

I have been given some fastq files post demultiplexing via Guppy and I was thinking to checking chimera reads. This data is amplicon data from FMD virus (Amplicon size 400bp). The genome of mRNA virus is 8.3Kb long (I know you said yacrd is only for DNA direct sequencing here and ours is cDNA). And I was looking for best practices of analyzing this data. I wish to understand if I should run Chimera detection step first followed by read scrubbing or vice versa. I realized that scrubbing tend to split some chimeric reads in an issue here. So I was thinking to first perform chinera detection> splitting those reads as suggested in issue above > perform scrubbing on resulting fastq file?

The issue is we don't know weather to expect chimera reads or not because we are still experimenting around and we would like to know if there are such reads. So this is exploratory question.

Detect fasta or fastq without use extension

Issue #28 show yacrd scrubbing only accept sequence file with extension fasta and fastq.

We need support sequence file with any type of extension.

Wait integration of rust-bio/rust-bio#222 in stable version of rust-bio

yacrd usable for Nanopore cDNA reads?

Hi!

I am wondering if I can use ycard on my set of ONT cDNA reads that were generated using the SQK-PCS109 PCR-cDNA Sequencing Kit?
Will it remove rare splice variants?
Thanks for your thoughts on this.

Michael

Multi-threading overlapping file parsing

Two solution:

Assign a thread for each file merge table at end
- Pros: easy a implemented
- Against: really effective?
Create worker dispatcher architecture
- Pros: work with one file or many
- Against: require more code

json output

Write yacrd out in json format.

trouble in postdetection output generation

If input are like ../something/blabla.paf result need to be ../something/blabla_suffix.pafactualy is _suffix../something/blabla.paf

other test case ../something.other/blabla.paf

Can I use yacrd to remove palindromes in long reads obtained from MDA?

Hi!

I'd like to know if I can use yacrd to scrub long reads obtained from MDA samples, specifically in terms of removing palindromes.

If I run yacrd in a scrubbing mode, can I expect that yacrd would remove palindromes in long reads introduced by MDA?

Actually, I've tried to use Pacasus, a tool developed specifically to correct palindromes in long reads from MDA. (https://github.com/swarris/Pacasus)
But, unfortunately, I haven't succeeded in installing the tool on my system.

Thanks.

Ilnam

Deleted chimeric and no-coverage reads from a file in different file format

Add cli option :

-f --filter option takes the file to filter
-o --output takes the file name where the filtered data is written

Supported format :

fasta file (required seqan)
fastq file (required seqan)
paf
mhap

Comparisons with MiniScrub

Hello, @natir,

I'm currently looking into long read scrubbing and came about your tools and publications, and the comparison with DASCRUBBER, but you just briefly mention MiniScrub.

Do you have or know of any comparisons on effect of MiniScrub vs yacrd?

Greetings.

Better panic message

Wait for integration in rust stable channel of rust-lang/rust#44489

yacrd parameters optimization

Hi,

first of all, thank you for this tool. It's very helpful!
but I still don't get how it really works. I used yacrd on my nanopore genomic reads applying recommended parameters:
minimap2 -x ava-ont -g 500 reads.fasta reads.fasta > overlap.paf
yacrd -i overlap.paf -o report.yacrd -c 4 -n 0.4

in the output, I get a number of chimeric reads. One of these looks realistic with quite large bad regions in the alignment such as:

Chimeric 940ba8e3-795c-4739-8263-06f9f97f2a21 62054 2,0,2;2728,41848,44576;27,62027,62054
2)Chimeric 58077747-4acf-434b-8218-d34afda38a33 35725 39,0,39;3744,25149,28893;1,35724,35725

but some looks weird with just a little misalignment:

Chimeric 691a7c43-65da-4253-8d24-e483218e856a 71017 26,0,26;40,37103,37143;6,71011,71017
2)Chimeric aead8c29-bb59-4557-823c-2cb264d50148 41193 2,0,2;20,3635,3655;13,41180,41193

zero-coverage regions are less than 40 nt which is less than 1% of the overall read length. Why yacrd thinks that these reads are chimeric?
Could you explain how -c and -n parameters really work?
Thanks in advance!
Sergei

What pipeline should I use?

Hi,

I have 77 gb filtered ONT long reads (30x coverage of my target genome), now I would like to know if there are chimeric reads and if so I would like to either split/scrub them or discard them. So, what pipeline would you recommend in terms of using fpa and yacrd? I am confused what's the difference between split and scrub? To me, it looks like scrub also split chimeric reads and does some extra trimming, is this correct?

Thank you very much.

Best wishes,
Yutang

Why the overlap?

Dear all,
I successfully used yacrd after the minimap2 execution. It worked perfectly. I have the following doubt: why do we need to use the overlap step using minimap2. I refer to the instructions reported below (from the yacrd website):

minimap2 -x ava-ont my_reads.fq my_reads.fq > overlap.paf
yacrd -i overlap.paf -o reads.yacrd extract -i my_reads.fastq -o reads.extract.fastq

Here is again my question: why at step 1 do we have to perform the overlap task? Which information will give us back used to identify the chimeric reads?

Probably I miss some information.

Thank you very much for your support.

Dr. Mastriani Emilio

Compilation error: converting to 'std::priority_queue

Any ideas? Trying to make a homebrew package for it.

g++-5   -I/tmp/yacrd-20180421-139125-1jr8c12/yacrd-0.2/inc  -DNDEBUG -O3 -flto -march=native -mtune=native   -std=c++11 -o CMakeFiles/yacrd.dir/src/analysis.cpp.o -c /tmp/yacrd-20180421-139125-1jr8c12/yacrd-0.2/src/analysis.cpp
/tmp/yacrd-20180421-139125-1jr8c12/yacrd-0.2/src/analysis.cpp: 
In function 'std::unordered_set<std::__cxx11::basic_string<char> > yacrd::analysis::find_chimera(const string&, uint64_t, float)':
/tmp/yacrd-20180421-139125-1jr8c12/yacrd-0.2/src/analysis.cpp:52:15: 
error: converting to 'std::priority_queue<long unsigned int, std::vector<long unsigned int>, std::greater<long unsigned int> >' from initializer list would use explicit constructor 'std::priority_queue<_Tp, _Sequence, _Compare>::priority_queue(const _Compare&, _Sequence&&) [with _Tp = long unsigned int; _Sequence = std::vector<long unsigned int>; _Compare = std::greater<long unsigned int>]'
         stack = {};
               ^

Using gcc 5.5 on Linux:

cmake version 3.11.1
gcc version 5.5.0 (Homebrew gcc 5.5.0_4)

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value'

Hi, I installed yacrd and fpa by conda, fpa worked but yacrd failed, it reported error like this:
thread 'main' panicked at 'called Option::unwrap() on a None value', src/libcore/option.rs:355:21
note: Run with RUST_BACKTRACE=1 for a backtrace.
I tried on different machines but still, the same issue occurred . Could you please give any ideas about this? Thanks!

Reduce memory usage

At this stage all overlap is load and memory for large overlapping file it's a huge problem.

Mismatched types

I imagine this is a change in clap, but on Bioconda I'm running into the following errors during compilation:

2021-03-31T20:34:40.9529590Z 20:34:40 BIOCONDA INFO (ERR) error[E0308]: mismatched types
2021-03-31T20:34:40.9533120Z 20:34:40 BIOCONDA INFO (ERR)   --> src/cli.rs:45:17
2021-03-31T20:34:40.9535020Z 20:34:40 BIOCONDA INFO (ERR)    |
2021-03-31T20:34:40.9536860Z 20:34:40 BIOCONDA INFO (ERR) 45 |         short = "i",
2021-03-31T20:34:40.9538000Z 20:34:40 BIOCONDA INFO (ERR)    |                 ^^^ expected `char`, found `&str`
2021-03-31T20:34:40.9538590Z 20:34:40 BIOCONDA INFO (ERR)

Perhaps you've already fixed this in the main branch (I don't know rust, but I assume this should be 'i' rather than "i") and if so it'd be great if you could tag a new release soon.

Add --version switch

% yacrd --version
yacrd 0.2

Helps a lot in pipeline audits.

Thread 'main' panicked at 'called `Option::unwrap() ERROR

Hello @natir ,

I wanted to test your tool on a set of contigs, to see whether it can detect "chimeric" contigs as well. But after just 2 min, yacrd crashed with this error message:

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:345:21 note: Run with `RUST_BACKTRACE=1` for a backtrace.

The command I ran is :

yacrd -i sample.mecat2.racon.noN.self.olp.paf -o yacrd.out -f fasta -e fasta -s fasta
Any idea, what might have gone wrong?

Best,
Julien

thread 'main' panicked at 'slice index starts at 8092 but ends at 7507'

Hi,

I was running yacrd with the following commands:

${SINGULARITYdir}minimap2.simg minimap2 -x ava-ont -t $SLURM_CPUS_PER_TASK -g 500 ${TMPdir}filtered_reads.fq.gz ${TMPdir}filtered_reads.fq.gz >${TMPdir}overlap.paf

yacrd -i ${TMPdir}overlap.paf -o ${TMPdir}report.yacrd -c 4 -n 0.4 scrubb -i ${TMPdir}filtered_reads.fq -o ${TMPdir}reads.scrubb.fasta

and got the following error:

\thread 'main' panicked at 'slice index starts at 8092 but ends at 7507', src/libcore/slice/mod.rs:2670:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
SIGABRT: abort
PC=0x47cdab m=0 sigcode=0

goroutine 1 [running, locked to thread]:
syscall.RawSyscall(0x3e, 0x10421, 0x6, 0x0, 0x0, 0xc000110180, 0xc000110180)
	/usr/lib/golang/src/syscall/asm_linux_amd64.s:78 +0x2b fp=0xc00023be70 sp=0xc00023be68 pc=0x47cdab
syscall.Kill(0x10421, 0x6, 0x0, 0x0)
	/usr/lib/golang/src/syscall/zsyscall_linux_amd64.go:597 +0x4b fp=0xc00023beb8 sp=0xc00023be70 pc=0x479bcb
github.com/sylabs/singularity/internal/app/starter.Master.func2()
	internal/app/starter/master_linux.go:152 +0x61 fp=0xc00023bf00 sp=0xc00023beb8 pc=0x7928f1
github.com/sylabs/singularity/internal/pkg/util/mainthread.Execute.func1()
	internal/pkg/util/mainthread/mainthread.go:21 +0x2f fp=0xc00023bf28 sp=0xc00023bf00 pc=0x790f4f
main.main()
	cmd/starter/main_linux.go:102 +0x5f fp=0xc00023bf60 sp=0xc00023bf28 pc=0x972bbf
runtime.main()
	/usr/lib/golang/src/runtime/proc.go:203 +0x21e fp=0xc00023bfe0 sp=0xc00023bf60 pc=0x433b4e
runtime.goexit()
	/usr/lib/golang/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00023bfe8 sp=0xc00023bfe0 pc=0x45f7c1

goroutine 19 [syscall]:
os/signal.signal_recv(0xb9da80)
	/usr/lib/golang/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
	/usr/lib/golang/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
	/usr/lib/golang/src/os/signal/signal_unix.go:29 +0x41

goroutine 5 [chan receive]:
github.com/sylabs/singularity/internal/pkg/util/mainthread.Execute(0xc0003cc400)
	internal/pkg/util/mainthread/mainthread.go:24 +0xb4
github.com/sylabs/singularity/internal/app/starter.Master(0x7, 0x4, 0x10436, 0xc00000e140)
	internal/app/starter/master_linux.go:151 +0x44c
main.startup()
	cmd/starter/main_linux.go:75 +0x53e
created by main.main
	cmd/starter/main_linux.go:98 +0x35

rax    0x0
rbx    0x0
rcx    0xffffffffffffffff
rdx    0x0
rdi    0x10421
rsi    0x6
rbp    0xc00023bea8
rsp    0xc00023be68
r8     0x0
r9     0x0
r10    0x0
r11    0x202
r12    0xff
r13    0x0
r14    0xb83b64
r15    0x0
rip    0x47cdab
rflags 0x202
cs     0x33
fs     0x0
gs     0x0

Do you have any idea what could have triggered this error?
I am rerunning it now with backtrace enabled.

All the best and thank you, Dominik

yacrd for detecting chimeras in amplicon sequences

Hi,

I want to detect chimeras on 16S nanopore data, similar to this post I've tried vsearch now, but as vsearch was developed for high quality short reads, I think a lot of false positive chimeric sequences are found.

@natir in that post you state "If a read has a poor-quality region in the middle, it's considered chimeric.". But - if I'm not mistaken - this does not lead to correct chimera detection of amplicon chimeras? In amplicon sequencing the error profile (i.e. the poor-quality region) is not related to the read being chimeric or not.

So can I state correctly that yacrd is not suitable for chimera detection of amplicon nanopore data?

how about using -X when running minimap2?

Thanks for this nice tool!
Since it starts with an all-vs-all comparison, is it OK to use the parameter -X in minimap2 ("skip self and dual mappings (for the all-vs-all mode)") to save time and disk space?

MDA chimeric reads

Hey, @natir,

Any experience with using yacrd to clean MDA derived reads from wgaDNA?

escape characters `/` and `\` in ondisk filename

Error while running yacrd

Hi there,

I have been running the spaghetti.sh script for my Nanopore sequencing data. All samples work fine except for one which produces this error:

**Error: Error in compression detection of file barcode41_set2_comb-porechop-nanofilt.paf

Caused by:
File is too short, less than five bytes**

The file doesn't seem to have any issues (similar size and data as the other ones).

Any suggestions?

Multi-threading post-chimeric detection operation

Two solution:

Assign a thread for each input post-operation pair:
- Pros: easy a implemented
- Against: can we read one file for multiple thread at same time
A thread for each file with sub-thread for each post-operation required:
- Pros: one reader peer file
- Against: more complex architecture

interpreting results

Hi,

so the tool ran easily - thanks - but I am a little concerned with the results.

wc -l *.yacrd
1964840 iddm_report.yacrd

grep -c Chimeric iddm_report.yacrd
114108

grep -c NotBad iddm_report.yacrd
454940

grep -c NotCov iddm_report.yacrd
1395792

As I understand it, out of 1.9m reads, only 454k are NotBad and can therefore be used in further analyses ? From work to date with the unfiltered data (WGS Rat, just genomic alignments), I think most reads are pretty decent.

Or should I be happy with the NotCov reads ?

Commands:


srun -c 16 minimap2 -t 16 -x ava-ont -g 500 iddm_30kbp_3325_comb.fastq.gz iddm_30kbp_3325_comb.fastq.gz > iddm_overlaps.paf &
yacrd -i iddm_overlaps.paf -o report.yacrd -c 4 -n 0.4 scrubb -i iddm_30kbp_3325_comb.fastq.gz -o iddm_30kbp_3325_comb.fastq.gz.scrubb.fasta

Installation error

HI. I'm trying to install yacrd on CentOS 7.4.1708 with cmake v2.8.12.2 and GNU make v3..82

$ git clone https://github.com/natir/yacrd.git
Cloning into 'yacrd'...
remote: Counting objects: 129, done.
remote: Compressing objects: 100% (82/82), done.
remote: Total 129 (delta 70), reused 93 (delta 42), pack-reused 0
Receiving objects: 100% (129/129), 31.66 KiB | 0 bytes/s, done.
Resolving deltas: 100% (70/70), done.
$ cd yacrd
$ ls -l
total 12
-rw-r--r--. 1 root root 548 Mar 29 16:15 CMakeLists.txt
drwxr-xr-x. 2 root root 34 Mar 29 16:15 image
drwxr-xr-x. 2 root root 97 Mar 29 16:15 inc
-rw-r--r--. 1 root root 1071 Mar 29 16:15 LICENSE
-rw-r--r--. 1 root root 2480 Mar 29 16:15 Readme.md
drwxr-xr-x. 2 root root 117 Mar 29 16:15 src
drwxr-xr-x. 2 root root 28 Mar 29 16:15 test
$ mkdir build
$ cd build
$ cmake ..
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/software/yacrd/build
$ make
Scanning dependencies of target yacrd
[ 20%] Building CXX object CMakeFiles/yacrd.dir/src/analysis.cpp.o
c++: error: unrecognized command line option â-Wodrâ
c++: error: unrecognized command line option â-std=c++14â
make[2]: *** [CMakeFiles/yacrd.dir/src/analysis.cpp.o] Error 1
make[1]: *** [CMakeFiles/yacrd.dir/all] Error 2
make: *** [all] Error 2
$ ls -l
total 28
-rw-r--r--. 1 root root 12003 Mar 29 16:16 CMakeCache.txt
drwxr-xr-x. 6 root root 4096 Mar 29 16:26 CMakeFiles
-rw-r--r--. 1 root root 1582 Mar 29 16:16 cmake_install.cmake
-rw-r--r--. 1 root root 7669 Mar 29 16:16 Makefile

post-chimeric detection operation: spliting

If read as a chimera remove not covered region split read on this region.

Please clarify whether chimeric read detection also performs scrubbing

Could you please clarify (and maybe also mention in the documentation): when running yacrd chimeric, does it also perform read scrubbing?

I'd like to know whether I need to re-align the dechimerised reads -- ideally I'd like to not have to, as the all-to-all alignment is fairly expensive, even with minimap2.

Kernel panic at 'called `Option::unwrap()`

Hello, @natir,

I tried running yacrd to scrub my reads and got the following output:

thread 'main' panicked at 'called Option::unwrap() on a None value', src/libcore/option.rs:355:21
note: Run with RUST_BACKTRACE=1 for a backtrace.

I ran the script like this:

minimap2 -x ava-ont -g 500 SRR10150407_1_merged.fq.gz SRR10150407_1_merged.fq.gz > overlap.paf
yacrd scrubbing -c 3 -n 0.4 -m overlap.paf -s SRR10150407_1_merged.fq.gz -S reads_scrubbed.fasta -r scrubbed_report.yacrd

Cheers

example for splitting command line

What's the right way to obtain a splitted fasta?

I tried

yacrd -i 35k_all_vs_all.paf -s yacrd.fa

and it gives a few lines like:
Chimeric ERR1716491.27306 21246 152,0,152;187,2522,2709;13514,6541,20055;24,21222,21246 Chimeric ERR1716491.58156 26998 2361,0,2361;2992,2680,5672;3227,6300,9527;3615,9919,13534;2457,16289,18746;5,26993,26998 ...

and then crashes with

thread 'main' panicked at 'called Option::unwrap()on aNonevalue', libcore/option.rs:345:21 note: Run withRUST_BACKTRACE=1 for a backtrace.

producing no fasta file. Dataset is https://transfer.sh/Renf8/35k_all_vs_all.paf in case you need it

Faster yacrd

Hi @jguhlin.

You seem to have some interesting ideas to improve yacrd runtime, (I stole one, by the way), can we discuss it somewhere?

Mail, twitter, this issue ?

Thanks for your intrest on yacrd.