hmpnk / csa2.6 Goto Github PK

View Code? Open in Web Editor NEW

10.0 1.0 6.0 1.56 MB

Chromosome Scale Assembler: A high-throughput chromosome scale genome assembly pipeline for vertebrate genomes

License: MIT License

Perl 93.50% Awk 3.63% Shell 2.87%

genome-assembly oxford-nanopore pacbio longread chromosome-level-assembly

csa2.6's People

Contributors

Stargazers

Watchers

Forkers

pythseq yangliandong yuzhenpeng biocko b0bbybaldi jiangchb

csa2.6's Issues

crash everytime

I assemble mouse genome with nanopore long reads, but crash every time and I cannot figure out the reason.
There is the error message:

`Fri Jun 19 09:20:34 CST 2020
MAP RAW READS TO ASSEMBLY

Fri Jun 19 09:53:52 CST 2020
WRITE GAP/CTGEND MATCHING RAW READS TO FILES

Fri Jun 19 09:58:39 CST 2020
RUN LOCAL ASSEMBLIES

*** Error in /home/yu/app/CSA2.6/INSTALL/../bin/wtdbg2.2/wtdbg-cns': free(): corrupted unsorted chunks: 0x00000000017deda0 *** *** Error in /home/yu/app/CSA2.6/INSTALL/../bin/wtdbg2.2/wtdbg-cns': free(): corrupted unsorted chunks: 0x0000000001d91da0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f50427fe7e5]
/lib/x86_64-linux-gnu/libc.so.6/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f504280737a]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f188afa737a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f5042d726ba]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f188afab53c]
/home/yu/app/CSA2.6/INSTALL/../bin/wtdbg2.2/wtdbg-cns[0x404415]
/home/yu/app/CSA2.6/INSTALL/../bin/wtdbg2.2/wtdbg-cns[0x4036e1]
/lib/x86_64-linux-gnu/libc.so.6(clone/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f188b5126ba]
00400000-00421000 r-xp 00000000 08:41 2139283 /home/yu/app/CSA2.6/bin/wtdbg2.2/wtdbg-cns
00620000-00621000 r--p 00020000 08:41 2139283 /home/yu/app/CSA2.6/bin/wtdbg2.2/wtdbg-cns
00621000-00622000 rw-p 00021000 08:41 2139283 /home/yu/app/CSA2.6/bin/wtdbg2.2/wtdbg-cns
017de000-01889000 rw-p 00000000 00:00 0 [heap]
7f502c000000-7f502c023000 rw-p 00000000 00:00 0`
Maybe my RAM is too low(188G)? If it is the reason, I think you should mention the minimal RAW requirement. Thank you!

CSa fails to go beyond step2

Hi there

I am trying to get CSA running since a while but it fails always at step 2. The test dataset works and runs through tho. It seems its something to do with Ragout, but the log is not useful to find out what the actual problem is. It always fails with the error "ERROR: Error reading permutations". Previously my problem was htat I was using zsh as shell and it choked on the bash scripts CSA generated, specifically on the multi-line commands. Using bash now, the test dataset worked but my actualy dataset still doesnt. The

cat parallel.log
index file FASTLAST.fai not found, generating...
awk: /home/ek/progz/CSA2.6/INSTALL/../script/update_maf_coords.awk:14: warning: escape sequence \.' treated as plain .'

cat ragout.log
[12:19:10] INFO: Cooking Ragout...
[12:19:10] INFO: Reading FASTA with contigs
[12:19:13] INFO: Converting MAF to synteny
Parsing MAF file
Started initial compression
Simplification with 30 10
Simplification with 100 100
Simplification with 500 1000
Simplification with 1000 5000
Simplification with 5000 15000
[12:19:15] INFO: Running Ragout with the block size 160000
[12:19:15] ERROR: Error reading permutations

In the folder mafworkdir/160000, the files are empty, the other ones up to 80000 are ok. I guess this is the problem but dunno how to fix this.

the content of the files in this folder are this:
$ cat blocks_coords.txt
Seq_id Size Description

$ cat coverage_report.txt
Seq_id Size Description

genomes_permutations.txt is empty.

Any ideas how to fix this?

problem if system language not en_US.UTF-8

Hi,
we stumbled across a little problem with the your script 03_REASSEMBLE.pl . We were using your pipeline on a Centos-7 system, where the language was not set to en_US.UTF-8. This led to an error in step 3 because the script tried to process a file called 'insgesamt', which it could not find.
We narrowed it down to the line 100, where you use a ' xargs wc -l | sort -k1,1rn | grep -vw total' .

We came up with a solution to set the system language in the beginning of the STEP_03.bash-skript.
export LANG='en_US.UTF-8'

Maybe you can think of a different solution or mention it in the Readme.
Thanks again for this nice Pipeline!

what are the difference between final assembly A and B?

I was using a prepared assembly genome as the input starting from the 02 step,and finally i get two version of assembly varying huge in scaffold numbers.Why are there two difference version of assembly and what are the difference between them.

hmpnk / csa2.6 Goto Github PK

csa2.6's People

Contributors

Stargazers

Watchers

Forkers

csa2.6's Issues

crash everytime

CSa fails to go beyond step2

the content of the files in this folder are this:
$ cat blocks_coords.txt
Seq_id Size Description

$ cat coverage_report.txt
Seq_id Size Description

problem if system language not en_US.UTF-8

what are the difference between final assembly A and B?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

hmpnk / csa2.6 Goto Github PK

csa2.6's People

Contributors

Stargazers

Watchers

Forkers

csa2.6's Issues

the content of the files in this folder are this: $ cat blocks_coords.txt Seq_id Size Description

$ cat coverage_report.txt Seq_id Size Description

Recommend Projects

Recommend Topics

Recommend Org

the content of the files in this folder are this:
$ cat blocks_coords.txt
Seq_id Size Description

$ cat coverage_report.txt
Seq_id Size Description