Code Monkey home page Code Monkey logo

nextpolish's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nextpolish's Issues

compile error in ubuntu 18

Hi,

I have a compile error here. Any suggestions?

gcc -g -Wall -O2 -I. -I../../lib/htslib/ -I./lz4 -c -o bam_tview_curses.o bam_tview_curses.c
bam_tview_curses.c:41:10: fatal error: curses.h: No such file or directory
#include <curses.h>
^~~~~~~~~~
compilation terminated.
Makefile:133: recipe for target 'bam_tview_curses.o' failed
make[2]: *** [bam_tview_curses.o] Error 1

Thanks
Bin

Question about -debug output

Hi,

I was wondering if you could tell me the specification for the stderr output when you run NextPolish with the -debug flag? It looks like it outputs 5 columns, and I have a vague idea what they mean, but I would like to make sure.

For context, I would like to make a file from my NextPolish run which is similar to the file which Pilon outputs when you run it with --changes (see here) and I think I can probably generate this from the -debug output, but to do so, I need to be confident about it's format.

Thanks! And awesome program by the way :)

make:No reaction

Hello,I have a problem executing "make".as follows:
[Zahng@fat01 NextPolish] $ make
mkdir /lustre/Zahng/biosoft/NextPolish/bin
make -C util;
make[1]: Entering directory `/lustre/Zahng/biosoft/NextPolish/util'
gcc -Wall -O3 -fvisibility=hidden -s -pthread -o seq_split seq_split.c thpool.c -lz

Staying at this step, nothing happens
Excuse me, will this step last for a long time?

NextPolish was core dumped in ‘01.lgs_polish/05.polish.ref.sh.work’

When I run nextPolish, the job stoped at 01.lgs_polish/05.polish.ref.sh.work, but there were no error log in nextPolish.sh.e. The python script named NextPolish/lib/nextpolish2.py did not use the CPU. And the jobs has been running for a long time(more than 1 day)at this step.

The directory structure of 05.polish.ref.sh.work was:
polish_genome0
├── [ 49M] genome.nextpolish.part000.fasta
├── [ 787] nextPolish.sh
├── [ 0] nextPolish.sh.done
├── [ 46K] nextPolish.sh.e
└── [ 6] nextPolish.sh.o
polish_genome1
├── [2.7G] core.336510
├── [ 47M] genome.nextpolish.part001.fasta
├── [ 787] nextPolish.sh
├── [ 43K] nextPolish.sh.e
└── [ 6] nextPolish.sh.o
polish_genome2
├── [ 49M] genome.nextpolish.part002.fasta
├── [ 787] nextPolish.sh
├── [ 0] nextPolish.sh.done
├── [ 52K] nextPolish.sh.e
└── [ 6] nextPolish.sh.o

The 3 lines at the end of nextPolish.sh.e as follow:
gap_aln:194 sup_aln:206 depth_cluster:2 gap_cluster:0 bin_len:500 median_depth:63
gap_aln:274 sup_aln:330 depth_cluster:2 gap_cluster:0 bin_len:500 median_depth:142
[INFO] 2020-06-08 23:17:04,873 Start a corrected worker in 336888 from parent 336295

nextPolish v1.2.1

My run.cfg file:
[General]
job_type = local
job_prefix = nextPolish
task = 55551212
rewrite = yes
rerun = 3
parallel_jobs = 3
multithread_jobs = 20
genome = ./A72_assembly.fasta
genome_size = auto
workdir = ./
polish_options = -p {multithread_jobs}

[sgs_option]
sgs_fofn = ./sgc.config
sgs_options = -max_depth 100

[lgs_option]
lgs_fofn = ./lgc.config
lgs_options = -min_read_len 1k -max_read_len 150k -max_depth 100
lgs_minimap2_options = -x map-ont

[polish_options]
ploidy=1

will it be OK to use HIFI reads to consensus the genome ?

Hi,
I have a genome generated with nanopore read , and I want to consensus it with Pacbio reads , will it be OK ? In addtion , how many iterations will be best to consensus or polish the genome because my assembly is large ?

BUG for task=best

In the example, when set task = best, "551212" will be used:
task = best # task need to run [all, default, best, 1, 2, 5, 12, 1212...], all=[5]1234, default=[5]12, best=[55]1212. (default: default)

But in the script lib/configParser.py download from https://github.com/Nextomics/NextPolish/releases/download/v1.0.3/NextPolish-CentOS6.9.tgz

def _settask(self):
import re
task = re.sub(r'[\s,;]+','', self.cfg['task'])
if task == 'all':
task = '1234'
elif task == 'default':
task = '12'
elif task == 'best':
### task = '12121212'
when tast=best, use 12121212, it's a bug?

Haikuan

Some confusion about the seq_split and the paried fasta file

Hi sir,

I use seqkit to filt out the sequence that contain "N" in my short reads file.

But i ignore the "paired" issue at the first.

I just use the filtered-not-totally-paired short reads to polish my genome.

The seq_split is working fine with the filtered-not-totally-paired short reads.

There is no bug info during the whole analysis.

But i am not sure that whether this kind of file can generate the right polished genome.

Do i need to filter out the unpaired reads after i filter out the containing "N" reads to get the good-paired NGS_1.fq and NGS_2.fq ?

Thanks a lot.

The completeness evaluated by BUSCO

Hi, Thanks for developing the tools. I compared the result of Nextpolish and pilon. The error rate of polished result by Nextpolish were lower than that of pilon, however, the completeness of BUSCO were also slightly reduced compared with pilon. Could you please provide some information about this? Thanks a lot.

long-read only polishing error: Error, task only accept: [all,1,2,3,4]

Hi,
I am trying to polish an assembly with NextPolish version 1.0.3 on a single-node with 24 cores. I am using just canu error corrected reads. As we do not have illumina short reads and sgs_option seems mandatory, i just passed an empty.fofn

This is the config file I provided with the intention of running 5 polishing iterations:

[General]
job_type = local
job_prefix = pacbio-polish-5-assembly
task = 55555
rewrite = yes
rerun = 2
parallel_jobs = 1
multithread_jobs = 24
genome = ./assembly.fa
genome_size = auto
workdir = ./polished_out
polish_options = -p {multithread_jobs}

[sgs_option]
sgs_fofn = ./assembly-illumina.fofn
sgs_options = -max_depth 100

[lgs_option]
lgs_fofn = ./assembly-pacbio-corr.fofn
lgs_options = -min_read_len 10k -max_read_len 150k -max_depth 60
lgs_minimap2_options = -x map-pb -t 6

I am getting this error after a minute:
[INFO] 2020-05-15 18:07:40,902 start...
[INFO] 2020-05-15 18:07:40,903 logfile: pid19519.log.info
[WARNING] 2020-05-15 18:07:40,922 Re-write workdir
[INFO] 2020-05-15 18:07:45,632 scheduled tasks:
[5, 5, 5, 5, 5]
[INFO] 2020-05-15 18:07:45,632 options:
[INFO] 2020-05-15 18:07:45,633 {'polish_options': '-p 24', 'rewrite': 1, 'job_prefix': 'pacbio-polish-5-assembly', 'job_type': 'local', 'cluster_options': '', 's
np_valid': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./polished_out/%02d.snp_valid', 'kmer_count': '/home/devel/my_genome__NextPolish_v1.
0.3_long_reads_5_iterations/./polished_out/%02d.kmer_count', 'lgs_fofn': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./assembly-pacbio-corr
.fofn', 'sgs_max_depth': '100', 'sgs_block_size': 500000000, 'lgs_max_read_len': '150k', 'parallel_jobs': '1', 'multithread_jobs': '24', 'snp_phase': '/home/deve
l/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./polished_out/%02d.snp_phase', 'genome': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iteration
s/./assembly.fa', 'genome_size': 2464389961L, 'workdir': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./polished_out', 'cleantmp': 0, 'sgs_a
lign_options': 'minimap2 -a -x sr -t 24', 'sgs_unpaired': '0', 'sgs_fofn': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./assembly-illumina.
fofn', 'align_threads': 24, 'sgs_use_duplicate_reads': 0, 'score_chain': '/home/devel/my_genome__NextPolish_v1.0.3_long_reads_5_iterations/./polished_out/%02d.sc
ore_chain', 'task': [5, 5, 5, 5, 5], 'lgs_max_depth': '60', 'lgs_block_size': 500000000, 'lgs_minimap2_options': '-x map-pb -t 6', 'rerun': 2, 'lgs_min_read_len'
: '1k'}
[INFO] 2020-05-15 18:07:45,633 step 0 and task 5 start:
[ERROR] 2020-05-15 18:07:45,633 Error, task only accept: [all,1,2,3,4]

Is it possible to polish just with the pacbio data? am I doing something wrong?

Thanks,
F.

Maybe a small bug?

Hi!
Thanks for maintaining NextPolish.
I think it's just a small issue below when I try to use NGS-data to polish:
Error message

Traceback (most recent call last):
  File "/public/home/wusong/sfw/NextPolish/nextPolish", line 460, in 
    main(args)
  File "/public/home/wusong/sfw/NextPolish/nextPolish", line 358, in main
    task = Task(set_task(cfg, 'map_genome', path = path, genomefile = genomefile, gtask = gtask), prefix = 'map_genome', convertpath = False)
  File "/public/home/wusong/sfw/NextPolish/nextPolish", line 33, in set_task
    cmd = map_genome(cfg, genomefile = args['genomefile'], task = args['gtask'])
  File "/public/home/wusong/sfw/NextPolish/nextPolish", line 144, in map_genome
    THREADS = str(cfg['align_threads']) if cfg['align_threads'] < 5 else '5'
TypeError: '<' not supported between instances of 'str' and 'int'

I am using NextPolish 1.2.4.
The code is modified to
THREADS = str(cfg['align_threads']) if int(cfg['align_threads']) < 5 else '5'
and it works.
Not so sure whether it's caused by the requirements or something else.

Best wishes
Song

I think there is some bug within nextpolish2.py

Hi sir,

Recently i use nextdenovo to get the draft genome, so i decide to polish my genome with nextpolish.
I set the "task=551212 ", "polish_options = -p 30 "and "parallel_jobs = 3" in run.cfg.
When the nextPolish went to polish.ref.sh.work of 00.lgs_polish step.
I just find the each nextpolish2.py jobs is using almost 100 threads at the beginnning of the process.
I have to stop it and do it one by one.
After this , i continue the whole polish jobs by taping the command "nextPolish run.cfg".
And the problem rose again when the nextPolish went to polish.ref.sh.work of 01.lgs_polish step.
i open the nextpolish.sh and i find the -p option is set as 30.
So i think maybe this is a bug in nextpolish2.py.
Maybe it its because my genome is huge, which is about 2 Gb?
Besides, i just download the version 1.2.3, and the same consequence show again

Hoping for your reply, thanks.

[ERROR] 2020-01-13 15:40:14,058 Error, task only accept: [all,1,2,3,4]

Hello,

I have an issue with NextPolish when I try to use both long and short reads here is my run.cfg

[General]
job_type = local
job_prefix = NP
task = 555512121212
rewrite = no
rerun = 10
parallel_jobs = 30
multithread_jobs = 30
genome = ref.fasta
genome_size = auto
polish_options = -p 30

[sgs_option]
sgs_fofn = ./sgs.fofn
sgs_options = -max_depth 500 -bwa

[lgs_option]
lgs_fofn = ./lgs.fofn    # input long reads file, one file one line.
lgs_options = -min_read_len 1k -max_read_len 150k -max_depth 200
lgs_minimap2_options = -x map-ont -t 20

However, it seems that NextPolish does not recognize the task "5":

[WARNING] 2020-01-13 15:40:14,058 mv ref.fasta to /path/to/4_NP.backup0
[INFO] 2020-01-13 15:40:14,058 step 0 and task 5 start:               
[ERROR] 2020-01-13 15:40:14,058 Error, task only accept: [all,1,2,3,4]

I also have this issue when using the --help / -h argument

./NextPolish/nextPolish -h                      
  File "../nextPolish", line 44                                  
    print '\033[35mPlease update to the latest version: %s, current version: %s \033[0m' % (latest, ver)
SyntaxError: invalid syntax

I downloaded and installed as in the GitHub.

Cheers,

AH

make error

make -C util;
make[1]: Entering directory `/public-supool/home/shenchen/software/NextPolish/util'
make[2]: Entering directory `/public-supool/home/shenchen/software/NextPolish/util/bwa'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/public-supool/home/shenchen/software/NextPolish/util/bwa'
fatal: Not a git repository: ../../.git/modules/util/samtools
make[2]: Entering directory `/public-supool/home/shenchen/software/NextPolish/util/samtools'
cd ../../lib/htslib/ && make lib-static
fatal: Not a git repository: ../../.git/modules/lib/htslib
fatal: Not a git repository: ../../.git/modules/lib/htslib
make[3]: Entering directory `/public-supool/home/shenchen/software/NextPolish/lib/htslib'
gcc -g -Wall -O2 -I.  -c -o cram/cram_io.o cram/cram_io.c
cram/cram_io.c:61:18: fatal error: lzma.h: No such file or directory
 #include <lzma.h>
                  ^
compilation terminated.
make[3]: *** [cram/cram_io.o] Error 1
make[3]: Leaving directory `/public-supool/home/shenchen/software/NextPolish/lib/htslib'
make[2]: *** [../../lib/htslib//libhts.a] Error 2
make[2]: Leaving directory `/public-supool/home/shenchen/software/NextPolish/util/samtools'
make[1]: *** [samtools_] Error 2
make[1]: Leaving directory `/public-supool/home/shenchen/software/NextPolish/util'
make: *** [all] Error 2

Operating system
Which operating system and version are you using?
You can use the command lsb_release -a to get it.
centos7

GCC 4.9.0

NextPolish
What version of NextPolish are you using?
You can use the command nextPolish -v to get it.
nextPolish v1.2.4

ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type

Running nextPolish v1.0.5

Execution of nextPolish

(base) berombau@agro:/data/dp1/public/output/nextpolish$ nextPolish run.cfg                                          
[INFO] 2019-12-08 20:45:12,272 start...                                                                              
[INFO] 2019-12-08 20:45:12,308 logfile: pid29840.log.info                                                            
[WARNING] 2019-12-08 20:45:12,309 Re-write workdir                                                                   
[INFO] 2019-12-08 20:45:12,309 scheduled tasks:                                                                      
[1, 2, 1, 2]                                                                                                         
[INFO] 2019-12-08 20:45:12,309 options:                                                                              
[INFO] 2019-12-08 20:45:12,309 {'polish_options': '-p 2', 'rewrite': 1, 'job_prefix': 'nextPolish', 'job_type': 'loca
l', 'cluster_options': '', 'snp_valid': '/data/dp1/public/output/nextpolish/./01_rundir/%02d.snp_valid', 'kmer_count'
: '/data/dp1/public/output/nextpolish/./01_rundir/%02d.kmer_count', 'lgs_fofn': 0, 'sgs_max_depth': '100', 'sgs_block
_size': 500000000, 'lgs_max_read_len': '150k', 'parallel_jobs': '10', 'multithread_jobs': '2', 'snp_phase': '/data/dp
1/public/output/nextpolish/./01_rundir/%02d.snp_phase', 'genome': '/data/dp1/public/output/medaka/consensus.fasta', '
genome_size': 300000000, 'workdir': '/data/dp1/public/output/nextpolish/./01_rundir', 'cleantmp': 0, 'sgs_align_optio
ns': 'minimap2 --split-prefix tmp -a -x sr -t 2', 'sgs_unpaired': '0', 'sgs_fofn': '/data/dp1/public/output/nextpolis
h/sgs.fofn', 'align_threads': '2', 'sgs_use_duplicate_reads': 0, 'score_chain': '/data/dp1/public/output/nextpolish/.
/01_rundir/%02d.score_chain', 'task': [1, 2, 1, 2], 'lgs_max_depth': '60', 'lgs_block_size': '500M', 'lgs_minimap2_op
tions': '-x map-ont', 'rerun': 3, 'lgs_min_read_len': '1k'}                                                          
[INFO] 2019-12-08 20:45:12,309 step 0 and task 1 start:                                                              
[INFO] 2019-12-08 20:45:12,310 analysis tasks done                                                                   
[INFO] 2019-12-08 20:45:12,311 total jobs: 3                                                                         
[INFO] 2019-12-08 20:45:12,312 Throw jobID:[29841] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/01.db_split.sh.work/db_split0/nextPolish.sh] in the local_cycle.                                                   
[INFO] 2019-12-08 20:45:12,814 Throw jobID:[29849] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle.                                                   
[INFO] 2019-12-08 20:45:13,316 Throw jobID:[29858] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle.                                                   
[INFO] 2019-12-08 21:47:27,938 db_split done                                                                         
[INFO] 2019-12-08 21:47:27,940 analysis tasks done                                                                   
[INFO] 2019-12-08 21:47:27,986 total jobs: 10                                                                        
[INFO] 2019-12-08 21:47:27,987 Throw jobID:[36460] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome00/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:28,489 Throw jobID:[36469] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome01/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:28,991 Throw jobID:[36481] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome02/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:29,494 Throw jobID:[36490] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome03/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:29,996 Throw jobID:[36499] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome04/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:30,499 Throw jobID:[36508] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome05/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:31,002 Throw jobID:[36517] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome06/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:31,504 Throw jobID:[36526] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome07/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:32,007 Throw jobID:[36535] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome08/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 21:47:32,509 Throw jobID:[36544] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/02.map.ref.sh.work/map_genome09/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-08 22:55:03,556 align_genome done                                                                     
[INFO] 2019-12-08 22:55:03,628 analysis tasks done                                                                   
[INFO] 2019-12-08 22:55:03,685 total jobs: 1                                                                         
[INFO] 2019-12-08 22:55:03,686 Throw jobID:[14907] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chai
n/03.merge.bam.sh.work/merge_bam0/nextPolish.sh] in the local_cycle.                                                 
[INFO] 2019-12-09 00:00:50,573 merge_bam done
[INFO] 2019-12-09 00:00:50,637 analysis tasks done
[INFO] 2019-12-09 00:00:50,683 total jobs: 10
[INFO] 2019-12-09 00:00:50,684 Throw jobID:[5428] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain
/04.polish.ref.sh.work/polish_genome00/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:51,186 Throw jobID:[5434] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain
/04.polish.ref.sh.work/polish_genome01/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:51,689 Throw jobID:[5440] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain
/04.polish.ref.sh.work/polish_genome02/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:52,191 Throw jobID:[5446] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome03/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:52,694 Throw jobID:[5453] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome04/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:53,196 Throw jobID:[5459] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome05/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:53,699 Throw jobID:[5465] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome06/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:54,201 Throw jobID:[5471] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome07/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:54,704 Throw jobID:[5477] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome08/nextPolish.sh] in the local_cycle.
[INFO] 2019-12-09 00:00:55,206 Throw jobID:[5483] jobCmd:[/data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome09/nextPolish.sh] in the local_cycle.
[ERROR] 2019-12-09 00:01:00,217 polish_genome failed: please check the following logs:
[ERROR] 2019-12-09 00:01:00,218 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome00/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,283 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome01/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,283 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome02/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,283 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome03/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome04/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome05/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome06/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome07/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome08/nextPolish.sh.e
[ERROR] 2019-12-09 00:01:00,284 /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome09/nextPolish.sh.e

Tail of the error:

(base) berombau@agro:/data/dp1/public/output/nextpolish$ tail /data/dp1/public/output/nextpolish/01_rundir/00.score_chain/04.polish.ref.sh.work/polish_genome09/nextPolish.sh.e
[INFO] 2019-12-09 00:00:55,274 Namespace(bam_lgs='/data/dp1/public/output/nextpolish/./01_rundir/00.score_chain/lgs.sort.bam', bam_sgs='/data/dp1/public/output/nextpolish/./01_rundir/00.score_chain/sgs.sort.bam', block='/data/dp1/public/output/nextpolish/./01_rundir/00.score_chain/input.genome.fasta.blc', block_index='9', count_read_ins_sgs=10000, debug=False, ext_len_edge=2, genome='/data/dp1/public/output/nextpolish/./01_rundir/00.score_chain/input.genome.fasta', indel_balance_factor_lgs=0.33, indel_balance_factor_sgs=0.5, max_clip_ratio_lgs=0.4, max_clip_ratio_sgs=0.15, max_count_kmer=50, max_indel_factor_lgs=0.21, max_ins_fold_sgs=5, max_ins_len_sgs=10000, max_len_kmer=50, max_snp_factor_lgs=0.53, max_variant_count_lgs=150000, min_count_ratio_skip=0.8, min_count_snp=5, min_count_snp_link=5, min_depth_snp=3, min_len_inter_kmer=5, min_len_ldr=3, min_map_quality=0, min_snp_factor_sgs=0.34, out='genome.nextpolish.part009.fasta', ploidy=2, process=2, task=1, trim_len_edge=2)
Traceback (most recent call last):
  File "/data/dp1/local/NextPolish/lib/nextPolish.py", line 331, in <module>
    main(args)
  File "/data/dp1/local/NextPolish/lib/nextPolish.py", line 216, in main
    CFG = P.config_init(args.genome, args.bam_sgs, args.bam_lgs)
ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type
Command exited with non-zero status 1
0.05user 0.01system 0:00.07elapsed 98%CPU (0avgtext+0avgdata 12704maxresident)k
0inputs+0outputs (0major+1911minor)pagefaults 0swaps

Error when installed

Hi,
When I installed NextPolish-1.0.3, there were some errors as follow:

微信图片_20190725100149
Could you give me some advices to solve it? Thank you very much!
Wang

Parameters for PacBio long reads polishing

Hi,
I have ~100x depth PacBio reads and a genome file generated by Flye. And I want to use NextPolish to improve my assembly. However, the parameters in the example run.cfg seems to be used for Nanopore data. So, I want some advice about parameters for PacBio data.

Thx!
gr

polishing strategy and rounds

Hi,

I have an assembly generated using wtdbg2 which could use wtpos-cns performing polishment. Will I get a better result through performing additional long- or short-reads polishing steps using the NextPolish? Would it produces conflict results using different polishing tools?
How many rounds can reach best status and how to determine?

nextpolish1.py generate the sequence with wrong id

Hi sir
I try the newest nextpolish, and i found the nextpolish1.py generate the genome file whose all seq is named "_np1".
I check the test_data result, the same problem is in there as well.
The problem make the samtools cannot index the file correctly, so we can`t get right blc file as well.
This problem will not stop the whole process, but will generate a one seq file in final.
The previous version is good at that.

Is there some way to solve this in the newset version?

Waiting for your reply.
Thanks a lot.

type error '<' not supported between instances of 'str' and 'int'

Hi,
It likes that db split process takes long time to run. May I skip split step?
After split step done, genome map step report TypeError which was related align_threads.

File "/home/usr1/Documents/biosoft/NextPolish/nextPolish", line 144, in map_genome
THREADS = str(cfg['align_threads']) if cfg['align_threads'] < 5 else '5'
TypeError: '<' not supported between instances of 'str' and 'int'

Any suggestions?

What is a proper "-min_read_len" options?

Hi sir, sorry to bother you again.

I use nextDenovo to assembly my genome, and the recommanded seed_cutoff is about 30k generated by seq_stat with "-d 45".
And a i got pretty good assembly result with long N50.
Then i wanna use nextpolish to polish my draft genome with my ONT and Illumina reads.
When i started to set to config of nextpolish, a little confusion came out.
I see the "-min_read_len" in here is 10k but the default is 1k.
So how can i set a proper one, or maybe i should use seq_stat to get a "-min_read_len" due to the "-max_depth" of lgs_options?

Thanks a lot :)

I got a simple question to ask :)

Hi guys
i just started to do some genome assembly jobs. So i got some confusion about the data type.

in the readme, should the long reads used to polish the genome be the corrected reads after i use nextdenovo to correct?

and should the short reads be the illumina seq?

Do i understand it right?

Thanks a lot!

Two bugs for Nextpolish v1.2.4

Describe the bug
(1) bug1: Make does not go through bwa compilation with my Arch linux workstation, but not with another CentOS computer.

Solution, I replace the source folder with bwa from git clone (git clone https://github.com/lh3/bwa.git), and make is successful now. bwa folder from https://nchc.dl.sourceforge.net/project/bio-bwa/bwa-0.7.17.tar.bz2 does not work.

(2)bug2: test is not successful when I run ./nextPolish test_data/run.cfg. If long nanopore reads are not used, error info disappears and test runs through.

Solution, use nextpolish v1.2.1 instead. Other versions have problems with long-read corrections.

Error message
(1) Bug1

[hui@archlinux NextPolish]$ make -j 72
mkdir /home/hui/software/NextPolish/test5/test4/NextPolish/bin
make -C util;
make[1]: 警告: jobserver 不可用: 正使用 -j1。添加 “+” 到父 make 的规则。
make[1]: 进入目录“/home/hui/software/NextPolish/test5/test4/NextPolish/util”
gcc -Wall -O3 -s -pthread -o seq_split seq_split.c thpool.c -lz
seq_split.c: 在函数‘get_fp_index’中:
seq_split.c:74:6: 警告:overflow in conversion from ‘uint32_t’ {或称 ‘unsigned int’} to ‘int’ changes value from ‘count = 4294967295’ to ‘-1’ [-Woverflow]
74 | k = count = -1;
| ^~~~~
seq_split.c: 在函数‘main’中:
seq_split.c:369:25: 警告:‘sprintf’ may write a terminating nul past the end of the destination [-Wformat-overflow=]
369 | sprintf(opt.out, "%s/%s", opt.outdir, opt.outpre);
| ^
seq_split.c:369:2: 附注:‘sprintf’ output 2 or more bytes (assuming 1025) into a destination of size 1024
369 | sprintf(opt.out, "%s/%s", opt.outdir, opt.outpre);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
seq_split.c:158:19: 警告:‘%03d’ directive writing between 3 and 10 bytes into a region of size between 0 and 1023 [-Wformat-overflow=]
158 | sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
| ^~~~
seq_split.c:158:15: 附注:directive argument in the range [0, 2147483647]
158 | sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
| ^~~~~~~~~~~~
seq_split.c:158:3: 附注:‘sprintf’ output between 14 and 1044 bytes into a destination of size 1024
158 | sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gcc -Wall -O3 -s -std=c99 -o seq_count seq_count.c -lz
make[2]: 进入目录“/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa”
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS utils.c -o utils.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS kthread.c -o kthread.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS kstring.c -o kstring.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS ksw.c -o ksw.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwt.c -o bwt.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bntseq.c -o bntseq.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwa.c -o bwa.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwamem.c -o bwamem.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwamem_pair.c -o bwamem_pair.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwamem_extra.c -o bwamem_extra.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS malloc_wrap.c -o malloc_wrap.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS QSufSort.c -o QSufSort.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwt_gen.c -o bwt_gen.o
bwt_gen.c: 在函数‘BWTIncBuildRelativeRank’中:
bwt_gen.c:879:10: 警告:变量‘oldInverseSa0RelativeRank’被设定但未被使用 [-Wunused-but-set-variable]
879 | bgint_t oldInverseSa0RelativeRank = 0;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
bwt_gen.c: 在函数‘BWTIncMergeBwt’中:
bwt_gen.c:953:15: 警告:变量‘bitsInWordMinusBitPerChar’被设定但未被使用 [-Wunused-but-set-variable]
953 | unsigned int bitsInWordMinusBitPerChar;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS rope.c -o rope.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS rle.c -o rle.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS is.c -o is.o
gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtindex.c -o bwtindex.o
ar -csru libbwa.a utils.o kthread.o kstring.o ksw.o bwt.o bntseq.o bwa.o bwamem.o bwamem_pair.o bwamem_extra.o malloc_wrap.o QSufSort.o bwt_gen.o rope.o rle.o is.o bwtindex.o
ar: u' 修饰符被忽略,因为 D' 为默认(参见 U') gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwashm.c -o bwashm.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwase.c -o bwase.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwaseqio.c -o bwaseqio.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtgap.c -o bwtgap.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtaln.c -o bwtaln.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bamlite.c -o bamlite.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwape.c -o bwape.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS kopen.c -o kopen.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS pemerge.c -o pemerge.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS maxk.c -o maxk.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtsw2_core.c -o bwtsw2_core.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtsw2_main.c -o bwtsw2_main.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtsw2_aux.c -o bwtsw2_aux.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwt_lite.c -o bwt_lite.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtsw2_chain.c -o bwtsw2_chain.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS fastmap.c -o fastmap.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwtsw2_pair.c -o bwtsw2_pair.o gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS main.c -o main.o gcc -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwashm.o bwase.o bwaseqio.o bwtgap.o bwtaln.o bamlite.o bwape.o kopen.o pemerge.o maxk.o bwtsw2_core.o bwtsw2_main.o bwtsw2_aux.o bwt_lite.o bwtsw2_chain.o fastmap.o bwtsw2_pair.o main.o -o bwa -L. -lbwa -lm -lz -lpthread -lrt /usr/bin/ld: ./libbwa.a(rope.o):/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa/rle.h:33: multiple definition of rle_auxtab'; ./libbwa.a(bwtindex.o):/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa/rle.h:33: first defined here
/usr/bin/ld: ./libbwa.a(rle.o):/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa/rle.h:33: multiple definition of `rle_auxtab'; ./libbwa.a(bwtindex.o):/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa/rle.h:33: first defined here
collect2: 错误:ld 返回 1
make[2]: *** [Makefile:30:bwa] 错误 1
make[2]: 离开目录“/home/hui/software/NextPolish/test5/test4/NextPolish/util/bwa”
make[1]: *** [Makefile:19:bwa_] 错误 2
make[1]: 离开目录“/home/hui/software/NextPolish/test5/test4/NextPolish/util”
make: *** [Makefile:18:all] 错误 2
[hui@archlinux NextPolish]$

(2) Bug2
[hui@archlinux NextPolish]$ ./nextPolish test_data/run.cfg
[INFO] 2020-07-28 01:30:46,296 start...
[INFO] 2020-07-28 01:30:46,297 logfile: pid1443877.log.info
[WARNING] 2020-07-28 01:30:46,298 Re-write workdir
[INFO] 2020-07-28 01:30:46,303 scheduled tasks:
[5, 1, 2]
[INFO] 2020-07-28 01:30:46,303 options:
[INFO] 2020-07-28 01:30:46,303 {'polish_options': '-p 3', 'rewrite': 1, 'job_prefix': 'nextPolish', 'job_type': 'local', 'cluster_options': '', 'snp_valid': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/%02d.snp_valid', 'kmer_count': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/%02d.kmer_count', 'lgs_fofn': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./lgs.fofn', 'sgs_max_depth': '100', 'align_threads': 3, 'sgs_block_size': 5556450L, 'lgs_max_read_len': '150k', 'parallel_jobs': '2', 'multithread_jobs': '3', 'snp_phase': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/%02d.snp_phase', 'genome': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./raw.genome.fasta', 'genome_size': 111129L, 'workdir': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir', 'cleantmp': 0, 'sgs_align_options': 'bwa mem -p -t 3', 'sgs_unpaired': '0', 'sgs_fofn': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./sgs.fofn', 'lgs_polish': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/%02d.lgs_polish', 'sgs_use_duplicate_reads': 0, 'score_chain': '/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/%02d.score_chain', 'task': [5, 1, 2], 'lgs_max_depth': '60', 'lgs_block_size': 3333870L, 'lgs_minimap2_options': '-x map-ont -t 3', 'rerun': 3, 'lgs_min_read_len': '1k'}
[INFO] 2020-07-28 01:30:46,304 step 0 and task 5 start:
[INFO] 2020-07-28 01:30:46,305 analysis tasks done
[INFO] 2020-07-28 01:30:46,309 total jobs: 4
[INFO] 2020-07-28 01:30:46,311 Throw jobID:[1443910] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split0/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:46,814 Throw jobID:[1443918] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:49,124 Throw jobID:[1443927] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:49,676 Throw jobID:[1443940] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:50,687 db_split done
[INFO] 2020-07-28 01:30:50,688 analysis tasks done
[INFO] 2020-07-28 01:30:50,691 total jobs: 2
[INFO] 2020-07-28 01:30:50,693 Throw jobID:[1443947] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/02.map.ref.sh.work/map_genome0/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:51,196 Throw jobID:[1443959] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/02.map.ref.sh.work/map_genome1/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:53,313 align_genome done
[INFO] 2020-07-28 01:30:53,314 analysis tasks done
[INFO] 2020-07-28 01:30:53,317 total jobs: 1
[INFO] 2020-07-28 01:30:53,319 Throw jobID:[1444012] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/03.merge.bam.sh.work/merge_bam0/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:54,974 merge_bam done
[INFO] 2020-07-28 01:30:54,975 analysis tasks done
[INFO] 2020-07-28 01:30:54,979 total jobs: 2
[INFO] 2020-07-28 01:30:54,981 Throw jobID:[1444026] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0/nextPolish.sh] in the local_cycle.
[INFO] 2020-07-28 01:30:55,483 Throw jobID:[1444035] jobCmd:[/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1/nextPolish.sh] in the local_cycle.
[ERROR] 2020-07-28 01:30:56,672 polish_genome failed: please check the following logs:
[ERROR] 2020-07-28 01:30:56,672 /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0/nextPolish.sh.e
[ERROR] 2020-07-28 01:30:56,672 /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1/nextPolish.sh.e
[hui@archlinux NextPolish]$less /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0/nextPolish.sh.e

hostname

  • hostname
    cd /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0
  • cd /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0
    /home/hui/.pyenv/versions/2.7.14/bin/python /home/hui/software/NextPolish/test5/test5/NextPolish/lib/nextpolish2.py -s -p 3 -g /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
  • /home/hui/.pyenv/versions/2.7.14/bin/python /home/hui/software/NextPolish/test5/test5/NextPolish/lib/nextpolish2.py -s -p 3 -g /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
    [INFO] 2020-07-28 01:30:55,054 Corrected step options:
    [INFO] 2020-07-28 01:30:55,054 Namespace(alignment_identity_ratio=0.8, alignment_score_ratio=0.8, auto=True, bam_list='/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list', block='/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc', block_index='0', genome='/home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta', out='genome.nextpolish.part000.fasta', process=3, read_type=1, split=0, window=5000000)
    /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0/nextPolish.sh:行 5: 1444030 段错误 (核心已转储)/home/hui/.pyenv/versions/2.7.14/bin/python /home/hui/software/NextPolish/test5/test5/NextPolish/lib/nextpolish2.py -s -p 3 -g /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l /home/hui/software/NextPolish/test5/test5/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta

Operating system
[hui@archlinux NextPolish]$ uname -a
Linux archlinux 5.7.5-arch1-1 #1 SMP PREEMPT Mon, 22 Jun 2020 08:10:02 +0000 x86_64 GNU/Linux

GCC
[hui@archlinux NextPolish]$ gcc -v
使用内建 specs。
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/10.1.0/lto-wrapper
目标:x86_64-pc-linux-gnu
配置为:/build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --with-isl --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror gdc_include_dir=/usr/include/dlang/gdc
线程模型:posix
Supported LTO compression algorithms: zlib zstd
gcc 版本 10.1.0 (GCC)

Python
I am using pyenv, so I tried both python 2.7.14 and python 3.8.0.

NextPolish
v1.2.4

To Reproduce (Optional)
Steps to reproduce the behavior. Providing a minimal test dataset on which we can reproduce the behavior will generally lead to quicker turnaround time!

Additional context (Optional)
Add any other context about the problem here.

short reads polishing stopped for the second round

Hi,
I am using Nextpolish for my genome , but it stopped at the second round of short reads polishing . I use the same code as the first round , and feed it with the sorted bam file . According to my experience , it will not take much time to polish genomes with short read , but I use 7 days and it still goes no further . My command is python /he_lab/share/data/local/NextPolish/lib/nextpolish1.py -g ./wtd_assem_short_pol1.fasta -t 1 -p 20 -s SGS_mapped_sorted2.bam -o wtd_assem_short_pol2.fasta . So can you give me some advice ?

Question about NextPolish

Hi developer,
I have some questions about NextPolish,please give me some suggestion:
1. Is it using the base quality information when nextPolish run with Nanopore raw data?
2. Your colleague told me that nextPolish was faster than Nanopolish, what do you improve that make nextPolish so faster?
3. Does nextPolish just get the concensus sequence? Or others?
4. How does nextPolish polish indel with Illumina data? Finally, the indel could be similar with Illumina data? Or nanopore data?
Thank you very much!

Errors or waining?

Hi,
When I used NextPolish, there wai some errors, what is the matter?

samtool
捕获

 Could you give me some advice to solve it? Thank you!

Wang

Has the bug fixed ?

I met the same problem mention in #34 ,I am asking is the bug fixed by now ? Or how can I workaround ?

job_type = local

HI, I test the nextpolish on local, but it does not work. And the error is
db_split failed: please check the following logs:
[ERROR] 2019-11-15 10:10:16,795 */NextPolish/test_data/01_rundir/00.score_chain/01.db_split.sh.work/db_split0/nextPolish.sh.e
[ERROR] 2019-11-15 10:10:16,795 */NextPolish/test_data/01_rundir/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e
[ERROR] 2019-11-15 10:10:16,795 */NextPolish/test_data/01_rundir/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e

What's the lowercase letters

Hi,

I found that the genome contain some bases in lowercase letters after polishing. I would like to know what these lowercase letters represent, is it repeat sequence or corrected sequence by Nextpolish?

In addition, I also found that the genome is larger than the original genome after polishing (about 10Mb longer for a 400Mb genome). Did NextPolish made some extensions for the original genome using shortgun reads?

Best

Nextpolish unfinished

Question or Expected behavior
I launch next polish to correct a genome with Pacbio data, I got a partial correction but the process never ends.

This is the command :
$ singularity exec --bind /data1/nextpolish:/mnt /home/ulg/Documents/program/test-singu/nphase/nextpolish.sif python /opt/NextPolish/lib/nextpolish2.py -g /mnt/60444-hybrid-complete.fasta -l /mnt/pb.map.bam.fofn -r clr -p 25 -a -s -o /mnt/pb.asm.nextpolish1.fa

it produced a partial pb.asm.nextpolish1.fa with 6389 contigs (while the genome has 6402 contigs). I have this message and nothing happen since 2 days:
gap_aln:0 sup_aln:0 depth_cluster:0 gap_cluster:0 split_count:0 bin_len:0 median_depth:0

Operating system

Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic

Sibgularity 3, Centos 7

GCC

$ singularity exec --bind /data1/nextpolish:/mnt /home/ulg/Documents/program/test-singu/nphase/nextpolish.sif gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)

Python
What version of Python are you using?
$ singularity exec --bind /data1/nextpolish:/mnt /home/ulg/Documents/program/test-singu/nphase/nextpolish.sif python --version

Python 3.6.8

NextPolish
What version of NextPolish are you using?
$ singularity exec --bind /data1/nextpolish:/mnt /home/ulg/Documents/program/test-singu/nphase/nextpolish.sif nextPolish -v

nextPolish v1.2.4

Additional context (Optional)
My computer has 64 Gb of Ram, can it be the issue (for a 1,5 Gb genome) ?

Thanks for the help,
Luc

erro occurred in test

Dear NextPolish developer,

Thank you for providing us with such a nice tool.

However when I run the test in NextPolish folder, erro was occurred as following,

"File "../nextPolish", line 44
print '\033[35mPlease update to the latest version: %s, current version: %s \033[0m' % (latest, ver)
^
SyntaxError: invalid syntax"

I don't know why. Could you give some advices to me?

Thank you again.

Ar

job halted while runing 05.polish.ref.sh

Hi,
I was running NextPolish, and when it proceeded to 05.polish.ref.sh step, the job just could not finish and go to the next step. The terminal showed the job was still running and no error info was given, but it showed no CPU and memory was used by the program. The run.cfg was given as following:

[General]
job_type = local
job_prefix = nextPolish
task = 1212
rewrite = yes
rerun = 3
parallel_jobs = 5
multithread_jobs = 2
genome = /home/user/necat/stylo/medaka/polished.assembly.fasta
genome_size = 1200000000
workdir = ./01_rundir
polish_options = -p {multithread_jobs}

[sgs_option]
sgs_fofn = ./sgs.fofn
sgs_options = -max_depth 100

make: *** [Makefile:13: seq_split] Error 1

Hi, there
I encounter an error here when I tried to install the nextpolish software. Any help would be appreciated. Thanks. Here's the error

gcc -Wall -O3 -fvisibility=hidden -s -pthread -o seq_split seq_split.c thpool.c -lz
seq_split.c: In function 'main':
seq_split.c:369:23: warning: '%s' directive writing up to 1023 bytes into a region of size between 0 and 1023 [-Wformat-overflow=]
  sprintf(opt.out, "%s/%s", opt.outdir, opt.outpre);
                       ^~               ~~~
seq_split.c:369:2: note: 'sprintf' output between 2 and 2048 bytes into a destination of size 1024
  sprintf(opt.out, "%s/%s", opt.outdir, opt.outpre);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
seq_split.c:158:19: warning: '%03d' directive writing between 3 and 10 bytes into a region of size between 0 and 1023 [-Wformat-overflow=]
   sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
                   ^~~~
seq_split.c:158:15: note: directive argument in the range [0, 2147483647]
   sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
               ^~~~~~~~~~~~
seq_split.c:158:3: note: 'sprintf' output between 14 and 1044 bytes into a destination of size 1024
   sprintf(fn, "%s.%03d.%s", opt->out, i, suffix);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sujitmaiti/miniconda3/bin/../lib/gcc/x86_64-conda_cos6-linux-gnu/7.3.0/../../../../x86_64-conda_cos6-linux-gnu/bin/ld: cannot find -lz
collect2: error: ld returned 1 exit status
make: *** [Makefile:13: seq_split] Error 1

Genome size halved after polishing

Dear developer,

I used both Pacbio and Illumina filtered reads for polishing and specified the genome size in the run.cfg file. But the final polished genome size was about half of the input one. I'm wondering if there are anything wrong in my cfg file showed below. I'll appreciate it if you have any suggestion.

Best regards,
Sen

[General]
job_type = local
job_prefix = nextPolish
task = 551212
rewrite = no
rerun = 3
parallel_jobs = 2
multithread_jobs = 5
genome = ./in.pacbio.hic.fa
genome_size = 270M
workdir = ./bg.20200410
polish_options = -p {multithread_jobs}

[sgs_option]
sgs_fofn = ./sgs.fofn
sgs_options = -max_depth 300 -bwa

[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 10k -max_read_len 150k -max_depth 150
lgs_minimap2_options = -x map-pb -t {multithread_jobs}

[polish_options]
-min_map_quality 30

questions about make

image
Dear sir,
When I install the software, some errors appeared, could you give some suggestions. Thanks a lot!

Best,

Question on NextPolish output

Hi, Dr. Hu, thanks for your awesome polisher. I have a question NextPolish output:
I found that NextPolish generates two output files. In my case it looks like this:
genome.nextpolish.part000.fasta
genome.nextpolish.part001.fasta
Why did my contigs split into two files? What does it mean? Is there any reason important for downstream analysis? What are the criteria for this split?
My output details:
wc ~/NextPolish/Acer_data/./01_rundir/02.kmer_count/polish.ref.sh.work/polish_genome/genome.nextpolish.part*.fasta
72 108 124308796 /home/crciv/AcerChrAssemb/NextPolish/Acer_data/./01_rundir/02.kmer_count/05.polish.ref.sh.work/polish_genome0/genome.nextpolish.part000.fasta
56 84 99657733 ~/NextPolish/Acer_data/./01_rundir/02.kmer_count/05.polish.ref.sh.work/polish_genome1/genome.nextpolish.part001.fasta
128 192 223966529 total
wc ~/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta
128 192 223966529 ~/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta

Thanks in advance!
Ural Yunusbaev.

using Nextpolish and after polished ,produce more contigs

Hi
i used Nextpolish to polish my primary contigs fasta finished by Canu2.0. I followed the tutoral——using PE-short reads and Pacbio long reads for two rounds each.
here is my code:
########################################################
job_type = local
job_prefix = nextPolish
task = best
rewrite = yes
rerun = 3
parallel_jobs = 8
multithread_jobs = 8
genome =genome.contigs.fasta
genome_size = auto
workdir = ./01_rundir
polish_options = -p 8

[sgs_option]
sgs_fofn = ./sgs.fofn
sgs_options = -max_depth 100

[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 10k -max_read_len 150k -max_depth 60
lgs_minimap2_options = -x map-ont -t 8
########################################################
but after i polished my genome contigs ,i found that the contigs numbers increased highly,following figures are my genome.contigs.fasta、genome.nextpolish.contigs.fasta consequences statistics
image
image

could you tell me how to deal with it?

NextPolish-1.04 Segmentation fault (core dumped)

Hi,
I have installed NextPolish-1.04 successfully, but there was error when I ran it. The command line was: ../nextpolish run.cfg, and the error was: Segmentation fault (core dumped). What is the matter? Can you help me to solve it, thank you.
wang

Getting error when running lib/nextpolish1.py using python3

Hi there, I am trying to run NextPolish using a some previously-created alignments, basically following your example, but I am getting an error if I run it with python3. Seems to work with python2 though.

Here is the traceback:

Traceback (most recent call last):
  File "/home/jesse/software/NextPolish/lib/nextpolish1.py", line 334, in <module>
    main(args)
  File "/home/jesse/software/NextPolish/lib/nextpolish1.py", line 229, in main
    print(seq_name + ' %d %d %c %c' % pbase, file=sys.stderr)
TypeError: %c requires int or char

some errors

image
Dear sir,
the new one also showed some mistakes, please give some suggestions, and the system is ubuntu 16, gcc is 5.4.0, thanks a lot!

Best

nextPolish: command not found...

Hi, while I follow the workflow to install NextPolish, I met an error "nextPolish: command not found" in the step of testing nextPolish after "make".
After the command "cd ./NextPolish && make", I didn't get success message of finishing installation but get the information "make[2]: Leaving directory /data/users/yurx/yanzhong/software/NextPolish/util/minimap2' make[1]: Leaving directory /data/users/yurx/yanzhong/software/NextPolish/util'" in the end and then finish the installation.
Then, I try to test the running of nextPolish in the directory of "NextPolish" using the command "nextPolish ./test_data/run.cfg", I get the message "nextPolish: command not found". I try to use "chmod 777 nextPolish" but it didn't work.
I will appreciate any help.

qustion about the times of iteration using long reads for consensus?

Hi,
I am using the Nextpolish for my genome consensusing using PacBio long reads , But the genome is very large to 8G, so it will take very long time to consensus the genome if I consensus it for several time , do you have any test for the iteration to get a satisfying result ? I see the iterations for SGS data on the github ,but not see the recommended times for long reads .

About the question for polishing long reads

I found that I can correct genome using long reads through nextpolish2.py in the lib directory of NextPolish after mapping and indexing, but we also can correct genome using long reads through seting parameters in the [lgs_option] in the config file (run.cfg), so I wanna know what's the difference between these two ways?

About running time

Hi!
Is it normal that samtools is sleeping and using less memory and threads?
Because when I check the status, it shows like below:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
167899 song 20 0 9892424 4.793g 968 S 0.3 5.2 42:06.05 samtools

I am using 1.2.4.

error when run test data

Hi,
There is an error when run test data.
[ERROR] 2020-06-30 09:43:36,569 polish_genome failed: please check the following logs:
[ERROR] 2020-06-30 09:43:36,570 /NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome0/nextPolish.sh.e

the tail of the log:
Command terminated by signal 11
0.03user 0.00system 0:00.24elapsed 18%CPU (0avgtext+0avgdata 10812maxresident)k
0inputs+0outputs (0major+1743minor)pagefaults 0swaps

The nextPolish version is v1.2.4. The system is ubuntu 18.04.

Any suggestions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.