zhangrengang / centromics Goto Github PK
View Code? Open in Web Editor NEWCentromics: visualing centromeres with multiple omics data
License: GNU General Public License v3.0
Centromics: visualing centromeres with multiple omics data
License: GNU General Public License v3.0
Hi!
Centromics is a user-friendly tool to detect the persumed centromeric regions. The output file "{prefix}.trf.count" shows different "CL" contents among the chromosomes as follows:
#chrom start end CL1 CL2 CL3 CL4 CL5 CL6 CL7
A01 18870000 18880000 8724.0 8532.0 0 250.0 0 0 0
A01 18880000 18890000 9542.0 9587.0 0 0 0 0 0
A01 18890000 18900000 9878.0 9867.0 0 0 0 0 0
But the "CL" seems are not a formal name of tandem repeats, I wonder are there any formal classification and naming systems of tandem repeats.
Thanks!
Hello, I would like to use hifi and hic data to find the centromere sequence of the genome.
However, when using HIC data, a Juice Box interface may appear. Then I imported merged_nodups.hic file, and what should be done afterwards.
Looking forward to your reply, thank you.
The command is as follows:
centromics -l /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifi.fq.gz -g /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/ref.fa -pre hifihic -hic /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic -outdir hifihic -tmpdir hifihic.tmp -ncpu 10
Running process:
(RepCent) hsqhsq /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data
$ centromics -l /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifi.fq.gz -g /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_dhic /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic -outdir hifihic -tmpdir hifihic.tmp -ncpu 10
23-04-11 18:56:58 [INFO] Command: /home/hsqhsq/miniconda3/envs/RepCent/bin/centromics -l /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/ex/home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/ref.fa -pre hifihic -hic /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/mhifihic -tmpdir hifihic.tmp -ncpu 10
23-04-11 18:56:58 [INFO] Version: 0.3
23-04-11 18:56:58 [INFO] Arguments: {'genome': '/home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/ref.fa', 'long': ['/home/hsq/hli/example_data/hifi.fq.gz'], 'hic': '/home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic', 'chip': None, 'prefi'hifihic', 'tmpdir': 'hifihic.tmp', 'subsample_x': 5, 'subsample_n': 100000, 'trf_opts': '1 1 2 80 5 200 2000 -d -h', 'min_cov': 0monomer_len': 1, 'clust_opts': '-m jaccard -k 15 -c 0.2 -x 2 -I 2', 'min_ratio': 0.1, 'window_size': 200000, 'chr_prefix': 'chr[\eanup': False, 'overwrite': False}
23-04-11 18:56:58 [INFO] ##Step: Processing long reads data
23-04-11 18:56:59 [INFO] Genome size: 132,081,078 bp
23-04-11 18:56:59 [INFO] Subsample 5x reads from ['/home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifi.fq.gz']
23-04-11 18:57:02 [INFO] Subsampled 42,150 reads(660,394,773 bases)
23-04-11 18:57:02 [INFO] Run TRF to identify tandem repeats in reads
23-04-11 18:57:16 [INFO] 20 commands in /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf/cmds.list, 0 chsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf/cmds.list.completed
23-04-11 18:57:16 [INFO] continue to run 20 commands
23-04-11 18:57:16 [INFO] VARS: {'tc_tasks': 10, 'mode': 'local', 'grid_opts': '-tc {tc}', 'cpu': 1, 'mem': '1g', 'cont': True, 're'out_path': '/home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf/cmds.list.out', 'completed': '/home/hsq/hli/example_data/hifihic.tmp/hifihic.trf/cmds.list.completed', 'cmd_sep': '\n\n\n', 'kargs': {}}
23-04-11 18:57:16 [INFO] running 20 commands: try 1
23-04-11 18:57:16 [INFO] reset tc_tasks to 10 by [10, 32, 129, 20]
23-04-11 18:57:16 [INFO] Start Pool with 10 process(es)
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.2.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.2.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.1.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.1.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.3.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.3.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.4.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.4.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.5.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.5.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.6.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.6.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.7.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.7.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.8.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.8.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.9.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_zhic.tmp/hifihic.trf/chunks.9.fasta.*.dat
23-04-11 18:57:16 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.10.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.10.fasta.*.dat
23-04-11 19:02:19 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.11.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.11.fasta.*.dat
23-04-11 19:02:28 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.12.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.12.fasta.*.dat
23-04-11 19:02:29 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.13.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.13.fasta.*.dat
23-04-11 19:02:31 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.14.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.14.fasta.*.dat
23-04-11 19:02:35 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.15.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.15.fasta.*.dat
23-04-11 19:02:39 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.16.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.16.fasta.*.dat
23-04-11 19:02:42 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.17.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.17.fasta.*.dat
23-04-11 19:02:44 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.18.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.18.fasta.*.dat
23-04-11 19:02:54 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.19.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.19.fasta.*.dat
23-04-11 19:03:10 [INFO] run CMD: cd /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.trf && trf /home/hsqsili/example_data/hifihic.tmp/hifihic.trf/chunks.20.fasta 1 1 2 80 5 200 2000 -d -h > /dev/null; ls /home/hsq/hhh/hsqhsqhsq/ceshi_ihic.tmp/hifihic.trf/chunks.20.fasta.*.dat
23-04-11 19:09:06 [INFO] finished with 0 commands uncompleted
23-04-11 19:09:07 [INFO] Cluster tandem repeats to identify TR families
23-04-11 19:09:07 [INFO] run CMD: REPclust /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/ hifihic.tmp/hifihic.trf.fa -m jaccard -k 15 -c 0.2 -x 2 -I 2 -pre hifihic. -outdir /home/hsq/hhh /hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.clust -tmpdir /home/hsq/hhh/hsqhsqhsq /ceshi_zhuosili/example_data/hifihic.tmp/hifihic.clust -p 10
23-04-11 19:09:25 [INFO] Filter tandem repeats as putive centromeric
23-04-11 19:09:37 [INFO] Align with genome and count
23-04-11 19:09:37 [INFO] run CMD: makeblastdb -in /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/exampl e_data/hifihic.tmp/hifihic.blast/hifihic..blastqry -dbtype nucl -out /home/hsq/hhh/hsqhsqhsq/ces hi_zhuosili/example_data/hifihic.tmp/hifihic.blast/hifihic..blastqry
23-04-11 19:09:38 [INFO] run CMD: blastn -query /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_ data/ref.fa -db /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.blast/hi fihic..blastqry -out /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.bla st/hifihic..blastqry.blastout -outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qen d sstart send evalue bitscore qlen slen sstrand' -num_threads 10 -task blastn-short -word_size 9 -dust no -soft_masking false
23-04-11 19:09:53 [INFO] Parse blast out
23-04-11 19:09:54 [INFO] New check point file: /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_d ata/hifihic.tmp/hifihic.hifihic.trf.count.ok
23-04-11 19:09:54 [INFO] ##Step: Processing Hi-C data
23-04-11 19:09:54 [INFO] 28 commands in /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifi hic.tmp/hifihic.hic.matrix100000.sh, 0 commands in /home/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/exampl e_data/hifihic.tmp/hifihic.hic.matrix100000.sh.completed
23-04-11 19:09:54 [INFO] VARS: {'tc_tasks': 10, 'mode': 'local', 'grid_opts': '-tc {tc}', 'cpu': 1, 'mem': '1g', 'cont': False, 'retry': 2, 'script': None, 'out_path': '/home/hsq/hhh/hsqhsqhsq /ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix100000.sh.out', 'completed': '/home/h sq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix100000.sh.completed', 'cmd_sep': '\n\n\n', 'kargs': {}}
23-04-11 19:09:54 [INFO] running 28 commands: try 1
23-04-11 19:09:54 [INFO] reset tc_tasks to 10 by [10, 32, 128, 28]
23-04-11 19:09:54 [INFO] Start Pool with 10 process(es)
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 Chr1 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-Chr1.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 Chr2 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-Chr2.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 Chr3 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-Chr3.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 Chr4 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-Chr4.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 Chr5 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-Chr5.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 ChrM BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-ChrM.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr1 ChrC BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr1-ChrC.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr2 Chr2 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr2-Chr2.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr2 Chr3 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr2-Chr3.100000.mat
23-04-11 19:09:54 [INFO] run CMD: java -jar /home/hsqhsq/miniconda3/envs/RepCent/lib/python3.8/ site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE /hom e/hsq/hhh/hsqhsqhsq/ceshi_zhuosili/example_data/merged_nodups.hic Chr2 Chr4 BP 100000 /home/hsq/ hhh/hsqhsqhsq/ceshi_zhuosili/example_data/hifihic.tmp/hifihic.hic.matrix/Chr2-Chr4.100000.mat
你好,我用如下命令跑:centromics -l ~/wang/cleanData/singleCell/hybrid/pacbioHifi/AZ-2.hifi_reads.fq.gz -g ../AZ-2hap2chr18.fa, 出现了错误,请问我该怎么解决呢?谢谢!
23-08-31 10:21:19 [INFO] run CMD: blastn -query ../AZ-2hap2chr18.fa -db /stor9000/apps/users/NWSUAF/2010110076/wang/hybrid/assembly/hifiasm16/AZ-2/singelCell/02_hic/011_juicerboxAZ-2h2/3d-dna/juicerbox/centromics/tmp/centomics.blast/centomics..blastqry -out /stor9000/apps/users/NWSUAF/2010110076/wang/hybrid/assembly/hifiasm16/AZ-2/singelCell/02_hic/011_juicerboxAZ-2h2/3d-dna/juicerbox/centromics/tmp/centomics.blast/centomics..blastqry.blastout -outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen sstrand' -num_threads 28 -task blastn-short -word_size 9 -dust no -soft_masking false
23-08-31 10:21:26 [INFO] Parse blast out
23-08-31 10:21:26 [INFO] New check point file: /stor9000/apps/users/NWSUAF/2010110076/wang/hybrid/assembly/hifiasm16/AZ-2/singelCell/02_hic/011_juicerboxAZ-2h2/3d-dna/juicerbox/centromics/tmp/centomics.centomics.trf.count.ok
23-08-31 10:21:26 [INFO] Copy /stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/circos
to /stor9000/apps/users/NWSUAF/2010110076/wang/hybrid/assembly/hifiasm16/AZ-2/singelCell/02_hic/011_juicerboxAZ-2h2/3d-dna/juicerbox/centromics/cent-output
Traceback (most recent call last):
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/bin/centromics", line 33, in
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 272, in main
pipeline.run()
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 133, in run
self.run_circos(tr_bed=tr_bed, tr_labels=tr_labels,
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 153, in run_circos
Circos.centomics_plot(self.genome, wkdir, *args, **kargs)
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 309, in centomics_plot
_n = n = stack_bed(tr_bed, tr_file, window_size=window_size)
File "/stor9000/apps/users/NWSUAF/2010110076/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 427, in stack_bed
for i in range(counts.shape[1]):
IndexError: tuple index out of range
Dear rengang,
When I run centomics in slurm, there is an unexpected error happened. The log as follows
23-11-27 20:38:34 [INFO] Command: /data/00/user/user225/mambaforge/envs/RepCent/bin/centromics -l hifi.fq.gz -g csa.t2t.fa -p 40
23-11-27 20:38:34 [INFO] Version: 0.3
23-11-27 20:38:34 [INFO] Arguments: {'genome': 'csa.t2t.fa', 'long': ['hifi.fq.gz'], 'hic': None, 'chip': None, 'prefix': 'centomics', 'outdir': 'cent-output', 'tmpdir': 'tmp', 'subsample_x
': 5, 'subsample_n': 100000, 'trf_opts': '1 1 2 80 5 200 2000 -d -h', 'min_cov': 0.9, 'min_len': 100, 'min_monomer_len': 1, 'clust_opts': '-m jaccard -k 15 -c 0.2 -x 2 -I 2', 'min_ratio': 0
.1, 'window_size': 200000, 'chr_prefix': 'chr[\\dXYZW]+', 'ncpu': 40, 'cleanup': False, 'overwrite': False}
23-11-27 20:38:34 [INFO] ##Step: Processing long reads data
23-11-27 20:38:48 [INFO] Check point file: `/data/01/user178/04.Csa.v2/13.t2t/tmp/centomics.centomics.trf.count.ok` exists; skip this step
23-11-27 20:38:48 [INFO] Copy `/data/00/user/user225/mambaforge/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/circos` to `/data/01/user178/04.Csa.v2/13.t2t/ce
nt-output`
23-11-27 20:39:15 [INFO] run CMD: `cd /data/01/user178/04.Csa.v2/13.t2t/cent-output/centomics.circos && circos -conf ./circos.conf`
Traceback (most recent call last):
File "/data/00/user/user225/mambaforge/envs/RepCent/bin/centromics", line 33, in <module>
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/data/00/user/user225/mambaforge/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 272, in main
pipeline.run()
File "/data/00/user/user225/mambaforge/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 133, in run
self.run_circos(tr_bed=tr_bed, tr_labels=tr_labels,
File "/data/00/user/user225/mambaforge/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 153, in run_circos
Circos.centomics_plot(self.genome, wkdir, *args, **kargs)
File "/data/00/user/user225/mambaforge/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 399, in centomics_plot
os.link(figfile, dstfig)
PermissionError: [Errno 1] Operation not permitted: '/data/01/user178/04.Csa.v2/13.t2t/cent-output/centomics.circos/circos.png' -> '/data/01/user178/04.Csa.v2/13.t2t/cent-output/centomics.c
ircos.png'
best,
kunjing
3-05-14 16:01:34 [INFO] run CMD: cd /media/ps/4T/Centromics/HapA/cent-output/hifi.circos && circos -conf ./circos.conf
Traceback (most recent call last):
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/shutil.py", line 791, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/media/ps/4T/Centromics/HapA/cent-output/hifi.circos/circos.svg' -> '/media/ps/4T/Centromics/HapA/cent-output/hifi.circos/circos.svg.bk'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ps/anaconda3/envs/RepCent/bin/centromics", line 33, in
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 272, in main
pipeline.run()
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 133, in run
self.run_circos(tr_bed=tr_bed, tr_labels=tr_labels,
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 153, in run_circos
Circos.centomics_plot(self.genome, wkdir, *args, **kargs)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 386, in centomics_plot
fmt_svg(svgfile)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 633, in fmt_svg
bksvgfile, svgfile = backup_file(svgfile)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/small_tools.py", line 288, in backup_file
shutil.move(input_file, input_file_bk)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/shutil.py", line 805, in move
copy_function(src, real_dst)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/shutil.py", line 435, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/ps/anaconda3/envs/RepCent/lib/python3.8/shutil.py", line 264, in copyfile
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/media/ps/4T/Centromics/HapA/cent-output/hifi.circos/circos.svg'
hi, thank you so much for this tool for identifying the centromere. I have run this tool with our long reads and HiC reads. I would like to ask for help in terms of the Centromics result and how to find the centromere sequences. Many thanks.
centomics.trf.fa
centomics.trf.count
centomics.hic.count.100000.intra_chr
centomics.hic.count.100000.inter_chr
centomics.candidate_peaks.bed
centomics.circos.png
centomics.circos
centomics.circos.pdf
centomics.circos_legend.pdf
centomics.circos_legend.txt
Hi, I got an error about "NameError: name 'pp' is not defined".
My command: "centromics -p 128 -l $hifi -g V127_CBMhap1.fasta -pre V127_CBMhap1-hifi_hifi -hic ./juicer/aligned/merged_nodups.txt"
"###STDOUT:<< >>
###STDERR:<< Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:135)
at htsjdk.tribble.util.LittleEndianInputStream.readString(LittleEndianInputStream.java:121)
at juicebox.data.DatasetReaderV2.getMagicString(DatasetReaderV2.java:111)
at juicebox.data.HiCFileTools.extractDatasetForCLT(HiCFileTools.java:57)
at juicebox.tools.clt.old.Dump.readArguments(Dump.java:353)
at juicebox.tools.HiCTools.main(HiCTools.java:85)
Traceback (most recent call last):
File "/home/wangj/miniconda3/envs/RepCent/bin/centromics", line 33, in
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 272, in main
pipeline.run()
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 120, in run
hic_bed = self.run_hic()
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 254, in run_hic
out1, out2 = count_obs(self.hic, chrLst=self.d_seqL.keys(), prefix=hic_count, tmpdir=tmpdir,
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Hic.py", line 17, in count_obs
hic2signals(fout1, fout2,
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Juicer.py", line 97, in hic2signals
d_files = hic2matrix(res=res, **kargs)
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Juicer.py", line 90, in hic2matrix
d_files = run_juicerbox(inHic, chr_combs, outdir=outdir, res=res, norm=norm, cmd_opts=cmd_opts)
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Juicer.py", line 23, in run_juicerbox
run_job(cmd_file, cmd_list=cmds, **cmd_opts)
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/RunCmdsMP.py", line 533, in run_job
exit = run_tasks(cmd_list, tc_tasks=tc_tasks, mode=mode, grid_opts=grid_opts,
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/RunCmdsMP.py", line 189, in run_tasks
job_status = pp_run(cmd_list, processors=tc_tasks)
File "/home/wangj/miniconda3/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/RunCmdsMP.py", line 322, in pp_run
job_server = pp.Server(processors, ppservers=ppservers)
NameError: name 'pp' is not defined"
The file of "slurm-2527306_127.txt" includes the whole working log. If you need more information, please let me know. Thanks!
hs_err_pid244723.log
slurm-2527306_127.txt
你好!
请问命令行中的-hic merged_nodups.hic 这个是juicer运行出来的 inter_30.hic还是merge_nodups.txt
hi dear developer,
Thanks for the software!
It seemed to be an error on circos plotting, but I have no idea how to solve it. Could you please give me some advices?
23-09-07 11:33:38 [INFO] finished with 0 commands uncompleted
23-09-07 11:33:38 [INFO] Count inter and intra-chromosomal signals
23-09-07 11:42:06 [INFO] New check point file: `/data2/home/ydn/yty/pame/0reassembly/9.structure/centromere/tmp/cent.cent.hic.count.100000.ok`
23-09-07 11:42:06 [INFO] Copy `/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/circos` to `/data2/home/ydn/yty/pame/0reassembly/9.structure/centromere`
23-09-07 11:42:19 [INFO] run CMD: `cd /data2/home/ydn/yty/pame/0reassembly/9.structure/centromere/cent.circos && circos -conf ./circos.conf`
Traceback (most recent call last):
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/shutil.py", line 791, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/data2/home/ydn/yty/pame/0reassembly/9.structure/centromere/cent.circos/circos.svg' -> '/data2/home/ydn/yty/pame/0reassembly/9.structure/centromere/cent.circos/circos.svg.bk'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/export/home/ydn/.conda/envs/yty/envs/centromics/bin/centromics", line 33, in <module>
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 272, in main
pipeline.run()
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 133, in run
self.run_circos(tr_bed=tr_bed, tr_labels=tr_labels,
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/pipe.py", line 153, in run_circos
Circos.centomics_plot(self.genome, wkdir, *args, **kargs)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 386, in centomics_plot
fmt_svg(svgfile)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/Circos.py", line 633, in fmt_svg
bksvgfile, svgfile = backup_file(svgfile)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/small_tools.py", line 288, in backup_file
shutil.move(input_file, input_file_bk)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/shutil.py", line 805, in move
copy_function(src, real_dst)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/shutil.py", line 435, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/export/home/ydn/.conda/envs/yty/envs/centromics/lib/python3.8/shutil.py", line 264, in copyfile
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/home/ydn/yty/pame/0reassembly/9.structure/centromere/cent.circos/circos.svg'
Best regards.
How to cite Centromics?
Directly referencing this github page?
hello, can hifi, ont, and hic data be used at the same time? How to use hifi and ont data at the same time? Many thanks.
Hello, recently I encountered the same error when running Centromics on different servers. My command is :
centromics -l hifi.clean.fasta.gz -g nogap_verkkofilling_Chr1_12.fasta -pre hifihic -hic merged_nodups.hic -tmpdir hifihic.tmp -ncpu 20
Here is the error message:
24-03-12 10:01:54 [INFO] run CMD: java -jar /public/home/zhaojiale/software/yes/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE merged_nodups.hic Chr2 Chr9 BP 100000 /public/home/zhaojiale/KLF/centromics/hifihic.tmp/hifihic.hic.matrix/Chr2-Chr9.100000.mat
24-03-12 10:01:55 [WARNING] exit code 1 for CMD java -jar /public/home/zhaojiale/software/yes/envs/RepCent/lib/python3.8/site-packages/Centromics-0.3-py3.8.egg/Centromics/bin/juicebox_tools.jar dump observed NONE merged_nodups.hic Chr2 Chr6 BP 100000 /public/home/zhaojiale/KLF/centromics/hifihic.tmp/hifihic.hic.matrix/Chr2-Chr6.100000.mat
:
24-03-12 10:01:55 [WARNING]
###STDOUT:<< HiC file version: 8
###STDERR:<< Exception in thread "main" java.lang.NullPointerException
at juicebox.tools.clt.old.Dump.extractChromosomeRegionIndices(Dump.java:467)
at juicebox.tools.clt.old.Dump.readArguments(Dump.java:364)
at juicebox.tools.HiCTools.main(HiCTools.java:85)
I'm not sure if this is a problem with the Java version, but the error I encountered is the same on both servers.
您好,我的基因组非常大,单条染色体长度都是1Gb以上,所以跑Centromics的时候在blastn这一步都是core dumped。所以我把染色体切割来跑,但是由于contig太多,到circos这一步就中断了。我想请问下centomics.candidate_peaks.bed这个文件就是最终的结果吗?如果是的话我怎么根据这个文件来找到具体的centomics的位置?谢谢!
这个文件的部分结果如下:
#chrom start end data_from peak_value sum_value
1000 80000 1930000 TR-CL1 0.0 0.0
1000 80000 1930000 TR-CL2 0.0 0.0
1000 80000 1930000 TR-CL3 0.0 0.0
1000 80000 1930000 TR-CL4 0.0 0.0
1023 560000 2120000 TR-CL1 0.0 0.0
1023 560000 2120000 TR-CL2 0.0 0.0
1023 560000 2120000 TR-CL3 0.0 0.0
1023 560000 2120000 TR-CL4 0.0 0.0
10291 0 30000 TR-CL1 0.0 0.0
10291 0 30000 TR-CL2 0.0 0.0
10291 0 30000 TR-CL3 0.0 0.0
10291 0 30000 TR-CL4 0.0 0.0
1038 330000 370000 TR-CL1 0.0 0.0
1038 330000 370000 TR-CL2 0.0 0.0
1038 330000 370000 TR-CL4 0.0 0.0
1076 40000 1020000 TR-CL6 6818.0 30136.0
1076 40000 1430000 TR-CL1 0.0 0.0
1076 40000 1430000 TR-CL2 0.0 0.0
1076 40000 1430000 TR-CL3 0.0 0.0
1076 40000 1430000 TR-CL4 0.0 0.0
1144 1630000 4980000 TR-CL1 0.0 0.0
1144 1630000 4980000 TR-CL2 0.0 0.0
1144 1630000 4980000 TR-CL3 0.0 0.0
1144 1630000 4980000 TR-CL4 0.0 0.0
1153 590000 6450000 TR-CL1 0.0 0.0
请问这里的1076 :40000-1020000这个文件就是潜在的centomics吗?
您好,没有找到例子的目录,另外SubPhaser的依赖包是不是指定的太严格了,安装一直说有软件冲突。
您好!非常感谢您的工作!
我用同样的输入文件,hic、hifi和genome,其他参数默认,运行了两次,得出来的结果不一致。我将两次得到的数量最多的CL进行了blastn,它们相似性也很低。
第一次最多的CL是:
`
CL8 N=3;L=273;center=TR250;weight=0.36;ratio=0.002%
AAGATGGTGCCACCTCTAAACGACAACCTAAACGGACCAAAACGTGACGAAAATGGTGCC
ACTCTCTTCACAACGGTTTGTGGAATATGCGACTCTTTCATCGCACAAAGAGGAGGAAAC
GGGACGAAGATGATGCCACATCTAAACGACACCCTAAACGGACCAAAACGTGACAAAAAT
GGTACCACTCTCTTCACAACGGTTTGTGGAATATTCGACTATTTCATCGCACAAAGAGGA
GGAAACGAGAGCCCAAAGTGGAGGAAACGGGCG
CL6 N=4;L=147;center=TR332;weight=0.26;ratio=0.002%
ATGGTGCTACCTTTAAACGACAACCTAAACGGACCAAAATGTGACGAAAATGGTGCACTC
TCTTCACAACGGTTTGTGGACTATGCGACTCTTTCATCGCACAAAGAGGAGGAAACGGGA
GCCCAAAGTGGAGGAAACGGGACGAAG
`
第二次最多的CL是:
`
CL6 N=4;L=147;center=TR227;weight=0.26;ratio=0.002%
ATGGTGCTACCTTTAAACGACAACCTAAACGGACCAAAATGTGACGAAAATGGTGCACTC
TCTTCACAACGGTTTGTGGACTATGCGACTCTTTCATCGCACAAAGAGGAGGAAACGGGA
GCCCAAAGTGGAGGAAACGGGACGAAG
CL10 N=2;L=297;center=TR545;weight=0.21;ratio=0.001%
TGAAGAGAGTGCACCATTTTCATCATGTTTTGGTCCGTTTAGAGTGCCGTTTAAAGGTAG
CACCATATTCGTCCCGTGTCCTCCACTTTGGGCTCCCGTTTGCTCCTCTTTGTGCACGAT
GAAAGAGTCGCATAATCCACAAACCGTTGTGAATAGAGTGCACCATTTTCGTCACGTTTT
GGTCAGTTTAATGTGTCGTTTAAATGTAGCACCATCTTCGTCCTGTTTCCTCCACTTTGG
GGTCCCGTTTGCTCCTCTTTGTGCGGATGAAAGAATTGCATAGTCCATAAACCATTG
CL8 N=2;L=1756;center=TR178;weight=0.74;ratio=0.001%
ACGGGAGCCCAAAGTGGAGGAAATGGGACGAAGATGGTGCTACCTTTAAACGACAACCTA
AACAAACAAAAACGTGACAAAAATGGTGCACTCTCTTCACATCGGTTTGCGGACTATGCA
ACTCTTTTAGCACACACAAAGAGGAACAAACCAGAGCCCAAAGTGGAGGAAACAGGACGA
AGATGGTGCTACCTTTAAACGACACACTAAACGGATCAAAACGTGATGAAAATGCTGCAA
TCTCTTACAATGGTTTGTGGACTATGCTACTCTTTAATCGTGCACAAAGAGGAGCAAACG
GGAGCCCAAAGTGGAGGAAATGGGACGAAGGTGGTGCTACTTTTAATCGAAACACTAAAT
GGACCAAAACGTGATGAAAATGGTGCAACTCTAATCACAATGGTTTTTGGACTATGTGAC
TCTTTCATCGTACATAGGGGAGGAAACGGGATCCCAAAGTGGAGGAAACGGGACAAATAT
GGTGCTACCTTAAAAGACACACTAAACACACCAAAACCTGACGATAATGGTGCATTCTCT
TCACAACGGTTGGTGGACTATGCGACTCTTTCATCGCACACAAAGAGGAGGAAACGGGAC
GAAGATGGTGCTACCTTTAAACGGCAAGCTAAACAGACCAAAATGTGACGAAAATGGTGC
ACTCTATTCACAATGGTTTGTGGAATATGCTACTCTTTCATCGCATAAAGAAAATGAAAC
GGAAGCCCAAAGTGGAGGAAATGGGACGAAGATGGTGCTACATTTAAATGACACACTAAA
CAGACCAAAATATGACAAAAATGGTACCACTCTCTTCACAATGGTTTGAGGAATATGCTA
CTCTTTCATCGCATAAAGAAAATCAAACGGAAGCCCAAAGTGGAGGAAATGGGACGAAGA
TGGTGCTACCTTTAAACGATAAACTAAACAAACCAAAACCTGATGAAAATGGTGCACTCT
CTTCACAACGGTTTGTGGACTATGCAACTCTTTCAGCACGCACAAAGAGGAGTAAACGAG
AGCCCAAATTGGAGGAAACGGGACAAATATAGTGCCACCTTTATACGACCACCTAAATGA
ACCAAAATGTGACAAAAATGGTACCACTCTCTTTACAATGGTTTGTGGAATATGCTACTC
TTTCATCGCATAAAGAGGAGGAAACTGGAGCCCAAAGTGGAGGAAATGGGATGAAGATGG
TGCTACCTTTAAACGACAACCTAAATAAACCAAAACGTGACGAAAATGGTGCACTCTCTT
GACAACGATTGATGCACTATGAAACACTTTTATCACGCACAAAGAATAGCAAACGAGATC
CCAAAATGGAGGAAACACGACGAAGATGGTGCTACCTTTAAACGACACACAAAACGGACC
AAAACGTGATGAAAATGGTGCACACTCTTACAACGGTTTGTGGACTATGCGACTCTTTAA
TCGCGCACAAAGAGGAGGAAACGGGAGCCCAAAGTGGAGGAAACAGGACGATGATGACGC
TACCTTTAAACGACAACCTAAACGAACCAAATGTGACAAAAATGGTGCACTCTCTTCACA
ACGGTTTAAGGACTATGCAACTCTTTCATCGCGCACAAAGAGGAGGAAACAGGAACCCAA
AGTGGAGGAAACGGGACGAAGATGGTGCCACCTTTAAACGACAACCTAAATGAACCAAAA
TGTGACTAAAATGGTACCACTATCTTCACAACGGTTTGTGGAATATGTTACTCTTTCATC
GCATTAAGAAGAGGAA
`
它们在染色体的位置是相似的。
请问这是什么原因呢?
谢谢!
Hello, I would like to use HiHi and chip data to find the centromere sequence of the genome.
However, When I run the following command, an error occurs:
Command:centromics -l m64144_220615_194259.ccs.fastq.gz -g L14.Chr.fasta -pre hifi -chip chip.merged.sorted.bam -hic merged_nodups.hic
Error:
[INFO] Grid computing is not available because DRMAA not configured properly: Could not find drmaa library. Please specify its full path using the environment variable DRMAA_LIBRARY_PATH
[INFO] No DRMAA (see https://github.com/pygridtools/drmaa-python), Switching to local mode.
Looking forward to your reply, thank you~
你好,十分感谢你,Centromics是一款很好用的软件,输入文件及参数非常详细。但是输出文件缺乏基本的介绍,我有以下2个疑问:
1.输出文件*.trf.count是一个bed文件,后6列为串联重复的计数,但是在我的文件中,出现了$4>$3-$2的行,请问这是软件本身统计的失误,还是我对计数的理解有误?
HiC_scaffold_1 40260000 40270000 10140.0 9928.0 0 0 0
HiC_scaffold_1 40270000 40280000 9879.0 9684.0 0 0 0
HiC_scaffold_1 40280000 40290000 10129.0 9923.0 0 0 0
HiC_scaffold_1 40290000 40300000 9876.0 9681.0 0 0 0
HiC_scaffold_1 40300000 40310000 10137.0 9924.0 0 0 0
2.输出文件*.trf.fa为如下的各串联重复的序列,每一条序列中的如N=2285,weight=0.42,ratio=0.523%这三个参数如何理解,在*.trf.count文件中,计数似乎远大于N这个数值。
CL1 N=2285;L=262;center=TR3376;weight=0.42;ratio=0.523%
TAACAGAGTGTTACTGTTCACAGAAAGCGCTAAACTGAGCGTT……
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.