Code Monkey home page Code Monkey logo

Comments (3)

zhangrengang avatar zhangrengang commented on August 27, 2024 1

Yes, mismatches of sequence ID between input fasta and preexisting hmmscan outfile (*.domtbl) can result in this issue. It can be avoided by deleting the *.domtbl file or using -fw option. I will add a warning for this issue.

from tesorter.

zhangrengang avatar zhangrengang commented on August 27, 2024

Hi Shujun,
Please update TEsorter (the lastest conda version is v1.4.6). With the version 1.4, it is ok:

2023-05-03 15:37:42,995 -INFO- Command: /media/40T/wlx/zrg/share/home/app/.local/python3/bin/TEsorter test.fa
2023-05-03 15:37:42,996 -INFO- VARS: {'sequence': 'test.fa', 'hmm_database': 'rexdb', 'db_hmm': None, 'seq_type': 'nucl', 'prefix': None, 'force_write_hmmscan': False, 'processors': 4, 'tmp_dir': './tmp-63472734-e985-11ed-8106-4cd98fb9bbe7', 'min_coverage': 20, 'max_evalue': 0.001, 'min_probability': 0.5, 'no_cleanup': False, 'disable_pass2': False, 'pass2_rule': '80-80-80', 'no_library': False, 'no_reverse': False, 'genome': False, 'win_size': 270000, 'win_ovl': 30000, 'p2_identity': 80.0, 'p2_coverage': 80.0, 'p2_length': 80.0}
2023-05-03 15:37:42,996 -INFO- checking dependencies:
2023-05-03 15:37:43,098 -INFO- hmmer    3.3     OK
2023-05-03 15:37:44,391 -INFO- blastn   2.13.0+ OK
2023-05-03 15:37:44,392 -INFO- check database /media/40T/wlx/zrg/share/home/app/.local/python3/lib/python3.6/site-packages/TEsorter-1.4.1-py3.6.egg/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm
2023-05-03 15:37:44,392 -INFO- db file: /media/40T/wlx/zrg/share/home/app/.local/python3/lib/python3.6/site-packages/TEsorter-1.4.1-py3.6.egg/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm
2023-05-03 15:37:44,392 -INFO- REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm    OK
2023-05-03 15:37:44,392 -INFO- Start classifying pipeline (ELEMENT mode)
2023-05-03 15:37:44,472 -INFO- total 1 sequences
2023-05-03 15:37:44,473 -INFO- HMM scanning against `/media/40T/wlx/zrg/share/home/app/.local/python3/lib/python3.6/site-packages/TEsorter-1.4.1-py3.6.egg/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm`
2023-05-03 15:37:44,521 -INFO- Start Pool with 4 process(es)
2023-05-03 15:37:44,522 -INFO- translating `./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/chunk.1.fasta` in six frames
/media/40T/wlx/zrg/share/home/app/.local/python3/lib/python3.6/site-packages/Bio/Seq.py:2609: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future.
  BiopythonWarning)
2023-05-03 15:37:44,627 -INFO- Start Pool with 4 process(es)
2023-05-03 15:37:44,628 -INFO- run CMD: `hmmscan --nobias --notextw --noali --domtblout ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/chunk.1.fasta.aa.domtbl /media/40T/wlx/zrg/share/home/app/.local/python3/lib/python3.6/site-packages/TEsorter-1.4.1-py3.6.egg/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/chunk.1.fasta.aa > /dev/null`
2023-05-03 15:37:44,829 -INFO- generating gene anntations
2023-05-03 15:37:44,830 -INFO- 1 sequences classified by HMM
2023-05-03 15:37:44,830 -INFO- see protein domain sequences in `test.fa.rexdb.dom.faa` and annotation gff3 file in `test.fa.rexdb.dom.gff3`
2023-05-03 15:37:44,831 -INFO- classifying the unclassified sequences by searching against the classified ones
2023-05-03 15:37:44,886 -INFO- using the 80-80-80 rule
2023-05-03 15:37:44,886 -INFO- run CMD: `makeblastdb -in ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/pass1_classified.fa -dbtype nucl -out ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/pass1_classified.fa`
2023-05-03 15:37:46,226 -INFO- run CMD: `blastn -query ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/pass1_unclassified.fa -db ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/pass1_classified.fa -out ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7/pass1_unclassified.fa.blastout -outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs qcovhsp sstrand' -num_threads 4 `
2023-05-03 15:37:47,403 -INFO- 0 sequences classified in pass 2
2023-05-03 15:37:47,404 -INFO- total 1 sequences classified.
2023-05-03 15:37:47,404 -INFO- see classified sequences in `test.fa.rexdb.cls.tsv`
2023-05-03 15:37:47,404 -INFO- writing library for RepeatMasker in `test.fa.rexdb.cls.lib`
2023-05-03 15:37:47,404 -INFO- writing classified protein domains in `test.fa.rexdb.cls.pep`
2023-05-03 15:37:47,404 -INFO- Summary of classifications:
Order           Superfamily      # of Sequences# of Clade Sequences    # of Clades# of full Domains
DIRS            unknown                       1              0              0              0
2023-05-03 15:37:47,404 -INFO- cleaning the temporary directory ./tmp-63472734-e985-11ed-8106-4cd98fb9bbe7
2023-05-03 15:37:47,405 -INFO- Pipeline done.

from tesorter.

oushujun avatar oushujun commented on August 27, 2024

I think the issue might be due to preexisting TEsorter results matching the input name but the input content has been changed, so the existing results do not match with the new content. After deleting those results, TEsorter runs normally.

from tesorter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.