Code Monkey home page Code Monkey logo

rrefinder's People

Contributors

alexamk avatar loraine-gueguen avatar prihoda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rrefinder's Issues

Missing files in exploratory mode

The following error shows when running program in exploratory mode:

Reading in file ../rodeo2/region007.gbk
Continuing with 42 queries
Rewriting fasta
hmmsearch --cpu 1 -o output/tes5/results/RREfinder_hmm_results.txt --domtblout output/tes5/results/RREfinder_hmm_results.tbl -T 25 data/hmm/RRE_v7_phmms_3_iter.hmm output/tes5/fastas/fasta_all.fasta
Resubmitting 1 found RREs
- 12:41:59.246 ERROR: Input file (output/tes5/fastas/Contig253_7120-7474_kmtR_S1361_07170_RRE_expalign_ss.a3m) could not be opened!

Traceback (most recent call last):
  File "./RRE.py", line 1371, in <module>
    res,parsed_data_dict = main(settings)
  File "./RRE.py", line 1190, in main
    all_groups = rrefinder_main(settings,RRE_targets,all_groups)
  File "./RRE.py", line 945, in rrefinder_main
    resubmit_all(all_groups,RRE_targets,settings)
  File "./RRE.py", line 567, in resubmit_all
    resubmit_group(group,RRE_targets,settings,settings.cores)
  File "./RRE.py", line 515, in resubmit_group
    parse_hhpred_res(group,RRE_targets,settings,resubmit=True)
  File "./RRE.py", line 519, in parse_hhpred_res
    results = read_hhr(group.RRE_results_file)
  File "./RRE.py", line 365, in read_hhr
    with open(f) as inf:
FileNotFoundError: [Errno 2] No such file or directory: 'output/tes5/results/Contig253_7120-7474_kmtR_S1361_07170_RRE.hhr'

Can you help to solve it?

Environment is missing Biopython

Hi guys, giving your tool a shot, looks pretty exciting!

Just a heads-up: the environment YAML file is missing the biopython package, I needed to install it manually.

Cheers,
David

Problems parsing gbk files - 'illegal characters'

I just installed RREFinder for a quick testrun and was successful with fasta files, but not with genbank files. It looks to me like something about how biopython parses seq records from gbk files has changed? I only installed up to the point where I can run the precision mode. Full disclosure - I also commented out the line '- ld_impl_linux-64=2.34=h53a641e_0' in the RREfinder.yml file as it caused issues when installing, so there is the obvious chance that I'm causing problems...

Files used:
protein fasta:
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/017/068/035/GCF_017068035.1_ASM1706803v1/GCF_017068035.1_ASM1706803v1_protein.faa.gz
gbk:
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/017/068/035/GCF_017068035.1_ASM1706803v1/GCF_017068035.1_ASM1706803v1_genomic.gbff.gz

Commands used:
protein fasta:
python RRE.py --infile path/to/GCF_017068035.1_ASM1706803v1_protein.faa --intype fasta --outputfolder path/to/outfiles/ --cores 4 --mode --verbosity 2 precision RRE_test_faa
genbank:
python RRE.py --infile path/to/GCF_017068035.1_ASM1706803v1_genomic.gbff --intype genbank --outputfolder path/to/outfiles/ --cores 4 --mode precision --verbosity 2 RRE_test_gbk

Output for genbank:
Parse failed (sequence file ../outfiles/RRE_test_gbk/fastas/fasta_all.fasta):
Line 72: illegal character :

Looking at the vicinity of line 72 of the fasta_all.fasta file showed the following:

image

I tried it with a few different genbank files and end up running into the same problem sooner or later.

Missing file in exploratory mode (Could not solve by installing HHsuite in the same conda environment)

Hi,

It seems that we have a directory issue, supposedly due to a missing file in exploratory mode. This problem might have shown up before in previous issue, but applying the suggested solution (which is installing HHsuite in the same conda environment) did not solve the problem.

Any suggestion is appreciated. Thank you very much. Here is the log.

(RREfinder) ~/RREFinder$ python2 RRE.py -v2 -t fasta -c 10 -i test.fasta -m exploratory abcxyz
Warning! Output folder with name abcxyz already found - results may be overwritten
Reading in file test.fasta
Continuing with 6 queries
Skipped 0 genes
Rewriting fasta
Resubmitting 3 found RREs
hhblits -cpu 10 -d data/database/RRE_v5_iter_3 -i output/abcxyz/fastas/NonB_RRE.fasta -oa3m output/abcxyz/fastas/NonB_RRE_expalign.a3m -o output/abcxyz/fastas/NonB_RRE_expalign.hhr -v 0 -n 3
addss.pl output/abcxyz/fastas/NonB_RRE_expalign.a3m output/abcxyz/fastas/NonB_RRE_expalign_ss.a3m -a3m
Traceback (most recent call last):
File "RRE.py", line 1298, in
res,parsed_data_dict = main(settings)
File "RRE.py", line 1129, in main
all_groups = rrefinder_main(settings,RRE_targets,all_groups)
File "RRE.py", line 896, in rrefinder_main
resubmit_all(all_groups,RRE_targets,settings)
File "RRE.py", line 520, in resubmit_all
resubmit_group(group,RRE_targets,settings,settings.cores)
File "RRE.py", line 461, in resubmit_group
add_ss(group,settings,resubmit=True)
File "RRE.py", line 258, in add_ss
p = Popen(cmds,stdout=PIPE,stderr=PIPE)
File "/usr/lib/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Bioconda package

Hi,

I work on the bioinformatics platform ABiMS in Roscoff, France.

We are interested in your tool RREFinder. To follow our installation policies (we use Conda env), I would like to add your tool as a package on the bioconda channel. I started a PR here: bioconda/bioconda-recipes#33650

But for that, I miss a release of the tool with a version number and a tar.gz archive. Would it be possible to tag a release in your Github repo ?

Also feel free to push commits or comment in the PR !

Regards,

Loraine Guéguen

No _fas.ffindex files in Uniclust database

Hi,

I was trying to run the the exploratory mode with Uniclust database, as described in 3rd step of "Use RREFinder to detect RREs with the uniclust30 database (Advanced)" section:

 resubmit_database=path/to/uniclust30
 expand_database=path/to/uniclust30

But eventually the following error arises:

Traceback (most recent call last):
  File "RRE.py", line 1400, in <module>
    check_databases(settings)
  File "RRE.py", line 1374, in check_databases
    raise ValueError(f'File {file} for category {category} not found. Please make sure all required databases are downloaded and their paths indicated in the config file.')
ValueError: File UniRef30_2021_03_fas.ffindex for category exploratory_hhpred_resubmit_database not found. Please make sure all required databases are downloaded and their paths indicated in the config file.

And there are no _fas.ffindex in Uniclust30 database.

Best,
Pavlo

Support AntiSMASH 5 output format

Hi, I noticed you are using the old antiSMASH output format, where the final file ends with .final.gbk. Are you planning to support the new version as well?

In version 5.0, the output of antiSMASH has changed: https://github.com/antismash/antismash

Here is an example antiSMASH output file:
mibig_example_antismash_5.1.0.gbk.gz

The clusters are now provided in a more detailed form:

     protocluster    1..16387
                     /aStool="rule-based-clusters"
                     /contig_edge="True"
                     /core_location="[0:15047]"
                     /cutoff="20000"
                     /detection_rule="((YcaO or TIGR03882) and ((thio_amide and
                     (PF06968 or PF04055 or PF07366)) or Lant_dehydr_C or
                     Lant_dehydr_N or PF07366 or PF06968 or PF04055) or
                     thiostrepton)"
                     /neighbourhood="10000"
                     /product="thiopeptide"
                     /protocluster_number="2"
                     /tool="antismash"
     proto_core      1..15047
                     /aStool="rule-based-clusters"
                     /tool="antismash"
                     /cutoff="20000"
                     /detection_rule="((YcaO or TIGR03882) and ((thio_amide and
                     (PF06968 or PF04055 or PF07366)) or Lant_dehydr_C or
                     Lant_dehydr_N or PF07366 or PF06968 or PF04055) or
                     thiostrepton)"
                     /neighbourhood="10000"
                     /product="thiopeptide"
                     /protocluster_number="2"
     cand_cluster    1..16387
                     /candidate_cluster_number="1"
                     /contig_edge="True"
                     /detection_rules="((YcaO or TIGR03882) and ((thio_amide and
                     (PF06968 or PF04055 or PF07366)) or Lant_dehydr_C or
                     Lant_dehydr_N or PF07366 or PF06968 or PF04055) or
                     thiostrepton)"
                     /detection_rules="((goadsporin_like or PF00881 or
                     TIGR03605) and (YcaO or TIGR03882))"
                     /kind="chemical_hybrid"
                     /product="thiopeptide"
                     /product="LAP"
                     /protoclusters="2"
                     /protoclusters="1"
                     /tool="antismash"

You can contact the authors for more info.

Problem with installation ("FileNotFoundError: [Errno 2] No such file or directory" for .a3m file)

Hi,

I tried the both installation methods - automatic and manual, but the following error arise:

Reading in file ../../04_bgc_annotation/gb6/data/gb1_bin.48.strict_filtered_kept_contigs.fasta
Continuing with 2227 queries
Rewriting fasta
Resubmitting 5 found RREs
Traceback (most recent call last):
  File "RRE.py", line 1403, in <module>
    res,parsed_data_dict = main(settings)
  File "RRE.py", line 1184, in main
    all_groups = rrefinder_main(settings,RRE_targets,all_groups)
  File "RRE.py", line 939, in rrefinder_main
    resubmit_all(all_groups,RRE_targets,settings)
  File "RRE.py", line 561, in resubmit_all
    resubmit_group(group,RRE_targets,settings,settings.cores)
  File "RRE.py", line 502, in resubmit_group
    add_ss(group,settings,resubmit=True)
  File "RRE.py", line 296, in add_ss
    with open(infile) as handle:
FileNotFoundError: [Errno 2] No such file or directory: 'output/gb1/fastas/NIMADCBA_00820_Ferric_uptake_regulation_protein_RRE_expalign.a3m'

The problem could be connected to the addss.pl script error:

$ addss.pl --help
<normal output>
.
.
. 
Filtering alignment to diversity 7 ...                                                                                                                                                                             
$ hhfilter -v 1 -neff 7 -i /tmp/FySoZdSt9c/9q7oqmXpnV.in.a3m -o /tmp/FySoZdSt9c/9q7oqmXpnV.in.a3m                                                                                                                  
                                                                                                                                                                                                                   
Error: command 'hhfilter -v 1 -neff 7 -i /tmp/FySoZdSt9c/9q7oqmXpnV.in.a3m -o /tmp/FySoZdSt9c/9q7oqmXpnV.in.a3m' returned error code 0     
.
.
.
<several more errors arise, but this is the first one>                                                                                                                                                                                                                                                                   

Best,
Pavlo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.