antismash / antismash Goto Github PK
View Code? Open in Web Editor NEWantiSMASH
Home Page: https://antismash.secondarymetabolites.org
License: GNU Affero General Public License v3.0
antiSMASH
Home Page: https://antismash.secondarymetabolites.org
License: GNU Affero General Public License v3.0
Describe the bug
Running bacterial genomes in AS5 -RC1 results in unexpected termination due to a bug in the lassopeptide prediction code.
Expected behavior
A clear and concise description of what you expected to happen.
To Reproduce
Reproducible example here:
# get an example gbk and a script to run docker with the latest tagged release
wget -O bgc.gbk https://mibig.secondarymetabolites.org/repository/BGC0001307/BGC0001307.1.final.gbk
wget https://gist.githubusercontent.com/zachcp/7b5fa4286a5e6f3b08fe638bac4cc5fb/raw/76f6f3ff5e2b7ca9f6b612a54508062a5e1f5080/run_as5rc1.sh
# convert to fasta and run with gene finding
seqmagick convert bgc.gbk bgc.fasta
bash run_as5rc1.sh bgc.fasta . --genefinding-tool prodigal
Error Message Below
○ → bash run_as5rc1.sh bgc.fasta . --genefinding-tool prodigal
Traceback (most recent call last):
File "/usr/local/bin/antismash", line 11, in <module>
sys.exit(entrypoint())
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 120, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 109, in main
antismash.run_antismash(sequence, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 572, in run_antismash
result = _run_antismash(sequence_file, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 637, in _run_antismash
analysis_timings = analyse_record(record, options, analysis_modules, module_results)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 263, in analyse_record
run_module(record, module, options, previous_result, timings)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 235, in run_module
results = module.run_on_record(record, results, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/__init__.py", line 75, in run_on_record
return specific_analysis(record)
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/specific_analysis.py", line 716, in specific_analysis
motif = run_lassopred(record, cluster, candidate)
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/specific_analysis.py", line 644, in run_lassopred
result = determine_precursor_peptide_candidate(record, cluster, query, query.translation)
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/specific_analysis.py", line 630, in determine_precursor_peptide_candidate
valid, rodeo_score = run_rodeo(record, cluster, query, query_sequence[:end], query_sequence[end:])
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/specific_analysis.py", line 596, in run_rodeo
fimo_motifs, motif_score, fimo_scores = identify_lasso_motifs(leader, core)
File "/usr/local/lib/python3.5/dist-packages/antismash/modules/lassopeptides/specific_analysis.py", line 423, in identify_lasso_motifs
fimo_output = subprocessing.run_fimo_simple(motif_file, tempfile.name)
File "/usr/local/lib/python3.5/dist-packages/antismash/common/subprocessing.py", line 299, in run_fimo_simple
result = execute(command)
File "/usr/local/lib/python3.5/dist-packages/antismash/common/subprocessing.py", line 91, in execute
proc = Popen(commands, stdin=stdin_redir, stdout=stdout, stderr=stderr)
File "/usr/lib/python3.5/subprocess.py", line 676, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.5/subprocess.py", line 1282, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied
Running antiSMASH FAILED
System (please complete the following information):
docker AS5 release
Additional context
Reproducible example in docker with public data attached above.
Describe the bug
I've installed 5-0-0rc1 on a Redhat 6 system using python 3.7.3.
Clusterblast fails to download:
$ env PATH=/disks/patric-common/runtime/bin:$PATH /disks/patric-common/runtime/antismash-5-0-0rc1/bin/download-antismash-databases --database-dir /vol/patric3/production/data/antismash/antismash-5-0-0rc1
/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/scss/selector.py:54: FutureWarning: Possible nested set at position 329
''', re.VERBOSE | re.MULTILINE)
Creating checksum of Pfam-A.hmm
PFAM file present and ok for version 27.0
Creating checksum of Pfam-A.hmm
PFAM file present and ok for version 31.0
Resfams database present and checked
Downloading ClusterBlast database.
Traceback (most recent call last):
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 1317, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 956, in send
self.connect()
File "/disks/patric-common/runtime/lib/python3.7/http/client.py", line 1392, in connect
server_hostname=server_hostname)
File "/disks/patric-common/runtime/lib/python3.7/ssl.py", line 412, in wrap_socket
session=session
File "/disks/patric-common/runtime/lib/python3.7/ssl.py", line 853, in _create
self.do_handshake()
File "/disks/patric-common/runtime/lib/python3.7/ssl.py", line 1117, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 82, in download_file
req = request.urlopen(url)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 543, in _open
'_open', req)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/disks/patric-common/runtime/lib/python3.7/urllib/request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/disks/patric-common/runtime/antismash-5-0-0rc1/bin/download-antismash-databases", line 11, in <module>
load_entry_point('antismash==5.0.0rc1', 'console_scripts', 'download-antismash-databases')()
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 352, in _main
download(args)
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 330, in download
download_clusterblast(args.database_dir)
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 307, in download_clusterblast
download_if_not_present(CLUSTERBLAST_URL, archive_filename, CLUSTERBLAST_ARCHIVE_CHECKSUM)
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 207, in download_if_not_present
download_file(url, filename)
File "/disks/patric-common/runtime/antismash-5-0-0rc1/lib/python3.7/site-packages/antismash/download_databases.py", line 84, in download_file
raise DownloadError("ERROR: File not found on server.\nPlease check your internet connection.")
antismash.download_databases.DownloadError: ERROR: File not found on server.
Please check your internet connection.
I have installed the latest certifi package, and verified it holds the root certificate
"DST Root CA X3" that your site uses via LetsEncrypt.
Expected behavior
download-databases pulls the databases properly
To Reproduce
Install 5-0-0rc1 on a Redhat 6 system using python 3.7.3 via a python venv.
Screenshots
Pasted text above
System (please complete the following information):
Additional context
Hi @kblin and team. We are happy about your involvement in the DeepBGC module, would you be interested in integrating the module in antiSMASH?
DeepBGC can be used programmatically, here's an example:
from deepbgc import DeepBGCDetector, DeepBGCClassifier
os.environ["DEEPBGC_DOWNLOADS_DIR"] = os.path.join(os.path.basename(__file__), 'data')
detector = DeepBGCDetector('deepbgc', merge_max_protein_gap=1, merge_max_nucl_gap=2000, score_threshold=0.5)
product_classifier = DeepBGCClassifier('product_class')
activity_classifier = DeepBGCClassifier('product_activity')
detector.run(record)
product_classifier.run(record)
activity_classifier.run(record)
If you are interested, please let me know and I will be happy to get involved and discuss next steps.
Hi,
I am doing a full-featured run of antismash 5.1 on a bacterial high-quality genome. I run the following command:
antismash -c 60 --fullhmmer --cf-create-clusters --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees --output-dir 3_secondary_metab/C08 C08.gbk
and get repeatedly this warning:
Process ForkPoolWorker-141: Traceback (most recent call last): File "/project02/miniconda3/envs/antismash/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/project02/miniconda3/envs/antismash/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/project02/miniconda3/envs/antismash/lib/python3.7/multiprocessing/pool.py", line 110, in worker task = get() File "/project02/miniconda3/envs/antismash/lib/python3.7/multiprocessing/queues.py", line 354, in get return _ForkingPickler.loads(res) File "/project02/miniconda3/envs/antismash/lib/python3.7/site-packages/antismash/common/secmet/qualifiers/nrps_pks.py", line 103, in extend raise NotImplementedError("Extending this list won't work") NotImplementedError: Extending this list won't work
It doesn't kill the execution of the routine but the job never finish and there is no CPU usage after the last occurrence of the error.
I am using a recent conda installation of antismash on a Linux Mint 19.2
Hi guys,
I'm trying to use Record.from_genbank
on this file. The run has been generated beginning of February and I just installed antismash again using pip (see below for versions).
Trying to load the file above I get several Exceptions. I used to run the function without problem a year or so ago.
from antismash.common.secmet import Record
rec = Record.from_genbank("Amycolatopsis_keratinaphila.gbk")
Traceback (most recent call last):
File "/.../lib/python3.7/site-packages/antismash/common/secmet/record.py", line 340, in get_domain_by_name
return self._domains_by_name[name]
KeyError: 'nrpspksdomains_ctg1_2108_Condensation_Starter.1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/.../lib/python3.7/site-packages/antismash/common/secmet/features/module.py", line 201, in from_biopython
domains = [record.get_domain_by_name(domain) for domain in domain_names]
File "/.../lib/python3.7/site-packages/antismash/common/secmet/features/module.py", line 201, in <listcomp>
domains = [record.get_domain_by_name(domain) for domain in domain_names]
File "/.../lib/python3.7/site-packages/antismash/common/secmet/record.py", line 342, in get_domain_by_name
raise KeyError("record %s contains no domain named %s" % (self.id, name))
KeyError: 'record NZ_LT629789.1 contains no domain named nrpspksdomains_ctg1_2108_Condensation_Starter.1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/.../lib/python3.7/site-packages/antismash/common/secmet/record.py", line 762, in from_biopython
record.add_feature(cls.from_biopython(feature, record=record))
File "/.../lib/python3.7/site-packages/antismash/common/secmet/features/module.py", line 204, in from_biopython
bio_feature.location, err))
ValueError: record does not contain domain referenced by module at [2331621:2334660](+): 'record NZ_LT629789.1 contains no domain named nrpspksdomains_ctg1_2108_Condensation_Starter.1'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.7/site-packages/antismash/common/secmet/record.py", line 774, in from_genbank
records.append(Record.from_biopython(bio, taxon))
File "/.../lib/python3.7/site-packages/antismash/common/secmet/record.py", line 764, in from_biopython
raise SecmetInvalidInputError(str(err)) from err
antismash.common.secmet.errors.SecmetInvalidInputError: record does not contain domain referenced by module at [2331621:2334660](+): 'record NZ_LT629789.1 contains no domain named nrpspksdomains_ctg1_2108_Condensation_Starter.1'
pip freeze
antismash==5.1.1
bcbio-gff==0.6.6
biopython==1.76
cycler==0.10.0
helperlibs==0.1.11
Jinja2==2.11.1
joblib==0.14.1
kiwisolver==1.1.0
MarkupSafe==1.1.1
matplotlib==3.1.3
numpy==1.18.1
pyparsing==2.4.6
pyScss==1.3.5
pysvg-py3==0.2.2.post3
python-dateutil==2.8.1
scikit-learn==0.22.1
scipy==1.4.1
six==1.14.0
Thank you very much for your efforts in getting antiSMASH v5 onto bioconda.
When running the new antismash via conda however, I am getting this warning message indicating errors that I will assume happen in the future?:
/home/lamma/miniconda3/envs/antismash_v5/lib/python3.7/site-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
warnings.warn(msg, category=FutureWarning)
/home/lamma/miniconda3/envs/antismash_v5/lib/python3.7/site-packages/scss/selector.py:54: FutureWarning: Possible nested set at position 329
''', re.VERBOSE | re.MULTILINE)
Is your feature request related to a problem? Please describe.
We've noticed on a few clusters we've been working on that the Adenylation domain substrate predictions differ in AS5 relative to AS4.
Describe the solution you'd like
Would it be possible to restore the use of NRPSPredictor3 to the AS5 pipeline?
Describe alternatives you've considered
Additional context
Here is a breakdown of the Adenylation domain substrate differences between AS4 and AS5 on the mibig dataset: substrate_prediction_comparison. It seems most of the differences between the two programs are related to no_call
s which I would guess to be due to differences in acceptance thresholds. However there is also a small subset of sequences where there are differences.
Describe the bug
ClusterBlast in antismash 5 returns only bacterial hits for fungal genomes
Expected behavior
ClusterBlast in antismash 4 returned fungal hits for fungal genomes
To Reproduce
For analysis issues, provide an accession number or a record fragment that can reproduce the problem. fungi-0d30459c-21b0-469a-84cf-146b64dcaebc
two screenshots attached for the same cluster in
the problem applicable to all clusters :(
when ran locally, antiSMASH5 still retain the same bug as 4's clusterblast module: the SVGs will not be shown in the html output because it violates cross-origin resource sharing (CORS) policy.
given that we're now also store clusterblast result as an object, would it be a good idea to reimplement the html visualization e.g. by re-drawing the SVGs inline?
Are you planning on adding this py3 port of antismash to bioconda? The current version on bioconda is 4.2.
Hi,
I noticed this check inside the hmmer
module:
antismash/antismash/common/hmmer.py
Lines 115 to 118 in e6f6620
This is always true. It should have probably said:
if hsp.query_id not in feature_by_id:
continue
feature = feature_by_id[hsp.query_id]
Or even throw an exception since there would be something fishy going on :)
Looking for this file in my outputs from Antismash, I didn't find it?
Is there any option to generate this file or it is not available in the new version?
Do you have any post-processing data script to do this?
Thank you in advance.
Describe the bug
@kblin I have been following your suggestion of using secmet to extract info from the genbank files. See #216. This works for most files, but fails for files which have the string leader
, for example: prepeptide=leader, leader_location="[56991:57051](-)
For such files, I get the error
antismash.common.secmet.errors.SecmetInvalidInputError: Features that bridge the record origin cannot be directly created: join{[56991:57051](-)}
Expected behavior
These files should be parsed like others which don't have this string
Attached is a file which can help reproduce this error
CS1EDSR2D_564265.region001.gbk.txt
Describe the bug
I'm trying to run antiSMASH with the Docker image that was downloaded today around 1500hrs (Germany Time) as follows:
bash run_antismash-full input.fna '/output/folder' --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go
However I keep getting the following error, I keep getting the following error:
WARNING 12/07 13:14:50 Fasta header too long: renamed "Chlamy10_contig_108" to "c00108_Chlamy1.."
. . . many of these . . .
WARNING 12/07 13:14:50 Fasta header too long: renamed "Chlamy10_contig_109" to "c00109_Chlamy1.."
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/usr/local/lib/python3.5/dist-packages/antismash/common/record_processing.py", line 239, in ensure_cds_info
genefinding(sequence, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/detection/genefinding/__init__.py", line 93, in run_on_record
raise ValueError("Called find_genes, but genefinding disabled")
ValueError: Called find_genes, but genefinding disabled
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/antismash", line 11, in <module>
sys.exit(entrypoint())
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 124, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 574, in run_antismash
result = _run_antismash(sequence_file, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 628, in _run_antismash
cast(AntismashModule, genefinding))
File "/usr/local/lib/python3.5/dist-packages/antismash/common/record_processing.py", line 388, in pre_process_sequences
sequences = parallel_function(partial, ([sequence] for sequence in sequences))
File "/usr/local/lib/python3.5/dist-packages/antismash/common/subprocessing/base.py", line 132, in parallel_function
results = jobs.get(timeout=timeout)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
ValueError: Called find_genes, but genefinding disabled
Running antiSMASH FAILED
I'm fairly certain I've installed Docker and the image correctly, as I followed the instructions on the docs pages.
System (please complete the following information):
Antismash fails prerequisite check during installation with following traceback:
silwer@ibch@dna ~ $ antismash --check-prereqs
/usr/local/lib/python3.6/dist-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
warnings.warn(msg, category=FutureWarning)
Traceback (most recent call last):
File "/home/domain/silwer/.local/bin/antismash", line 8, in <module>
sys.exit(entrypoint())
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/__main__.py", line 126, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/__main__.py", line 115, in main
antismash.run_antismash(sequence, options)
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/main.py", line 612, in run_antismash
result = _run_antismash(sequence_file, options)
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/main.py", line 619, in _run_antismash
_log_found_executables(options)
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/main.py", line 718, in _log_found_executables
version = " ({})".format(version_getter()) # pylint: disable=not-callable
File "/home/domain/silwer/.local/lib/python3.6/site-packages/antismash/common/subprocessing/java.py", line 21, in run_java_version
raise RuntimeError(msg % java)
RuntimeError: unexpected output from java: /usr/bin/java, check path
though java works just fine:
silwer@ibch@dna ~ $ java --version
Picked up _JAVA_OPTIONS: -Dswing.defaultlaf=com.sun.java.swing.plaf.gtk.GTKLookAndFeel -Dawt.useSystemAAFontSettings=on
openjdk 11.0.6 2020-01-14
OpenJDK Runtime Environment (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1)
OpenJDK 64-Bit Server VM (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1, mixed mode, sharing)
The reason for the crash is the presence of the custom _JAVA_OPTIONS, which were set through environmental vars. If I unset them, then antismash passes all the checks.
silwer@ibch@dna ~ $ unset _JAVA_OPTIONS
silwer@ibch@dna ~ $ java -version
openjdk version "11.0.6" 2020-01-14
OpenJDK Runtime Environment (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1)
OpenJDK 64-Bit Server VM (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1, mixed mode, sharing)
silwer@ibch@dna ~ $ antismash --check-prereqs
/usr/local/lib/python3.6/dist-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
warnings.warn(msg, category=FutureWarning)
All prerequisites satisfied
It seems, that the root of the problem is https://github.com/antismash/antismash/blob/master/antismash/common/subprocessing/java.py
with a very basic check:
def run_java_version() -> str:
""" Get the version of the java binary """
java = get_config().executables.java
command = [
java,
"-version",
]
version_string = execute(command).stderr
if not version_string.startswith("openjdk") and not version_string.startswith("java"):
msg = "unexpected output from java: %s, check path"
raise RuntimeError(msg % java)
# get rid of the non-version stuff in the output
return version_string.split()[2].strip('"')
which fails due to the usage of startswith function. the issue can be solved by this naive check:
if not "openjdk" in version_string and not "java" in version_string:
Describe the bug
antismash 5.0 runntime error:
RuntimeError: hmmscan returned 1: 'Error: TC bit thresholds unavailable on model Enediyne-KS' while scanning '>CMLKBGAL_01274\nMSSKLIYTGKAKDIYTTEDEHVIRSVYKDQATMLNGARKETIEGKGVLNNQISSLIFEKLNAAGVATHFIERISDTEQLNKKVT'
Expected behavior
A clear and concise description of what you expected to happen.
To Reproduce
For analysis issues, provide an accession number or a record fragment that can reproduce the problem.
antismash --cpus 8 --cf-create-clusters --smcog-trees --cb-knownclusters --asf --pfam2go --output-dir results/umgs/ERR2764806.metaspades.bin.4.fa --genefinding-gff3 ERR2764806.metaspades.bin.4.gff ERR2764806.metaspades.bin.4.fna
For interaction issues, also include steps to reproduce the behavior:
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
Additional context
Add any other context about the problem here.
Describe the bug
I am seeing a 'Permission denied' error with using the --smcog-trees
parameter; tested using a manual installation of antiSMASH and a venv
installation. The application finishes successfully without this parameter.
[cjfields@compute-8-0 2019-05-16-test-as]$ antismash --logfile as.log --debug --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees -c $SLURM_NPROCS --genefinding-gff GCF_002157265.1_ASM215726v1_genomic.gff GCF_002157265.1_ASM215726v1_genomic.fna
...
DEBUG 16/05 10:40:44 annotating CDS features with resist info: 1 CDSes
DEBUG 16/05 10:40:44 annotating CDS features with smcogs info: 174 CDSes
DEBUG 16/05 10:40:44 Checking if antismash.modules.smcog_trees should be run
INFO 16/05 10:40:44 Running antismash.modules.smcog_trees
INFO 16/05 10:40:44 Calculating and drawing phylogenetic trees of cluster genes with smCOG members
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/apps/software/Python/3.6.1-IGB-gcc-4.9.4/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/apps/software/Python/3.6.1-IGB-gcc-4.9.4/lib/python3.6/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 68, in smcog_tree_analysis
draw_tree(input_number, output_dir, gene_id)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 138, in draw_tree
run_result = subprocessing.execute(command)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/common/subprocessing/base.py", line 86, in execute
proc = Popen(commands, stdin=stdin_redir, stdout=stdout, stderr=stderr)
File "/home/apps/software/Python/3.6.1-IGB-gcc-4.9.4/lib/python3.6/subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "/home/apps/software/Python/3.6.1-IGB-gcc-4.9.4/lib/python3.6/subprocess.py", line 1326, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/bin/antismash", line 11, in <module>
sys.exit(entrypoint())
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/__main__.py", line 124, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/__main__.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/main.py", line 574, in run_antismash
result = _run_antismash(sequence_file, options)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/main.py", line 638, in _run_antismash
analysis_timings = analyse_record(record, options, analysis_modules, module_results)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/main.py", line 264, in analyse_record
run_module(record, module, options, previous_result, timings)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/main.py", line 236, in run_module
results = module.run_on_record(record, results, options)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/modules/smcog_trees/__init__.py", line 128, in run_on_record
trees = generate_trees(smcogs_dir, record.get_cds_features_within_regions(), nrpspks_genes)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 48, in generate_trees
subprocessing.parallel_function(smcog_tree_analysis, args)
File "/home/groups/hpcbio/apps/antismash/antismash-github/install/lib/python3.6/site-packages/antismash/common/subprocessing/base.py", line 132, in parallel_function
results = jobs.get(timeout=timeout)
File "/home/apps/software/Python/3.6.1-IGB-gcc-4.9.4/lib/python3.6/multiprocessing/pool.py", line 608, in get
raise self._value
PermissionError: [Errno 13] Permission denied
Expected behavior
Well, that --smcog-trees
would work, but I know this one is a difficult issue to wrangle, particularly with a complex tool with a lot of moving parts ;)
To Reproduce
Example data (GCF_002157265.1_ASM215726v1_genomic.fna
and GCF_002157265.1_ASM215726v1_genomic.gff
) from ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/002/157/265/GCF_002157265.1_ASM215726v1
Using Github checkout cbd4ad3 (May 14).
Screenshots
None
System (please complete the following information):
[cjfields@compute-8-0 2019-05-16-test-as]$ cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
None (command line)
Additional context
The run passes all checks for installation; I am attaching a log file for this
My suspicion is that an application is being called that isn't installed, but I can't hone in on which one it is. muscle
or Fasttree
are in the path.
as.log
The current antismash release (5.1.1) comes with an outdated version of the pfam2go database. The one which is distributed with the tarball is dated to !version date: 2018/02/24 12:32:44
while the current one (http://current.geneontology.org/ontology/external2go/pfam2go) is dated to !version date: 2019/12/14 15:47:44
. A lot has changed - versions differ not only in the number of lines (10570 vs. 10332 in the latter) but also in the naming of some pathways, e.g. (note the hyphen in the G-protein)
< Pfam:PF00001 7tm_1 > GO:G-protein coupled receptor activity ; GO:0004930
< Pfam:PF00001 7tm_1 > GO:G-protein coupled receptor signaling pathway ; GO:0007186
---
> Pfam:PF00001 7tm_1 > GO:G protein-coupled receptor activity ; GO:0004930
> Pfam:PF00001 7tm_1 > GO:G protein-coupled receptor signaling pathway ; GO:0007186
see the attached diff (https://pastebin.com/4pKmuzmt) for a more detailed view.
Thus it would be very convenient to have an opportunity to change the version or, at least, use the most recent one in the corresponding dir. At the moment the name of the database is hardcoded in several places:
modules/pfam2go/__init__.py:38: pfam2go-march-2018.txt: mapping file for Pfam to Gene Ontology mapping
modules/pfam2go/__init__.py:41: if path.locate_file(path.get_full_path(__file__, 'data', 'pfam2go-march-2018.txt')) is None:
modules/pfam2go/pfam2go.py:159: full_gomap_as_ontologies = construct_mapping(path.get_full_path(__file__, 'data', 'pfam2go-march-2018.txt'))
modules/pfam2go/test/test_pfam2go.py:84: data = path.get_full_path(os.path.dirname(__file__), 'data', 'pfam2go-march-2018.txt')
The "Most similar known cluster" provides a % similarity to a significant BCG hit in blast but when that column returns say 3% or even 30%, is that really enough to take away that this BGC might produce that compound? Sorry if I am misunderstanding the purpose of that output.
Hello Team Antismash,
I am wondering if there is a way to combined Antismash results from different runs. For example - to take two JSON outputs, combine them, and then generate a new, combined html output? I played around a bit with the AntismashResults class but it looks like Module results are either not stored or not parsed so, while I was able to load the JSON files I was unable to generate fresh html. Is there some way to do this currently? If not, I think this might be a useful additional tool.
Regards,
zach cp
Apologies if my request is misplaced here. I see the TSV outputs were available in v3, with JSON being the preferred output format for v5
I am looking for a simple script which can parse the antiSMASH output and return a frequency table of the detected BGC's along with feature data such as the predicted product. I am working a bunch of metagenome assembled genomes and therefore trying to aggregate results in a meaningful way.
Describe the bug
antiSMASH throws error during check_prerequisites. It tries to update /usr/local/lib/python3.5/dist-packages/antismash/detection/hmm_detection/data/bgc_seeds.hmm
, but this file is not writable for the user (belongs to root).
Expected behavior
From my point of view antiSMASH should not update any file systems file, when a user runs the tool. Is there an option for placing these prerequisities somewhere, where the user can write? Or can I disable the check_prerequisities step?
To Reproduce
I try to install antiSMASH on our HPC cluster. Because antiSMASH has many dependencies, I wanted to use a Singularity container. Therefore I pulled your docker container with Singularity 3.2.1:
singularity pull docker://antismash/standalone:5.0.0
In order to user the Singularity container, I changed the container executation in your wrapper script to:
singularity --debug run --writable-tmpfs \
--bind ${INPUT_DIR}:${CONTAINER_SRC_DIR} \
--bind ${OUTPUT_DIR}:${CONTAINER_DST_DIR} \
standalone_5.0.0.sif \
${INPUT_FILE} \
$@
Then I execute the wrapper script with a small test data set, provided by the user:
./run_antismash ./example_input/CP026121sequence.fasta \
/example_output/ \
-c 1 -v -d \
--genefinding-tool prodigal \
--logfile test.log
The error message I get is:
ERROR 19/07 11:05:51 antismash.detection.hmm_detection: preqrequisite failure: Failed to generate file '/usr/local/lib/python3.5/dist-packages/antismash/detection/hmm_detection/data/bgc_seeds.hmm'
Traceback (most recent call last):
File "/usr/local/bin/antismash", line 11, in <module>
sys.exit(entrypoint())
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 124, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/usr/local/lib/python3.5/dist-packages/antismash/__main__.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 574, in run_antismash
result = _run_antismash(sequence_file, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 603, in _run_antismash
check_prerequisites(options.all_enabled_modules, options)
File "/usr/local/lib/python3.5/dist-packages/antismash/main.py", line 511, in check_prerequisites
raise RuntimeError("Modules failing prerequisites")
RuntimeError: Modules failing prerequisites
Running antiSMASH FAILED
System (please complete the following information):
Additional context
I think I would have the same problem if I install an environment module, because the installation directory also not writable for a user of our HPC cluster.
Describe the bug
Encounter a parsing error for GFF3 file when running a standalone antiSMASH 5.0.0 from https://hub.docker.com/r/antismash/standalone on Linux. The error message shows "could not parse records from GFF3".
Expected behavior
The same GFF3 file was working using the public web server for antiSMASH 5.0.0.
To Reproduce
Here is the commands:
/home/antismash/bin/run_antismash ~/data/sequences/SPAdes_Pcocy_lophfium_scaffolds.fasta ~/data/maker/pcocy/antismash --taxon fungi --tta-threshold 0.65 --cb-general --cb-subclusters --cb-knownclusters --smcog-trees --fullhmmer --asf --pfam2go --cf-create-clusters --genefinding-tool none --genefinding-gff3 ~/data/maker/pcocy/pcocy.all.functional.ipr.sprot.gff
System (please complete the following information):
Additional context
GFF3 file was generated by MAKER.
Is your feature request related to a problem? Please describe.
Lets say you have a number of gene clusters that cannot be shared publicly but which you would like to incorporate into Antismash. Ideally, there would be a way to take these clusters, extract the relevant information, and incorporate them into the known-cluster or sub-cluster files so that when you run Antismash your clusters are included in the results.
Describe the solution you'd like
Take as input one or more GBK/JSON files and:
Describe alternatives you've considered
Cluster and fasta files need to be generated to be used in clusterblast. These files can be
generated during build/install or they can be specified at runtime. I have highlighted the option whereby we build the files beforehand. However, the ability to add files at runtime may be more attractive from an operational point of view: e.g. Antismash can still be built as before but you could specify extra filepaths to look for. This would require the same steps as above except (4.) would now act at runtime. It would need to extend the clusterblast module to handle linting/validation of the files and to either combine these files on-the-fly or run 2 blast jobs and combine the results.
Additional context
I think this is a potentially useful feature that would be of value to a wide range of academic and industrial groups.
Manually installed antiSMASH v5.1 on a linux server
Did a test run with the minimal flag and the output looked fine
But running a full-featured run throws the following error
Process ForkPoolWorker-62:
Traceback (most recent call last):
File "miniconda3/envs/antismash/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run()
File "miniconda3/envs/antismash/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "miniconda3/envs/antismash/lib/python3.7/multiprocessing/pool.py", line 110, in worker
task = get()
File "miniconda3/envs/antismash/lib/python3.7/multiprocessing/queues.py", line 354, in get
return _ForkingPickler.loads(res)
File "miniconda3/envs/antismash/lib/python3.7/site-packages/antismash/common/secmet/qualifiers/nrps_pks.py", line 103, in extend
raise NotImplementedError("Extending this list won't work")
NotImplementedError: Extending this list won't work
Describe the bug
There are no NRPS structures drawn for an NRPS's that did get predicted structures in antiSMASH 4.
To Reproduce
I've installed antiSMASH-5.0.0 from the Releases section of the Github antiSMASH site and it seems to work fine. I'm using GCA_000021785.1_ASM2178v1_genomic.gbff from NCBI (Bacillus cereus) for testing. When run in antiSMASH 4, the NRPS pages show predicted structures, whereas in antiSMASH 5, I get a link to the Norine web page that doesn't work.
I noticed that the example antiSMASH output for S. coelicolor on the antiSMASH web site, the NRPS's do have structures available.
Is there something missing in the installation that I can add to get the structure prediction images? Going to the Norine web site is not useful in my context.
I would be very useful is those of us on clusters using conda could run the most current version of antismash, so are there any plans to update the version avaliable via conda?
Describe the bug
I added a custom Resfam pHMM model that overlaps semantically with an existing smCOG annotation. I see the smCOG annotation (other) in the html output, (gene color is gray).
Expected behavior
I expect the resistance annotation to take priority (e.g. pink gene in the html output). I can see both the smCOG and resist annotations in the side panel, and the resist annotation has the higher bitscore.
According to the comments in the GeneFunctionsAnnotations class, I would expect the resfams to take priority:
antismash/antismash/common/secmet/qualifiers/gene_functions.py
Lines 181 to 183 in c72e99f
However, I believe the name of tool is misspecified in line 182:
# then priority for resfam, then smcogs
- annotations = self._by_tool.get("resfam", self._by_tool.get("smcogs"))
+ annotations = self._by_tool.get("resist", self._by_tool.get("smcogs"))
if annotations:
return annotations[0].function
If both smcogs and resist is present, the annotation defaults to smcog.
To Reproduce
I can create a reproducible example if you disagree that the line above is a straight-forward bug.
System (please complete the following information):
Describe the bug
Trying to run a few strains using a command line, I got an error.
Expected behavior
As normal, I expected that it worked fine as usual.
To Reproduce
My command line
for i in GCF_000954135.1 ; do run_antismash.py -c 8 --taxon bacteria --input-type nucl --transatpks_da --transatpks_da_cutoff 2 --clusterblast --subclusterblast --smcogs --inclusive --full-hmmer --asf -v --outputfolder antismash_results_156_ralstonia/${i}.antismash ${i}.gbff; done
File "/usr/local/bin/antismash-4.2.0/run_antismash.py", line 45, in <module>
from antismash.specific_modules import (
File "/usr/local/bin/antismash-4.2.0/antismash/specific_modules/lantipeptides/__init__.py", line 20, in <module>
from .specific_analysis import specific_analysis
File "/usr/local/bin/antismash-4.2.0/antismash/specific_modules/lantipeptides/specific_analysis.py", line 27, in <module>
from antismash.specific_modules.lassopeptides.specific_analysis import distance_to_pfam, find_all_orfs
File "/usr/local/bin/antismash-4.2.0/antismash/specific_modules/lassopeptides/__init__.py", line 20, in <module>
from .specific_analysis import specific_analysis
File "/usr/local/bin/antismash-4.2.0/antismash/specific_modules/lassopeptides/specific_analysis.py", line 33, in <module>
from svm_lasso import svm_classify
File "/usr/local/bin/antismash-4.2.0/antismash/specific_modules/lassopeptides/svm_lasso/svm_classify.py", line 54, in <module>
import sklearn
File "/usr/lib/python2.7/dist-packages/sklearn/__init__.py", line 57, in <module>
from .base import clone
File "/usr/lib/python2.7/dist-packages/sklearn/base.py", line 12, in <module>
from .utils.fixes import signature
File "/usr/lib/python2.7/dist-packages/sklearn/utils/__init__.py", line 10, in <module>
from .murmurhash import murmurhash3_32
File "__init__.pxd", line 155, in init sklearn.utils.murmurhash (sklearn/utils/murmurhash.c:6314)
System (please complete the following information):
Debian 9
The geneclusters.txt
outout of antismash is showing no "Compound with gene cluster of highest homology" despite the hmtl output showing most similar know clusters with a % matching. Is there a reason for this or are they two different things?
I used Manual installation from https://docs.antismash.secondarymetabolites.org/install/
execution command:
ls $INPUT/*.fna | sed 's/.*\///'| sed 's/.fna//'| parallel -j $CPU "echo ============ processing {};
time antismash --cpus $CPU -v --taxon fungi --fullhmmer --cassis --cf-create-clusters --smcog-trees --cb-general --cb-subclusters --cb-knownclusters --asf --pfam2go --output-dir $OUT/{} --genefinding-gff3 antismashIN/{}.gff --logfile $OUTPUT/{}.log antismashIN/{}.fna"
I got an error message:
INFO 05/06 03:00:36 Analysing record: EU1_51
INFO 05/06 03:00:36 Running antismash.detection.full_hmmer
INFO 05/06 03:00:36 Running whole-genome PFAM search
INFO 05/06 03:00:48 Detecting secondary metabolite clusters
INFO 05/06 03:00:48 Running antismash.detection.hmm_detection
INFO 05/06 03:00:49 Running antismash.detection.cassis
INFO 05/06 03:00:49 Detecting gene cluster regions using CASSIS
INFO 05/06 03:00:49 Running antismash.detection.clusterfinder_probabilistic
INFO 05/06 03:00:49 Running ClusterFinder to detect probabilistic gene clusters
INFO 05/06 03:00:49 No regions detected, skipping record
INFO 05/06 03:00:49 Analysing record: EU1_52
INFO 05/06 03:00:49 Running antismash.detection.full_hmmer
INFO 05/06 03:00:49 Running whole-genome PFAM search
INFO 05/06 03:01:03 Detecting secondary metabolite clusters
INFO 05/06 03:01:03 Running antismash.detection.hmm_detection
INFO 05/06 03:01:03 Running antismash.detection.cassis
INFO 05/06 03:01:03 Detecting gene cluster regions using CASSIS
INFO 05/06 03:01:03 Running antismash.detection.clusterfinder_probabilistic
INFO 05/06 03:01:03 Running ClusterFinder to detect probabilistic gene clusters
INFO 05/06 03:01:03 1 region(s) detected in record
INFO 05/06 03:01:03 Running antismash.detection.nrps_pks_domains
INFO 05/06 03:01:04 Running antismash.detection.genefunctions
INFO 05/06 03:01:05 Running antismash.modules.smcog_trees
INFO 05/06 03:01:05 Calculating and drawing phylogenetic trees of cluster genes with smCOG members
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/anaconda/envs/antismash5/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/opt/anaconda/envs/antismash5/lib/python3.6/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 68, in smcog_tree_analysis
draw_tree(input_number, output_dir, gene_id)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 161, in draw_tree
label_func=lambda node: str(node).replace("|", " "))
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/Bio/Phylo/_utils.py", line 344, in draw
import matplotlib.pyplot as plt
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/pyplot.py", line 32, in
import matplotlib.colorbar
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/colorbar.py", line 29, in
import matplotlib.collections as collections
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/collections.py", line 2056, in
docstring.interpd.update(LineCollection=artist.kwdoc(LineCollection))
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/artist.py", line 1583, in kwdoc
leadingspace=2))
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/artist.py", line 1354, in pprint_setters
accepts = self.get_valid_values(prop)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/matplotlib/artist.py", line 1247, in get_valid_values
return re.sub("\n *", " ", match.group(1))
File "/opt/anaconda/envs/antismash5/lib/python3.6/re.py", line 191, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
"""The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/anaconda/envs/antismash5/bin/antismash", line 10, in
sys.exit(entrypoint())
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 124, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 574, in run_antismash
result = _run_antismash(sequence_file, options)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 638, in _run_antismash
analysis_timings = analyse_record(record, options, analysis_modules, module_results)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 264, in analyse_record
run_module(record, module, options, previous_result, timings)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/main.py", line 236, in run_module
results = module.run_on_record(record, results, options)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/modules/smcog_trees/init.py", line 128, in run_on_record
trees = generate_trees(smcogs_dir, record.get_cds_features_within_regions(), nrpspks_genes)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/modules/smcog_trees/trees.py", line 48, in generate_trees
subprocessing.parallel_function(smcog_tree_analysis, args)
File "/opt/anaconda/envs/antismash5/lib/python3.6/site-packages/antismash/common/subprocessing/base.py", line 132, in parallel_function
results = jobs.get(timeout=timeout)
File "/opt/anaconda/envs/antismash5/lib/python3.6/multiprocessing/pool.py", line 670, in get
raise self._value
TypeError: expected string or bytes-like objectreal 60m17.158s
user 144m35.199s
sys 15m16.297s
The error happens not immideately. some trees were created.
Ubuntu18.04
Describe the bug
I am using Antismash 4.2 due to installing via conda but the --help documentation says to use antismash -h [options] but this just results in the -h call showing all the help documentation but no other execution or analysis occuring. How can I run antismash with custom options without this -h flag? the slurm generated output log shows no errors of any kind.
Expected behavior
Antismash to begin analysis with the options I flagged
System (please complete the following information):
We are running antismash to find biosynthetic capabilities of different organisms, but we already have our own annotation of these genomes.
Is there a way to give our own genebank file with our annotation to generate data compatible with our own open reading frames?
I ran download-antismash-database --database-dir $MY_SPECIAL_PATH
, which worked fine, but I can't find any docs on how to use a custom database location for running antismash. I don't see any such parameter in the antismash script doc and no information. I'm guessing that I have to set some environmental variable (eg., ANTISMASH_DB_DIR=$MY_SPECIAL_PATH
), but I can't find any information in the docs on this.
It would help to have clear docs (script and/or at https://docs.antismash.secondarymetabolites.org/install/) for using a custom database path.
antismash --check-prereqs
Traceback (most recent call last):
File "/usr/local/bin/antismash", line 9, in
entrypoint()
File "antismash-5.0.0/antismash/main.py", line 124, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/antismash-5.0.0/antismash/main.py", line 113, in main
antismash.run_antismash(sequence, options)
File "antismash-5.0.0/antismash/main.py", line 574, in run_antismash
result = _run_antismash(sequence_file, options)
File "antismash-5.0.0/antismash/main.py", line 596, in _run_antismash
check_prerequisites(modules, options)
File "antismash-5.0.0/antismash/main.py", line 504, in check_prerequisites
res = module.check_prereqs(options)
File "antismash-5.0.0/antismash/outputs/html/init.py", line 66, in check_prereqs
return prepare_data()
File "antismash-5.0.0/antismash/outputs/html/init.py", line 57, in prepare_data
result = scss.Compiler(output_style="expanded").compile(flavour + ".scss")
AttributeError: module 'scss' has no attribute 'Compiler'
The file:
https://github.com/antismash/antismash/blob/master/antismash/outputs/html/__init__.py
requiere
https://github.com/Kronuz/pyScss
should be added:
pip uninstall libsass
pip uninstall scss
pip install pyScss
and in:
https://github.com/antismash/antismash/blob/master/antismash/outputs/html/__init__.py
add:
from scss import Compiler
under import scss
With that the issue is fixed
Greetings
Can we use fungismash on the command line? I cant find a seperate install for it and nothing in the antismash documentation seems to show we can.
When is the updated version expected on bioconda?
hello
when I downloaded antismash, problem coming out.
refer to this web site (https://docs.antismash.secondarymetabolites.org/install/)
my os is ubuntu 18.04
antismash --check-prereqs
when i insert the coding in terminal, this results coming out
ERROR 19/04 17:22:20 Failed to locate executable for 'long-orfs'
ERROR 19/04 17:22:20 Failed to locate executable for 'extract'
ERROR 19/04 17:22:20 Failed to locate executable for 'build-icm'
ERROR 19/04 17:22:20 Failed to locate executable for 'glimmer3'
ERROR 19/04 17:22:20 Failed to locate executable for 'glimmerhmm'
ERROR 19/04 17:22:20 Failed to locate executable for 'hmmpfam2'
ERROR 19/04 17:22:20 Failed to locate executable for 'hmmpfam2'
ERROR 19/04 17:22:20 Failed to locate executable for 'java'
ERROR 19/04 17:22:20 Failed to locate executable for 'hmmpfam2'
ERROR 19/04 17:22:20 Failed to locate executable for 'hmmpfam2'
ERROR 19/04 17:22:20 Not all prerequisites met
i download all they need follow the step but come out the error
help me...
thank you
Is your feature request related to a problem? Please describe.
Extract module nucleotide co-ordinates for NRPS coding ORFs. Since v5 module co-ordinate inference has become difficult due to lack of the txt folder from v4.1.2
Describe the solution you'd like
Ideally, get a text file with tab separated output stating ORF-ID, Module-number, module.start, module.end
Describe alternatives you've considered
Smart JSON Editor
http://www.smartjsoneditor.com/
and extracting per ORF the co-ordinates from the modules section of the json tree
(Note: This gives only amino acid positions and not nucleic acid)Additional context
I need to apply the alternative to a large number of geneclusters with long NRPSs which makes manual co-ordinate extraction tedious and mistake prone. This need not be a standard output but if there is a flag in the antismash offline run to output this info it would be great.
Advantages
This will enable meta-analysis on evolution of modules and help understand diversification mechanisms in NRPS.
Describe the bug
When running antismash 5 in a docker container, I'm getting a KeyError when the update_prediction
function from the nrps_pks/parsers module is called.
Expected behavior
I expect AD/KS domains to be sanitized and rebuilt from scratch since --skip-sanitisation
is not flagged.
To Reproduce
wget https://mibig.secondarymetabolites.org/repository/BGC0000001/BGC0000001.1.region001.gbk
mkdir BGC0000001
## Gives error
docker run -v ${PWD}:/input/ -v ${PWD}/BGC0000001:/output/ --detach=false --rm \
--user=$(id -u):$(id -g) antismash/standalone:5.0.0 BGC0000001.1.region001.gbk
## No error (but is the source of BGC0000001)
efetch -db nuccore -format gb -id JF752342.1 > JF752342.1.gbk
docker run -v ${PWD}:/input/ -v ${PWD}/BGC0000001:/output/ --detach=false --rm \
--user=$(id -u):$(id -g) antismash/standalone:5.0.0 JF752342.1.gbk
System (please complete the following information):
Additional context
Normally, I would think about just stripping out the annotations, but this is also interacting with issue #180, where we want to combine / reuse existing results.
thanks!
I run antismash 4.2 for 10 genomes dowloaded from NCBI. After downloaded, i run prokka then take *.gbk file to run antismash. 9 others file is ok. But 1 file when i run antismash. I had this error
/home/vdsa/software/antismash-4.2.0/run_antismash.py -c 12 --transatpks_da --clusterblast --subclusterblast --knownclusterblast --smcogs --inclusive --borderpredict --full-hmmer --asf --tta -v --outputfolder GCA_003945505.1_ASM394550v1_genomic GCA_003945505.1_ASM394550v1_genomic.gbk
INFO 10/01 14:07:55 Loading detection plugins
INFO 10/01 14:07:55 Parsing the input sequence(s)
INFO 10/01 14:07:58 Analyzing record 1 (CP029618.1)
INFO 10/01 14:07:58 Detecting secondary metabolite signature genes for contig #1
INFO 10/01 14:08:12 Detecting secondary metabolite clusters using inclusive ClusterFinder algorithm for contig #1
INFO 10/01 14:08:12 Running whole-genome pfam search
INFO 10/01 14:43:38 Running ClusterFinder HMM to detect gene clusters
INFO 10/01 14:43:50 Running cluster-specific analyses
INFO 10/01 14:43:50 Calculating detailed predictions for Lanthipeptide clusters
INFO 10/01 14:48:04 Calculating detailed predictions for lasso peptide clusters
INFO 10/01 14:48:32 Predicting NRPS A domain substrate specificities with SANDPUMA
INFO 10/01 15:10:25 Predicting PKS AT domain substrate specificities by Yadav et al. PKS signature sequences
INFO 10/01 15:10:37 Predicting PKS AT domain substrate specificities by Minowa et al. method
INFO 10/01 15:10:59 Predicting CAL domain substrate specificities by Minowa et al. method
INFO 10/01 15:11:00 Predicting PKS KR activity and stereochemistry using KR fingerprints from Starcevic et al.
INFO 10/01 15:11:30 Aligning Tans-AT PKS domains
INFO 10/01 15:11:53 TransATPKS: constructing phylogeny tree for each KS domain
INFO 10/01 15:17:49 Phylogenetic analysis: predicting substrate specificity of KS
INFO 10/01 15:17:56 MAFFT: generating pairwise distance matrix of multiple sequence alignment of all domains
INFO 10/01 15:23:19 TRANSATPKS: generating distance matrix of assembly lines
Traceback (most recent call last):
File "/home/vdsa/software/antismash-4.2.0/run_antismash.py", line 1210, in
main()
File "/home/vdsa/software/antismash-4.2.0/run_antismash.py", line 559, in main
run_analyses(seq_record, options, plugins)
File "/home/vdsa/software/antismash-4.2.0/run_antismash.py", line 643, in run_analyses
cluster_specific_analysis(plugins, seq_record, options)
File "/home/vdsa/software/antismash-4.2.0/run_antismash.py", line 1191, in cluster_specific_analysis
plugin.specific_analysis(seq_record, options)
File "/home/vdsa/software/antismash-4.2.0/antismash/specific_modules/nrpspks/specific_analysis.py", line 99, in specific_analysis
classify_nrpspks_domains_ks(pksnrpsvars, seq_record, options)
File "/home/vdsa/software/antismash-4.2.0/antismash/specific_modules/nrpspks/nrpspks_classification.py", line 159, in classify_nrpspks_domains_ks
similar_bgc_per_cluster, new_cluster, new_cluster_index = cdt.run_calculate_distance(data_dir = data_dir, seq_simScore = KS_msa_dist, ksa_per_new_cluster=KS_per_cluster, cutoff_bgc_nr = bgcs_nr)
File "/home/vdsa/software/antismash-4.2.0/antismash/specific_modules/nrpspks/nrpspksdomainalign/calculate_distance_transATPKS.py", line 494, in run_calculate_distance
BGCs, DMS, cluster_seq, new_cluster, new_cluster_index = generate_BGCs_DMS(cluster=cluster, cluster_KSindex=cluster_KSindex, pseudo_aa=pseudo_aa, ksa_per_new_cluster=ksa_per_new_cluster)
File "/home/vdsa/software/antismash-4.2.0/antismash/specific_modules/nrpspks/nrpspksdomainalign/calculate_distance_transATPKS.py", line 119, in generate_BGCs_DMS
new_cluster, new_cluster_index = _get_new_cluster_index(ksa_per_new_cluster=ksa_per_new_cluster)
File "/home/vdsa/software/antismash-4.2.0/antismash/specific_modules/nrpspks/nrpspksdomainalign/calculate_distance_transATPKS.py", line 68, in _get_new_cluster_index
cluster_i_dict_wob[int(index)] = [ksa_index, cluster_i[k][0]]
ValueError: invalid literal for int() with base 10: 'KS?'
Describe the bug
Encountered the following error messages when running antiSMASH 5.0.0 from docker image with the option of --cassis.
antismash.detection.cassis: preqrequisite failure: Failed to locate executable for 'meme'
antismash.detection.cassis: preqrequisite failure: Failed to locate executable for 'fimo'
To Reproduce
/home/antismash/bin/run_antismash ~/data/maker/pcocy/SPAdes_Pcocy_lophfium_scaffolds.fasta ~/data/maker/pcocy/test --taxon fungi --cassis --tta-threshold 0.65 --cb-general --cb-subclusters --cb-knownclusters --smcog-trees --fullhmmer --asf --pfam2go --cf-create-clusters --genefinding-tool none --genefinding-gff3 /input/pcocy.all.functional.ipr.sprot.gff
System (please complete the following information):
I was testing out antiSMASH (downloaded using Bioconda) using few known pathogens. I downloaded genbank records from NCBI (e.g. Clostridium difficile : GCA_002073735.2 ) and ran antismash with default settings. Got the following warning. This doesn't keep it from generating an output,
/usr/miniconda2/envs/antismash/lib/python2.7/site-packages/Bio/Seq.py:2576: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future.
I downloaded 5 other genomes from NCBI and got the same warning for all of them.
Hello, antismash development team,
how to do large-scale analysis of antismash results of many genomes in a programmatic manner? parse *.gbk or parse *.json or parse regions.js?
Is your feature request related to a problem? Please describe.
Yes, basically I run antiSMASH on some RefSeq bacterial assemblies in fasta format. Some of those have multiple fasta records within a single assembly, and each fasta record has an ID of the form XXXXXXXX.#
. My issue is that antiSMASH removes the last part of the accession ID (the .#
) so that output files get names like XXXXXXXX_BGC.txt
, and BGCs get names like XXXXXXXX_c1
. Hence I have no direct way of matching my original fasta record IDs to the filenames. I have resorted to changing the IDs in my original input files to some custom unique values and keeping a separate table that matches my unique accessions to the original ones. Obviously this is cumbersome and far from optimal since I duplicate data and add extra steps whenever I want to match the antiSMASH results with other data.
Describe the solution you'd like
Simply keeping the original IDs of the input fasta records would be the ideal. This would be the best practice since software shouldn't modify the IDs of the input sequences, and the predictions technically correspond not only to an accession, but to a specific version as well.
Describe alternatives you've considered
I imagine there could be reasons why having a dot in the BGC_ID could cause some incompatibilities in the pipeline. An alternative would be to add a column to the output files that have the original unmodified ID of the input fasta record where a BGC is found.
Great software
Sur
Bacteroides thetaiotaomicron VPI-5482 (NCBI assembly ID: GCA_000011065.1) has experimentally verified BGCs in their genome which is also present in MIBiG. But why I am getting no antismash results in this strain? I ran antismash both locally (conda) and in antismash server.
I am running antiSMASH 4.2 installed through conda and saving only the txt output.
I cannot find explicit documentation indicating what the values in the "BGC_range" column in the _BGC.txt files means.
For example if I had 1:2
in the range, and the sequence ACT
, which bases are included in the feature?
AC
.A
.C
.CT
.Thanks
Whilst the htlm output it very useful to look at, having a tabular version of it would be more useful for generating summaries across samples or groups of samples and just general use in down stread analysis. I know you used to have the genecluster.txt output but no longer generate that so what is the best way to get such a tabular output now?
Is your feature request related to a problem? Please describe.
Some of my genomes have long contig headers and preserving them as such is useful for downstream stats. I understand the rational behind renaming is to conform with the genbank limit, but I guess it becomes cumbersome for users who don't intend to work/submit using the genbank format
Describe the solution you'd like
Renaming headers should be optional
Describe the bug
A clear and concise description of what the bug is.
Expected behavior
A clear and concise description of what you expected to happen.
To Reproduce
For analysis issues, provide an accession number or a record fragment that can reproduce the problem.
For interaction issues, also include steps to reproduce the behavior:
Screenshots
If applicable, add screenshots to help explain your problem.
System (please complete the following information):
Additional context
Add any other context about the problem here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.