tseemann / abricate Goto Github PK

:mag_right: :pill: Mass screening of contigs for antimicrobial and virulence genes

License: GNU General Public License v2.0

Perl 100.00%

antimicrobial-resistance-genes virulence-genes contigs bioinformatics-tool genomic-regions

abricate's Introduction

ABRicate

Mass screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with multiple databases: NCBI, CARD, ARG-ANNOT, Resfinder, MEGARES, EcOH, PlasmidFinder, Ecoli_VF and VFDB.

Is this the right tool for me?

It only supports contigs, not FASTQ reads
It only detects acquired resistance genes, NOT point mutations
It uses a DNA sequence database, not protein
It needs BLAST+ >= 2.7 and any2fasta to be installed
It's written in Perl 🐫

If you are happy with the above, then please continue! Otherwise consider using Ariba, Resfinder, RGI, SRST2, AMRFinderPlus, etc.

Quick Start

% abricate 6159.fasta
Using database resfinder:  2130 sequences -  Mar 17, 2017
Processing: 6159.fna
Found 3 genes in 6159.fna
#FILE     SEQUENCE     START   END     STRAND GENE     COVERAGE     COVERAGE_MAP     GAPS  %COVERAGE  %IDENTITY  DATABASE  ACCESSION  PRODUCT        RESISTANCE
6159.fna  NC_017338.1  39177   41186   +      mecA_15  1-2010/2010  ===============  0/0   100.00     100.000    ncbi      AB505628   n/a	     FUSIDIC_ACID
6159.fna  NC_017338.1  727191  728356  -      norA_1   1-1166/1167  ===============  0/0   99.91      92.367     ncbi      M97169     n/a            FOSFOMYCIN
6159.fna  NC_017339.1  10150   10995   +      blaZ_32  1-846/846    ===============  0/0   100.00     100.000    ncbi      AP004832   betalactamase  BETA-LACTAM;PENICILLIN

Installation

Brew

If you are using the MacOS Homebrew or LinuxBrew packaging system:

brew install brewsci/bio/abricate
abricate --check
abricate --list

Bioconda

If you use Conda follow the instructions to add the Bioconda channel:

conda install -c conda-forge -c bioconda -c defaults abricate
abricate --check
abricate --list

Source

If you install from source, Abricate has the following package dependencies:

any2fasta for sequence file format conversion
BLAST+ >2.7.0 for blastn, makeblastdb, blastdbcmd
Perl modules: LWP::Simple, Bio::Perl, JSON, Path::Tiny
git, unzip, gzip for updating databases

Most of these are easy to install on an Ubuntu-based system:

sudo apt-get install bioperl ncbi-blast+ gzip unzip git \
  libjson-perl libtext-csv-perl libpath-tiny-perl liblwp-protocol-https-perl libwww-perl
git clone https://github.com/tseemann/abricate.git
./abricate/bin/abricate --check
./abricate/bin/abricate --setupdb
./abricate/bin/abricate ./abricate/test/assembly.fa

Input

Abricate takes any sequence file that any2fasta can convert to FASTA files (eg. Genbank, EMBL), and they can be optionally gzip or bzip2 compressed.

abricate assembly.fa 
abricate assembly.fa.gz
abricate assembly.gbk 
abricate assembly.gbk.bz2

It can take multiple files at once too:

abricate assembly.*
abricate /mnt/ncbi/bacteria/*.gbk.gz

Or you can provide it a "file of file names" (a "FOFN"):

% cat test/fofn.txt

assembly.fa
assembly.fa.gz
assembly.gbk
assembly.gbk.bz2

% abricate --fofn test/fofn.txt

It does not accept raw FASTQ reads; please use Ariba or SRTS2 for that.

Output

Abricate produces a tap-separated output file with the following columns:

Column	Example	Description
FILE	`Ecoli.fna`	The filename this hit came from
SEQUENCE	`contig000324`	The sequence in the filename
START	`23423`	Start coordinate in the sequence
END	`24117`	End coordinate
STRAND	`+`	Strand + or -
GENE	`tet(M)`	AMR gene name
COVERAGE	`1-1920/1920`	What proportion of the gene is in our sequence
COVERAGE_MAP	`===============`	A visual represenation of the hit. `=`=aligned, `.`=unaligned, `/`=has_gaps
GAPS	`1/4`	Openings / gaps in subject and query - possible psuedogene?
%COVERAGE	`100.00%`	Proportion of gene covered
%IDENTITY	`99.95%`	Proportion of exact nucleotide matches
DATABASE	`ncbi`	The database this sequence comes from
ACCESSION	`NC_009632:49744-50476`	The genomic source of the sequence
PRODUCT	`aminoglycoside O-phosphotransferase APH(3')-IIIa`	Gene product (if available)
RESISTANCE	`TETRACYCLINE;FUSIDIC_ACID`	putative antibiotic resistance phenotype, `;`-separated

Caveats

Does not find mutational resistance, only acquired genes.
Gap reporting incomplete
Sometimes two heavily overlapping genes will be reported for the same locus
Possible coverage calculation issues

Databases

ABRicate comes with some pre-downloaded databases:

You can check what you have installed with the --list command. This lists the available databases in TSV (or CSV with --csv) and three columns:

% abricate --list

DATABASE       SEQUENCES  DBTYPE  DATE
argannot       1749       nucl    2019-Jul-28
card           2241       nucl    2019-Jul-28
ecoh           597        nucl    2019-Jul-28
ecoli_vf       2701       nucl    2019-Jul-28
megares        6635       nucl    2020-Feb-20
ncbi           4324       nucl    2019-Jul-28
plasmidfinder  263        nucl    2019-Jul-28
resfinder      2434       nucl    2019-Jul-28
vfdb           2597       nucl    2019-Jul-28

The default database is ncbi. You can choose a different database using the --db option:

% abricate --db vfdb --quiet 6159.fa

6159.fna  NC_017338.1  2724620  2726149  aur      1-1530/1530     ===============  0/0    100.00     99.346     vfdb      NP_647375	zinc metalloproteinase aureolysin
6159.fna  NC_017338.1  2766595  2767155  icaR     1-561/561       ===============  0/0    100.00     98.930     vfdb      NP_647402	N-acetylglucosaminyltransferase
6159.fna  NC_017338.1  2767319  2768557  icaA     1-1239/1239     ===============  0/0    100.00     99.677     vfdb      NP_647403	n/a
6159.fna  NC_017338.1  2768521  2768826  icaD     1-306/306       ===============  0/0    100.00     99.020     vfdb      NP_647404	n/a
6159.fna  NC_017338.1  2768823  2769695  icaB     1-873/873       ===============  0/0    100.00     99.542     vfdb      NP_647405	n/a
6159.fna  NC_017338.1  2769682  2770734  icaC     1-1053/1053     ===============  0/0    100.00     98.955     vfdb      NP_647406	n/a
6159.fna  NC_017338.1  2771040  2773085  lip      1-2046/2046     ===============  0/0    100.00     98.778     vfdb      NP_647407	triacylglycerol lipase precursor

Combining reports across samples

ABRicate can combine results into a simple matrix of gene presence/absence. An absent gene is denoted . and a present gene is represented by its '%COVERAGE`. This can be individual abricate reports, or a combined one.

# Run abricate on each .fa file
% abricate 1.fna > 1.tab
% abricate 2.fna > 2.tab

# Combine
% abricate --summary 1.tab 2.tab

#FILE     NUM_FOUND  aac(6')-aph(2'')_1  aadD_1  blaZ_32  blaZ_36  erm(A)_1  mecA_15  norA_1  spc_1  tet(M)_7
1.tab     8          100.00              100.00  .        100.00   100.00    100.00   99.91   100.00  100.00
2.tab     3          .                   .       100.00   .        .         100.00   99.91   .       .

Or if you ran everything in a single report, it will work too.

% abricate *.fna > results.tab
% abricate --summary results.tab > summary.tab

Updating the databases

# force download of latest version
% abricate-get_db --db ncbi --force

# re-use existing download and just regenerate the database
% abricate-get_db --db ncbi

Making your own database

Let's say you want to make your own database called tinyamr. All you need is a FASTA file of nucleotide sequences, say tinyamr.fa. Ideally the sequence IDs would have the format >DB~~~ID~~~ACC~~~RESISTANCES DESC where DB is tinyamr, ID is the gene name, ACC is an accession number of the sequence source, RESISTANCES is the phenotype(s) to report, and DESC can be any textual description.

% cd /path/to/abricate/db     # this is the --datadir default option
% mkdir tinyamr
% cp /path/to/tinyamr.fa sequences
% head -n 1 sequences
>tinyamr~~~GENE_ID~~~GENE_ACC~~RESISTANCES some description here
% abricate --setupdb
% # or just do this: makeblastdb -in sequences -title tinyamr -dbtype nucl -hash_index

% abricate --list
DATABASE  SEQUENCES  DBTYPE  DATE
tinyamr   173        nucl    2019-Aug-28

% abricate --db tinyamr screen_this.fasta

Etymology

The name "ABRicate" was chosen as the first 3 letters are a common acronym for "Anti-Biotic Resistance". It also has the form of an English verb, which suggests the tool actual taking "action" against the problem of antibiotic resistance. It is also relatively unique in Google, and is unlikely to receive an infamous JABBA Award.

Citation

If you publish the results of Abricate please cite both the software and the appropriate database you used with --db

Seemann T, Abricate, Github https://github.com/tseemann/abricate
NCBI AMRFinderPlus - doi: 10.1128/AAC.00483-19
CARD - doi:10.1093/nar/gkw1004
Resfinder - doi:10.1093/jac/dks261
ARG-ANNOT - doi:10.1128/AAC.01310-13
VFDB - doi:10.1093/nar/gkv1239
PlasmidFinder - doi:10.1128/AAC.02412-14
EcOH - doi:10.1099/mgen.0.000064
MEGARES 2.00 - doi:10.1093/nar/gkz1010

Issues

Please report problems to the Issues Page.

License

GPLv2

Author

Torsten Seemann | @torstenseemann | blog

abricate's People

Contributors

Stargazers

Watchers

Forkers

destadey tarah28 andrewjpage dfornika yemilawal fbreitwieser bioinfoacademy kuzmenkov111 ayoraind easnavely fuying98 nunoalexandrefaria lancetxiao jingjiesong p-barbet replikation galstriker mngumi gjfenske wangdi2014 junwu302 bgruening tauqeer9 jdalsdurf ksanjeetsinha talalhossain nayeimkhan kdbrumfield buihoangphuc412 olatzs19 tushar-ahmed fmaguire fanshu1986 liaohu1231 chasefor karubiotools xthua mingjuhao gsarfo-boateng nataliagaeta29 jianshu93 boasvdp onecodex absartalat khemlalnirmalkar chau1311 chen318liang kayobianco anyihu cvn001 ansair chavin09 xiaomagogogo zhangxiaodong8315 qhgenomics alanseb92 mleemann genostack genopaths-africa xiangyang1984 nquynh8991 ahmedbajwa03 liaochenlanruo linguopeng kfwins2022 bioinfosif aastha-batta munamerchan pengshengbin ztsin jadetree21 terencedong alienzj dewadewi2020 albertrockg vasconcelosgabriella dialvarezs 1306975847 derrick-daniel pxhhappy seqbioucc randolium tin6150 mohamed-ashraf11 zhanglixixi

abricate's Issues

Enable protein query with tblastn

Hi,
is it feasable to use abricate for mass screening of protein sequences instead of genes (i.e. use tblastn instead of blastn)?
Thanks,
Carlus

Don't distribute BLAST indices

Just send sequences and add --setupdb option?

downloading CARD using abricate-get_db - card format changed

Hi Torsten,

I was trying to update the CARD database to the most updated version using the abricate-get_db command but the command failed with the following error:

Downloading: https://card.mcmaster.ca/download/0/broadstreet-v1.1.9.tar.gz
Result: 501
Destination: card.tar.bz2
Filesize:  bytes
tar: card.tar.bz2: Cannot open: No such file or directory

Looks like the CARD database finally made the switch to tar.gz. Can you update the get_db script to change the .bz2 to .gz?

Thanks!

--list should print to STDOUT not STDERR

abricate --list
Database abricate has no sequences, please try: abricate-get_db --db abricate
argannot:  1749 sequences -  Aug 28, 2017
card:  2153 sequences -  Aug 28, 2017
ncbi:  4124 sequences -  Aug 28, 2017
ncbibetalactamase:  1557 sequences -  Aug 28, 2017
plasmidfinder:  263 sequences -  Aug 28, 2017
resfinder:  2228 sequences -  Aug 28, 2017
vfdb:  2597 sequences -  Aug 28, 2017

Thanks @schultzm

The first error line will still be STDOUT

Add licenses for the databases I include

Need to check I ain't breaking any rules :-/

CARD database is outdated

I am trying to use the latest version of CRAD, which is 1.2 via

abricate-get_db --db card --force

Setting up 'card' in '/Users/pi/miniconda3/db/card'
Downloading: https://card.mcmaster.ca/download/0/broadstreet-v1.1.9.tar.gz
Result: 200
Destination: card.tar.bz2

However, the most recent version is 1.2.0.

I tried changing the card.tar.bz2 in the .../miniconda3/db/card/src/ and then reloaded the db with

abricate-get_db --db card

no effect, still uses 1.1.9

What am I doing wrong?

Add database EcOH

Issue #38 by @dutchscientist

https://github.com/katholt/srst2/tree/master/data

Remove ! from option in --help output

If you do abricate --help. the output contains exclamation marks after debug, quiet, version etc. These should not be included.

When I was explaining this to students, they thought it was confusing. Why add the exclamation mark when it should not be used?

Coverage and Identity!

I am constructing binary matrix from abricate output and I can't decide what percentages to go with. The idea was to report only genes that has >95% coverage and from that >95% identity. But I am concerned that it could be too strict - what's your opinion on the subject?

@tseemann please lable this "Help wanted"!

Thanks in advance

abricate-get_db feature requests

Hi Torsten,
Feature requests:

add --help to abricate-get_db
allow user to specify db url or path to db if already downloaded (e.g., http://www.mgc.ac.cn/VFs/Down/VFDB_setB_nt.fas.gz instead of http://www.mgc.ac.cn/VFs/Down/VFDB_setA_nt.fas.gz )
during the running of abricate-get_db, append unique identifier to locus tag so duplicate error does not occur (BLAST Database creation error: Error: Duplicate seq_ids are found:)

Cheers,

Mark

Fails on Biolinux8 due to BLAST+ version?

Biolinux8 is Ubuntu 14.04 LTS, and this one has BLAST+ v 2.2.28 as final one in its repositories. Hence Abricate won't run (it expects >= 2.2.30), and the check does not pick this up.

How easiest to correct?

no ncbi database?

Hi Torsten, Jason, just checking.. is the ncbi database no longer used in abricate 0.5? Thanks, Rosie

abricate only gives top hit for gene

In the output, only the top blast hit is shown.
For some genes eg. blaTEM, if both TEM-1 and TEM-45 were present in a genome, only TEM-1 would be reported. However, the activity and clinical significance of the two beta-lactamases is different.

Database abricate has no sequences, please try: abricate-get_db --db abricate

This is because I have a folder and a README

Make --list more machine readable

I propose TSV format for programs that need to parse for available DBs.

something wrong with mcr-1.6_1

123.fasta	NODE_126_length_2763_cov_117.927	775	2600	mcr-1.6_1	1-1826/1826	Err:510	0/0	100	99.89

according to KY352406, the length of mcr-1.6_1 should be 1626 bps, the same as the other mcr-1 variants. However, as you can see at the top, it shows 1826 bps.

I also double-checked it with the online version of resfinder, it gave a correct match with mcr-1 (1626), so I'm wondering where it went wrong.

Thanks so much

Add simple test suite

Formats

asm.{gbk,fna}[.gz]
ensure same result

Cases

missing file
not valid format

Add database: SerotypeFinder

Issue #38 by @dutchscientist
https://cge.cbs.dtu.dk/services/SerotypeFinder/

@dutchscientist do you have the URL for the amplicon sequences?

can't update database

Hi there,

I tried to update the database with command "abricate-get_db --force --db ncbi", however it failed throwing out 501 https error.
I then installed LWP::Protocol::https via "cpanm LWP::Protocol::https" and turned out the same error message persisted. After that I also tried "sudo cpanm LWP::Protocol::https" which installed it with the other perl location I suppose, but still not working.
Both ways successfully installed LWP::Protocol::https, but still did not resolve the problem.
I'm a bit new to all of these, can someone show me how to resolve it, thanks much!

Check FASTA file exists before running BLAST

% abricate olap.rm.fa
#FILE   SEQUENCE        START   END     GENE    COVERAGE        COVERAGE_MAP    GAPS    %COVERAGE       %IDENTITY
Processing: olap.rm.fa
Command line argument error: Argument "query". File is not accessible:  `olap.rm.fa'

Need BLAST > 2.2.27 (?) for "6 gaps" output attribute?

Output 'antibiotic class' in the TSV

feature request: send description from database headers to output table

Hi @tseemann ,

Feature request: add a new column to the abricate output that contains 'DESCRIPTION' information. For example, the ncbi database has

>ncbi~~~mcr-1~~~A7J11_03461 phosphoethanolamine--lipid A transferase MCR-1
>ncbi~~~mcr-1.2~~~A7J11_03754 phosphoethanolamine--lipid A transferase MCR-1.2
>ncbi~~~mcr-1.3~~~A7J11_04944 phosphoethanolamine--lipid A transferase MCR-1.3

in the headers. I would like to be able to extract from the abricate stdout that mcr-1.3 is a phosphoethanolamine--lipid A transferase

Thanks

Support for GENBANK, and GZ files?

It would make it easier to screen Genbank Assemblies.

--setupdb doesn't check for makeblastdb first

0.15s$ bin/abricate --setupdb
Database argannot has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database card has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database ncbi has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database ncbibetalactamase has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database plasmidfinder has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database resfinder has not been indexed; formatting now.
sh: 1: makeblastdb: not found
Database vfdb has not been indexed; formatting now.
sh: 1: makeblastdb: not found

dependency hdf5 is not available

Hello Torsten,
brew seems not to have hdf5 available. I installed via ubuntu but still not recognized by abricate. Perhaps, the dependecies need to be changed and the place where to find hdf5?

brew install abricate
Updating Homebrew...
==> Installing abricate from tseemann/bioinformatics-linux
Error: No available formula with the name "hdf5" (dependency of tseemann/bioinformatics-linux/abricate)
It was migrated from homebrew/science to homebrew/core.

Install issue - newbie

Having trouble with the installation?
Used brew initially, had to do cpan -i Slurp/CSV/JSON as well. Then:

brew install abricate --HEAD
==> Installing abricate from tseemann/bioinformatics-linux
==> Installing dependencies for tseemann/bioinformatics-linux/abricate: blast
==> Installing tseemann/bioinformatics-linux/abricate dependency: blast�[39
Error: homebrew/science/blast cannot be built with any available compilers.
Install Clang or brew install gcc

Tried brew install gcc, no joy.

Also tried installing BLAST into home directory.

No luck with git clone:
git clone https://github.com/tseemann/abricate.git./abricate/bin/abricate
Cloning into 'abricate'...
remote: Not Found
fatal: repository 'https://github.com/tseemann/abricate.git./abricate/bin/abricate/' not found

Or with conda:
~/miniconda2/bin$ ls
2to3 cmfetch c_rehash openssl smtpd.py
activate cmpress deactivate pip sqlite3
cmalign cmscan easy_install pydoc tclsh8.5
cmbuild cmsearch easy_install-2.7 python wheel
cmcalibrate cmstat f2py python2 wish8.5
cmconvert conda idle python2.7
cmemit conda-env integron_finder python-config

~/miniconda2/bin$ conda install abricate
Fetching package metadata .............

PackageNotFoundError: Package missing in current linux-64 channels:

abricate

Close matches found; did you mean one of these?

abricate: fabric

Can you please help? I'm a newbie, working on Ubuntu

Thanks

Automate finding latest CARD version

https://card.mcmaster.ca/download/0/broadstreet-v1.1.9.tar.gz

abricate --list says "abricate: corrupt database?"

@Slugger70 reports:

abricate --list
abricate: corrupt database? - /Users/Simon/miniconda3/bin/../db/abricate
argannot:  1749 sequences -  Jul 8, 2017
card:  2124 sequences -  Jul 8, 2017
ncbibetalactamase:  1557 sequences -  Mar 17, 2017
plasmidfinder:  263 sequences -  Mar 19, 2017
resfinder:  2228 sequences -  Jul 8, 2017
vfdb:  2597 sequences -  Mar 17, 2017

Returns 0 not 1 on bad command line arg

needs || usage(1)

Can not find sequence data in 'file'

Forgive me if this is a known or simple issue. I have all of the necessary perl modules and Abricate, itself, installed. I am trying to feed it a fasta file with ~6000 sequences in it. This fasta was converted from fastq format. I am receiving an error message saying "WARNING: can not find sequence data in 'file_name'

I have tried using EMBOSS' seqret program to reformat the file again. It's definitely a slightly different format, but the error persists.

It clearly has sequence data at roughly 150 bp per read and the standard Identifier line beginning with ">"

Any thoughts on this?

Thanks

Can the database be changed to ARGannot and/or CARD?

Is there anyway to do this?
The script seems to allow for a database update, but it seems to only be for Res-Finder.
Thank you.

ARO accession numbers missing from CARD database

While using abricate to get resistance genes out of a set of sequences, I have noticed that card entries do not have aro accessions in the report file, instead have ncbi accessions (which is fine of course). However, aro accessions are also very useful when using card because they allow you to make direct links to CARD website. I have started digging your fasta files and found the one for card. However, each entry is like this one:

>card~~~AAC(1)~~~HM036080:132-597 Acetylation of paromomycin, and apramycin, on the amino group at position 1 in E. coli, Actinomycete, Campylobacter spp.

So, there is no aro accession at all. Therefore for my specific problem I just made a dictionary that matched ncbi accession numbers with the respective aro accession using cards file (available here). I used the aro_index.csv, but original card fastas already have a header like this:

>gb|GQ343019|+|132-1023|ARO:3002999|CblA-1 [mixed culture bacterium AX_gF3SD01_15]

Perhaps you can find a way to maintain these "links" on your card fasta database.

get_db needs to have dir already created with --dbdir

abricate-get_db --db argannot --dbdir card
ERROR: --outdir 'card' does not exist

feature request: allow --summary to join csv tables

Hi @tseemann ,

I would like to place a feature request for abricate --summary *.csv.

Thanks,

Mark

unsatisfied perl dependencies

Hi Torsten

I couldn't install abricate via linuxbrew, any advice gratefully received, thanks

brew install abricate --HEAD
==> Installing abricate from tseemann/bioinformatics-linux
abricate: Unsatisfied dependency: File::Slurp
Homebrew does not provide special Perl dependencies; install with:
cpan -i File::Slurp
abricate: Unsatisfied dependency: Text::CSV
Homebrew does not provide special Perl dependencies; install with:
cpan -i Text::CSV
abricate: Unsatisfied dependency: JSON
Homebrew does not provide special Perl dependencies; install with:
cpan -i JSON
Error: Unsatisfied requirements failed this build.

I tried cpan -i File::Slurp with no effect on abricate installation

Loading internal null logger. Install Log::Log4perl for logging messages

CPAN.pm requires configuration, but most of it can be done automatically.
If you answer 'no' below, you will enter an interactive dialog for each
configuration option instead.

Would you like to configure as much as possible automatically? [yes] yes
Use of uninitialized value $what in concatenation (.) or string at /usr/share/perl/5.22/App/Cpan.pm line 633, line 1.

Warning: You do not have write permission for Perl library directories.

To install modules, you need to configure a local Perl library directory or
escalate your privileges. CPAN can help you by bootstrapping the local::lib
module or by configuring itself to use 'sudo' (if available). You may also
resolve this problem manually if you need to customize your setup.

What approach do you want? (Choose 'local::lib', 'sudo' or 'manual')
[local::lib] local::lib
Attempting to create directory /home/mike/perl5

Checking if your kit is complete...
Looks good
Generating a Unix-style Makefile
Writing Makefile for local::lib
Writing MYMETA.yml and MYMETA.json
cp lib/POD2/DE/local/lib.pod blib/lib/POD2/DE/local/lib.pod
cp lib/POD2/PT_BR/local/lib.pod blib/lib/POD2/PT_BR/local/lib.pod
cp lib/local/lib.pm blib/lib/local/lib.pm
cp lib/lib/core/only.pm blib/lib/lib/core/only.pm
Manifying 4 pod documents
PERL_DL_NONLAZY=1 "/usr/bin/perl" "-I/home/mike/perl5/lib/perl5" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/.t
t/bad_variables.t ...... ok
t/carp-mismatch.t ...... ok
t/classmethod.t ........ ok
t/coderefs_in_inc.t .... ok
t/de-dup.t ............. ok
t/lib-core-only.t ...... ok
t/pipeline.t ........... ok
t/shell.t .............. ok
t/stackable.t .......... ok
t/subroutine-in-inc.t .. ok
t/taint-mode.t ......... ok
All tests successful.
Files=11, Tests=149, 2 wallclock secs ( 0.05 usr 0.01 sys + 0.94 cusr 0.06 csys = 1.06 CPU)
Result: PASS
Manifying 4 pod documents
Installing /home/mike/perl5/lib/perl5/POD2/PT_BR/local/lib.pod
Installing /home/mike/perl5/lib/perl5/POD2/DE/local/lib.pod
Installing /home/mike/perl5/lib/perl5/lib/core/only.pm
Installing /home/mike/perl5/lib/perl5/local/lib.pm
Installing /home/mike/perl5/man/man3/POD2::DE::local::lib.3pm
Installing /home/mike/perl5/man/man3/POD2::PT_BR::local::lib.3pm
Installing /home/mike/perl5/man/man3/lib::core::only.3pm
Installing /home/mike/perl5/man/man3/local::lib.3pm
Appending installation info to /home/mike/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/perllocal.pod

Would you like me to append that to /home/mike/.bashrc now? [yes] yes
Running install for module 'File::Slurp'
Fetching with LWP:
http://www.cpan.org/authors/id/U/UR/URI/File-Slurp-9999.19.tar.gz
Fetching with LWP:
http://www.cpan.org/authors/id/U/UR/URI/CHECKSUMS
Checksum for /home/mike/.cpan/sources/authors/id/U/UR/URI/File-Slurp-9999.19.tar.gz ok
Configuring U/UR/URI/File-Slurp-9999.19.tar.gz with Makefile.PL
Checking if your kit is complete...
Looks good
Generating a Unix-style Makefile
Writing Makefile for File::Slurp
Writing MYMETA.yml and MYMETA.json
URI/File-Slurp-9999.19.tar.gz
/usr/bin/perl Makefile.PL INSTALLDIRS=site -- OK
Running make for U/UR/URI/File-Slurp-9999.19.tar.gz
cp lib/File/Slurp.pm blib/lib/File/Slurp.pm
Manifying 1 pod document
URI/File-Slurp-9999.19.tar.gz
/usr/bin/make -- OK
Running make test
PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/.t
t/append_null.t ....... ok
t/binmode.t ........... ok
t/chomp.t ............. ok
t/data_list.t ......... ok
t/data_scalar.t ....... ok
t/edit_file.t ......... ok
t/error.t ............. ok
t/error_mode.t ........ ok
t/file_object.t ....... ok
t/handle.t ............ ok
t/inode.t ............. ok
t/large.t ............. ok
t/newline.t ........... ok
t/no_clobber.t ........ ok
t/original.t .......... ok
t/paragraph.t ......... ok
t/perms.t ............. ok
t/pod.t ............... skipped: Test::Pod 1.14 required for testing POD
t/pod_coverage.t ...... skipped: Test::Pod::Coverage 1.04 required for testing POD coverage
t/prepend_file.t ...... ok
t/pseudo.t ............ ok
t/read_dir.t .......... ok
t/signal.t ............ ok
t/slurp.t ............. ok
t/stdin.t ............. ok
t/stringify.t ......... ok
t/tainted.t ........... ok
t/write_file_win32.t .. ok
All tests successful.
Files=28, Tests=296, 3 wallclock secs ( 0.07 usr 0.02 sys + 0.72 cusr 0.05 csys = 0.86 CPU)
Result: PASS
URI/File-Slurp-9999.19.tar.gz
/usr/bin/make test -- OK
Running make install
Manifying 1 pod document
Installing /home/mike/perl5/lib/perl5/File/Slurp.pm
Installing /home/mike/perl5/man/man3/File::Slurp.3pm
Appending installation info to /home/mike/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/perllocal.pod
URI/File-Slurp-9999.19.tar.gz
/usr/bin/make install -- OK

brew install abricate --HEAD

==> Installing abricate from tseemann/bioinformatics-linux
abricate: Unsatisfied dependency: File::Slurp etc etc

brew doctor
Your system is ready to brew
brew update
Already up-to-date

brew config
HOMEBREW_VERSION: 1.1.10
ORIGIN: https://github.com/Linuxbrew/brew.git
HEAD: d8c8b867bf99c604c89bc848f58d1fae8afd599c
Last commit: 5 weeks ago
Core tap ORIGIN: https://github.com/Linuxbrew/homebrew-core
Core tap HEAD: 9f0d64e6326d616c0c2ceb996d1903b6ae456aec
Core tap last commit: 4 days ago
HOMEBREW_PREFIX: /home/mike/.linuxbrew
HOMEBREW_REPOSITORY: /home/mike/.linuxbrew
HOMEBREW_CELLAR: /home/mike/.linuxbrew/Cellar
HOMEBREW_BOTTLE_DOMAIN: https://linuxbrew.bintray.com
CPU: quad-core 64-bit 0x65e
Homebrew Ruby: 2.3.1 => /usr/bin/ruby2.3
Clang: N/A
Git: 2.7.4 => /usr/bin/git
Perl: /usr/bin/perl
Python: /home/mike/.linuxbrew/bin/python => /home/mike/.linuxbrew/Cellar/python/2.7.13/bin/python2.7
Ruby: /usr/bin/ruby => /usr/bin/ruby2.3
Java: 1.8.0_112
Kernel: Linux 4.4.0-53-generic x86_64 GNU/Linux
OS: Ubuntu 16.04.1 LTS
Codename: xenial
OS glibc: 2.23
OS gcc: 5.4.0
Linuxbrew glibc: N/A
Linuxbrew gcc: 5.3.0
Linuxbrew xorg: N/A

Format for database

Hi!

I'd love to use this for other genes. Could you specify what the fasta description line should look like, for instance, which part(s) of it do you use where?

ERROR: abricate --list

abricate --list gives

abricate: corrupt database? - /usr/local/Cellar/abricate/0.5/bin/../db/abricate

how to resolve it?

Thanks

Fix IMP-4 until we wait for CGE to update the DB

Updated their IMP-4 to use the AF244145.1 sequence rather than the previous DQ307573 sequence (with 2 SNPs in it).

Add option to define minimum coverage?

Currently Abricate will also report partial hits. An option to define the minimum coverage required (similar to --minid=95 or so) would be a good option. So say --mincov=90 or so?

use multi-threaded blast

Currently, the blast search against the databases/query is only single-threaded. Blastn supports multi-threaded search natively. It would be great to expose -num_threads as an argument to abricate. Otherwise, search in metagenomic assemblies would take ages.

INDENITY column has 3 decimal places

But COVERAGE only has 2 ?

run with multiple database

Is it possible to run with multiple databases? I tried to run with all the database using comma to separate the databases , but did not work.

Card Summary Header

There seems to be a small issue with the formatting of the last columns in the card_summary output file header, I believe they are supposed to read smeR, and then sul1, but for me, they appear concatenated, with no tab in between. Extremely minor issue, but I thought I'd let you know!

Thank you, I really appreciate the software!

Interested in additional databases?

I have converted the SerotypeFinder and EcOH (SRST2) databases for E. coli serotyping to Abricate format. Are you interested in such databases to be made generally available? I am happy to share.

add indicator for stop codon in mini-map

Example abricate output:

2012-17035.fna	gnl|Prokka|2012-17035_4	51134	51309	aadA1	1-184/972	===...../......	8	18.11	94.565
2012-17035.fna	gnl|Prokka|2012-17035_2	27659	27769	aadA1	679-789/789	............===	0	14.07	99.099
2012-17035.fna	gnl|Prokka|2012-17035_4	52211	53002	aadA2	1-792/792	===============	0	100.00	99.874
2012-17035.fna	gnl|Prokka|2012-17035_4	61539	62354	aph(3')-Ia	1-816/816	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_2	37261	38142	blaKPC-2	1-882/882	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_2	26775	27614	blaOXA-9	1-840/840	===============	0	100.00	99.881
2012-17035.fna	gnl|Prokka|2012-17035_1	2712753	2713613	blaSHV-11	1-861/861	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_3	6395	7237	blaSHV-12	1-861/861	========/======	18	97.91	97.909
2012-17035.fna	gnl|Prokka|2012-17035_2	25215	26075	blaTEM-1A	1-861/861	===============	0	100.00	99.884
2012-17035.fna	gnl|Prokka|2012-17035_4	44636	45295	catA1	1-660/660	===============	0	100.00	99.848
2012-17035.fna	gnl|Prokka|2012-17035_4	51306	51803	dfrA12	1-498/498	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_1	4674590	4675009	fosA	1-420/420	===============	0	100.00	98.571
2012-17035.fna	gnl|Prokka|2012-17035_4	59641	60562	mph(A)	1-921/921	========/======	1	100.00	99.675
2012-17035.fna	gnl|Prokka|2012-17035_4	59657	60562	mph(A)	1-906/906	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_1	1200950	1202125	oqxA	1-1176/1176	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_1	1197774	1200926	oqxB	1-3153/3153	===============	0	100.00	100.000
2012-17035.fna	gnl|Prokka|2012-17035_4	53420	54346	sul1	1-927/927	===============	0	100.00	100.000

However, blaOXA-9 has a TAG stop codon in the middle, probably rendering the gene non-functional. It would be useful feature to see this in the mini-map.

(Ariba currently detects this and outputs it in the table.)

[bug] Summary option broken in version 0.7

Just upgraded to version 0.7, and found the summary option is broken.

Rather than making the summary of every genome assembly tested, it only lists the filename with percentages.

Restoring version 0.6 gives correct summary files, so it's not a problem with the source files.

Example:

#FILE	NUM_FOUND	Col(BS512)_1	Col(BS512)_1__NC_010656_dupe	Col(Ye4449)_1	Col156_1	Col8282_1	ColE10_1	ColRNAI_1	ColpVC_1	IncA/C2_1	IncFIB(AP001918)_1	IncFIB(K)_1_Kpn3	IncFIB(S)_1	IncFIB(pB171)_1_pB171	IncFIB(pHCM2)_1_pHCM2	IncFIC(FII)_1	IncFII(S)_1	IncFII(pCTU2)_1_pCTU2	IncFII(pKPX1)	IncFII_1	IncFII_1_pSFO	IncHI2A_1	IncHI2_1	IncI1_1_Alpha	IncI2_1_Delta	IncN2_1	IncN_1	IncP(6)_1	IncQ1_1	IncQ2_1	IncX1_1	IncX1_3	IncX3_1	IncY_1	TrfA_1	pENTAS02_1
filename_plasmidfinder.tsv	35	100.00;100.00	100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00;98.70;98.70;98.70;98.70;98.70;98.70;98.70;100.00;100.00	98.55	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	98.46;100.00;100.00;100.00;100.00;100.00;83.85;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;98.46;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;90.77;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;98.46;100.00;98.46;100.00;100.00;100.00;100.00;100.00;100.00;100.00;83.08;100.00;100.00;83.08;100.00;83.08;100.00;83.85;83.85;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;98.46;100.00;90.77;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;98.46;100.00;98.46;100.00;98.46;100.00;98.46;100.00;98.46;100.00;98.46;100.00;98.46;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;83.85;100.00;100.00;98.46;83.85;100.00;98.46;100.00;100.00;100.00;98.46;100.00;98.46;100.00;100.00;98.46;100.00;100.00;98.46;100.00;98.46;100.00;100.00	100.00;100.00	100.00;100.00;100.00;100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;99.84;99.84;100.00	100.00	99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60;99.60	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00	100.00	100.00	99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;99.61;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00;100.00	100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00	100.00	100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;100.00;95.33;100.00	98.67;98.67;98.67;98.67;98.67;98.67;98.67;98.67;98.67;98.67;98.67;98.67	100.00;100.00	100.00;100.00;100.00;100.00;100.00	90.37	100.00;100.00;100.00;100.00;100.00;100.00	99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66;99.66	98.47;98.57;98.57

Install also requires BioPerl

When installing via brew, the installation aborts complaining about BioPerl not being installed. You might want to add this to the dependencies?

Need to install JSON module on Ubuntu for updating databases

On Ubuntu 14.04 LTS2 (Biolinux8) and Ubuntu 16.04 I have had to manually install JSON with CPANM to be able to use the "abricate-get_db --db resfinder --force".

Maybe add this to the readme?

"sudo cpanm install JSON".

--summary is buggy wrt to filenames vs #FILE

abricate --summary 20*-*_abricate.tab  | cut -f1-3
#FILE   NUM_FOUND       QnrS1_1
/mnt/seq/MDU/QC/2006-22873/contigs.fa   16      .
/mnt/seq/MDU/QC/2007-09580/contigs.fa   3       .
/mnt/seq/MDU/QC/2008-00824/contigs.fa   3       .
/mnt/seq/MDU/QC/2008-08584/contigs.fa   11      .
/mnt/seq/MDU/QC/2009-13468/contigs.fa   1       36.99;61.04;38.81;35.92
/mnt/seq/MDU/QC/2010-02703/contigs.fa   3       36.99;61.04;38.81;35.92
/mnt/seq/MDU/QC/2010-13898/contigs.fa   7       .
/mnt/seq/MDU/QC/2011-02423/contigs.fa   7       .
/mnt/seq/MDU/QC/2011-07982/contigs.fa   9       .
/mnt/seq/MDU/QC/2012-09399/contigs.fa   1       .
/mnt/seq/MDU/QC/2012-19021/contigs.fa   11      .
/mnt/seq/MDU/QC/2013-11571/contigs.fa   3       .
/mnt/seq/MDU/QC/2014-09379/contigs.fa   7       .
/mnt/seq/MDU/QC/2014-22940/contigs.fa   3       .
2006-22873_abricate.tab 0       .
2007-09580_abricate.tab 0       .
2007-16390_abricate.tab 0       .
2007-24238_abricate.tab 0       .
2008-00824_abricate.tab 0       .
2008-08584_abricate.tab 0       .
2009-10234_abricate.tab 0       .
2009-13468_abricate.tab 0       .
2010-02703_abricate.tab 0       .
2010-13898_abricate.tab 0       .
2011-02423_abricate.tab 0       .
2011-07982_abricate.tab 0       .
2012-09399_abricate.tab 0       .
2012-19021_abricate.tab 0       .
2013-01222_abricate.tab 0       .
2013-11571_abricate.tab 0       .
2014-09379_abricate.tab 0       .
2014-22940_abricate.tab 0       .
2015-03233_abricate.tab 0       .
2015-03995_abricate.tab 0       .
2015-04172_abricate.tab 0       .

tseemann / abricate Goto Github PK

abricate's Introduction

ABRicate

Is this the right tool for me?

Quick Start

Installation

Brew

Bioconda

Source

Input

Output

Caveats

Databases

Combining reports across samples

Updating the databases

Making your own database

Etymology

Citation

Issues

License

Author

abricate's People

Contributors

Stargazers

Watchers

Forkers

abricate's Issues

Recommend Projects

Recommend Topics

Recommend Org