Code Monkey home page Code Monkey logo

crisprcasfinder's People

Contributors

bneron avatar dcouvin avatar duboism avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

crisprcasfinder's Issues

A question when doing test

Can't locate Data/Dumper.pm in @inc (@inc contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at CRISPRCasFinder.pl line 23.
BEGIN failed--compilation aborted at CRISPRCasFinder.pl line 23.

Too many intermediate files

Hi there, not sure if this has been addressed elsewhere, but your (very fine!) tool produces tens of thousands of intermediate files that scales with the size of the input -- to the point that it has filled my disk drive file # limit. This is mostly just a note that it would be cool if there was a way to avoid that -- unless there is one that I missed.
Thanks!

Different results for CasFinder in online and local runs

Hello!

Thank you for your wonderful software!
I had a question about running Casfinder. When I run the following genome (https://drive.google.com/file/d/1MolqP-lrjhngn2mO_XLq3QCUX1Fp_rR8/view?usp=sharing) on the online system, it reports multiple Cas genes. But when I run it on my local CrisprCasFinder installation using -

/.../CRISPRCasFinder.pl -i /.../NZ_BJTO01000001.1.fa -out /.../NZ_BJTO01000001 -html 1 –keep 1 -cas 1 -ccvr 1 -ccc 1 -gscf 1 -cpuM 2
in the output, there are no Cas genes.

What do you think may be the cause of this discrepancy?

Thanks!
Chahat

a particular set of parameters for CCF fails to produce results for all genomes we tried

We have discovered recently that a particular set of parameters (quite sensible, in our opinion) for CCF fails to produce any CRISPR arrays for all genomes we tried.

Here is the command line we used

        perl $installation_root/CRISPRCasFinder.pl \
            -in  "$fasta" \
            -cas -keep \
            -drpt  $installation_root/supplementary_files/repeatDirection.tsv \
            -rpts $installation_root/supplementary_files/Repeat_List.csv \
            -ccvr -dbc $installation_root/supplementary_files/CRISPR_crisprdb.csv \
            -levelMin 3 \
            -mismDRs 20 \
            -minNbSpacers 4 \
            -noMism \
            -truncDR 20 \
            -spSim 10 

without additional numeric parameters and -noMism flag it produces a sensible array for a sample input that I will attach later:

      "Crisprs": [
        {
          "Name": "NZ_BJDJ01000033_1",
          "Start": 1396,
          "End": 2620,
          "DR_Consensus": "GGGTTTAACCTTATTGATTTAACATCCTTCTAAAAC",
          "Repeat_ID": "R5170",
          "DR_Length": 36,
          "Spacers": 18,
          "Potential_Orientation": "-",
          "CRISPRDirection": "-",
          "Evidence_Level": 4,
          "Conservation_DRs": 99.1736881974464,
          "Conservation_Spacers": 0,
          "Regions": [
            {

as you can see output fits the input parameters. I am not sure how truncDr parameter is used since the corresponding metrics is not reported back in JSON output.

[FEATURE REQUEST] macsyfinder models custom dir

It would be nice to add a flag, where one specify the path to a custom macsyfinder models dir.
This is my current installation solution as admin on Oracle Linux 8.8:

VERSION=4.3.2
URL=https://github.com/dcouvin/CRISPRCasFinder
conda env create --name crisprcasfinder-$VERSION --file https://raw.githubusercontent.com/dcouvin/CRISPRCasFinder/release-${VERSION}/ccf.environment.yml
conda activate crisprcasfinder-${VERSION}
conda install -c bioconda macsyfinder=2.0
macsydata install --models $CONDA_PREFIX/share/macsyfinder/models CASFinder==3.1.0
pushd $CONDA_PREFIX/share
git clone --quiet --depth 1 --branch release-$VERSION $URL crisprcasfinder
chmod +x crisprcasfinder/CRISPRCasFinder.pl
sed --in-place=.orig "s@macsyfinder -w@macsyfinder --models-dir ${CONDA_PREFIX}/share/macsyfinder/models/ -w@" ${CONDA_PREFIX}/share/crisprcasfinder/CRISPRCasFinder.pl
pushd ../bin
ln -s ../share/crisprcasfinder/CRISPRCasFinder.pl .
popd ; popd

And running with test user:

TEMPDIR=$(mktemp -d)
pushd $TEMPDIR
conda activate crisprcasfinder-4.3.2
git clone https://github.com/dcouvin/CRISPRCasFinder
cd CRISPRCasFinder
CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -keep

It would be nice to omit the sed command and be able to do something like this:

TEMPDIR=$(mktemp -d)
pushd $TEMPDIR
conda activate crisprcasfinder-4.3.2
git clone https://github.com/dcouvin/CRISPRCasFinder
cd CRISPRCasFinder
CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -keep -msfdir $CONDA_PREFIX/share/macsyfinder/models

Errors when using --faa and --gff options

Hi,
I'm trying to run the Singularity version of CRISPRCasFinder (version 4.2.20) with the -faa and -gff options (I generated my GFF file using the Dfast annotator, see below), but I'm getting the following errors:

COMMAND USED:
sudo singularity exec -B $PWD CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -so /usr/local/CRISPRCasFinder/sel392v2.so -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv -cas -def G -out CrisprCasFinder2 -in MB0146_2.fasta -faa MB0146_protein_2.fasta -gff MB0146_short_2.gff --keep

STDOUT:
################################################################

--> Welcome to /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl (version 4.2.20)

################################################################

vmatch2 is...............OK
mkvtree2 is...............OK
vsubseqselect2 is...............OK
fuzznuc (from emboss) is...............OK
needle (from emboss) is...............OK

[23:39:36] ---> Results will be stored in CrisprCasFinderOut

Sequence number 1.. ( Input file: ppMB0146_1.fna, Sequence ID: ppMB0146_1, Sequence name = Unknown )
Nb of CRISPRs in this sequence = 0

prodigal installation is.............OK
macsyfinder installation is...........OK
MacSyFinder's results will be stored in MB0146_2_22_10_2020_23_39_36/casfinder_ppMB0146_1/
Analysis launched on /home/cris/installs/crisprcasfinder/MB0146_protein_2.fasta for system(s):
- General-Class1
- General-Class2


General-Class1


Accessory genes:
Csx17_0_IU 0
Cas3_0_IU 0
Csx10_0_IIID 0
Csm3_1_IIID 0
Cas3_0_ID 0
Csc1_0_ID 0
Csf1_1_IV 0
Csm3_1_IIIAD 0
Csm3_0_IIIA 0
Csm3_0_IIID 0
Csx10_1_IIID 0
Cas7_1_IB 0
Csb1_0_IU 0
Cas3-Cas2_0_IF 0
Cas7_1_IC 0
Cas10_1_IIIB 0
Cas7_0_IE 0
Cas7_0_IA 0
Cas7_0_IC 0
Cas1_0_I-II-III-V 0
Cas8a1_2_IB 0
Cas4_0_I-II 0
Cas5_0_I 0
Csf4_1_IV 0
Csx19_0_IIID 0
Csm4_0_IIIA 0
Csm2_0_IIID 0
Csm2_0_IIIA 0
Csm5_0_IIIA 0
Csf2_0_IV 0
Cas2_0_I-II-III-V 1
Cas6_0_IA 0
Cas6_0_IF 0
Cas6_0_IE 0
Cas10_0_IIIA 0
Cas10_0_IIIC 0
Cas10_0_IIIB 0
Cmr1_0_IIIC 0
Cmr1_0_IIIB 0
Csm2_1_IIIA 0
Cas5_0_IE 0
Cas5_0_IB 0
Cas5_0_IC 0
Cas5_0_IA 0
Csf5_0_IV 0
CsaX_0_IA 0
Csf4_0_IV 0
Cas8b_0_IB 0
Cmr8_0_IIIB 0
Csm5_1_IIIA 0
Cas7_3_IB 0
Cas1_0_I-II-III 0
Csc2_0_ID 0
Cas3_1_I 0
Cmr7_0_IIIB 0
Cas2_0_IE 0
Cmr6_0_IIIB 0
Cmr6_0_IIIC 0
Cas10d_0_ID 0
Csf3_0_IV 0
Cas5_1_IB 0
Cas3_0_I 0
Cas4_0_I-II-V 0
Cas4_0_IA 0
Csf2_1_IV 0
Cse2_0_IE 0
Cmr5_1_IIIB 0
Cmr5_1_IIIC 0
Cmr4_0_IIIB 0
Cas8c_1_IC 0
Csm3_0_IIIAD 0
Csf3_1_IV 0
Cas1_0_II 0
Csb2_0_IU 0
Cas7_0_I 0
Cmr3_0_IIIC 0
Cmr3_0_IIIB 0
Csy3_0_IF 0
Csa5_0_IA 0
Cas2_0_I-II-III 0
Cas1_0_IC 0
Csm6_0_IIIA 0
Cas1_0_IA 0
Cas1_0_IF 0
Cas1_0_IE 0
Cas8c_0_IC 0
Csf1_0_IV 0
Cas10_0_III 0
Csb3_0_IU 0
Cmr3_1_IIIB 0
Cas7_2_IB 0
Cmr5_0_IIIC 0
Cmr5_0_IIIB 0
Cas6_0_I-III 0
Csm4_1_IIIA 0
Csy2_0_IF 0
Cse1_0_IE 0
Cas8a1_0_IA 0
Cas8a1_0_IB 0
Cmr6_1_IIIB 0
Cas8a1_1_IB 0
Cas5_1_IC 0
Cas8a1_1_IA 0
Csy1_0_IF 0
Cmr4_1_IIIB 0


Building reports of detected systems


System: General-Class1 (General-Class1_putative)

#SequenceID Cas-type/subtype Gene status System Type Begin End Strand Other_information
Use of uninitialized value within %hashGeneType in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1778, line 92.
Use of uninitialized value within %hashGeneBegin in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1779, line 92.
Use of uninitialized value within %hashGeneEnd in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1780, line 92.
Use of uninitialized value within %hashGeneStrand in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1781, line 92.
Use of uninitialized value within %hashGeneOther in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1782, line 92.
MB0146_17175 Cas2_0_I-II-III-V accessory General-Class1
Use of uninitialized value within %hashGeneType in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1790, line 92.
Use of uninitialized value within %hashGeneBegin in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1791, line 92.
Use of uninitialized value within %hashGeneEnd in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1792, line 92.
Use of uninitialized value within %hashGeneStrand in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1793, line 92.
Use of uninitialized value within %hashGeneOther in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1794, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1797, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1797, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1797, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1800, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1800, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1800, line 92.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1846.
Use of uninitialized value in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1846.
Use of uninitialized value $[1] in numeric gt (>) at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 3521.
Use of uninitialized value $
[0] in numeric gt (>) at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 3521.
Use of uninitialized value $[1] in numeric gt (>) at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 3516.
Use of uninitialized value $
[0] in numeric gt (>) at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 3516.
Use of uninitialized value $beginCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1867.
Use of uninitialized value $endCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1867.
####Summary system General-Class1:begin=;end=;sequenceID=ppMB0146_1

Use of uninitialized value $beginCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1868.
Use of uninitialized value $endCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1868.
Use of uninitialized value $beginCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1871.
Use of uninitialized value $endCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1871.
Use of uninitialized value $beginCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1896.
Use of uninitialized value $endCasCluster in concatenation (.) or string at /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl line 1896.

Nb of Cas in this sequence = 1

Statistics on CRISPRs orientation by CRISPRCasFinder vs. CRISPRDirection

Total number of CRISPRs arrays found = 0
Number of perfect macthes between CRISPRCasFinder and CRISPRDirection = 0
Number of Forward by CRISPRCasFinder = 0
Number of Forward by CRISPRDirection = 0
Number of Reverse by CRISPRCasFinder = 0
Number of Reverse by CRISPRDirection = 0

Number of unoriented by CRISPRCasFinder = 0
Number of unoriented by CRISPRDirection = 0
Orientations count file created: MB0146_2_22_10_2020_23_39_36/crisprs_orientations_count.tsv

Secondary folders/files (Prodigal, CasFinder, rawFASTA, CRISPRFinderProperties) have been created

All CRISPRs = 0
All Cas = 1

[23:39:37] Thank you for using /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl! Thank you for your patience!

[23:39:37] The script lasted: 0 year(s) 0 month(s) 0 day(s) , 0 hour(s) 0 minute(s) 1 second(s)

Here is (part of) the GFF I'm using:
GFF:
##gff-version 3
ppMB0146_1 GAnn plasmid 1 48031 . . . ID=ppMB0146_1;Name=ppMB0146_1;circular=True;
ppMB0146_1 Prodigal:2.6.3 CDS 174 404 . + 0 ID=MB0146_16925;Name=MB0146_16925;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 1015 1524 . + 0 ID=MB0146_16930;Name=MB0146_16930;inference=INSD:KOR93936.1;product=dUTPase;
ppMB0146_1 Prodigal:2.6.3 CDS 1521 1832 . + 0 ID=MB0146_16935;Name=MB0146_16935;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 1989 2291 . + 0 ID=MB0146_16940;Name=MB0146_16940;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 2288 2638 . + 0 ID=MB0146_16945;Name=MB0146_16945;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 2643 3035 . + 0 ID=MB0146_16950;Name=MB0146_16950;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 3025 3279 . + 0 ID=MB0146_16955;Name=MB0146_16955;inference=Prodigal ab initio prediction;product=hypothetical protein;
ppMB0146_1 Prodigal:2.6.3 CDS 3285 4034 . + 0 ID=MB0146_16960;Name=MB0146_16960;inference=RefSeq:WP_011459086.1;product=thymidylate synthase (FAD);EC_number=2.1.1.148;
ppMB0146_1 Prodigal:2.6.3 CDS 4047 4478 . + 0 ID=MB0146_16965;Name=MB0146_16965;inference=INSD:KOR93955.1;product=RinA family phage transcriptional regulator;

Could there be something about the GFF that is causing this error (that appears to be related to parsing the GFF)?
Please feel free to let me know if you require more information, and thanks you for your help!
Cris

Singularity image unable to find local files

Hello!

I am trying to use the Singularity image of CrisprCasFinder locally (since I had problems with the regular local installation - as described in another issue I have opened). I downloaded the Singularity image, and installed Vagrant, which is required to run Singularity on a Mac.

Now, to test if it works, I used the command -

singularity exec /vagrant/softwares/singularity/CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -h

And this gave the output that we expect -

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = "en_US:",
	LC_ALL = (unset),
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Name:
  /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl standalone version 4.2.20

Synopsis:
  A perl script to identify CRISPR arrays and associated cas genes in DNA sequences

Usage:

Next I tried to check if local files are accesible from within the virtual machine (the /vagrant/ folder was bound to my desired local folder) and that worked too -

vagrant@vagrant:/$ head /vagrant/NZ_BJTO01000001.1.fa 
>NZ_BJTO01000001.1 Enterococcus faecalis strain J190 sequence01, whole genome shotgun sequence
AAAGATAGCTTGAAAGATCGCTTAGATAAAGTGACAACCTCTGAAGTAACAGTGAATGATGCAGATAGCA
ATGGCAAAGCGGACGATGTAGATTTAGCTGAAAAAGCAGCGGCAGACGCAGTAAAAGCAGCAGAAGACGC
AGGCAAAGCTGGAGCAGATAAGAAAGCCGAAGTGGAAACCGACGGTTTAGTGACTCCAGAGGAAAAAGCG
GCAGTGGATGGCTTGTTAGAAATAAAACAGTCTTCATTTATGCCGTTTGAAAATTTATTTTCGACTACAA
ATGATTACTCACAGTTTCCTAAAACTGGTGAAAAATCTGATTCTATTTTAACCATTTATGGAGGTTTATT
ATTCTTAAGTAGTATAGGATTATTAGGAATAAAAAAAAGAAAAAATAATACGAATTAAGTTTGTTTGTAT
TCTTCTTTAAGAAAGGATAGGTGTATAAATTTTATCAATAAAAAGCTGACTATTTGTCAAATAGAGTTGA
TAGAATGATAATAAAAGATCATCTAAGGTAGGATTTCTCTTTTCTACTTTAGATGACTTTTTGTTTAAAT
ATCAGATAAATTTTGATTGGACCTGTAGTTGAATATAGATCTGTCGTAGTTACAGAAGGTAAATCGCATA
vagrant@vagrant:/$ 

But when I tried to run CrisprCasFinder, it returns the error that file not found -

vagrant@vagrant:/$ singularity exec /vagrant/softwares/singularity/CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -in /vagrant/NZ_BJTO01000001.1.fa
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = "en_US:",
	LC_ALL = (unset),
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

Error: file /vagrant/NZ_BJTO01000001.1.fa not found. Please check that the file exists or enter a correct file name.
Usage: /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl [options] [-in] <filename>
type -version or -v to see the current version
type -help or -h for help
vagrant@vagrant:/$ 

Any ideas why this could be happening?
Thank you so much!
Chahat

CRISPRCasFinder.pl did not report any errors but produced three unknown core files

Hi
These are my orders, CRISPRCasFinder.pl -in tt.fna -def General -cas -ccvr -ccc 50000 -gscf -keep -html -log -cpuM 8 -cpuP 8 --so sel392v2.so>tt.log.
CRISPRCasFinder.pl did not report any errors but produced three unknown core files
core.304026 core.304028 core.305136.
I don't know what these files represent, are they normal output from the program
Thanks

munmap_chunk Error

Hi! I've run your script a couple of times using the same command and I kept getting this error.

CRISPRCasFinder.pl -i 10000_contigs.fasta -out 10000_contigs -meta -log -cas -cpuM 8 -def S

prodigal installation is.............OK
macsyfinder installation is...........OK

PRODIGAL v2.6.2 [February, 2015]
Univ of Tenn / Oak Ridge National Lab
Doug Hyatt, Loren Hauser, et al.

Request: Metagenomic, Phase: Training
Initializing training files...done!

Request: Metagenomic, Phase: Gene Finding
Finding genes in sequence #1 (1598 bp)...done!
MacSyFinder's results will be stored in 97000_contigs_AUR_16_5_2019_13_0_51/casfinder_contig-100_48157/
Analysis launched on /media/2tb_Viejo/CAS_Find/CRISPrCASFinder/AUR/97000_contigs_AUR_16_5_2019_13_0_51/prodigal_contig-100_48157/contig-100_48157.faa for system(s):
- CAS-TypeIIU
- CAS-TypeIF
- CAS
- CAS-TypeIC
- CAS-TypeU
- CAS-TypeIB
- CAS-TypeIE
- CAS-TypeIIB
- CAS-TypeIIA
- CAS-TypeIIIA
- CAS-TypeIIIB
- CAS-TypeID
- CAS-TypeIU
- CAS-TypeIIIU
- CAS-TypeIA
*** Error in `python': munmap_chunk(): invalid pointer: 0x00007f02140010e0 ***
Terminado (killed)

The file has several sequences (10.000), and the error always occurred in a different sequence, until one time it worked just fine. Could you help me understand what's happening, so I don't get the same error in future runs?
Thank!!

BioConda package

Hi and thanks for this great CRISPR tool.
I wonder if there might be a bioconda package for it soon. It would be very handy and useful for many people who are not into Docker/Singularity containers.

run command locally and submission to website generate different results.

Hi
I installed CRISPRCasFinder in my mac and run
perl ./../CRISPRCasFinder.pl -in ../../hybrid_assembly/assembly_fasta&gfa/hybrid_fasta/Yol001.fasta -cas -cf CasFinder-2.0.2 -def G -keep locally, and it is not able to find any cas genes and crispr system. However, when I submit the same genome to websit, https://crisprcas.i2bc.paris-saclay.fr/CrisprCasFinder/Viewing/637185279057211761. It is able to find 2 CRISPR and 1 CAS.
Could anyone please help what might be the resson of my issue? your help will be greatly appreciated.

By the way, when I install CRISPRCasFinder , I have to run
$ export PERL5LIB=/usr/local/libexec/crisprcas/lib/perl5:/Users/dklabuser/Scripts/cgview_comparison_tool/lib/bioperl-1.2.3:/Users/dklabuser/Scripts/cgview_comparison_tool/lib/perl_modules:

Then, run $ perl ./../CRISPRCasFinder.pl -in ../../hybrid_assembly/assembly_fasta&gfa/hybrid_fasta/Yol001.fasta -cas -cf CasFinder-2.0.2 -def G -keep

I don't know what is wrong with this issue either. If anyone can help, that will be really great.

Best
Limin

"index files are missing" error when running test command

Hi,
I'm trying to run the command to test my local install of CRISPRCasFinder.pl.

perl CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -cf CasFinder-2.0.3 -def G -keep -soFile $VMATCH_SEL392

but macsyfinder is telling me that it can't find some of the index files it needs (see error message below).

I read through the documentation to see if there is an option to tell CRISPRCasFinder to build these index files, but didn't see anything. Or maybe I got something wrong in my macsyfinder installation.

Would you have any idea of what I'm not doing correctly?
Thanks in advance for any help.

...
...

prodigal installation is.............OK 
macsyfinder installation is...........OK 
-------------------------------------
PRODIGAL v2.6.3 [February, 2016]         
Univ of Tenn / Oak Ridge National Lab
Doug Hyatt, Loren Hauser, et al.     
-------------------------------------
Request:  Single Genome, Phase:  Training
Reading in the sequence(s) to train...526169 bp seq created, 52.76 pct GC
Locating all potential starts and stops...27714 nodes
Looking for GC bias in different frames...frame bias scores: 1.12 0.26 1.63
Building initial set of genes to train from...done!
Creating coding model and scoring nodes...done!
Examining upstream regions and training starts...done!
-------------------------------------
Request:  Single Genome, Phase:  Gene Finding
Finding genes in sequence #1 (526169 bp)...done!
Traceback (most recent call last):
  File "/software/UHTS/Analysis/macsyfinder/1.0.5/bin/macsyfinder", line 329, in <module>
    idx.build(force = config.build_indexes)
  File "/software/UHTS/Analysis/macsyfinder/1.0.5/lib/python2.7/site-packages/macsypy/database.py", line 119, in build
    self._hmmer_indexes = self.find_hmmer_indexes()
  File "/software/UHTS/Analysis/macsyfinder/1.0.5/lib/python2.7/site-packages/macsypy/database.py", line 142, in find_hmmer_indexes
    raise RuntimeError(msg)
RuntimeError: some index files are missing. Delete all index files (*.phr, *.pin, *.psd, *.psi, *.psq, *.pal) and try to rebuild them.
No Cas results
Nb of Cas in this sequence = 0
Secondary folders/files (Prodigal, CasFinder, rawFASTA, CRISPRFinderProperties) have been created

All CRISPRs = 9
All Cas = 0

[14:55:46] Thank you for using CRISPRCasFinder.pl! Thank you for your patience!

[14:55:46] The script lasted: 0 year(s) 0 month(s) 0 day(s) , 0 hour(s) 0 minute(s) 4 second(s)

Installation of Date::Calc failed

I'm working on Mac. The first time I tried to install CRISPRCasFinder, it could not install Json::Parse. I installed it with conda and moved on. The second time the error was with Try::Tiny. Same solution. The third it was Test::Most. However, the fourth time it was Date::Calc, which I have not been able to install via conda or anywhere else.

Installation of Perl and packages
Installation of Date::Calc...
Installation of Date::Calc failed see /Users/JFF/Desktop/Programas/CRISPRCasFinder-master/installer.2022-10-21-14:28:14.log for details.

perl module issues

Hello there,

Perl modules required for CRISPRCasFinder.pl script are conflicting with my Miniconda environment.

CRISPRCasFinder worked perfectly if I remove Miniconda, but when Minoconda is there, CRISPRCasFinder can not locate modules. Following error occurs:

hishir@d3w:~/CRISPRCasFinder$ perl CRISPRCasFinder.pl
Can't locate Date/Calc.pm in @inc (you may need to install the Date::Calc module) (@inc contains: /home/shishir/miniconda3/envs/genomics/lib/site_perl/5.26.2/x86_64-linux-thread-multi /home/shishir/miniconda3/envs/genomics/lib/site_perl/5.26.2 /home/shishir/miniconda3/envs/genomics/lib/5.26.2/x86_64-linux-thread-multi /home/shishir/miniconda3/envs/genomics/lib/5.26.2 .) at CRISPRCasFinder.pl line 25.
BEGIN failed--compilation aborted at CRISPRCasFinder.pl line 25.

not only for Date::Calc module, but also for some others.
CRISPRCasFinder is looking for Perl modules in the Conda environment instead of root.

How can I solve the issue to make the CRISPRCasFinder to look for scripts at the root directory?
Here is the path to Perl Modules:-

shishir@d3w:~/CRISPRCasFinder$ locate Date/Calc.pm
/home/shishir/.cpan/build/Date-Calc-6.4-0/blib/lib/Date/Calc.pm
/home/shishir/.cpan/build/Date-Calc-6.4-0/lib/Date/Calc.pm
/home/shishir/.cpan/build/Date-Calc-6.4-1/blib/lib/Date/Calc.pm
/home/shishir/.cpan/build/Date-Calc-6.4-1/lib/Date/Calc.pm
/home/shishir/.cpanm/work/1595698278.17997/Date-Calc-6.4/lib/Date/Calc.pm
/usr/local/share/perl/5.30.0/Date/Calc.pm
/usr/share/perl5/Date/Calc.pm

shishir@d3w:~/CRISPRCasFinder$ locate Unix/Sysexits.pm
/usr/local/lib/x86_64-linux-gnu/perl/5.30.0/Unix/Sysexits.pm

N.B: I am new in Linux, so facing issues. sorry for the inconvenience.

The result of CRISPR-Cas_summary.tsv don't have Cas Types/Subtypes

Hi, all

I used this nice tool to identify CRISPR-Cas systems for my assembled MAGs. However, the result of the output file (CRISPR-Cas_summary.tsv) doesn't have Cas and Cas Types/Subtypes. Why? Where is it wrong?

command:
perl CRISPRCasFinder.pl -in MAG1.fa -out output -log -so CRISPRCasFinder/sel392v2.so -cas -keep -gscf -meta
CRISPR-Cas_summary.txt

Best wishes!

Thanks!

Jay

error in using crisprcasfinder with subprocess, snakemake and Rscript

when i use subprocess, the error coming from using subprocess is /home/DRs_1 permission required.
when i use snakemkae, the error is related to after refining and generating the fasta file from my input and says, no such file or directory 'CPXXXXXX.fna'
when i use Rscript the error is, please provide fasta file with -in, even thoug it is provided.

i am trying to batch analyse multiple fasta files

BUG: Invoking the perl scripts from a different location than the source folder results in default macsyfinder CRISPR search

If the perl script is invoked from a location different than the source folder the following line of code will always fail:

if ( (-d $casfinder) and (-d $casdb) and (-d $profiles) ) 

This is because the directories checked here are relative and not absolute. Therefore macsyfinder is not called with these options and defaults to its own profiles and definitions, which are older than the ones from CRISPRCasFinder. In addition the -definition option is not used.

Getting multiple CAS types for a single CRISPR array

What does it mean multiple cas systems/types are reported in overlapping positions? Below is an example from the CRISPR-Cas_summary.tsv of my results.

Sequence(s) CRISPR array(s) Nb CRISPRs Evidence-levels Cas cluster(s) Nb Cas Cas Types/Subtypes
NODE_16_length_64907_cov_27 NODE_16_length_64907_cov_27_1[39949;42929] (evidence-level=4), 1 Nb_arrays_evidence-level_1=0,Nb_arrays_evidence-level_2=0,Nb_arrays_evidence-level_3=0,Nb_arrays_evidence-level_4=1 CAS-TypeIC[43401;48298], CAS-TypeIA[8791;10563], CAS[8791;50522], 3 CAS (n=1), CAS-TypeIA (n=1), CAS-TypeIC (n=1),

trouble with multifasta.fna

Hi, thank you for developing such a useful tool !

I successfully installed this software with a solid result of functional test.

But when I try to run a test for a multifasta file like this :

>A1_scaffold1
TGTTTGTTTGTTAAAGAACTGCGCGATCAACTTTGTTCATCGCGTCGCTGCGT...
>A1_scaffold10
CGCACTCCGTCCGTATCGCCTGAATTCAAGATCATAGGTGGACCTGTGTCTTC...
>A1_scaffold100
GAGGCGAGGGTGGGGGACCTGCGCGTGCAGGTGCCGGGCCTGTTCGGGCAACT...
>A1_scaffold1008
CGGCCGACTTCTCGGAGACGATGGTCAGGCCGAGCCGCTTCTTGAAGGTGCCG...

It seems only readed in the first A1_scaffold1 sequence, as all temporary files created were about A1_scaffold1 only.

My command is like this:
perl CRISPRCasFinder.pl -i A1.seq.fna -out A1 -cas -def -rpts -rcfowce -gff A1.crispr.gff -faa A1.crispr.faa -meta -ccvr

With this output :

################################################################
# --> Welcome to /home/umaru/software/CRISPRCasFinder/CRISPRCasFinder/CRISPRCasFinder.pl (version 4.2.17)
################################################################

vmatch2 is...............OK
mkvtree2 is...............OK
vsubseqselect2 is...............OK
fuzznuc (from emboss) is...............OK
needle (from emboss) is...............OK

[16:31:19] ---> Results will be stored in A1

Sequence number 1.. ( Input file: A1_scaffold1.fna, Sequence ID: A1_scaffold1, Sequence name = Unknown )
The shared object file (./sel392v2.so) must be available in your current directory. Otherwise, you must use option -soFile (or -so)!

So, anything wrong ?

CRISPRCasFinder.pl: command not found

Hi! Today I've been struggling to install CRISPRCasFinder on my computer for UBUNTU. I've been following the manual step by step, I've uncommented that one string in source.list file, I've ran installer_UBUNTU.sh, set the enviroment variables and everything worked just fine, but when i try to run an install test, it's just refusing to work by replaying "CRISPRCasFinder.pl: command not found". I am very new to ubuntu, i've installed it this day and there's a lot I'm missing, so I would despretly want to understand, why this error is accuring.
I've tried to install this program for many times now, so this might be the problem as I don't know, how exactly does installation process is going on. I'll try my best to unswer your questions, but I might not be as skilled as it needs to be

ENHANCEMENT: Add option CasFinder to assume the set of sequences is from the same replicon

It seems that the CasFinder module always evaluates each sequence from the fasta file seperately. (There is a loop and for each individuel sequence prodigal and macsyfinder are executed.)

This is fine if all the sequences from the fasta are unrelated to each other, so for example are a metagenome. However in reality, a fasta file can contain multiple sequences from the same organism (or replicon).

So, would it be possible to include the option to specify that the whole set of sequences of the provided fasta file are from the same replicon.

In addition, support for the gembase format can be included. A format proposed by the developers of macsyfinder to provide a single file with multiple sequences from multiple replicons, in which a subset of the sequences can originate from the same replicon.

No CRISPR/CAS detection on fasta files with more than 100 contigs

Hello,

I installed CRISPRCasFinder using conda. I am running the script on draft genomes. The script is running fine with the fasta files containing less than 100 contigs but is detecting no CRISPR arrays or cas genes with the files having more than 100 contigs. Majority of my genomes are having more than 100 contigs, how to run the script on these files?

I would be highly obliged if you could help me through this.

vmatch2

I have been trying to use this program with singularity. I thought all the dependencies were going to be inside that package. I'm not sure how to install vmatch2 with singularity. Im including my command to show whats going on

(base) David-2:Downloads DavidMRobinson$ singularity exec -B $PWD CrisprCasFinder.simg perl /Users/DavidMRobinson/Downloads/CRISPRCasFinder/CRISPRCasFinder.pl -so /Users/DavidMRobinson/Downloads/CRISPRCasFinder/sel392v2.so -cf /Users/DavidMRobinson/Downloads/CRISPRCasFinder/CasFinder-2.0.3 -drpt /Users/DavidMRobinson/Downloads/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /Users/DavidMRobinson/Downloads/CRISPRCasFinder/supplementary_files/Repeat_List.csv -cas -def G -out RES21092020_2 -in megahit_frx_bon_final_assembly.fasta
################################################################

--> Welcome to /Users/DavidMRobinson/Downloads/CRISPRCasFinder/CRISPRCasFinder.pl (version 4.2.19)

################################################################

vmatch2 is not installed, please install it and try again.
[ 5.367732] reboot: Power down

json output sometimes "forgets" closing ] in "Cas": [] line

We were testing about 1900 accessions with CRISPRCasFinder, and we found that in 72 cases JSON output was corrupted.

Command line:


        perl $installation_root/CRISPRCasFinder.pl \
            -in  "$fasta" \
            -cas -keep \
            -out "$dir"

where $fasta is FASTA file with sequence of Refseq (NCBI) accession NZ_CP016476.1

The output JSON file in "$dir" ends with the snippet:

"Cas":[
}
]
}

As you can see, left square parenthesis on the "Cas" line is not matched by right square parenthesis and the next control character is right curly bracket.

Workaround:


grep -Pl '"Cas":\[\s*$' *json  | xargs perl -i  -pe 's{"Cas":\[\s*$}{"Cas": []\n}g' 

Date::Calc perl module not located

d3w@d3w:~/CRISPRCasFinder$ perl CRISPRCasFinder.pl -cf CasFinder-2.0.3 -def General -cas -i install_test/sequence.fasta -out Results_test_install -keep
Can't locate Date/Calc.pm in @inc (you may need to install the Date::Calc module) (@inc contains: /home/d3w/miniconda3/envs/crispr/lib/site_perl/5.26.2/x86_64-linux-thread-multi /home/d3w/miniconda3/envs/crispr/lib/site_perl/5.26.2 /home/d3w/miniconda3/envs/crispr/lib/5.26.2/x86_64-linux-thread-multi /home/d3w/miniconda3/envs/crispr/lib/5.26.2 .) at CRISPRCasFinder.pl line 25.
BEGIN failed--compilation aborted at CRISPRCasFinder.pl line 25.

d3w@d3w:~/CRISPRCasFinder$ sudo cpanm Date::Calc
Date::Calc is up to date. (6.4)

Date::Calc module is already there. Even I tried force reinstalling it. but getting the same error. What I am missing? help please

Flank sequence near contig edge has negative coordinate

Hello! Thank you for this great tool. I am using CRISPRCasFinder v.4.3.2 and I noticed that flank sequence coordinates are reported without taking into account how close to the edge of the contig they are. One of my CRISPR hits begins at position 23, and the left flank sequence is reported to start at position -77 (making an invalid GFF) and its sequence is reported as "unknown". Would it be possible to shorten reported flank sequences in future releases if they are near the edge?
Thank you!

Permission issues while running the Singularity container

Dear developers,

I am trying to run your singularity container on our HPC cluster.
Pulling the image works fine, but then whenever I try to run it, I get the error:

Possible precedence issue with control flow operator at /usr/local/share/perl/5.22.1/Bio/DB/IndexedBase.pm line 845.
################################################################
# --> Welcome to /usr/local/bin/CRISPRCasFinder (version 4.2.18) 
################################################################


vmatch2 is...............OK 
mkvtree2 is...............OK 
vsubseqselect2 is...............OK 
fuzznuc (from emboss) is...............OK 
needle (from emboss) is...............OK 


open : Permission denied at /usr/local/bin/CRISPRCasFinder line 550.

On that line the script tries to open a json file

   550	open (JSONRES, ">$jsonResult") or die "open : $!";

Do you have any idea what could be causing this error?

General Search not working

I am trying to identify all cas genes in a sequence even if they don't belong to a subgroup, or constitute all of the mandatory genes for a subgroup. For example, there is a sequence that I know has csy3 and cas6, but as those do not constitute all of the necessary components for a I-F system, there are no Cas clusters returned.

On the online version, you can change "Subtyping" to "General" to return these clusters. However, when I use -def G or --definition General on the command line version, it still does not seem to return these clusters. I suspect that the General feature is broken?

Dependency issues using Singularity image

Hello,

I am trying to use the singularity image approach to run this program as I do not have root access in CentOS. I am using the latest image available and these files are all in the same directory. This is the command I am running:

singularity exec -B $PWD CrisprCasFinder.simg perl CRISPRCasFinder.pl -in fasta.fasta

I get this message when I run this:
vmatch2 is not installed, please install it and try again

I looked through the perl script and it looks like there is a whole section of dependency checks where this is getting caught. I was under the impression the singularity image should cover the dependencies, but it looks like for some reason it is not.

I tried to get around this by installing the dependencies using a conda environment, but it looks like not all of the dependencies can be installed this way? (mkvtree2, vsubseqselect2)

Please let me know if you have a solution to this.

can't find the shared object file (./sel392v2.so)

dear @dcouvin

when i try to run CRISPRCasFinder, but it showed the erro like below:
"The shared object file (./sel392v2.so) must be available in your current directory"

i want to give a path of the shared object file (./sel392v2.so) , but i can't find that file in the whole folder.

can you help out this problem?

thanks a lot

Is there a way to simply get a list of spacers?

Hello,

I have downloaded CCfinder since I want to run it on hundreds of draft genomes. I am mainly interested in finding the list of spacers in each of them. After running CCfinder and seeing the results, I couldn't find a simple way to get to such a list. In the online version, there is a simple button to download spacers. Is it possible to achieve that in the downloadable version of the software?

I cant find the spacers in either the Crisprs_REPORT.tsv or the CRISPR-Cas_summary.tsv.

For reference, here is the command I am running (on Mac) -

./CRISPRCasFinder.pl -i /path/to/my/input.fa -out /path/to/my/output

Recent Singularity container is not recognized by Singularity versions <2.6.1

I downloaded the singularity image from https://crisprcas.i2bc.paris-saclay.fr/Home/DownloadFile?filename=CrisprCasFinder.simg recently and it is not recognized by Singularity versions 2.6.1 and 2.4.*. An older image I downloaded last year still works just fine, so I assume this has to do with how it was built (perhaps a Singularity version >=3.0).
command:
singularity exec -B $PWD CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -so /usr/local/CRISPRCasFinder/sel392v2.so -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv -def G -html -cas -out crisprcasfinder -in fasta.fasta

error:
ERROR : Unknown image format/type: CrisprCasFinder.simg
ABORT : Retval = 255

Just thought I'd pass this along, as I'll probably be fine just using the older image.
Thanks,
Cris

Error in `muscle': double free or corruption (out): 0x00007ffceb581bf0

Hi ,
I always get error messages when I run CRISPRCasFinder. Is there any way I can fix it?
Although I get this error messages, the programma does not stop running and I still can get the results. In this case, I do not know if my result is still reliable.

** Error in `muscle': double free or corruption (out): 0x00007ffceb581bf0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x2af91d2af329]
muscle(+0x4d2b3)[0x55cad3ec72b3]
muscle(+0x149bd)[0x55cad3e8e9bd]
muscle(+0x15930)[0x55cad3e8f930]
muscle(+0xe805)[0x55cad3e88805]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2af91d250555]
muscle(+0x12aad)[0x55cad3e8caad]

Using new version of MacSyFinder on CRISPRCasFinder.pl

Dear all

I ran CRISPRCasFinder on some contig sequences that I know contains Cas proteins of various subtypes with the options -cas -def G, but am still getting no Cas type information in my [output]/TSV/CRISPR-Cas_summary.tsv. I just want to make sure I am not missing any optional flags in the command:

perl CRISPRCasFinder.pl --version
version 4.2.20

perl CRISPRCasFinder.pl -log \
-out [output_dir] \
-so [/path/to]/sel392v2.so \
-cas -cf CasFinder-2.0.3 -getSummaryCasfinder -def S \
./input.fasta

With this command I am able to detect a large number of CRISPRs, but no indication of any Cas proteins.

Thanks

Marcus

How to use CRISPRCasFinder through singularity container

Thanks for your nice tool! I have only used Conda, and I have never used Singularity Container.

First, I installed Singularity via mamba.

mamba create -n singularity -y singularity

Then, I downloaded the CrisprCasFinder.simg image from the CRISPR-Cas++ Download page.

When I ran the following code, an error happened.

mamba activate singularity

singularity exec -B $PWD CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -so /usr/local/CRISPRCasFinder/sel392v2.so -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv -cas -def G -out RES21092020_2 -in sequence.fasta

ERROR  : Image path doesn't exists
ABORT  : Retval = 255

Please tell me how to use CRISPRCasFinder through the singularity container. Thank you!

mixed up results when launching on multiple files in parallele

Hi,

Trying to use the software in parallele on several fasta files located in a single folder and the results appeared to be mixed up.
Im using the singularity container but same problem with ubuntu version.
Watching live what was happening in my fasta files containing directory, it appeared that the "result.json" file and other temporary files are created in the working directory instead of the result directory, which is quite confusing as all jobs are writting in the single result.json file
The only trick i found was to treat each fasta file in separate folders, which is quite tedious when you analyse hundreds of genomes.

Is it be possible to redirect the different temporary outputs in the results folder instead of generating them in the working directory ?

Thanks for your help

MacSyFinder link outdated

Hello,

Is there any update to be coming any soon? It seems installation will not work mainly because of the absence of a working MacSyFinder source link in the installation file.
Let me know how to proceed.
Thanks in advance!

[in]
wget https://dl.bintray.com/gem-pasteur/MacSyFinder/macsyfinder-1.0.5.tar.gz >> $LOGFILE

[out]
Forbidden!

Can I just comment on all the following and then install MacSyFinder manually through pip3?

```
#install macsyfinder
echo "Installation of MacSyFinder" >> $LOGFILE
cd ${CURDIR}
wget https://dl.bintray.com/gem-pasteur/MacSyFinder/macsyfinder-1.0.5.tar.gz >> $LOGFILE
tar -xzf macsyfinder-1.0.5.tar.gz
test -d bin ||  mkdir bin
cd bin
ln -s ../macsyfinder-1.0.5/bin/macsyfinder
cd ${CURDIR}
echo "add definition of MACSY_HOME (${CURDIR}/macsyfinder-1.0.5/) in .profile" >> $LOGFILE
echo "export MACSY_HOME=${CURDIR}/macsyfinder-1.0.5/" >> $HOME/.profile

echo "add bin folder ($CURDIR/bin) to the definition of PATH in $HOME/.profile" >> $LOGFILE
echo "export PATH=${CURDIR}/bin:${PATH}" >> $HOME/.profile

Could not open output GFF in result/result2/ because No such file or directory

Hello, I am executing a command error:

command:
perl ../CRISPRCasFinder.pl -in /data_alluser/QK/NewMAGs/ALL_represent_AND_MINE_MAG_GTDB-tk/Archaea_fna/MGYG000000522.fna -out result/result2 -so ../sel392v2.so

error:
image

This problem occurs when I generate the result in a two-tier directory:

  1. -out result/result2:An error is reported
  2. -out result:Program run successfully

Thank you very much for your answer!

Singularity: Device or resource busy // FAIL TO BUILD

l'm on a Centos HPC, so installing with the regular CentOS instruction isn´t an option for me. Unfortunately, building the singularity image gives me an array of error which I'm trying to resolve at the moment. When I use the pre-build singularity image from https://crisprcas.i2bc.paris-saclay.fr/Home/DownloadFile?filename=CrisprCasFinder.simg, with this comand in a wrapper bash script

singularity exec -B $PWD /apps/crisprcasfinder/4.2.20/singularity/crisprcasfinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -so /usr/local/CRISPRCasFinder/sel392v2.so -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv -cas -def G -out RES21092020_2 -in sequence.fasta $@

I always get the following error:

Nb of Cas in this sequence = 2
mv: cannot move 'CP014688.fna' to 'sequence_17_3_2022_7_47_18/analyzedSequences/CP014688.fna': Device or resource busy
failure: "mv CP014688.fna sequence_17_3_2022_7_47_18/analyzedSequences", errorcode 256

EDIT: after such a failed execution the result.json only consists of {

I've managed to build the image, up to the %test by applying this patch:
patch.txt

patch --backup Singularity patch.txt

But then I get this error:

open : Read-only file system at /usr/local/bin/CRISPRCasFinder line 419.
Test failed see /tmp/test_CRISPRCasFinder_2022-03-17-06:50:50 for details.
FATAL:   While performing build: failed to execute %test script: exit status 30

Unfortunately, I can't have a look at the mentioned /tmp/test_CRISPCasFinder_2022-03-17-06:50:50.
I had a look at line 417-420:

## JSON file "result.json"
my $jsonResult = "result.json"; # JSON result to get all info concerning CRISPR-Cas and sequences JSON files
open (JSONRES, ">$jsonResult") or die "open : $!";
print JSONRES "{\n";

Any help or tips in the right direction would be appreciated.

BUG: RuntimeError: an error occurred during databases indexation see formatdb.log

Sequence number 34..Traceback (most recent call last):
  File "/home/ciillab/huangle/CRISPRCasFinder/bin/macsyfinder", line 329, in <module>
    idx.build(force = config.build_indexes)
  File "/home/ciillab/huangle/CRISPRCasFinder/macsyfinder-1.0.5/macsypy/database.py", line 118, in build
    raise RuntimeError(msg)
RuntimeError: an error occurred during databases indexation see formatdb.log

Singularity image not capable of being pulled

Hello!

I have tried pulling from your singularity image using the following command

singularity pull --name CRISPRCasFinder shub://dcouvin/CRISPRCasFinder:4.2.18

However I receive the following error:

ERROR Cannot find image. Is your capitalization correct?

Is it possible that your shub acct is not publicly available or something is not correct in the command provided? If the fix is on my end, please just let me know!

Thanks!

Discrepancy between Direct Repeat sequences

Hello!

I ran CrisprCasFinder on a bacterial genome. A picture of the output folder is attached.
Screen Shot 2020-08-17 at 8 57 23 PM

The content of the highlighted files however is not the same. Aren't the files DRs_2 and DRs_2_fasta supposed to be the same thing? Here is the content of those two files -

DRs_2
Screen Shot 2020-08-17 at 9 00 00 PM

and DRs_2_fasta -
Screen Shot 2020-08-17 at 9 00 22 PM

Why are the two slightly different?

Thank you!

MSG: No file or directory called

Cause I want to know the crispr/cas in genomes(refseq of bac and arc in NCBI), so I run this command,
perl CRISPRCasFinder.pl -in ~/database_db/refseq/archaea_library.fna -so sel392v2.so -cas -keep -out ~/software/crisprcasfinder/test
but I got this error,

################################################################
# --> Welcome to CRISPRCasFinder.pl (version 4.3.2)
################################################################


vmatch is...............OK
mkvtree is...............OK
vsubseqselect is...............OK
fuzznuc (from emboss) is...............OK
needle (from emboss) is...............OK


 ---> Results will be stored in /beegfs/home/syl/software/crisprcasfinder/test

  ( Input file: NC_002607.fna, Sequence ID: NC_002607, Sequence name = Halobacterium salinarum NRC-1, complete sequence )
Sequence number 1..
muscle 5.1.linux64 []  132Gb RAM, 72 cores
Built Feb 24 2022 03:16:15
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com

Input: 5 seqs, avg length 35, max 70

00:00 18Mb   CPU has 72 cores, defaulting to 20 threads

WARNING: Max OMP threads 2

00:00 93Mb    100.0% Calc posteriors
00:00 93Mb    100.0% Consistency (1/2)
00:00 93Mb    100.0% Consistency (2/2)
00:00 93Mb    100.0% UPGMA5
00:00 93Mb    100.0% Refining

muscle 5.1.linux64 []  132Gb RAM, 72 cores
Built Feb 24 2022 03:16:15
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com

Input: 2 seqs, avg length 31, max 40

00:00 18Mb   CPU has 72 cores, defaulting to 20 threads

WARNING: Max OMP threads 2

00:00 26Mb    100.0% Calc posteriors
00:00 26Mb    100.0% UPGMA5

muscle 5.1.linux64 []  132Gb RAM, 72 cores
Built Feb 24 2022 03:16:15
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com

Input: 2 seqs, avg length 41, max 41

00:00 18Mb   CPU has 72 cores, defaulting to 20 threads

WARNING: Max OMP threads 2

00:00 26Mb    100.0% Calc posteriors
00:00 26Mb    100.0% UPGMA5

------------- EXCEPTION -------------
MSG: No file or directory called 'NC_002607.fna'
STACK Bio::DB::IndexedBase::new /beegfs/home/syl/anaconda3/envs/crisprcasfinder/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:368
STACK main::reportToGff CRISPRCasFinder.pl:2545
STACK main::makeGff CRISPRCasFinder.pl:2427
STACK toplevel CRISPRCasFinder.pl:653
-------------------------------------

the seq names in archaea_library.fna are like this,

kraken:taxid|64091|NC_002607.1 Halobacterium salinarum NRC-1, complete sequence
kraken:taxid|64091|NC_001869.1 Halobacterium salinarum NRC-1 plasmid pNRC100, complete sequence
kraken:taxid|64091|NC_002608.1 Halobacterium salinarum NRC-1 plasmid pNRC200, complete sequence
kraken:taxid|273057|NC_002754.1 Saccharolobus solfataricus P2, complete sequence
kraken:taxid|192952|NC_003901.1 Methanosarcina mazei Go1, complete sequence
kraken:taxid|190192|NC_003551.1 Methanopyrus kandleri AV19, complete sequence
kraken:taxid|178306|NC_003364.1 Pyrobaculum aerophilum str. IM2, complete sequence
kraken:taxid|186497|NC_003413.1 Pyrococcus furiosus DSM 3638, complete sequence
kraken:taxid|188937|NC_003552.1 Methanosarcina acetivorans C2A, complete sequence
kraken:taxid|263820|NC_005877.1 Picrophilus torridus DSM 9790, complete sequence

I don't know how to solve this, can you help me?

yours tk,

mv error

Approximately 50 out of my 2000 CRISPRCasFinder jobs did not finish due to error shown below. Do you know what could be causing this? Thank you

Traceback (most recent call last):
  File "/projects/js66/software/envs/crisprcasfinder/bin/macsyfinder", line 329, in <module>
    idx.build(force = config.build_indexes)
  File "/projects/js66/software/envs/crisprcasfinder/lib/python2.7/site-packages/macsypy/database.py", line 118, in build
    raise RuntimeError(msg)
RuntimeError: an error occurred during databases indexation see formatdb.log
No Cas results
mv: cannot move ‘ERR720254_29_1_2019_15_29_31/prodigal_size309’ to ‘ERR720254_29_1_2019_15_29_31/Prodigal/prodigal_size309’: Directory not empty
failure: "mv ERR720254_29_1_2019_15_29_31/prodigal_size309 ERR720254_29_1_2019_15_29_31/Prodigal", errorcode 256

CRISPERCasFinder does not handle vmatch results with multi line sequence correctly

Hello,

while running CRISPERCasFinder we hit the following error:

Can't call method "DRseq" on an undefined value at /opt/gensoft/exe/CRISPRCasFinder/4.2.20/bin/CRISPRCasFinder line 4031, <FD> line 57

we identified the problem when vmatch_result.txt contains multiline results.
see output from vmatch_result.txt edited wit vi with line number displayed

     54 
     55 >   75 1328940   D    75 1329039
     56 TTCTTGTATCTTCAATTTTTTTCTCTAATTCTCCCCTCACTGTATTAATTTCTGTTTTAA
     57 GTTCTGTTCTTGTAT
     58 

my perl coding is not good enough to let me try to fix the trans_data function in order to fix it.

regards

Eric

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.