backofenlab / crispridentify Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hi,
I wonder if the option “--cas True” is valid on the premise that the CrisprcasIdentifier is installed?
Regards
Hello
you provide prebuilt binaries for
blast
fasta
hmmer
prodigal
clustalo
rnafold
those tools are available via conda, why don't you embed it in environnemt.yml
and use it ?
NB among the birnaires you provide some are staticaly linked, some not. if you whis to provide binaries, please try at least to provide staticaly linked ones.
egg on stock ubuntu-20.04 docker blastn fail due to missing libidn
Singularity> ldd /opt/CRISPRidentify/tools/blasting/blastn
linux-vdso.so.1 (0x00007ffc78f42000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd60ee73000)
libbz2.so.1 => /lib/x86_64-linux-gnu/libbz2.so.1 (0x00007fd60ee60000)
libidn.so.11 => not found
libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007fd60ee43000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fd60ee38000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd60ece7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd60ecc4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd60ead2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd60eab7000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd60ee95000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd60eab1000)
regards
Eric
Hi,
I'm failing in installing the environment, even when removing all builds from the environment.
Do you have any idea what I could try?
Hello.
if ones try to run CRISPRidentify.py
it must be in the same directory than CRISPRidentify.py
egg
No input was provided
Elapsed time: 2.193450927734375e-05
(crispr_identify_env) > cd ../
(crispr_identify_env) > python3.7 foo/CRISPRidentify.py
Traceback (most recent call last):
File "CRISPRidentify/CRISPRidentify.py", line 13, in <module>
from pipeline import Pipeline
ModuleNotFoundError: No module named 'pipeline'
this is due to the way PYTHONPATH is handled
following patch allows to run from anywhere
--- CRISPRidentify.py.ori 2021-06-21 17:41:51.922067095 +0000
+++ CRISPRidentify.py 2021-06-21 17:46:53.788085259 +0000
@@ -10,7 +10,8 @@
import subprocess
import warnings
warnings.filterwarnings("ignore")
-sys.path.insert(0, 'components/')
+dir_path = os.path.dirname(os.path.realpath(__file__))
+sys.path.insert(0, os.path.join(dir_path, 'components/'))
from pipeline import Pipeline
from components_ml import ClassifierWrapper
import shutil
regards
Eric
Hi folks! Thanks for putting together this excellent repo. Have you picked a license to release this code under? I checked the repo and I'm not finding a license file with terms anywhere.
Dear all,
Is it possible to add an output option to generate only "Complete_summary" files? I tried to use CRISPRIdentify for assemblies containing multiple contigs and there were hundreds of folders generated with empty files.
Best wishes,
Sofia
Hi,
Thanks for this good tool. I am noticing that if the fasta file has more than 42 lines, the tool cannot give output? Is that true?
Best,
Xichuan
Hello,
master and last tagged version differs a lot.
can you please tag a new release.
it helps package maintainer to have tagged version
regards
Eric
When executing a single fasta file using the arguments below the following occurs:
python CRISPRidentify.py --file TestInput/NC_019693.fa --json_report output.json
Traceback (most recent call last):
File "CRISPRidentify.py", line 265, in <module>
run_over_one_file(complete_path_file, folder_result, pickle_folder, json_folder)
TypeError: run_over_one_file() takes 3 positional arguments but 4 were given
It seems that the function run_over_one_file() is incorrectly called from another function and is providing too many variables?
Hi,
Thanks for providing such a powerful tool!
I installed it according to the suggested command "conda env create -f environment.yml", and finished successfully. However, I can not execute "python CRISPRidentify.py" after activating the environment. Indeed, I can not find this command "CRISPRidentify.py" in all the directories either. Could you kindly help to address this problem?
Thank you very much in advance.
Best,
Ling-Dong
When using the JSON option the output format seems to be scrambled in some way.
For example:
"4265926 ...................... GGG s:0 i:0 d:0\n4265951 .........C..T.-T..--C. CCCTTCCCTAAGAGGGAAGGGGGCTGGGGGGTTAGGTCTCTTTTTCAAACA s:4 i:0 d:3\n4266021 ...........A.......... s:1 i:0 d:0\n____________________________________________________________________________________________________\n GGAATAAATATCGTTGCTGTAC
Is if I read correctly an entry from the results folder but it is just one long string and all sub elements are glued together. Is this how it supposed to be?
Hi,
First of all, thanks for this pipeline!
I used it on a multifasta file and I got a result folder for each contigs but I did not have the summary files.
Would be great to generate a summary of the results of all contigs (in 1 file) that would contain DR, spacer, start, end, type etc... Do you think it would be possible?
Cheers,
Nico
Hello,
I was trying to install CRISPRidentify using standard unix conventions
and allow our users to ru it from anywhere.
tools and dspecific data tools path are broken as theyt are hard coded path relative to the archive tree. egg
tools/vmatch/vmatch
tool hardcoded
tools/prodigal/prodigal
tool hardcoded
tools/hmm_search/hmmsearch --tblout result_hmm.out tools/hmm_search/models_tandem.hmm protein_results.fa
hmme and model hardcoded
tools/hmm_search/models_tandem.hmm
data hardcoded
tools and data shoul be searched relative to the install directory.
regards
Eric
Hi
Are you planning on releasing CRISPRIdentify on conda?
It would make it much easier to include in other tools.
Cheers,
Russel
I am trying to build the environment on an OS system. I am unable to install the dependencies in the yaml file and I believe this is due to the platform specific build constraints on the dependencies. Can you share a environment.yaml without the constraints?
https://stackoverflow.com/questions/55554431/conda-fails-to-create-environment-from-yml
Hi
I tried to analysis the CRISPR array of a 6 complete E.coli genome.
4 working , but 2 showed mkvtree /vmatch errors.
Seems it is related to number of contigs in the file.
1/GCF_000005845.2_ASM584v2_genomic.ID.fasta -works fine
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000005845.2/
2/GCF_000007445.1_ASM744v1_genomic.ID.fasta -works fine
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000007445.1/
3/GCF_000008865.2_ASM886v2_genomic.ID.fasta - not working
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000008865.2/
4/GCF_000009565.1_ASM956v1_genomic.ID.fasta -works fine
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000009565.1/
5/GCF_000010245.2_ASM1024v1_genomic.ID.fasta -works fine
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000010245.2/
6/GCF_000010385.1_ASM1038v1_genomic.ID.fasta -not working
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000010385.1/
Executing file 1 out of 6 (GCF_000005845.2_ASM584v2_genomic.fna)
Run initial array detection
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Executing file 2 out of 6(GCF_000007445.1_ASM744v1_genomic.fna
Run initial array detection
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Executing file 3 out of 6(GCF_000008865.2_ASM886v2_genomic.fna)
Run initial array detection
mkvtree: Illegal character '>' in file "new_input.fa" line 2
vmatch: cannot open file "new_input.fa.prj": No such file or directory
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Executing file 4 out of 6 (GCF_000009565.1_ASM956v1_genomic.fna)
Run initial array detection
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Executing file 5 out of 6 (GCF_000010245.2_ASM1024v1_genomic.fna)
Run initial array detection
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Executing file 6 out of 6 (GCF_000010385.1_ASM1038v1_genomic.fna)
Run initial array detection
mkvtree: Illegal character '>' in file "new_input.fa" line 2
vmatch: cannot open file "new_input.fa.prj": No such file or directory
Refine detected arrays
Evaluate candidates
Enhance evaluated arrays
Complement arrays with additional info
Write down the results
Thank you
G
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.