Code Monkey home page Code Monkey logo

casilico's Introduction

CaSilico R Package

Introduction

The efficiency of CRISPR-Cas system is highly depends on well-designed CRISPR RNA (crRNA). To facilitate the use of various types of CRISPR-Cas systems by a wide range of researchers, there is a need for development of computational tools to design crRNAs which cover different CRISPR-Cas systems with off-target analysis capability. Numerous crRNA design tools have been developed but nearly all of them are dedicated to design crRNA for genome editing.

Hence, we developed a tool matching the needs of both beginners and experts, named CaSilico, which was inspired by the limitations of the current gRNA design tools for designing crRNAs for Cas12, Cas13, and Cas14 CRISPR-Cas systems. Using a list of important features such as mismatch tolerance rules, self-complementarity, GC content, frequency of cleaving base around the target site, target accessibility and PFS (protospacer flanking site) or PAM (protospacer adjacent motif) requirement, CaSilico searches all potential crRNAs in a user-input sequence. Considering these features help users to rank all crRNAs for a sequence and make an informed decision about whether a crRNA is suited for an experiment or not. Our tool is sufficiently flexible to tune some key parameters governing the design of crRNA and identification of off-targets, which can be led to increases the chances of successful CRISPR-Cas experiments.

CaSilico outperforms previous crRNA design tools in the following respects: 1) supporting any reference genome/transcriptome for which a FASTA file is available; 2) designing crRNAs that simultaneously target multiple sequences through conserved region detection among a set of sequences; 3) considering new CRISPR-Cas subtypes; 4) reporting a list of different features for each candidate crRNA, which can help the user to select the best one. Given these capabilities, CaSilico addresses end-user concerns arising from the use of sophisticated bioinformatics algorithms and has a wide range of potential research applications in different areas especially design of crRNA for pathogen diagnosis.

CaSilico was successfully applied to design crRNAs for different genes in SARS-CoV-2 genome, as some of the crRNAs have been experimentally tested in the previous studies. s

CaSilico workflow. (A) CaSilico accepts a single or a set of DNA or RNA sequences to be scanned for crRNA designing. (B) When more than one sequences is given as input, the conserved regions among them automatically detect considering conservation threshold and one of the two different approaches for identifying conserved regions. (C) A sliding window (stride of 1 nt) is employed across the single sequences or conserved region of multiple sequences to specify potentail target sites. (D) CaSilico applies multiple criteria for crRNA designing, performs off-target analysis and returns outputs in an interactive graphical interface and some files such as MSA and secondary structure (E & F).

Installation

library("devtools")

install_github("mrb20045/CaSilico")

A quick example to use CaSilico package

library("CaSilico")

############# (1) #############
# Run with TargetFasta and without off-target analysis

# Full path of target sequences
data <- paste0(system.file(package = "CaSilico"),"/data/3D.fasta")

# Run CaSilico for subtype VI_A with ConservationThreshold = 0.98 and ConservationMethod = 1
CaSilico(ResultsFolder="Example",
         TargetFasta=data,
         TargetAccession=NULL,
         CrisprTypes=c("casVI_A"),
         ConservationMethod = 1,
         ConservationThreshold=0.98,
         OffTarget = F,
         OffAsk = F)
##############################
#
#
#
############# (2) #############

# Run with TargetFasta and offline off-target analysis

# Full path of target sequences
data <- paste0(system.file(package = "CaSilico"),"/data/3D.fasta")

# Full path to fasta file sequences for local off target analysis
genome_dir <- paste0(system.file(package = "CaSilico"),"/data/genomes_off_target")
Local_Fasta <- file.path(genome_dir,list.files(genome_dir))

# Names related to the LocalFasta argument to determine the name of the results. (optional)
Local_Name <- c("Genome1","Genome2","Genome3")

# Run CaSilico for subtypes V_A and V_B with ConservationThreshold = 0.98 and ConservationMethod = 1
CaSilico(ResultsFolder="Example1",
         TargetFasta=data,
         TargetAccession=NULL,
         CrisprTypes=c("casV_A","casV_B"),
         ConservationMethod = 1,
         ConservationThreshold=0.98,
         OffTarget = T,
         OffAsk = F,
         Organism=NULL,
         LocalOff=T,
         LocalFasta=Local_Fasta,
         LocalName=Local_Name)
##############################
#
#
#
############# (3) #############
# Run with TargetFasta and online off-target analysis

# Full path of target sequences
data <- paste0(system.file(package = "CaSilico"),"/data/3D.fasta")

# Run CaSilico for subtype VI_A with ConservationThreshold = 0.98 and ConservationMethod = 1
# and online off-target analysis for Homo sapiens
CaSilico(ResultsFolder="Example2",
         TargetFasta=data ,
         TargetAccession=NULL,
         CrisprTypes=c("casVI_A"),
         ConservationMethod = 1,
         ConservationThreshold=0.98,
         OffTarget = T,
         OffAsk = F,
         Organism="Homo sapiens",
         LocalOff=F)
##############################
#
#
#
############# (4) #############
 # Run with accession numbers

# Run CaSilico for subtypes VI_A and VI_B with ConservationThreshold = 0.98 and ConservationMethod = 1
# The accession numbers used are: "U15717" and "U15718"
 CaSilico(ResultsFolder="Example3",
          TargetAccession=c("U15717", "U15718"),
          CrisprTypes=c("casVI_A","casVI_B"),
          ConservationMethod = 1,
          ConservationThreshold=0.98,
          OffTarget = F,
          OffAsk = F)
##############################
#
#
#
############# (5) #############
 # Run with sequence coordinate
 
 # Run CaSilico for subtypes VI_A and VI_B with ConservationThreshold = 0.98 and ConservationMethod = 1
 # The desired coordinates are: 
 # (1) chromosome = 1, start = 2000, end = 2150, strand = "-", species = "bos_taurus"
 # (2) chromosome = 1, start = 2000, end = 2150, strand = "-", species = "bos_taurus"
 CaSilico(ResultsFolder="Example4",
          TargetCoordinate=list(chromosome=c("1","1"),
                             start=c("2000","2000"),
                             end=c("2150","2150"),
                             strand=c("-","-"),
                             species = c("bos_taurus","bos_taurus")),
          CrisprTypes=c("casVI_B","casVI_A"),
          ConservationMethod = 1,
          ConservationThreshold=0.98,
          OffTarget = F,
          OffAsk = F)
##############################
          
          
      

Citation

If you use CaSilico in your study, please consider citing:

Asadbeigi A, Norouzi M, Vafaei Sadi MS, Saffari M, Bakhtiarizadeh MR. 2022. CaSilico: A versatile CRISPR package for in silico CRISPR RNA designing for Cas12, Cas13, and Cas14. Frontiers in Bioengineering and Biotechnology, 10:957131.

DOI: 10.3389/fbioe.2022.957131

License

GPL-3

casilico's People

Contributors

milad-norouzi avatar mrb20045 avatar sadegh65v avatar

Stargazers

James Sacco avatar  avatar Diana Muñoz avatar

Watchers

 avatar  avatar

casilico's Issues

Possiblity to use existing blast database

We are very happy with your tool. It was easy to use with the examples provided. It was also very fast with the offline offtarget analysis with the blast database.

I was wondering if it is possible to define an existing blast database to make the tool even faster (no need to build the database)? When I use the LocalFasta option and define a reference which has a blast database it does not try to use this database.

Thanks in advance!

Error in plot.window(...) : need finite 'xlim' values

Hi,
I am encountering a similar issue to the https://github.com/mrb20045/CaSilico/issues/3
Error in plot.window(...) : need finite 'xlim' values
Interestingly, this issue does not occur on a Windows system; the code executes without any errors there.
To address this issue on Linux, I added the following lines to my script to ensure the software continues running even if this error occurs:
run_CaSilico_with_error <- function(file_path) {
tryCatch({
run_CaSilico(file_path)
}, error = function(e) {
if (grepl("Error in plot.window", e$message)) {
message("Plotting error ignored: ", e$message)
} else {
stop(e)
}
})
}

However, despite this attempt, the software fails to produce the final output files, such as Spacers_Information.Html and Off_Targets_Information.Html.

Could you please provide guidance on how to resolve this issue? Any assistance would be greatly appreciated.

Thank you.

Error in plot.window(...)

Hi,
Im using Rstudio server from a docker image (rocker/rstudio), im trying to run the first example code inside de package help page, but im getting the following error:
Error in plot.window(...) : need finite 'xlim' values

The first step in the workflow seems to be ok, even i get the output folder with the alignments and subfolders, but somehow the plot is failing.

this are the final lines of the output i am getting:

final_structure
ggguugauuguugacaccagagauguggaagagcgcgugcacgucaugcgcaaaaccaagcuugcacccaccguggcacacgguguguuuaacccugaauuugggccugccgccuuguccaacaaggacccgcgccugaacgaagguguuguccucgacgaagucaucuucuccaaacacaagggagacaccaagaugucugaggaggacaaagcgcuguuccgccgcugcgcugcugacuacgcgucgcgccugcacagcgugcuggggacagcaaaugccccacugagcaucuaugaggccaucaagggcgucgacggacucgacgccauggagccugacaccgcgcccggccuccccugggcccuccaggggaagcgccgcggcgcgcucaucgacuucgagaacggcacugucggacccgagguugaggcugcccugaagcucauggagaaaagagaguacaaguuugcuugucaaaccuuccugaaggacgagauucgcccgauggagaaaguacgugccggcaagacucgcauugucgacgucuugccuguugaacacauucucuacaccaggaugaugauuggcagauuuugugcacaaaugcacucaaacaacggaccgcaaauuggcucggcggucgguugcaacccugauguugauuggcaaagauuuggcacacauuuugcccaguacaaaaacgugugggacguggacuauucggccuuugaugcuaaccacugcagugacgccaugaacaucauguuugaggagguguuccgcacggaauuugguuuccacccgaacgcugaguggauucugaagacucucgugaacacggaacacgccuacgagaacaagcgcaucacuguugaaggcggaaugccgucugguuguuccgcaacaagcaucaucaacacaauuuugaacaacaucuacgugcucuacgcccugcguagacacuaugagggaguugagcuggacaccuacaccaugaucuccuacggagacgacaucgugguggcaagugauuacgauuuggacuuugaggcucucaagccucacuuuaaaucucuuggucaaaccaucacuccagcugacaaaagcgacaaagguuuuguucuuggucacuccauuaccgaugucacuuuccucaaaagacacuuccacauggauuauggaacuggguuuuacaaaccugugauggcuucgaagacccucgaggcuauccucuccuuugcacgccgugggaccauacaggagaaguugaucuccguggcaggacucgccguccacucuggaccugacgaguaccggcgucucuucgagcccuuccagggccucuuugagauuccaagcuacagaucacuuuaccugcguugggugaacgccgugugcggugacgc
(((((((...))))).))......(((.(((.(((((((.....))))))).........))).))).((((...(((((((((((.....((((.......((((..(((.((((((...))))))....((((((......))))))(((((((.....((.((((((((((.........))))).....))))).))...)))))))....(((((((((((((((.((.((((((.....)))).))..)).))))).))..))))))))..((((........)))).....((((..((...((((((((.....))))))))...))))))((...(((((((((((((((((((.....))))))).)).)))).)))))).)).(((((((((.(..((((...))))..).)))))))))))).)))).....(((((((..(((.(((((...(((((((((.((((...(.(((((..((.(((...))).))..(((((((.((....((.(((((((((((......))).)))))))).)).....)).)))))))...))))).).)))).))))))))).(((((....))))).....((((.((((((...........)))))).))))((((......))))..((.(((....))).))........(((((((.((((..(((((((((.((((((...((((....(((((((.......(((((((((((((((...)))))(((((((..((((((((((((((.(((((..(((((.((.....)).)))))..)))))...)).))))....)))))))).))).)))).....)))..))))))).....((((((((((....)).))))))))....)))))))((((.......)))).((((.(((.((((.(((((((...)))))))...)))).))).))))((((((((((...(((((((((((((...))))).....))))))))....))).....((((((((...(((((((....))))))).))))))))...((((...))))....)))))))(((((((.(......).)))))))..(((.....)))...)))).))).((((.....)))).))))))))))))..)).))..)))))))...(((....)))(((((((((((......)))))))))))))))).))).....)))))))......((((.(((((.(((((..(((((.((((((((.(((((...)))))..).))))).))((.(((((...(((((((...)))).)))...))))).))..))))))))))))))).))))....))))..))))))))))))))).... (-542.40)
[1] "no sequences to match"
[1] "default ranges used"
[1] "no sequences to match"
[[1]]
[1] "Gap between bound NTs is less than 2"
[1] "no sequences to match"
Error in plot.window(...) : need finite 'xlim' values

Any idea what could be?

Batch Downloading of Analysis Results

Hi,
I am reaching out to discuss the analysis results generated by your bioinformatics software. The results are currently accessible through individual web pages, each corresponding to specific analysis outputs.

While the web interface provides a convenient way to view individual results, I am facing a challenge with efficiently downloading these results for further offline analysis or archiving. My workflow would greatly benefit from a feature that allows for batch downloading of these web-based results.

Could you please advise if there is an existing functionality within the software that supports this use case?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.