Code Monkey home page Code Monkey logo

cogeqc's Introduction

cogeqc

GitHub issues Lifecycle: stable R-CMD-check-bioc Codecov test coverage

The goal of cogeqc is to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. Currently, cogeqc can be used to assess:

  1. Genome assembly and annotation quality: using two approaches:

    • Statistics in a context: users can extract summary assembly and annotation statistics for genomes on NCBI (via the NCBI Datasets API) and compare their observed values (e.g., genome size, number of genes, contiguity measures) with previously reported values on NCBI genomes.

    • Gene space completeness with BUSCOs: users can assess gene space completeness using Best Universal Single-Copy Orthologs (BUSCOs) through wrapper functions that run BUSCO from the comfort of an R session and create publication-ready plots with summary statistics.

  2. Orthogroup inference: orthogroups are assessed based on the percentage of shared protein domains in all ortogroups. The rationale for this approach is that genes in the same orthogroup evolved from a common ancestor, so the percentage of conserved protein domains in an orthogroup should be as high as possible.

  3. Synteny detection: synteny detection is assessed using network-based approaches, namely the clustering coefficient and degree of a synteny network.

Installation instructions

Get the latest stable R release from CRAN. Then install cogeqc using from Bioconductor the following code:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("cogeqc")

And the development version from GitHub with:

BiocManager::install("almeidasilvaf/cogeqc")

Citation

Below is the citation output from using citation('cogeqc') in R. Please run this yourself to check for any updates on how to cite cogeqc.

print(citation('cogeqc'), bibtex = TRUE)
#> 
#> To cite package 'cogeqc' in publications use:
#> 
#>   Almeida-Silva F, Van de Peer Y (2022). _cogeqc: Systematic quality
#>   checks on comparative genomics analyses_. R package version 1.3.1,
#>   <https://github.com/almeidasilvaf/cogeqc>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {cogeqc: Systematic quality checks on comparative genomics analyses},
#>     author = {Fabrício Almeida-Silva and Yves {Van de Peer}},
#>     year = {2022},
#>     note = {R package version 1.3.1},
#>     url = {https://github.com/almeidasilvaf/cogeqc},
#>   }

Please note that the cogeqc was only made possible thanks to many other R and bioinformatics software authors, which are cited either in the vignettes and/or the paper(s) describing this package.

Code of Conduct

Please note that the cogeqc project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Development tools

For more details, check the dev directory.

This package was developed using biocthis.

cogeqc's People

Contributors

almeidasilvaf avatar jwokaty avatar nturaga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

cogeqc's Issues

Add license to software?

Hi there - first of all, thank you for putting together this package! I'm currently using it to assess orthogroup inference quality and it's been a fantastic tool.

I was looking to see whether the software included a license, and I noticed that you haven't yet added one - I was wondering whether you might be willing to do so?

Thanks again!
Austin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.