Code Monkey home page Code Monkey logo

Comments (1)

le-yuan avatar le-yuan commented on June 8, 2024

Hi,

I am sorry that you have bad feelings about our computational toolbox HGTphyloDetect. But some of your comments are too harsh for us.

Firstly, let's check the title of your comments: "Don’t waste your time, software not usable in current form". If you check the recent published paper Long-read genome sequencing provides novel insights into the harmful algal bloom species Prymnesium parvum in the journal Science of The Total Environment (Impact factor is about 10), you could know that there are already some scientists using our toolbox HGTphyloDetect in their own research.

Then, I also want to explain something about your listed issues (not one by one). For example:

Currently, the input protein accessions should come from the widely used GenBank database. If you check our pipeline, the protein id is important for us because we need to retrieve taxonomy information from there. And the GenBank database has contained 247777761 specific sequences (version 258, Oct 2023) maintained by NIH, which is a super cool resource updated since Dec 1982. I believe the current version can satisfy many many users in the community.

Right now, Blastp was run by using NCBI remote blastp function. As you may know, there are mainly two approaches that can be used to run Blastp. The first one is NCBI Blastp, users don't need to download the large NCBI database if they choose the remote Blastp, it is convenient. The second one is Diamond. While it is very fast, users need to download very large database before using the software. Actually, I also tried this option before, I need to spend more than 24 hours to download database if I remember correctly.

For the taxonomy information parsing, I used the toolkit ETE (http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html), which is very fast and very convenient! Besides, I also added the "try, except" block in the script to make sure the not found entries would not throw an error.

Although the software in running HGT prediction is not very quick, the accuracy and sensitivity are quite good by using some specific species as case studies. And we have made some descriptions in the discussion part in our published paper.

As you listed, particular directory structure is needed in HGTphyloDetect. However, we also provided detailed documentation about it. Especially, we provided detailed examples about how to run those steps one by one and what is the output.

Let's look forward to checking the future version of our software, hopefully you and many other people would like it! Also thanks for your feedback!

from hgtphylodetect.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.