Code Monkey home page Code Monkey logo

hite's People

Contributors

boredma avatar csu-kanghu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

minghuaxu

hite's Issues

Incorporating known TEs

I'm curious as to whether one can use a library of already curated TEs to enhance the analysis and eliminate duplication with previous library work.

I have several species that we have manually curated and I'm hoping to use HiTE. I plan to compare the HiTE libraries to our curated TEs using any of several tools but was wondering if there is a mechanism built in that would allow me to do this automatically.

How to deal with huge differences in the running results of different software?

Dear developer

Thank you for designing such a great software, which is excellent in some aspects of performance and running time!

When I performed TE detection on an animal genome of about ~2.5G in size, I used HiTE and EDTA2.2 to run and got the results, as shown in the figure:

image

image

The total number of TEs seems similar, but the ratio of SINEs to LINEs seems quite different. Because these two TEs account for a large proportion of the genome, this result of almost double difference confuses me. To be honest, the SINE ratio of HITE is similar to that of previous studies, but the result of LINE is much lower. Do I need to manually manage the TEs predicted by EDTA and HiTE? Or is there a strategy to integrate the results of the two softwares to help me get as many real TE sequences as possible.
In addition, I used the RepeatModeler+RepeatMasker strategy, and the SINE and LINE ratio obtained was closer to that of EDTA2.2.

Sincerely hope to get your professional advice, which will be very helpful to me.

Best wishes
yulong

Classification and insertion time of transposons

Hello, I have two questions about the HiTE output files:

First, it seems that the naming of transposon classification in the HiTE.tbl and HiTE.out files is inconsistent. Where can I find the correspondence between the two classification methods?

Second, the HiTE parameters can be set with --miu, but I did not find statistical information about transposon insertion time in the output files.

Looking forward to your response.

Output files

Hello,

I've just tried HiTE, only on one chromosome for now, and trying to run it on assemblies now.

I would like a bit more information on the output files:

I understand that the output:
confident_TE.cons.fa.classified
is the most final/complete output of TEs found?

However it doesn't give the coordinates on the genome like the confident_TE.cons.fa.domain does.

I would like the coordinates of the TEs on the genome.

Best wishes,

Isabella

Construction a panTElib

Dear Professor Hukang,

I am delighted to see that your work has developed a pipeline capable of accurately identifying full-length TEs. I have also been troubled by the overly fragmented TEs annotated by previous software like EDTA2. However, I believe EDTA2 has its advantages, such as their panEDTA pipeline, which allows the construction of a TE library at the pangenome level. This enables the use of a single library to annotate multiple genomes, facilitating comparison and analysis.

I would like to know if it is possible to use the panEDTA pipeline to cluster libraries constructed by HiTE for multiple genomes, thereby generating a panTElib. Perhaps you could consider adding this functionality in future updates? Additionally, I noticed that your article and the peer review comments highlight the high annotation accuracy of your pipeline. However, focusing only on full-length TEs may overlook many non-full-length TEs that are still abundant in the genome. Could I combine your annotation results with the more comprehensive results from EDTA2 to achieve a more complete and accurate annotation?

Thank you again for your work. Best wishes!
yfchen

Is there any statistics results?

Hi, thanks for this software. Is there any statistics results, such as percentage of each type of TEs? Total TEs percentage of whole genome?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.