Code Monkey home page Code Monkey logo

Comments (5)

BinchaoPeng avatar BinchaoPeng commented on August 26, 2024

I am so sorry to disturb you. I have seen these issues such as #5 ,#11,#16,#21.
first, in #5_11 Dec 2020, we can set MODEL_TYPE dnalongcat to process seq (seq_len > 512); and then, in #11_17 Jan 2021, the way is to turncate or split; and also, in #18_8 Apr 2021,you said to use --model_type dnalong. Because of one Q and three different A, I am confused for these answers and I don't konw how to process long seq correctly.

second,follow your answer in #18_8 Apr 2021, because the answer is the lastest, I modified the param "model_type": "bert" to "model_type": "dnalongcat" in file config.json ,and modified the param "max_len": 512 to "max_len": 3072 in file tokenizer_config.json. However it doesn't work, So may I hope you make a demo or write a detail note, after all, you are most familiar with how to do it.

third, in #16 ,the [CLS] token's hidden state is output[1]? also means sentence vector? and which is better to make a classify task between output[0](word vector) and output[1](sentence vector)?

Finally, a suggestion, sometimes, it is more convient to add your code in our project for soving problem rather than a final command line.

please reply me soon if you see here, Thanks very much!

Sincerely,PBC

from dnabert.

sheetalgiri avatar sheetalgiri commented on August 26, 2024

@BinchaoPeng did you find a workaround?

from dnabert.

victormaricato avatar victormaricato commented on August 26, 2024

third, in #16 ,the [CLS] token's hidden state is output[1]? also means sentence vector? and which is better to make a classify task between output[0](word vector) and output[1](sentence vector)?

About this:

In recent Transformer papers and in huggingface documentation, it is best to average the last hidden states (mean(output[0]) than to use [CLS] (output[1]) as it is a better "semantic" representation.

As DNABERT uses the [CLS] embedding and DNA sequences transformers are quite new and could have discrepancies with NLP's transformers, it is probably best to test both aggregation techniques.

from dnabert.

BinchaoPeng avatar BinchaoPeng commented on August 26, 2024

it means output[0] is better。thanks!

from dnabert.

jerryji1993 avatar jerryji1993 commented on August 26, 2024

Closed.

from dnabert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.