Code Monkey home page Code Monkey logo

Comments (5)

JasonMengcp avatar JasonMengcp commented on June 10, 2024 1

@JoaoLages
I encountered the same NaN problem in some parameter settings. (usually happens when the hidden dimension is small.) After debugging, I found it is because two parameters (bias_word and bias_sent)are not initialized which may contain NaN.
Add self.bias_word.data.uniform_(-0.1,0.1) to init() of AttentionWordRNN.
Add self.bias_sent.data.uniform_(-0.1,0.1) to init() of AttentionSentRNN.

It solved my problem. Hope this can help yours!

from attention-networks-for-classification.

Sandeep42 avatar Sandeep42 commented on June 10, 2024

Can you give me some more context, is it starting as NaN or is it converging to NaN?

from attention-networks-for-classification.

JoaoLages avatar JoaoLages commented on June 10, 2024

It is starting with NaN, then it cannot converge anymore. If it starts with a anything else other than NaN, I never saw it converging to NaN

from attention-networks-for-classification.

Sandeep42 avatar Sandeep42 commented on June 10, 2024

This problem didn't occur to me when I tested it, which data set were you using?

from attention-networks-for-classification.

JoaoLages avatar JoaoLages commented on June 10, 2024

I have been using another dataset, it's true, which I cannot share unfortunately. I was wondering if you had any idea on why it could happen and how to avoid it though

from attention-networks-for-classification.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.