Code Monkey home page Code Monkey logo

memnn's Introduction

Memory-Augmented Neural Networks

This project contains implementations of memory augmented neural networks. This includes code in the following subdirectories:

Other 3rd party implementations

  • python-babi: MemN2N implementation on bAbI tasks with very nice interactive demo.
  • theano-babi: MemN2N implementation in Theano for bAbI tasks.
  • tf-lang: MemN2N language model implementation in TensorFlow.
  • tf-babi: Another MemN2N implementation of MemN2N in TensorFlow, but for bAbI tasks.

memnn's People

Contributors

alexholdenmiller avatar facebook-github-bot avatar jaseweston avatar saravananselvamohan avatar tesatory avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memnn's Issues

help with KB based KVMemNN for QA experiments

I want to use KB for question answering using KVMemNN. This repository only contains code for experimenting KVMemNN using wiki-documents. I appreciate it if you provide source code for conducting KB based experiments mentioned in the paper. Thank you.

Can't position encoding be like temporal encoding

Why should the position encoding change along the dimension of the word embedding?
Shouldn't the entire embedding be multiplied elementwise by a constant value?
Consider the sentence "john, went, to, the, hallway", doesn't it suffice to multiply element-wise "john" by a small constant value, say 0.1, and, the last word "hallway" by a larger value.
I am trying to understand the reason behind varying the weight of position encoding along the dimension of a word embedding

Dataset can no longer be downloaded

The human annotated dataset is no longer available and without that it is difficult to reproduce the given work. Could you please provide a work around for that?

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AllAccessDisabled</Code>
  <Message>All access to this object has been disabled</Message> 
  <RequestId>00EC33BECAAF3FB7</RequestId>
  <HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>

Cannot read "dict-hash.txt"

Hello,

I have a problem while running the file "build_hash.sh". When it tries to run "build_hash.lua", it does not detect dictFile="./data/torch/dict-hash.txt" , I get the error that dict-hash.txt is nil, but it has the processed information. I cannot fix the error, do you know how can I solve it?

error

Thank you so much,

Andrea

run AskingQuestions/reinforce have some error

first i run the ./setup_data.sh to download the data,and then
i run the code of the AskingQuestions/reinforce/try.sh.
but i meet the errors about this
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=64 error=59 : device-side assert triggered
/home/wuqiong/torch/install/bin/luajit: /home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: cuda runtime error (59) : device-side assert triggered at /tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:64
stack traceback:
[C]: in function 'abs'
/home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:248: in function 'Forward_Policy_AQorQA'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:128: in function 'test'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:451: in function 'train'
train_RL.lua:18: in main chunk
[C]: in function 'dofile'
...iong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=196 error=59 : device-side assert triggered
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=211 error=59 : device-side assert triggered

Is the code have some error?

Vocabulary size of z/R in output module of EntNet

Hi,

I have been looking at the code and I'm not sure why the output vocabulary size consists of both the word & key embeddings when the key is not tied -- link to code. The step is followed by a narrow operation limiting the logsoftmax to only the words. Is there any reason for the design choice or we can get rid of the extra rows from z/R.

NaN in gradient on A matrix

For a model with adjacent weight tying, as in section 2.2.1, the gradient goes to NaN after a while.
The model is designed to work in bAbI (1k dataset). I tried lowering the learning rate to 1e-5 from 1e-2, that didn't help.
The parameters are initialized according to section 4.2 of the paper. The weights A,C,T_A(temporal encoding), T_C, are initialized from a gaussian with mean=0 and std=0.1. Number of hops are set to 3. Maximum gradient norm is set to 40. Batch size is 32, and embedding dimension is 40.
During training, gradients of A and T_A becomes NaN after about 10 epochs. This doesn't happen for C and T_C. The learning rate anneals at rate of 0.5 after every 15 epochs.

  1. What can I try to address the NaN in gradients of A and T_A? These weights are used only during the first hop.

On some tasks, we observed a large variance in the performance of our model (i.e. sometimes failing badly, other times not, depending on the initialization). To remedy this, we repeated each training 10 times with different random initializations, and picked the one with the lowest training error.

What were the other initializations that worked for you?

Training EntNet on CBT dataset

I have read the paper "TRACKING THE WORLD STATE WITH RECURRENT ENTITY NETWORKS".
In the section 5 experiments, table 4 shows the result of testing Entnet on CBT(children's book test).
I have some problem about training it.
In CBT dataset it looks like {story, query, candidate, answer}.
And bAbI is {story, query, answer}
If I want to train CBT dataset, how can I feed candidate to the model ?
Or candidate only used to prepare data as window sentence ?

Thanks!

KVmemnn setup.sh

setup.sh installs the C library into the torch path, but not the library/*.lua.

For now, I can just setup up so library/*.lua files are findable underneath
${s%luarocks}../share/lua/5.1/library, by continuing the setup.sh approach.

A better approach might install libmemnn and its lua interface via a .spec file
(in which case "library/" might probably not a great name for the lua interface)

When will it be converted to a Non-Matlab language?

I can't afford Matlab, and I can't read Matlab code. When will this get ported to a different language like Python, C++, Java, etc? I'd like to see it in use with one of the free and open source languages.

Running the model

Would it be possible to provide some more details about how to train/run the model and what output to expect? For example what will running the script th online_simulate.lua [params] produce and how long should it run for?

Human-in-the-Loop cannot be downloaded

Data in setup_data.sh and setup_turk_data.sh are no longer available to download:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AllAccessDisabled</Code>
  <Message>All access to this object has been disabled</Message> 
  <RequestId>00EC33BECAAF3FB7</RequestId>
  <HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>

Why the stored memory is used 2 times in a single layer?

Hi MemNN authors (@tesatory) !

I read the paper "End-To-End Memory Networks", and I have 1 question.
Why you use the stored memory x1 , .., xi 2 times in a single layer like figure1? (I attached the figure just in case
スクリーンショット 2021-03-09 16 21 27
)

Why that is not 1, or 3 or more times? Is there any Mathematical reason or just rule of thumb?

KVMemNN for KBQA

Excuse me , now I am applying the model of key-value memory networks to the KBQA,
but my experiment only shows the 46% accuracy on the test-set,and the paper of Key-Value MemNN shows the 93% accuracy on KB task,
does it really get the so high accuracy on kbqa?
In addition, I only train the model with the qa-dataset without any pre-train process,does it master?
Thank you very much if any one can give me a reply or guidance

Hash-lookup with KVMemNN

Hi,

I have a question regarding the hash-lookup performed by KVMemNNs.
Do you compute the hashes based on the actual words or on the embeddings?
And what kind of hashing function do you use?

Best regards,
Sebastian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.