facebookarchive / memnn Goto Github PK

Memory Networks implementations

License: Other

Lua 84.98% MATLAB 7.82% Shell 2.34% Python 2.95% C 1.92%

memnn's Issues

Dataset can no longer be downloaded

The human annotated dataset is no longer available and without that it is difficult to reproduce the given work. Could you please provide a work around for that?

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AllAccessDisabled</Code>
  <Message>All access to this object has been disabled</Message> 
  <RequestId>00EC33BECAAF3FB7</RequestId>
  <HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>

any plan on 1705.05414v2, Key-Value Retrieval Networks for Task-Oriented Dialogue?

Human-in-the-Loop cannot be downloaded

Data in setup_data.sh and setup_turk_data.sh are no longer available to download:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AllAccessDisabled</Code>
  <Message>All access to this object has been disabled</Message> 
  <RequestId>00EC33BECAAF3FB7</RequestId>
  <HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>

NaN in gradient on A matrix

For a model with adjacent weight tying, as in section 2.2.1, the gradient goes to NaN after a while.
The model is designed to work in bAbI (1k dataset). I tried lowering the learning rate to 1e-5 from 1e-2, that didn't help.
The parameters are initialized according to section 4.2 of the paper. The weights A,C,T_A(temporal encoding), T_C, are initialized from a gaussian with mean=0 and std=0.1. Number of hops are set to 3. Maximum gradient norm is set to 40. Batch size is 32, and embedding dimension is 40.
During training, gradients of A and T_A becomes NaN after about 10 epochs. This doesn't happen for C and T_C. The learning rate anneals at rate of 0.5 after every 15 epochs.

What can I try to address the NaN in gradients of A and T_A? These weights are used only during the first hop.

On some tasks, we observed a large variance in the performance of our model (i.e. sometimes failing badly, other times not, depending on the initialization). To remedy this, we repeated each training 10 times with different random initializations, and picked the one with the lowest training error.

What were the other initializations that worked for you?

KVMemNN for KBQA

Excuse me , now I am applying the model of key-value memory networks to the KBQA,
but my experiment only shows the 46% accuracy on the test-set,and the paper of Key-Value MemNN shows the 93% accuracy on KB task,
does it really get the so high accuracy on kbqa?
In addition, I only train the model with the qa-dataset without any pre-train process,does it master?
Thank you very much if any one can give me a reply or guidance

When will it be converted to a Non-Matlab language?

I can't afford Matlab, and I can't read Matlab code. When will this get ported to a different language like Python, C++, Java, etc? I'd like to see it in use with one of the free and open source languages.

Why the stored memory is used 2 times in a single layer?

Hi MemNN authors (@tesatory) !

I read the paper "End-To-End Memory Networks", and I have 1 question.
Why you use the stored memory x1 , .., xi 2 times in a single layer like figure1? (I attached the figure just in case

)

Why that is not 1, or 3 or more times? Is there any Mathematical reason or just rule of thumb?

Facing problem in HITL

While running online_simulate.lua I am getting the following error :

Running the model

Would it be possible to provide some more details about how to train/run the model and what output to expect? For example what will running the script th online_simulate.lua [params] produce and how long should it run for?

Can't position encoding be like temporal encoding

Why should the position encoding change along the dimension of the word embedding?
Shouldn't the entire embedding be multiplied elementwise by a constant value?
Consider the sentence "john, went, to, the, hallway", doesn't it suffice to multiply element-wise "john" by a small constant value, say 0.1, and, the last word "hallway" by a larger value.
I am trying to understand the reason behind varying the weight of position encoding along the dimension of a word embedding

Broken link in MemN2N-babi-matlab/README.md

The link to fb.ai/babi is a relative link, actually pointing to https://github.com/facebook/MemNN/blob/master/MemN2N-babi-matlab/fb.ai/babi, which does not work.

Getting provided dataset encounters 403 forbidden

how to download the mentioned dataset, need help!!!

Temporal encoding

Is temporal encoding equivalent to using "time words"?

Hash-lookup with KVMemNN

Hi,

I have a question regarding the hash-lookup performed by KVMemNNs.
Do you compute the hashes based on the actual words or on the embeddings?
And what kind of hashing function do you use?

Best regards,
Sebastian

KVmemnn setup.sh

setup.sh installs the C library into the torch path, but not the library/*.lua.

For now, I can just setup up so library/*.lua files are findable underneath
${s%luarocks}../share/lua/5.1/library, by continuing the setup.sh approach.

A better approach might install libmemnn and its lua interface via a .spec file
(in which case "library/" might probably not a great name for the lua interface)

Cannot read "dict-hash.txt"

Hello,

I have a problem while running the file "build_hash.sh". When it tries to run "build_hash.lua", it does not detect dictFile="./data/torch/dict-hash.txt" , I get the error that dict-hash.txt is nil, but it has the processed information. I cannot fix the error, do you know how can I solve it?

Thank you so much,

Andrea

help with KB based KVMemNN for QA experiments

I want to use KB for question answering using KVMemNN. This repository only contains code for experimenting KVMemNN using wiki-documents. I appreciate it if you provide source code for conducting KB based experiments mentioned in the paper. Thank you.

Why use element-wise sum rather than concatenate these features?

Hi MemNN authors (@tesatory) ,

Thank you for the great code!
I have a question regarding this line. Why do you use element-wise sum rather than concatenate these two features and feed it to a fully connected layer? Is there any advantages using the element-wise sum?

Thank you!

Best,
Rui

the position encoding equation is inconsistent with the equation in the paper

Hi, thanks for sharing the code. Line9 - line 17 in MemN2N-babi-matlab/build_model.m implements the position encoding? It is different from the equation in the paper (section 4.1). Can you explain why they are not the same and why you chose to implement a different one here?

run AskingQuestions/reinforce have some error

first i run the ./setup_data.sh to download the data,and then
i run the code of the AskingQuestions/reinforce/try.sh.
but i meet the errors about this
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=64 error=59 : device-side assert triggered
/home/wuqiong/torch/install/bin/luajit: /home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: cuda runtime error (59) : device-side assert triggered at /tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:64
stack traceback:
[C]: in function 'abs'
/home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:248: in function 'Forward_Policy_AQorQA'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:128: in function 'test'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:451: in function 'train'
train_RL.lua:18: in main chunk
[C]: in function 'dofile'
...iong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=196 error=59 : device-side assert triggered
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=211 error=59 : device-side assert triggered

Is the code have some error?

can't run MemN2N-lang-model

Hi,

I'm a beginner of torch. While I run the MemN2N-lang-model in my ubuntu 15.10(CUDA 7.5 is installed and my data is copied from https://github.com/wojzaremba/lstm/tree/master/data ), I get a error:

Could you tell me what's wrong with this?

Thanks~

Vocabulary size of z/R in output module of EntNet

Hi,

I have been looking at the code and I'm not sure why the output vocabulary size consists of both the word & key embeddings when the key is not tied -- link to code. The step is followed by a narrow operation limiting the logsoftmax to only the words. Is there any reason for the design choice or we can get rid of the extra rows from z/R.

Training EntNet on CBT dataset

I have read the paper "TRACKING THE WORLD STATE WITH RECURRENT ENTITY NETWORKS".
In the section 5 experiments, table 4 shows the result of testing Entnet on CBT(children's book test).
I have some problem about training it.
In CBT dataset it looks like {story, query, candidate, answer}.
And bAbI is {story, query, answer}
If I want to train CBT dataset, how can I feed candidate to the model ?
Or candidate only used to prepare data as window sentence ?

Thanks!

facebookarchive / memnn Goto Github PK

memnn's Issues

Recommend Projects

Recommend Topics

Recommend Org