facebookarchive / memnn Goto Github PK
View Code? Open in Web Editor NEWMemory Networks implementations
License: Other
Memory Networks implementations
License: Other
The human annotated dataset is no longer available and without that it is difficult to reproduce the given work. Could you please provide a work around for that?
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>AllAccessDisabled</Code>
<Message>All access to this object has been disabled</Message>
<RequestId>00EC33BECAAF3FB7</RequestId>
<HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>
Data in setup_data.sh and setup_turk_data.sh are no longer available to download:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>AllAccessDisabled</Code>
<Message>All access to this object has been disabled</Message>
<RequestId>00EC33BECAAF3FB7</RequestId>
<HostId>3gzW5QZH/lqRs4tq5zuQcaFbrQtrjgluiSx/leIG3SW9IRtAniZZ10iW3kCZyums5G29LV9gnJs=</HostId>
</Error>
For a model with adjacent weight tying, as in section 2.2.1, the gradient goes to NaN after a while.
The model is designed to work in bAbI (1k dataset). I tried lowering the learning rate to 1e-5 from 1e-2, that didn't help.
The parameters are initialized according to section 4.2 of the paper. The weights A,C,T_A(temporal encoding), T_C, are initialized from a gaussian with mean=0 and std=0.1. Number of hops are set to 3. Maximum gradient norm is set to 40. Batch size is 32, and embedding dimension is 40.
During training, gradients of A and T_A becomes NaN after about 10 epochs. This doesn't happen for C and T_C. The learning rate anneals at rate of 0.5 after every 15 epochs.
On some tasks, we observed a large variance in the performance of our model (i.e. sometimes failing badly, other times not, depending on the initialization). To remedy this, we repeated each training 10 times with different random initializations, and picked the one with the lowest training error.
What were the other initializations that worked for you?
Excuse me , now I am applying the model of key-value memory networks to the KBQA,
but my experiment only shows the 46% accuracy on the test-set,and the paper of Key-Value MemNN shows the 93% accuracy on KB task,
does it really get the so high accuracy on kbqa?
In addition, I only train the model with the qa-dataset without any pre-train process,does it master?
Thank you very much if any one can give me a reply or guidance
I can't afford Matlab, and I can't read Matlab code. When will this get ported to a different language like Python, C++, Java, etc? I'd like to see it in use with one of the free and open source languages.
Hi MemNN authors (@tesatory) !
I read the paper "End-To-End Memory Networks", and I have 1 question.
Why you use the stored memory x1 , .., xi 2 times in a single layer like figure1? (I attached the figure just in case
)
Why that is not 1, or 3 or more times? Is there any Mathematical reason or just rule of thumb?
Would it be possible to provide some more details about how to train/run the model and what output to expect? For example what will running the script th online_simulate.lua [params]
produce and how long should it run for?
Why should the position encoding change along the dimension of the word embedding?
Shouldn't the entire embedding be multiplied elementwise by a constant value?
Consider the sentence "john, went, to, the, hallway", doesn't it suffice to multiply element-wise "john" by a small constant value, say 0.1, and, the last word "hallway" by a larger value.
I am trying to understand the reason behind varying the weight of position encoding along the dimension of a word embedding
The link to fb.ai/babi
is a relative link, actually pointing to https://github.com/facebook/MemNN/blob/master/MemN2N-babi-matlab/fb.ai/babi
, which does not work.
how to download the mentioned dataset, need help!!!
Is temporal encoding equivalent to using "time words"?
Hi,
I have a question regarding the hash-lookup performed by KVMemNNs.
Do you compute the hashes based on the actual words or on the embeddings?
And what kind of hashing function do you use?
Best regards,
Sebastian
setup.sh installs the C library into the torch path, but not the library/*.lua.
For now, I can just setup up so library/*.lua files are findable underneath
${s%luarocks}../share/lua/5.1/library, by continuing the setup.sh approach.
A better approach might install libmemnn and its lua interface via a .spec file
(in which case "library/" might probably not a great name for the lua interface)
Hello,
I have a problem while running the file "build_hash.sh". When it tries to run "build_hash.lua", it does not detect dictFile="./data/torch/dict-hash.txt" , I get the error that dict-hash.txt is nil, but it has the processed information. I cannot fix the error, do you know how can I solve it?
Thank you so much,
Andrea
I want to use KB for question answering using KVMemNN. This repository only contains code for experimenting KVMemNN using wiki-documents. I appreciate it if you provide source code for conducting KB based experiments mentioned in the paper. Thank you.
Hi, thanks for sharing the code. Line9 - line 17 in MemN2N-babi-matlab/build_model.m implements the position encoding? It is different from the equation in the paper (section 4.1). Can you explain why they are not the same and why you chose to implement a different one here?
first i run the ./setup_data.sh to download the data,and then
i run the code of the AskingQuestions/reinforce/try.sh.
but i meet the errors about this
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=64 error=59 : device-side assert triggered
/home/wuqiong/torch/install/bin/luajit: /home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: cuda runtime error (59) : device-side assert triggered at /tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:64
stack traceback:
[C]: in function 'abs'
/home/wuqiong/.luarocks/share/lua/5.1/nn/Normalize.lua:40: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'func'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
/home/wuqiong/.luarocks/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:248: in function 'Forward_Policy_AQorQA'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:128: in function 'test'
...g/Mem/MemNN-raw/AskingQuestions/reinforce/RL_memmnet.lua:451: in function 'train'
train_RL.lua:18: in main chunk
[C]: in function 'dofile'
...iong/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=196 error=59 : device-side assert triggered
THCudaCheckWarn FAIL file=/tmp/luarocks_cutorch-scm-1-8495/cutorch/lib/THC/THCCachingHostAllocator.cpp line=211 error=59 : device-side assert triggered
Is the code have some error?
Hi,
I'm a beginner of torch. While I run the MemN2N-lang-model in my ubuntu 15.10(CUDA 7.5 is installed and my data is copied from https://github.com/wojzaremba/lstm/tree/master/data ), I get a error:
Could you tell me what's wrong with this?
Thanks~
Hi,
I have been looking at the code and I'm not sure why the output vocabulary size consists of both the word & key embeddings when the key is not tied -- link to code. The step is followed by a narrow operation limiting the logsoftmax to only the words. Is there any reason for the design choice or we can get rid of the extra rows from z/R.
I have read the paper "TRACKING THE WORLD STATE WITH RECURRENT ENTITY NETWORKS".
In the section 5 experiments, table 4 shows the result of testing Entnet on CBT(children's book test).
I have some problem about training it.
In CBT dataset it looks like {story, query, candidate, answer}.
And bAbI is {story, query, answer}
If I want to train CBT dataset, how can I feed candidate to the model ?
Or candidate only used to prepare data as window sentence ?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.