Code Monkey home page Code Monkey logo

deep-qa's Introduction

OVERVIEW

This code implements a convolutional neural network architecture for learning to match question and answer sentences described in the paper:

Aliaksei Severyn and Alessandro Moschitti. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. SIGIR, 2015

The network features a state-of-the-art convolutional sentence model, advanced question-answer matching model, and introduces a novel relational model to encode related words in a question-answer pair.

The addressed task is a popular answer sentence selection benchmark, where the goal is for each question to select relevant answer sentences. The dataset was first introduced by (Wang et al., 2007) and further elaborated by (Yao et al., 2013). It is freely availabe.

Evaluation is performed using the standard 'trec_eval' script.

DEPENDENCIES

  • python 2.7+
  • numpy
  • theano
  • scikit-learn (sklearn)
  • pandas
  • tqdm
  • fish
  • numba

Python packages can be easily installed using the standard tool: pip install

EMBEDDINGS

The pre-initialized word2vec embeddings have to be downloaded from here.

BUILD

To build the required train/dev/test sets in the suitable format for the network run:

$ sh run_build_datasets.sh

It will parse the raw XML files containg QA pairs and convert them into a suitable format for the deep learning model. The output files are stored under the folders TRAIN and TRAIN-ALL corresponding to the TRAIN and TRAIN-ALL training settings as described in the paper.

At the next step the script will extract the word embeddings for the all words in the vocabulary. We use the pre-trained word embeddings obtained by running the word2vec tool on a merged Wiki dump and Aquaint corpus (provided under the 'embeddings' folder. The missing words are randomly initalized with the uniform distribution [-0.25; +0.25]. For the further details please refer to the paper.

TRAIN AND TEST

To train the model in the TRAIN setting run:

$ python run_nnet.py TRAIN

in the TRAIN-ALL setting using 53,417 qa pairs:

$ python run_nnet.py TRAIN-ALL

The parameters of the trained network are dumped under the 'exp.out' folder.

The results reported by the 'trec_eval' script should be around these numbers:

TRAIN: MAP: 0.7325 MRR: 0.8018

TRAIN-ALL: MAP: 0.7654 MRR: 0.8186

NOTE: Small variations on different platforms are expected due to differences in random seeds which affect random initialization of network weights.

REFERENCES

Peter Clark Xuchen Yao, Benjamin Van Durme and Chris Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.

Mengqiu Wang, Noah A. Smith, and Teruko Mitaura. What is the jeopardy model? a quasi- synchronous grammar for qa. In EMNLP, 2007.

License

This software is licensed under the Apache 2 license.

deep-qa's People

Contributors

aseveryn avatar dbonadiman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-qa's Issues

Can this work on Paragraph Level?

Great work, great algorithms, great paper Severyn!

I got a question though, can this work on reranking pairs of paragraphs (Q+A), say of 2-3 sentences?

Question on '/tmp/trec-merged.txt'

A question on '/tmp/trec-merged.txt'. I have installed Theano as well as most other dependencies by installing Winpython. After running the first step:

To build the required train/dev/test sets in the suitable format for the network run:
$ sh run_build_datasets.sh

with "os.system('run_build_datasets.sh')" in the main directory of the project,

jacana-qa-naacl2013-data-results/train.xml
outdir TRAIN
Traceback (most recent call last):
  File "parse.py", line 178, in <module>
    qids, questions, answers, labels = load_data(all_fname)
  File "parse.py", line 15, in load_data
    lines = open(fname).readlines()
IOError: [Errno 2] No such file or directory: '/tmp/trec-merged.txt'
Vocab size 17022
embeddings/aquaint+wiki.txt.gz.ndim=50.bin
vocab_size, layer1_size 2470719 50
. . . . . . . . . . . . . . . . . . . . . . . . . done
Words found in wor2vec embeddings 16201
ndim 50
Using zero vector as random
random_words_count 821
(17023L, 50L)
TRAIN\emb_aquaint+wiki.txt.gz.ndim=50.bin.npy
Vocab size 56952
embeddings/aquaint+wiki.txt.gz.ndim=50.bin
vocab_size, layer1_size 2470719 50
. . . . . . . . . . . . . . . . . . . . . . . . . done
Words found in wor2vec embeddings 51250
ndim 50
Using zero vector as random
random_words_count 5702
(56953L, 50L)
TRAIN-ALL\emb_aquaint+wiki.txt.gz.ndim=50.bin.npy
bash: make: command not found
bash: make: command not found

The first problem is about '/tmp/trec-merged.txt' which I failed to find it.
What's '/tmp/trec-merged.txt'? Is it inside the downloaded zip file, or how to create it?

Implementation doubt

Hi,

Please excuse if you find this doubt to be very naive. So for evaluating MAP, for each test query text, ideally we should compare it with each answer text in the training set right? Or do we compare it against answers in the mini-batch only?

Architecture question

Pretty cool stuff.
Reading the code I'm just wondering about why so many levels of indirection from indexes to word2vec sentence matrixes.

It's like parsing -> creation of an "alphabet" to map words to indexes -> creation of questions / answers as series of alphabet indexes -> creation of an alphabet index to word2vec mapping.
This also requiring a nn layer that will do the lookup index to word2vec vector, before the convolution.

Is there a reason to bother with indexes at all, and not transforming everything straight into a word2vec matrix either at parsing time or even before the feed forward phase ?
Seems like this way the code would be more tolerant to being fed new document pairs containing words that exist in the word2vec but not in the "alphabet" mapping.

NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)

rzai@rzai00:/prj/deep-qa$ python run_nnet.py TRAIN
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled)
Running training in the TRAIN setting
y_train (array([0, 1], dtype=int32), array([4370, 348]))
y_dev (array([0, 1], dtype=int32), array([926, 222]))
y_test (array([0, 1], dtype=int32), array([1233, 284]))
q_train (4718, 33)
q_dev (1148, 33)
q_test (1517, 33)
a_train (4718, 40)
a_dev (1148, 40)
a_test (1517, 40)
Generating random vocabulary for word overlap indicator features with dim: 5
Gaussian
Loading word embeddings from TRAIN/emb_aquaint+wiki.txt.gz.ndim=50.bin.npy
Word embedding matrix size: (17023, 50)
batch_size 50
n_epochs 25
learning_rate 0.1
max_norm 0
Traceback (most recent call last):
File "run_nnet.py", line 500, in
main()
File "run_nnet.py", line 189, in main
nnet_q.set_input((x_q, x_q_overlap))
File "/home/rzai/prj/deep-qa/nn_layers.py", line 64, in set_input
self.output = self.output_func(input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 88, in output_func
layer.set_input(cur_input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 64, in set_input
self.output = self.output_func(input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 102, in output_func
layer.set_input(input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 64, in set_input
self.output = self.output_func(input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 88, in output_func
layer.set_input(cur_input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 64, in set_input
self.output = self.output_func(input)
File "/home/rzai/prj/deep-qa/nn_layers.py", line 435, in output_func
image_shape=self.input_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 151, in conv2d
return op(input, filters)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 509, in call
node = self.make_node(*inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 626, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
rzai@rzai00:
/prj/deep-qa$

some doubt of adadelta

In the function get_adagrad_updates, classical adadelta use recent exp_sqr_grads and exp_sqr_grads, but in your algorithm, the exp_sqr_grads and exp_sqr_grads are accumulated from the first time, while adagrad express like this。

downsample has been moved to pool

relating to this Theano/Theano#4337
when I run 'python2.7 run_nnet.py TRAIN'
it gives this:
Traceback (most recent call last):
File "run_nnet.py", line 15, in
import nn_layers
File "/media/sf_D_DRIVE/installed/githubs/deep-qa/nn_layers.py", line 6, in
from theano.tensor.signal import downsample
ImportError: cannot import name downsample

then I rewrite 'from theano.tensor.signal import downsample' in nn_layers.py to 'from theano.tensor.signal import pool as downsample',
everything works.

How to show the low embedding of a question or an answer in the process of test ?

I'm very interesting about this work. I want to do some work following it.

The problem making me confuse is that how to show the low embedding of a question or an answer in the process of test.

In the file ‘run_nnet.py’, the 'train_nnet' is the whole CNN.
If I want to get the low embedding of a question, I should check the input of 'classifier' (also, the output of ‘hidden_layer’). Would you generously tell me how to show them?

Thank you very much for your help~

Best,

Explanation for map_score() method.

I saw the following with map_score() method used to compute mean average precision:

  1. It sorts the query-label pairs in decreasing order of predicted scores.
  2. Then, if the original label > 0, i.e if it was a correct answer for a question, it increments the correct count and calculates precision@curr_index. This means:
  3. Incorrect behaviour: For any predicted score (predicted score will be in [0, 1], if its original label is 1, it will be considered in calculating precision. It could be that predicted_score=0.2, i.e the current answer is not correct/relevant for the question, but since its original label=1, it will be used to calculate precision.
  4. Ideally, predicted scores should be rounded to 0 or 1 based on some threshold and then, compared if label == score, if yes, then this item is relevant.

Original Code:

deep-qa/run_nnet.py

Lines 403 to 418 in 249a1ec

def map_score(qids, labels, preds):
qid2cand = defaultdict(list)
for qid, label, pred in zip(qids, labels, preds):
qid2cand[qid].append((pred, label))
average_precs = []
for qid, candidates in qid2cand.iteritems():
average_prec = 0
running_correct_count = 0
for i, (score, label) in enumerate(sorted(candidates, reverse=True), 1):
if label > 0:
running_correct_count += 1
average_prec += float(running_correct_count) / i
average_precs.append(average_prec / (running_correct_count + 1e-6))
map_score = sum(average_precs) / len(average_precs)
return map_score

Reworked Code: https://github.com/gvishal/rank-text-cnn/blob/master/code/utils.py#L24-L44

pip install numba --user failed

envy@ub1404:/media/envy/data1t/os_prj/github/deep-qa$ pip install numba --user
Requirement already satisfied (use --upgrade to upgrade): numba in /home/envy/.local/lib/python2.7/site-packages
Downloading/unpacking llvmlite (from numba)
Downloading llvmlite-0.10.0.tar.gz (92kB): 92kB downloaded
Running setup.py (path:/tmp/pip_build_envy/llvmlite/setup.py) egg_info for package llvmlite

Requirement already satisfied (use --upgrade to upgrade): numpy in /home/envy/.local/lib/python2.7/site-packages (from numba)
Requirement already satisfied (use --upgrade to upgrade): enum34 in /home/envy/.local/lib/python2.7/site-packages (from numba)
Downloading/unpacking singledispatch (from numba)
Downloading singledispatch-3.4.0.3-py2.py3-none-any.whl
Downloading/unpacking funcsigs (from numba)
Downloading funcsigs-1.0.2-py2.py3-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): six in /home/envy/.local/lib/python2.7/site-packages (from singledispatch->numba)
Installing collected packages: llvmlite, singledispatch, funcsigs
Running setup.py install for llvmlite
usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: -c --help [cmd1 cmd2 ...]
or: -c --help-commands
or: -c cmd --help

error: option --single-version-externally-managed not recognized
Complete output from command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip_build_envy/llvmlite/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-4f3RJo-record/install-record.txt --single-version-externally-managed --compile --user:
usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]

or: -c --help [cmd1 cmd2 ...]

or: -c --help-commands

or: -c cmd --help

error: option --single-version-externally-managed not recognized


Cleaning up...
Command /usr/bin/python -c "import setuptools, tokenize;file='/tmp/pip_build_envy/llvmlite/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-4f3RJo-record/install-record.txt --single-version-externally-managed --compile --user failed with error code 1 in /tmp/pip_build_envy/llvmlite
Storing debug log for failure in /home/envy/.pip/pip.log
envy@ub1404:/media/envy/data1t/os_prj/github/deep-qa$

How can I test on Microblog dataset?

Hello.
I have a trouble when I do an experiment on this code.
I'm interested on the paper, "Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks" and successfully do an experiment on answer sentence selection task.
But I can't do on a TREC Microblog Retrival Test because it requires raw rank of top 30 systems on TMB 2011.
Can anyone tell me how can I get this data for the finish the experiment??

Floating point exception (core dumped)

After running 'run_build_datasets.sh' all the files were generated inside TRAIN and TRAIN-ALL folder, but when i am trying to train the network with 'python run_nnet.py TRAIN' or 'python run_nnet.py TRAIN-ALL', it is stopping after printing 'Generating adadelta updates' (see below)


Zero out dummy word: True
1%|▎ | 8/1122 [00:03<08:00, 2.32it/s]
Floating point exception (core dumped)


can someone help me out with this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.