deaputri / qanet_keras Goto Github PK

This project forked from deepakkumar1984/qanet_keras

0.0 0.0 0.0 106 KB

QANet in keras

Python 51.36% Jupyter Notebook 48.64%

qanet_keras's Introduction

QANet in keras

This keras model refers to QANet in tensorflow (https://github.com/NLPLearn/QANet). ~~and the self-attention & position embedding are used from (https://kexue.fm/archives/4765, https://github.com/bojone/attention)~~. Now the self-attention & position embedding are also revised from (https://github.com/NLPLearn/QANet).

We find that the conv based multi-head attention in (https://github.com/NLPLearn/QANet/blob/master/layers.py) performs 3%~4% better than the multiplying matrices based one in (https://github.com/bojone/attention/blob/master/attention_keras.py).

Pipline

Download squad data from (https://rajpurkar.github.io/SQuAD-explorer/).
Run preprocess.ipynb and handcraft.ipynb to get npys of the preprocessed data and handcraft features.
Run train_QANet.py to start training.
Fast demo: Use the god made model.fit() in QANet_fit_demo.py with random numpy data.

Updates

I find that EMA in keras is hard to implement with GPU, and the training speed is greatly affected by it in keras. Besides, it's hard to add the slice op in keras too, so the training speed is further slower(cost about twice as much time compared with the optimized tensorflow version...). Moreover, there is also 2% gap of keras compared with the tensorflow version(https://github.com/NLPLearn/QANet).