sordonia / hred-qs Goto Github PK

Hierarchical Recurrent Encoder Decoder for Query Suggestion

License: BSD 3-Clause "New" or "Revised" License

Python 99.51% Shell 0.46% Forth 0.03%

hred-qs's Introduction

Hierarchical Recurrent Encoder-Decoder code (HRED) for Query Suggestion.

This code accompanies the paper:

"A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion", by Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob G. Simonsen, Jian-Yun Nie, to appear in CIKM'15.

The pre-print of the paper is available at: http://arxiv.org/abs/1507.02221.

-- Data processing

The dataset must consist in two files:

data.ses: each line is a sequence of tab-separated strings (queries). Each line represents a query session. data.rnk: each line is a sequence of tab-separated integers (not currently used in the model, can be set to a tab-separated list of 0).

Basically, the .rnk file is not used by the model but it contains the rank of the clicked documents for each of the queries.

./convert-text2dict.py data

This will create the preprocessed dataset for training.

-- Training

Create a prototype by modifying state.py and launch:

python train.py --prototype your_prototype

hred-qs's People

Stargazers

Watchers

Forkers

npow zhoujialinmumu sweaterr benjamesbabala kdjyss xuzhenjing2016 hydercps ilovecv drxianli rubeeny hqueduxiamen lixincst mquad shashankg7 vyraun gucasbrg wubin1836 guihui sachuin23 yangliuy leezqcst shubhampachori12110095 jackhate ufukhurriyetoglu jiths sakrifor afcarl mouradyahia wqj111186 happyyolanda benzayolo jeslev ash-shar surefirelin rouzbehpm sjyttkl pureuniverse yueyingshuo sarshaw renlang97 jianzhez tiffen lichuanxiang gkuo06

hred-qs's Issues

Missing evaluation include

Looks like the evaluation file is missing

Traceback (most recent call last):
File "train.py", line 7, in
from evaluation import *
ImportError: No module named evaluation

'context_to_indices' is not defined in suggest.py

Hi! sordonia,Thanks for your comment in the last issue.
But a NameError problem occured when running suggest.py

error message:
Traceback (most recent call last):
File "suggest.py", line 110, in
main()
File "suggest.py", line 102, in main
seqs = context_to_indices(lines, model)
NameError: global name 'context_to_indices' is not defined

documentation

First, I would like to thank you for releasing this great work as open-source.

Would you please mind adding some documentation to the project (docstring for functions and modules, comments in the code, wiki of the project, Project wiki, explanation of the order of execution, etc..) to improve reproducibility.

It's really time-consuming to read all the project line by line and try to infer the meaning of every variable.

Thank you in advance!

About test results

I trained and tested on AOL data

But the results after running samply.py is not very good.

Can you help me analyze it?

Or can you teach me how to use "Learning to Rank" method in this experiment.

Dataset

Hi,
Can you please provide the processed train and test data that you used in the paper

background data

Hello dear sordonia
you wrote;

we sort the query log by query timestamp and we use the queries submitted before 1 May,
2006 as our background data to estimate the proposed model and the baselines. The next two weeks of data are used as a training set for tuning the ranking models. The remaining two weeks are split into the validation and the test set.

what does it mean background data? why didnt you use in training phase background data.

an error

hi：
When I run your project,（ python train.py --prototype prototype_test）I encountered the following error，how should I change, thank you！

Traceback (most recent call last):
File "train.py", line 204, in
main(args)
File "train.py", line 93, in main
model = SessionEncoderDecoder(state)
File "/home/dixin/work/hred-qs/session_encdec.py", line 586, in init
self.encoder.build_encoder(training_x, xmask=training_hs_mask)
File "/home/dixin/work/hred-qs/session_encdec.py", line 167, in build_encoder
f_enc, sequences=[xe, xmask], outputs_info=o_enc_info)
File "/home/work/anaconda/lib/python2.7/site-packages/theano/scan_module/scan.py", line 1041, in scan
scan_outs = local_op(*scan_inputs)
File "/home/work/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 611, in call
node = self.make_node(*inputs, **kwargs)
File "/home/work/anaconda/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 538, in make_node
inner_sitsot_out.type.dtype))
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (outputs_info in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 2) has dtype float32, while the result of the inner function (fn) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.

sordonia / hred-qs Goto Github PK

hred-qs's Introduction

Hierarchical Recurrent Encoder-Decoder code (HRED) for Query Suggestion.

hred-qs's People

Stargazers

Watchers

Forkers

hred-qs's Issues

hi： When I run your project,（ python train.py --prototype prototype_test）I encountered the following error，how should I change, thank you！

Recommend Projects

Recommend Topics

Recommend Org

hi：
When I run your project,（ python train.py --prototype prototype_test）I encountered the following error，how should I change, thank you！