This is the dataset for Experiments of "Learning to Rank for Question Oriented Software Text Retrieval"
We provide three datasets here, including the Lucene Tag dataset, Java Tag dataset and 7 projects' FAQs.
Each question is stored in a file, so as to each answer.
For the Lucene Tag dataset, Java Tag dataset, the questions and answers are named as follows:
(1) the file name of question is “QuestionId_Q.txt”
(2) the file name of answer is "AnswerId_QuestionId_A.txt"
For FAQs, the questions and answers are named as follows:
(1) the file name of question is "ProjectName-Q"+"QAId"+".txt"
(2) the file name of answer is "ProjectName-A"+"QAId"+".txt"
(3) the classification result is in files named "InterrogativeNames-Q.txt"
Thanks.
Any questions, please email me: "[email protected]"
Ting Ye
20150516