Source codes for the paper "You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona", accepted at EMNLP 2022 Findings.
The code runs with python 3.6. All dependencies are listed in requirements.txt
pip install -r requirements.txt
You can download FoCus Dataset (Persona-Knowledge Chat) in here
Since we use RAG for dialogue generation, you need to create a knowledge index file for the generation.
1) The preprocessing code for creating raw knowledge is in the knowledge_index folder
create_knowledge_index_for_github.ipynb
2) The code for creating a knowledge index file is as below
use_own_knowledge_dataset.py
we used the same file in the transformers Github but modified it a bit for preprocessing the raw knowledge
3) After creating a knowledge index for FoCus Dataset, you should change your path of 'knowledge_dataset_path', and 'knowledge_index_path' in the config folder
Before you train the model, please modify the config file.
sh train.sh
sh evaluate.sh