Light

dlawjddn803 / info Goto Github PK

View Code? Open in Web Editor NEW

22.0 1.0 13.0 29 KB

Code for the paper "You Truly Understand What I Need : Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona" which is accepted to EMNLP 2022 (Findings)

Python 96.97% Shell 0.16% Jupyter Notebook 2.87%

info's Introduction

INFO: Intellectual and Friendly Dialogue Agents grounding

Source codes for the paper "You Truly Understand What I Need: Intellectual and Friendly Dialogue Agents grounding Knowledge and Persona", accepted at EMNLP 2022 Findings.

1. Setup

1.1 Environmental Setup

The code runs with python 3.6. All dependencies are listed in requirements.txt

pip install -r requirements.txt

1.2 Dataset

You can download FoCus Dataset (Persona-Knowledge Chat) in here

1.3 Create a knowledge index

Since we use RAG for dialogue generation, you need to create a knowledge index file for the generation.
Before creating a knowledge index, you need to move Focus dataset into the data/ folder.

|-- data
    |-- FoCus
        |-- train_focus.json
        `-- valid_focus.json

1) The preprocessing code for creating raw knowledge is in the knowledge_index folder

create_knowledge_index_for_github.ipynb

2) The code for creating a knowledge index file is as below

python use_own_knowledge_dataset --csv_path=your file --output_dir=your dir

or you can simply run sh file

sh create_knowldege_index.sh

we used the same file in the transformers Github but modified it a bit for preprocessing the raw knowledge

3) After creating a knowledge index for FoCus Dataset, you should change your path in the config/rag-tok-base-ct.json

"data_dir": 
"save_dirpath": 
"knowledge_dataset_path": 
"knowledge_index_path":

2. Training

Before you train the model, please modify the config file.

sh train.sh

3. Evaluate

sh evaluate.sh

info's People

Contributors

Stargazers

Watchers

Forkers

m3hrdadfi techthiyanes yoonnajang sugyeonge ekgus9 rgop13 scott-s0ny j-seo metalchaos8527 hyeonseokk dltmddbs100 superheavytail

info's Issues

Input Encoding Conflict in PolyEncoder with Mismatched Vocab Sizes

Hi,

I'm encountering an issue that appears to stem from a mismatch in vocab sizes between different parts of the pipeline. In my case, the input encoder handles a vocab size of 50265, while the poly_encoder seems to only cover 30522.

To give some context, here's the input that gets passed to the poly_encoder:

context_input_ids: tensor([[  101,  5320,   625,  1499,  1215,   448,   324, 21978,  3144, 48124,
           534,  6106,  1277, 11936,  7771,    43,   102,  1437,  2264,    16,
             5,  2148,     9,     5,  8410,   116]], device='cuda:0') torch.Size([1, 26])
context_input_masks: tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1]], device='cuda:0') torch.Size([1, 26])

However, upon feeding this into the BERT model within the poly_encoder, I'm hit with the following error:

/opt/conda/conda-bld/pytorch_1670525552411/work/aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [72,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
CUDA error: device-side assert triggered

This suggests an out-of-range error, which is consistent with a vocab size mismatch. I suspect that the poly_encoder's smaller vocab size of 30522 is causing the failure when handling inputs processed with a larger 50265 vocab.

Any insights into why this mismatch is occurring and how it can be resolved would be greatly appreciated. I am happy to provide further information if required.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.