Comments (7)
In multi-domain dialogue, multiple actions are common. DQN cannot solve this directly. In ConvLab, most frequent dialogue act combinations are considered as actions, which could not cover every possible dialogue act combinations. However, we welcome your contributions to adding more RL methods. You can follow the structure of the existing RL policy in ConvLab-2. Look forward to your success!
from convlab-2.
Hey, since my focus point is in RL-Reward function part, so for me there are not so much RL policies I could use, there are only PPO, and PG, is there any other RL methods I could implement in this platform?
I have made some statics, you are right, there are more than 8000 actions in this Multi Domain dataset.
from convlab-2.
You could try DQN, A3C, A2C, or any other that you are interested in.
from convlab-2.
Thanks, I am not sure if DQN will work. Since in Convlab1, there are only 300 combination of actions in total, but in convlab2 it will be more than 8000. When using same dataset, did you know why the combination of Convlab2 is so much?
from convlab-2.
Hey, about the action space, do you gays have some clue? Like should I add some model of action encoder to Convlab2? So I can narrow the action space from 8000 to 300, by only considering the most frequent actions.
from convlab-2.
In ConvLab, we select the most frequent action combinations from all combinations in the dataset (https://github.com/ConvLab/ConvLab/blob/master/data/multiwoz/da_slot_cnt.json). You can try this approximation in ConvLab-2, too.
from convlab-2.
Please refer to #96
from convlab-2.
Related Issues (20)
- [BUG] `try` does NOT work HOT 1
- Training Data HOT 1
- RULEDST evaluation HOT 9
- [Maintenance] pip takes too long in finding boto3 versions (100+ tries & still failed to install) HOT 7
- [BUG] Failed to build agent on CoLab HOT 2
- Spacy latest version compatibility HOT 2
- Different end-to-end results of DAMD HOT 1
- spacy tokenizer HOT 1
- Integrating with my own dataset HOT 3
- [BUG] 关于中文数据集crosswoz上policy-rule代码的问题 HOT 1
- BERTNLU postprocess.py 为什么可以通过 if intent_logits[j] > 0: 来获得intent的预测呢? HOT 2
- [Maintenance] docker for m1 mac, please HOT 1
- Issue in BertNLU HOT 4
- Unable to get the pretrained BERTNLU model. HOT 7
- Unable to get the pretrained BERTNLU model even after updating the URLs HOT 7
- installation python 3.7 on CentOS7 failed[BUG] HOT 2
- The Link to the datasets of LAUG is unreachable HOT 3
- [BUG] Failed to build tokenizers HOT 3
- 是否可以使用不经过训练后的BERTNLU呢 HOT 7
- Why the end-2-end performance mismatch with component level evaluation? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from convlab-2.