Code Monkey home page Code Monkey logo

mutual's Introduction

  • 👋 Hi, I’m @Nealcly, a senior researcher at Natural Language Processing Center, Tencent AI lab.
  • 👀 Please email me ([email protected]), if you would like to work with us.

Nealcly's github stats

mutual's People

Contributors

nealcly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mutual's Issues

about matching score

Hello, thanks for the great dataset,you said "a fully-connected layer is used for transforming the [CLS] token representation to the matching score", can you explain this step in more detail? Very thanks.

A bug

Description of bug:

I use the run_multiple_choice.py and utils_multiple_choice.py to fine-tune RoBERTa on train set and evaluate it on dev set and I find it weird that logit for every choice of every sentence is the same. I cannot find where my error is. Is there any parameter that I forgot to set or adjust?

To reproduce the issue

I run the run_multiple_choicy.py with one GPU by

python run_multiple_choice.py \
--data_dir './MuTual/data/mutual' \
--model_type roberta \
--model_name_or_path roberta-large \
--task_name mutual \
--output_dir ./OutputRoberta \
--do_train \
--evaluate_during_training \
--overwrite_cache \
--overwrite_output_dir \
--per_gpu_train_batch_size 1

and as I choose evaluate_during_training, the evaluation results of three epochs are:

1st epoch
05/29/2020 20:55:47 - INFO - main - MRR: = 0.6967644845748685
05/29/2020 20:55:47 - INFO - main - R4_1 = 0.23589164785553046
05/29/2020 20:55:47 - INFO - main - R4_2: = 0.6173814898419865
05/29/2020 20:55:47 - INFO - main - eval_loss = 1.3862943617073265

2nd epoch
05/29/2020 21:18:06 - INFO - main - MRR: = 0.8475357411587667
05/29/2020 21:18:06 - INFO - main - R4_1 = 0.24379232505643342
05/29/2020 21:18:06 - INFO - main - R4_2: = 0.2652370203160271
05/29/2020 21:18:06 - INFO - main - eval_loss = 1.3862943660031568

3rd epoch
05/29/2020 21:40:22 - INFO - main - MRR: = 1.0
05/29/2020 21:40:22 - INFO - main - R4_1 = 0.23927765237020315
05/29/2020 21:40:22 - INFO - main - R4_2: = 0.23927765237020315
05/29/2020 21:40:22 - INFO - main - eval_loss = 1.3862942457199097

Because the MRR, R4_1 and R4_2 are weird, so I then print the loss ant logits while evaluating, and I find that the logits for all texts and all choices are the same. For example, in the third epoch the logits are [[2.822, 2.822, 2.822, 2.822][2.822, 2.822, 2.822, 2.822].....].

Parameters

Would you like to share the detailed parameters of RoBERTa for Mutual. Thanks a lot for your help.

Enviroment

Hi , thanks for your contribution and baselines. I would like to rerun the baselines. Could you please help share the environment for it? Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.