Would it be possible to add some details on when each metric is useful and how to invo

please add more background on metrics and describe how to invoke just 1 in Readme about nlg-eval HOT 2 CLOSED

maluuba commented on May 17, 2024

please add more background on metrics and describe how to invoke just 1 in Readme

from nlg-eval.

Comments (2)

juharris commented on May 17, 2024 1

You're right that we could use some more explanations or links to explain the metrics supported.

For examples on just using one metric, you can see that the README links to the test cases which show more detailed examples than just what is appropriate for the README:
https://github.com/Maluuba/nlg-eval#usage links to https://github.com/Maluuba/nlg-eval/blob/master/nlgeval/tests/test_nlgeval.py which has a test case called test_compute_metrics_omit: https://github.com/Maluuba/nlg-eval/blob/master/nlgeval/tests/test_nlgeval.py#L88
so to just test one metric:

metric_to_use = 'ROUGE_L'
n = NLGEval(metrics_to_omit=NLGEval.valid_metrics-{metric_to_use})
n.compute_...

should work.

from nlg-eval.

kracwarlock commented on May 17, 2024

Our paper http://arxiv.org/pdf/1706.09799 describes all the metrics very briefly and cites the papers that first proposed these metrics so you could read those in more details. In the research community, there is not much of a consensus on which of these metrics work better (people measure correlation with human evaluation to figure out which metrics are more suited for their task and results vary a lot) so people usually report several metrics. From what I have observed, BLEU-4 and METEOR are the most widely used ones but CIDEr usage has been increasing.

from nlg-eval.

Recommend Projects

please add more background on metrics and describe how to invoke just 1 in Readme about nlg-eval HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent