Comments (6)
We have added this functionality now and the instructions are in the updated README.md
. The update is also pasted below for reference. The NLGEval()
will load all the models that you require and then you can just call the evaluate
method repeatedly. Since the models are not loaded again on each call now, this should be efficient for your use-case. Let us know if there is any other issue. Thanks for bringing this use-case to our notice!
object oriented API for repeated calls in a script
from nlgeval import NLGEval
nlgeval = NLGEval() # loads the models
metrics_dict = nlgeval.evaluate(references, hypothesis)
where references
is a list of ground truth reference text strings and
hypothesis
is the hypothesis text string.
from nlg-eval.
Thank you very much!
This will be very useful for my planned work
from nlg-eval.
Hi.
Calling compute_individual_metrics
is going to load the model each time which is going to be very very slow if your corpus is large. compute_metrics
loads it only once so that is much more efficient. Your approach of creating files upto the maximum reference number is going to be much faster in that case.
If you are generating sentences and if loading time of the model < generation time and you can do that in parallel then calling compute_individual_metrics
might be better. Depends on your setup. Hard to say without knowing more about it.
from nlg-eval.
Thanks!
Is there a way that instead of files, I can compute_metrics
for arrays? It's just easier to handle that way. (plus I don't need to wait to read and write to file every time, although it is just a few MB, over like 1000 epochs it adds up)
from nlg-eval.
Hi @kracwarlock ,
I followed @AmitMY 's method:
for every tuple of (hyp, refs), and eventually, average them
However, the averaged performance is worse than evaluate on the entire hyp and ref lists (by the nlg-eval standalone command). Which one is correct?
from nlg-eval.
I think it's because for BLEU we're using option='closest'
:
https://github.com/Maluuba/nlg-eval/blob/master/nlgeval/pycocoevalcap/bleu/bleu.py#L40
Sorry I don't know which is "correct".
from nlg-eval.
Related Issues (20)
- Add metrics to compute fluency of references HOT 6
- Getting error "ValueError: could not convert string to float: '' HOT 2
- Why do I only output Bleu when I use it on Mac ?
- download for glove 6B fails HOT 3
- ModuleNotFoundError: No module named 'nlgeval' HOT 7
- Problem with "the object oriented API for repeated calls in a script - multiple examples" HOT 4
- _pickle.UnpicklingError: pickle data was truncated HOT 1
- zipfile.BadZipFile: File is not a zip file
- TypeError: compute_individual_metrics() missing 1 required positional argument: 'hyp' HOT 2
- Assertion Error HOT 1
- about the files downloaded HOT 1
- nlg-eval --setup
- BrokenPipeError HOT 1
- CIDEr score evaluates to 0.0 no matter what references and metrics I use
- thanks for the codes! I have a question: should I tokenize the predictions and reference texts before using this api? HOT 1
- nlg-eval --setup error can't download glove.6B.zip HOT 3
- Compatibility with gensim 4 HOT 1
- New releases? HOT 1
- v2.4.0 tag does not have the right version info HOT 1
- "Not found for url" while downloading weights HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nlg-eval.