Hi! Thanks for sharing this awesome work, it's been interesting to explore your approa

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Comparison with other works? about rebel HOT 2 CLOSED

babelscape commented on May 30, 2024

Comparison with other works?

from rebel.

Comments (2)

LittlePea13 commented on May 30, 2024 4

Hi @logan-markewich thanks for your question, I am always happy to discuss these things.

When evaluating Relation Extraction, especially when referring to End-to-End, the comparison is always with work that extracts both entities and relations.

Of course an interesting "ablation" is to see what happens when NER is solved and see how the model performs on RE, to study how much of the error rate comes from that. With REBEL there is the particularity that both tasks are decoded simultaneously and evaluating RE separately can be tricky, and wasn't the point of our work.

Nevertheless there is Relation Classification (RC) which is more akin to what you describe. Benchmarks such as DocRED on it's original setting, or Re-Tacred, look at RC, and we also evaluate REBEL on it, by marking the spans of the entities in the input, however since pre-training wasn't designed for this, there is not that much of a contribution from REBEL but rather just competitive results that serve as comparison. Basically if I needed a system for Relation Classification between pairs of given entities I would not use REBEL.

Many systems perform RE in a pipeline fashion, where you first perform NER and then RE. Therefore I think it would be fair to compare what you describe with a system on let's say CONLL-04, by providing the gold entity spans to the RE module of such system. The paper from Table-sequence actually has a very nice ablation section where they perform this experiment on ACE-05, at Table 3, where they get around 3 points improvement on RE when the gold entities are provided. And you are in luck because someone had the same question in their github and you can run the same experiment on other datasets (like CONLL04): LorrinWWW/two-are-better-than-one#18

Sorry if I went into a tangent. To sum up, imo, directly comparing a system that considers NER solved with REBEL or other benchmarks in the paper wouldn't be entirely fair, but you could simply take some previous work and use their RE module once the entities are identified, so that the comparison would be fair. If you still want to compare with REBEL, you could follow the setup we had for Re-Tacred and mark the entities spans in the input and re-train for a dataset like CONLL-04. I recall trying that myself at some point and results didn't differ much (ie. NER is usually not the bottleneck) but can't remember more details.

Finally I can't recommend enough to check the following paper which we also cite in our work:
Let’s Stop Incorrect Comparisons in End-to-end Relation Extraction!. I think it is a crucial paper to understand the current state of RE (despite being a bit old now) and how many times comparisons have not been done properly.

Hope my opinion helps, best of luck in your work and let me know if I can help with anything else.

from rebel.

logan-markewich commented on May 30, 2024

Wow, thanks for the details and links! Super helpful! 👍🏻 Definitely answers my questions and should help a ton with my work

from rebel.

Comparison with other works? about rebel HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent