Hi, It seems that support for PyTorch models is currently limited to bert and xlm.

PyTorch models about mlm-scoring HOT 3 CLOSED

gerardb7 commented on May 13, 2024 1

PyTorch models

from mlm-scoring.

Comments (3)

JulianSlzr commented on May 13, 2024 3

Thanks for the suggestion! I updated to Transformers 3.3.1 and added DistilBERT & ALBERT. The main thing is defining an *BERTForMaskedLMOptimized class for speed. You can follow my example to add support for other MLMs. Pull requests welcome 🙂.

I also ran the two on BLiMP. Though ALBERT improves on downstream tasks like GLUE over RoBERTa, its BLiMP scoring is on par with BERT. Likewise, DistilBERT does similar to BERT on GLUE, but is much worse on BLiMP (78% vs 84%).

Possible takeaways:

DistilBERT's KD objective and taking alternate layers has an effect. Quantifier and island effects performance degrades significantly (maybe the knowledge was encoded in the offset layers? maybe probabilities are now too soft?).
Having pre-training match evaluation is likely more important. We saw this with LibriSpeech in our paper. Here, ALBERT and BERT are trained on the same corpus, while RoBERTa is trained on a larger corpus that may cover BLiMP better.

# distilbert-base-cased
anaphor_agreement:  0.983
argument_structure:  0.7857777777777778
binding:  0.7335714285714285
control_raising:  0.7788
determiner:  0.970375
ellipsis:  0.915
filler_gap:  0.7464285714285716
irregular_forms:  0.9555
island_effects:  0.54925
npi_licensing:  0.7901428571428571
quantifiers:  0.5895
subject_verb_agreement:  0.8965000000000001
overall:  0.782955223880597

# albert-xxlarge-v2
anaphor_agreement:  0.956
argument_structure:  0.8375555555555555
binding:  0.7912857142857143
control_raising:  0.865
determiner:  0.9395
ellipsis:  0.8735
filler_gap:  0.8188571428571427
irregular_forms:  0.9255
island_effects:  0.74975
npi_licensing:  0.9115714285714285
quantifiers:  0.6739999999999999
subject_verb_agreement:  0.8808333333333334
overall:  0.8435820895522389

from mlm-scoring.

gerardb7 commented on May 13, 2024

Grand job, thanks a lot!

from mlm-scoring.

Ago3 commented on May 13, 2024

Hi,

I'm extending the framework to include another PyTorch model. When using MLMScorerPT we don't need to pass a vocab, do we? I couldn't find any function where it is actually used..

Thank you!

PS: Very cool work :)

from mlm-scoring.

Recommend Projects

PyTorch models about mlm-scoring HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent