Thanks for publishing this excellent work. If I understand correctly, you run LASER in

Related to <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Generic model? about laser HOT 3 CLOSED

forresti commented on May 20, 2024

Generic model?

from laser.

Comments (3)

forresti commented on May 20, 2024 1

Thanks so much!!!

from laser.

dkmisra commented on May 20, 2024

That is correct. We do pick LASER hyperparameters for each task and this is important for seeing the huge gains we report. There is an alternate method called LaserRMT that is not from us, which provides a different task-agnostic way to select hyperparameters. I haven't tried it myself but the authors have reported some results.

The simplest way to try LASER across a range of tasks, is to compute a meta-score on a task like AGIEval, and then use it to select the hyperparameter. I am optimistic that we will still see gains across a range of tasks since we find that typically the gains all come from doing intervention in the later MLP layers, and so the optimal hyperparameters tend to have some pattern. The gains might be more modest, compared to only focusing on a single task though.

For most experiments in our paper, we apply LASER to a single layer and in fact we apply a single LASER intervention, i.e., we only edit a single matrix. We have an experiment on GPTJ+CounterFact where we composed multiple LASER interventions. See the paragraph Composing reductions across layers in the paper. @pratyushasharma has released a script here with details for this experiment, and the upcoming refactoring will support composing LASER in a proper generalizable way.

from laser.

dkmisra commented on May 20, 2024

Related to #19

from laser.

Related Issues (19)

Recommend Projects

Generic model? about laser HOT 3 CLOSED

Comments (3)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent