ericwtodd / function_vectors Goto Github PK
View Code? Open in Web Editor NEWFunction Vectors in Large Language Models (ICLR 2024)
Home Page: https://functions.baulab.info/
Function Vectors in Large Language Models (ICLR 2024)
Home Page: https://functions.baulab.info/
Thanks for the great work!
I get the same clean_nll
and intervention_nll
when I run n_shot_eval
with compute_nll=True
in eval_utils.py
.
I think it's because intervention_fv
sets a wrong idx
in add_function_vector
for the case of compute_nll=True
in intervention_utils.py
. In this case, the input of the model is nll_inputs
, which contains target
. The intervention should no longer be applied to the last token of the model input (nll_inputs
) as the other cases do, but to the last token of the original sentence (inputs
).
function_vectors/src/utils/intervention_utils.py
Lines 164 to 169 in 54ec3bf
What I think should fix the bug is replacing
intervention_fn = add_function_vector(edit_layer, function_vector.reshape(1, model_config['resid_dim']), model.device)
by
if compute_nll:
idx = -1 - target_len
else:
idx = -1
intervention_fn = add_function_vector(edit_layer, function_vector.reshape(1, model_config['resid_dim']), model.device, idx=idx)
Does this make sense to you? Also it would be helpful if you could fix other codes possibly related to this bug. Thanks!
I saw in section 3.1 and appendix B (Table 8) of your paper that you experimented different input format and tested the robustness. Did you do that by changing the --prefixes
argument for src/portability_eval.py
?
I am assuming that you used causal mediation to identify the attention heads, but do you have dedicated scripts to do that? Thanks in advance for your help!
Great work! Really enjoyed reading the paper & already have some followup ideas.
Thanks for publishing your code & data!
Best,
Chris
Thanks for the great work!
I'm trying to recompute function vectors for GPT-NeoX (and Pythia models using the same setup you have for GPT-NeoX). However, after running compute_indirect_effect.py
, I get all zero CIE matrices (indirect_effect.pt
).
I think this is due to the ablation intervention not working on these models, so clean probs and intervention probs are always equal hence why the difference is always 0.
Other models I tried don't have this issue (GPT2, LLaMa).
Do you know how to make the intervention work for GPT-NeoX models? Thanks!
In eval_utils.py function sentence_eval (lines 138-189), why are the logits of the last token in sentence (ICL prompt) taken, and not the logits of the last token in the target_completion (which consists of the sentence + target)? I have attached the relevant lines below for reference
Line 157: inputs = tokenizer(sentence, return_tensors='pt').to(device)
Line 158: original_pred_idx = len(inputs.input_ids.squeeze()) - 1
Line 170: clean_output = output.logits[:,original_pred_idx,:]
Line 181: clean_output = model(**inputs).logits[:,-1,:]
Hello, could you provide code in Sec 3.2 that implement the experiment of Table 6, thank you so much.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.