Comments (5)
Hi, this file has been updated recently, we will check the code assp.
from easyedit.
Hello, thank you very much for your attention to EasyEdit.
I believe there is a slight misunderstanding in your comprehension of this part of the code.
We first concatenate the prompt and target_new together and then convert them into tokens. These tokens all are then fed into the model for generation, and finally, we truncate the length of the prompt from the beginning. When we are performing the editing, we edit the model to generate ' ' + target_new.
Therefore, I don't think there is a bug with this part.
from easyedit.
Last night, I borrowed the evaluation code of EasyEdit in my method.
When I run this code, the input_ids produced by tok(prompt_target)
and tok(targets)
are different results; thus affecting the evaluation results.
I was very confused at that time, and found this inconsistency was caused by the concatenation mode.
However, if the code in the editing process is also concatenated by a space, there will be no mistakes since the editing and evaluation are consistent.
I am very sorry for this misunderstanding.
But again, Is it better if prompts
and targets
are connected as the tuple? since the tokenized results will be the same between tok(prompt_target)
and tok(targets)
.
This type of concatenation is already used in many places, e.g. ("premise", "hypothesis") to avoid misunderstandings.
from easyedit.
Hi Songlin,
No worries at all. EasyEdit thrives on open sharing and discussion, and your contributions are greatly appreciated. Regarding your concerns, let's see if we can reach the following consensus:
-
The ("premise", "hypothesis") input format you mentioned was indeed very popular during the BERT era because BERT's pre-training objective included Next Sentence Prediction. Therefore, the input format was typically
(text_a, text_b)
. However, in the GPT (autoregressive) era, almost all language models model single sentencetext
. -
Hence, we cannot treat the
prompt
andtarget_new
as two independent sentences; instead, target_new should be seen as a continuation of the prompt's text. This is also the optimization goal for all model editing. You can refer to GRACE (NeurIPS 2023), which also concatenates the two rather than treating them as a tuple. Therefore, we adjusttarget_new
to' ' + target_new
to ensure that the tokenizer does not treattarget_new
as the beginning of a new sentence.
from easyedit.
If you don't have any further questions, please help close this issue.
from easyedit.
Related Issues (20)
- 数据集问题 HOT 2
- counterfact数据结果 HOT 11
- ROME Question HOT 8
- [Speed issue]: How would you recommend running EasyEdit faster HOT 3
- [Evaluation Issues]: T5 Results are really strange HOT 3
- Evaluation Question HOT 4
- locality and portability evaluation HOT 8
- GRACE sequential edit result HOT 15
- IKE fluency HOT 2
- ccks gpt2-xl 模型 为什么用roberta模型加载? HOT 2
- R-ROME has poor performance when using GPT2-xl HOT 2
- WISE CONTEXT_TEMPLATES_CACHE HOT 7
- WISE tokenize HOT 7
- what is the meaning of archive HOT 2
- [Wise] Editing loss = nan + 0 HOT 2
- AttributeError: Can't pickle local object 'length_collation.<locals>.collate_fn' HOT 8
- Issue with KN batch editing HOT 2
- Question about ZSRE experiment, how to reproduce results in your WISE paper? HOT 6
- PermissionError: [Errno 13] Permission denied: './results/models/MEND/gpt2-xl' HOT 3
- Using MEND with monkeypatch from higher HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from easyedit.