Comments (4)
It appears your model does not list <|im_start|>
or <|im_end|>
as a special token. There's logic in llama.cpp if the token is not special.
If you're able, then maybe try adjusting the necessary special tokens to true
.
from llama.cpp.
That wouldn't explain this though.
It's like it's hardcoded to set <|im_start|> to 32001 and <|im_end|> to 32000 even if that's not what the model uses.
My model also was not trained with those tokens set as special, so I shouldn't need to change that to get things to work.
from llama.cpp.
The link to your model is 404 not found.
Anyway, did you check if added_tokens.json
is set correctly? (The JSON you posted above is from tokenizer_config.json
)
from llama.cpp.
The link to your model is 404 not found.
Sorry, I unprivated it.
Anyway, did you check if
added_tokens.json
is set correctly? (The JSON you posted above is fromtokenizer_config.json
)
My model is fine, and the added_tokens.json
is also set correctly. The issue here is llama.cpp conversion not matching Transformers at all when it comes to added tokens.
{
"<|im_end|>": 32001,
"<|im_start|>": 32000
}
from transformers import AutoTokenizer
import requests
string_to_test = "<|im_start|>user\nTest Input<|im_end|><|im_start|>assistant\nTest Response<|im_end|>"
tokenizer = AutoTokenizer.from_pretrained("PJMixers/MV02-PB-Mixture-v1-run_15-SFT-7B-Latest-QLoRA")
# Model is converted and quantized with lcpp, running on the latest kcpp
koboldcpp_string_to_test = (
requests.post(
f"http://127.0.0.1:5001/api/extra/tokencount",
json={"prompt": string_to_test},
).json()["ids"]
)
# Transformers output (Correct)
print(tokenizer.encode(string_to_test))
# [1, 32000, 2188, 13, 1963, 11232, 32001, 32000, 13892, 13, 1963, 12107, 32001]
# ['<s>', '<|im_start|>', '▁user', '<0x0A>', 'Test', '▁Input', '<|im_end|>', '<|im_start|>', '▁assistant', '<0x0A>', 'Test', '▁Response', '<|im_end|>']
# KoboldCPP/llama.cpp output (Very incorrect)
print(koboldcpp_string_to_test)
# [1, 32001, 1838, 13, 1963, 11232, 32000, 32001, 489, 11143, 13, 1963, 12107, 32000]
# ['<s>', '<|im_end|>', 'user', '<0x0A>', 'Test', '▁Input', '<|im_start|>', '<|im_end|>', 'ass', 'isstant', '<0x0A>', 'Test', '▁Response', '<|im_start|>']
You can also see here the legacy: true
issue appear.
from llama.cpp.
Related Issues (20)
- Question: why llama.cpp mobilevlm model(fp16) inference result is different with official pytorch project results, this is normal? HOT 6
- Feature Request: Improve Ergonomics of `llama-server` HOT 2
- Feature: support Vulkan devices that don't support 16-bit storage
- Feature Request: codestral support HOT 3
- Question: How to convert Yi-34B-Chat-4bits to gguf? HOT 1
- Bug: SPM tokenization breaks in at least one specific case. HOT 5
- Feature Request: Support for Yuan2-M32 HOT 1
- Feature Request: change model and lora from server api
- Bug: server crashed today for the first time. HOT 1
- Bug: server crashes on startup is ckt ctv specified. HOT 1
- Bug: cant finetune HOT 1
- Bug: DeepSeek-V2-Lite GGML_ASSERT: ggml-metal.m:1857: dst_rows <= 2048 and aborts HOT 3
- Add Support for Solidity Model
- Bug: No longer makes with w64devkit HOT 1
- Refactor: Add CONTRIBUTING.md and/or update PR template with [no ci] tips
- Why is convert.py missing? HOT 10
- When using GPU (OpenCL), the reply speed is slower and all replies are incorrect?? HOT 4
- Bug: The output of llama.cpp with Phi-3 contains Non-sense/meaningless words, Does anyone encounter the similar problem? HOT 3
- Bug: Phi-2 model tokenizer not recognized
- Bug: Incorrect memory allocation when mixing Nvidia and AMD GPU's HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.