okuvshynov / slowllama Goto Github PK

View Code? Open in Web Editor NEW

423.0 423.0 32.0 1.16 MB

Finetune llama2-70b and codellama on MacBook Air without quantization

License: MIT License

Python 99.50% Shell 0.50%

apple-silicon fine-tuning llama llama2

slowllama's Issues

Fine-tune other models

Hello,

Can we apply this method to fine-tune models other than llamas and codellama, such as mistral 7b?

Many thanks in advance!

Mojo 🔥?

Now that mojo is available for M1/M2 platforms, have you considered attempting this with mojo for improved performance? (Questionable as to how much I guess with all the shuffling to the ssd?)

https://www.modular.com/blog/mojo-is-now-available-on-mac

Here is a llama2 implementation: https://github.com/tairov/llama2.mojo

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

In order to merge LoRA checkpoint for llama 2 7B model, I run python merge_lora.py.

But an error occured,

Traceback (most recent call last):
  File "/Users/xxx/llama/slowllama/merge_lora.py", line 14, in <module>
    add_lora(model_path, lora_path, out_model_path)
  File "/Users/xxx/llama/slowllama/loader.py", line 188, in add_lora
    lora = lora_weights[b_key].mm(lora_weights[a_key]) * lora_scale
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

So I modified the code like below and I got the merged model file.

lora = lora_weights[b_key].to(torch.float32).mm(lora_weights[a_key].to(torch.float32)) * lora_scale

But I wonder it's okay or not.
Can you give the opinion or right solution?

/slowllama/logs/prepare_model.log doesnt exist

Hi, when I try to run the prepare_model.py which is the first script I get
/slowllama/logs/prepare_model.log
does not exist. I cant find that log file anywhere.

RuntimeError: The size of tensor a (2560) must match the size of tensor b (5120) at non-singleton dimension 0

The tensor size mismatch error occurred when I tried to merge the LoRA checkpoint for the llama 2 13B model.

Traceback (most recent call last):
  File "/Users/xxx/llama/slowllama-13b/merge_lora.py", line 14, in <module>
    add_lora(model_path, lora_path, out_model_path)
  File "/Users/xxx/llama/slowllama-13b/loader.py", line 190, in add_lora
    checkpoint[checkpoint_key] = checkpoint[checkpoint_key] + lora[subset].to(torch.bfloat16)
RuntimeError: The size of tensor a (2560) must match the size of tensor b (5120) at non-singleton dimension 0

Llama 2 13B has the two consolidated checkpoints, so do we need to consider the shard when we merge the weights?

use it for some tiny version of RLHF

title

finetune.py segmentation fault

I am trying to run the finetune.py and getting a seg. fault. Can anyone help. I am on Apple M2 mac mini with 24G memory.

% python finetune.py 
loc("mps_transpose"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":206:0)): error: 'anec.transpose' op Invalid configuration for the following reasons: Tensor dimensions N1D1C4096H1W32000 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
loc("mps_matmul"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":39:0)): error: 'anec.matmul' op Invalid configuration for the following reasons: Tensor dimensions N1D1C4096H1W32000 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
zsh: segmentation fault  python finetune.py

Is there a particular dataset format required for finetuning codellama? I have the dataset in the OpenAI suggested format which is basically a jsonl with each entry having messages: [{role: 'system', content: ''}, {role: 'user', content: ''}, {role: 'assistant', content: ''}]} object. Will this format work?

run prepare_model.py error

when I use CodeLlama-7b to run prepare_model.py,An exception occurred

RuntimeError: The expanded size of the tensor (32000) must match the existing size (32016) at non-singleton dimension 0. Target sizes: [32000, 4096]. Tensor sizes: [32016, 4096]

okuvshynov / slowllama Goto Github PK

slowllama's Issues

Fine-tune other models

Mojo 🔥?

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

/slowllama/logs/prepare_model.log doesnt exist

RuntimeError: The size of tensor a (2560) must match the size of tensor b (5120) at non-singleton dimension 0

use it for some tiny version of RLHF

finetune.py segmentation fault

Fine-tuning codellama dataset

run prepare_model.py error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent