Code Monkey home page Code Monkey logo

llama-trl's Introduction

Jason Van Followers Stars Profile Views wakatime

Oh Hi there,

I am Jason, an Algorithm(NLP) Engineer, my research area at graduate level is AI & ML.



🧑‍💼

⚙️ Development Stack
       - AI: Natural Language Processing, Machine Learning, 
       - Programming: Python, C/C++, PHP, SQL, Shell, Javascript
       - Framework: Pytorch, Paddle
☕️ Coffee Geek: Espresso, Pour-over, JP Dark Roasted

📫

Gmail BadgeInstagram - @jasonvanf




llama-trl's People

Contributors

dkqkxx avatar jasonvanf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

llama-trl's Issues

tuning_lm_with_rl 有完整运行成功的案例嘛?

accelerate launch --multi_gpu --num_machines 1 --num_processes 8
tuning_lm_with_rl.py
--log_with wandb
--model_name <LLAMA_FINETUNED_MODEL>
--reward_model_name <LLAMA_RM_MODEL>
--adafactor False
--tokenizer_name <LLAMA_TOKENIZER>
--save_freq 100
--output_max_length 128
--batch_size 8
--gradient_accumulation_steps 8
--batched_gen True
--ppo_epochs 4
--learning_rate 1.4e-5
--early_stopping True
--output_dir './checkpoints/tuning_llama_rl/'

请问 <LLAMA_RM_MODEL> 是指的哪个文件?是 「Wenzhong-GPT2-110M_peft_gpt-4-llm_rm_xxx_xx」 还是原base模型?求解答

想跟您联系与合作,谢谢。我的微信:xyj15764222030

tuning_lm_with_rl.py does not appear to have a file named config.json

Hi Jason,

I followed the steps
Step 1 - Supervised Fine-tuning, generate "/checkpoints/supervised_llama/" including folders:

checkpoint-2000
checkpoint-3000
checkpoint-4000
final_checkpoint

Step 2 Training Reward Model, generate "/checkpoints/training_reward_model/" including folders:

llama-7b-hf_peft_gpt-4-llm_rm_0_2e-05
peft_last_checkpoint

Step 3 Tuning LM with PPO.

accelerate launch --multi_gpu --num_machines 1  --num_processes 2     tuning_lm_with_rl.py     --log_with wandb     --model_name ./checkpoints/supervised_llama/     --reward_model_name ./checkpoints/training_reward_model/     --adafactor False     --tokenizer_name ./data/model/     --save_freq 100     --output_max_length 128     --batch_size 8     --gradient_accumulation_steps 8     --batched_gen True     --ppo_epochs 4     --learning_rate 1.4e-5     --early_stopping True     --output_dir './checkpoints/tuning_llama_rl/'

But there is an Error:

CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so...
Traceback (most recent call last):
  File "tuning_lm_with_rl.py", line 159, in <module>
    tokenizer = AutoTokenizer.from_pretrained(script_args.model_name)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/tokenization_auto.py", line 657, in from_pretrained
    config = AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py", line 916, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/configuration_utils.py", line 573, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/configuration_utils.py", line 628, in _get_config_dict
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py", line 380, in cached_file
    raise EnvironmentError(
OSError: ./checkpoints/supervised_llama does not appear to have a file named config.json. Checkout 'https://huggingface.co/./checkpoints/supervised_llama/None' for available files.

There is no config.json under supervised_llama or training_reward_model.

IndexError: index out of range in self on training_reward_model

(gh_llama-trl) ub2004@ub2004-B85M-A0:~/llm_dev/llama-trl$ python training_reward_model.py --model_name '/data-ssd-1t/hf_model/llama-7b-hf' --dataset_name './data/comparison_data.json' --output_dir './checkpoints/training_reward_model/'

/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/optimization.py:407: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: Tracking run with wandb version 0.15.2
wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing.
0%| | 0/11166 [00:00<?, ?it/s]You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2382: UserWarning: max_length is ignored when padding=True and there is no truncation strategy. To pad to max length, use padding='max_length'.
warnings.warn(
Traceback (most recent call last):
File "/home/ub2004/llm_dev/llama-trl/training_reward_model.py", line 307, in
trainer.train(script_args.resume_from_checkpoint)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/trainer.py", line 1664, in train
return inner_training_loop(
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/trainer.py", line 1940, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/trainer.py", line 2735, in training_step
loss = self.compute_loss(model, inputs)
File "/home/ub2004/llm_dev/llama-trl/training_reward_model.py", line 198, in compute_loss
rewards_j = model(input_ids=inputs["input_ids_j"], attention_mask=inputs["attention_mask_j"])[0]
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/peft/peft_model.py", line 566, in forward
return self.base_model(
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 821, in forward
transformer_outputs = self.model(
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 531, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/home/ub2004/anaconda3/envs/gh_llama-trl/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync /home/ub2004/llm_dev/llama-trl/wandb/offline-run-20230612_091914-tynnd9g5
wandb: Find logs at: ./wandb/offline-run-20230612_091914-tynnd9g5/logs
(gh_llama-trl) ub2004@ub2004-B85M-A0:~/llm_dev/llama-trl$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.