Code Monkey home page Code Monkey logo

Comments (8)

hiyouga avatar hiyouga commented on May 18, 2024 1

多卡需要每张卡都有 24G 内存。

from llama-factory.

starphantom666 avatar starphantom666 commented on May 18, 2024

多卡需要每张卡都有 24G 内存。

Dalao,单卡我这里也有个问题,13b的模型我4bit量化,输入512输出512怎么微调也OOM o(╥﹏╥)o

from llama-factory.

hiyouga avatar hiyouga commented on May 18, 2024

GPU 的空闲显存有多少?
从 512 减少到 256 试试呢?

from llama-factory.

starphantom666 avatar starphantom666 commented on May 18, 2024

GPU 的空闲显存有多少? 从 512 减少到 256 试试呢?

降低了可以。还有个问题

from transformers import LlamaForCausalLM, AutoTokenizer
import torch

ckpt = './bloom_13b/'
device = torch.device('cuda')
model = LlamaForCausalLM.from_pretrained(ckpt, device_map={"":0},load_in_8bit=True, low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model.eval()
prompt = "XXXXXXXXXXXXXXXXXXXXXXXXXXX"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
generate_ids = model.generate(input_ids, max_new_tokens=500, do_sample = False, repetition_penalty=1., eos_token_id=2, bos_token_id=1, pad_token_id=0)
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
response = output[len(prompt):]
print(response)

官方的示例对话的结果,对比我用项目里的web demo结果不一致(均为8bit do_sample=False),而且web生成的效果比我用上述官方示例的效果差很多,这是怎么回事?

web demo的参数我在后台都改了,如下
gen_kwargs = {
"input_ids": input_ids,
"do_sample": False,
"top_p": 0.01,
"temperature": 0.99,
"num_beams": 1,
# "max_length": max_length,
"max_new_tokens":500,
"repetition_penalty": 1.0,
"logits_processor": get_logits_processor(),
"streamer": streamer,
"eos_token_id":2,
"bos_token_id":1,
"pad_token_id":0
}

from llama-factory.

starphantom666 avatar starphantom666 commented on May 18, 2024

web demo的回答惜字如金。。

from llama-factory.

starphantom666 avatar starphantom666 commented on May 18, 2024

web demo的回答惜字如金。。

已经解决,原来在代码里面,会自动包装问题,导致结果和官方示例不一致,而且还导致回答惜字如金

from llama-factory.

dengfenglai321 avatar dengfenglai321 commented on May 18, 2024

web demo的回答惜字如金。。

已经解决,原来在代码里面,会自动包装问题,导致结果和官方示例不一致,而且还导致回答惜字如金

你好,请问怎么修改解决保证该项目与官方回答基本一致?

from llama-factory.

hiyouga avatar hiyouga commented on May 18, 2024

@yumulinfeng1 使用指令微调后的模型时候应该在命令行参数中加入 --prompt_template 参数

from llama-factory.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.