Code Monkey home page Code Monkey logo

yi's People

Contributors

0x1111 avatar 202030481266 avatar angeligareta avatar anonymitaet avatar cvyl avatar eltociear avatar findmyway avatar forpanyang avatar glorialee01 avatar hiyouga avatar jack-devnlp avatar jiangchengsilent avatar kai01ai avatar loofahcus avatar markli404 avatar mayankdebnath avatar petter529 avatar pitaokkk avatar renxiaoyi avatar richardllin avatar shiyue-loop avatar soulteary avatar statelesshz avatar tdolan21 avatar tpoisonooo avatar waytoowill avatar xffxff avatar yimi81 avatar yudongliu97 avatar zhaofancy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yi's Issues

prompt format?

this is not an issue but did not know where to put it. Is there a specific prompt format to use?

generate的max_len解惑

示例代码中的,max_len是 max_new_tokens的意思?还是prompt+max_new_tokens的意思?

outputs = model.generate(
inputs.input_ids.cuda(),
max_length=max_length,
eos_token_id=tokenizer.eos_token_id
)

是否能支持 huggingface/tokenizers

最近使用 candle , 想做 Yi 系列的支持,candle 使用 https://github.com/huggingface/tokenizers 这个库, 使用时候需要一个 tokenizer.json , 在 Yi 系列 中没有这个文件,一些其他模型如:https://huggingface.co/bert-base-chinese ,https://huggingface.co/Salesforce/blip-image-captioning-large 等有相关支持。
看了一下 transformer 文档, 似乎是 fast-tokenziers 这个模块 https://huggingface.co/docs/transformers/fast_tokenizers

之前咨询 ChatGLM 的时候, candle 那边回复如下,不知道 Yi 系列是否能够支持?
candle issue:
huggingface/candle#1177 (comment)

transformers 的一些相关代码 https://github.com/huggingface/transformers/blob/main/src/transformers/convert_slow_tokenizer.py

以下是 candle 支持 marian-mt 修改的 convert_slow_tokenizer.py 的代码
https://github.com/huggingface/candle/blob/main/candle-examples/examples/marian-mt/convert_slow_tokenizer.py#L1262C32-L1262C32

eos token is not set in the demo code.

Output of the base model Yi-6B:

There's a place where time stands still. A place of breath taking wonder, but also a place of great danger. A place where the past and the future meet. A place where the dead and the living walk together. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible becomes possible. A place where the impossible

Once I add the EOS token:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model
model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-6B", device_map="auto", torch_dtype="auto", trust_remote_code=True)
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-6B", trust_remote_code=True)
# Encode the input text
inputs = tokenizer("There's a place where time stands still. A place of breath taking wonder, but also", return_tensors="pt")
# Set a hard limit for the maximum length of the sequence
max_length = 256  

# Generate output with the end-of-sequence token
outputs = model.generate(
    inputs.input_ids.cuda(),
    max_length=max_length,
    eos_token_id=tokenizer.eos_token_id  # Use the EOS token ID which is 2
)
# Decode and print the output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

I get this output:

There's a place where time stands still. A place of breath taking wonder, but also a place of great danger. A place where the past and the future collide. A place where the dead walk the earth.
The place is called the Forbidden City.
The Forbidden City is a place of great power. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept. It is a place where the dead are kept, and where the living are kept.

Once the EOS token is set, the model generates correctly.

I would be happy to make a pull request and update the demo code, but either way it should be updated.

Model architecture?

Hi there.

This model looks pretty interesting. Are there any details available about the model architecture so other projects (like llama.cpp) could potentially support it?

I saw your information says it was trained from scratch. Is it also a custom type of model, or does it use the same architecture as some other common model type such as LLaMA2?

预训练语料问题

预训练阶段放入了大量的CoT数据吗。
全量微调了34b版本的模型,发现模型非常倾向于输出CoT过程。
这也导致经常得不到我想要的prompt对应的格式要求输出。

Tokenizer class YiTokenizer does not exist or is not currently imported.

Thank you for your contributions to the community.

I tried loading Yi for inference, but I got the following error:

tokenizer = self.AUTO_TOKENIZER_CLASS.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 748, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class YiTokenizer does not exist or is not currently imported.

I am using transformers 4.34.0 and I set trust_remote_code=True.

I am aware that since this is a "custom" model, files like "configuration_yi.py", "tokenization_yi.py", and "modeling_yi.py" will be executed.

In addition, I am ware that AutoTokenizer does NOT have YiTokenizer pre-registered [https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/tokenization_auto.py](The source code of AutoTokenizer)

Can you please provide your valuable insights? Thank you very much!

YiModel's `forward` cannot work on `input_ids==None` when not using `flash_ atten`

YiModel's forward function supports input_ Ids==None and inputs_ embeds != None , but _prepare_decoder_attention_mask is not work on input_ids==None when not using flash_ atten.
This way, without using flash_ atten, it will cause an error.

    def _prepare_decoder_attention_mask(
        self, attention_mask, input_ids, inputs_embeds, past_key_values_length
    ):
        input_shape = input_ids.shape
        # create causal mask
        # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
        combined_attention_mask = None
        if input_shape[-1] > 1:
            combined_attention_mask = _make_causal_mask(
                input_shape,
                inputs_embeds.dtype,
                device=inputs_embeds.device,
                past_key_values_length=past_key_values_length,
            )

        if attention_mask is not None:
            # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
            expanded_attn_mask = _expand_mask(
                attention_mask, inputs_embeds.dtype, tgt_len=input_shape[-1]
            ).to(inputs_embeds.device)
            combined_attention_mask = (
                expanded_attn_mask
                if combined_attention_mask is None
                else expanded_attn_mask + combined_attention_mask
            )

        return combined_attention_mask

Hallucination after finetuning

After lora sft Yi-34B with sharegpt and oaast_sft datasets, the self cognition like this
Screenshot 2023-11-07 at 12 05 31

Back to the base model, try to use generate() to answer the self cognition question...

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("/models/Yi-34B", device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("/models/Yi-34B", trust_remote_code=True)
inputs = tokenizer("我是一个AI助手,可以回答您的问题并提供信息。我由", return_tensors="pt")
outputs = model.generate(inputs.input_ids.cuda(), max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

the output like this

我是一个AI助手,可以回答您的问题并提供信息。我由OpenAI开发,旨在提供准确、无偏见的信息。如果您有任何问题,请随时提问。

6B的模型输出是不是太奇怪了

model_path = './01ai/Yi-6B/'
model = AutoModelForCausalLM.from_pretrained(model_path, device_map={'': 0}, torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
prompt = '你会做什么?'
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs.input_ids.cuda(), max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

结果回复我:

你会做什么?
A. 把车开到路边,然后打电话给保险公司
B. 把车开到路边,然后打电话给警察
C. 把车开到路边,然后打电话给朋友
D. 把车开到路边,然后打电话给家人
E. 把车开到路边,然后打电话给保险公司,然后打电话给警察
F. 把车开到路边,然后打电话给警察,然后打电话给保险公司
G. 把车开到路边,然后打电话给朋友,然后打电话给保险公司
H. 把车开到路边,然后打电话给家人,然后打电话给保险公司
I. 把车开到路边,然后打电话给保险公司,然后打电话给警察,然后打电话给朋友
J. 把车开到路边,然后打电话给保险公司,然后打电话给警察,然后打电话给家人
K. 把车开到路边,然后打电话给保险公司,然后打电话给警察,然后打电话给朋友,然后打电话给家人
L. 把车开到路边,然后打电话给保险公司,然后打电话给警察,然后打电话给朋友,然后打电话给家人,然后打电话给朋友
M. 把车开到路边,然后打电话给

How to use 200K model

Will the 6B model with a length of 200K be released? I heard that this version of the model is a fine-tuned model, I'm not sure if there will be a checkpoint.

About 200k context length

Thanks for your models.

Are there any details available about how the model support 200k context length?

Thanks.

test

你好,可以帮我讲一下新时代社会主义不

Languages in the pretrain

你好 🖐️

Did you explicitly filter other languages (non English and non Chinese) from the pretrain dataset?

If not, what are proportions of them?

Coding / HumanEval Model

Hello,

Do we plan to have a fine-tuned version for coding and HumanEval for the Yi-6B-200K-Chat model?

If yes, is the end of November 2023 a likely release date?

Thank you!

Yi-6B 推理输出异常

代码同 README:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-34B", device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-34B", trust_remote_code=True)
inputs = tokenizer("There's a place where time stands still. A place of breath taking wonder, but also", return_tensors="pt")
max_length = 256  

outputs = model.generate(
    inputs.input_ids.cuda(),
    max_length=max_length,
    eos_token_id=tokenizer.eos_token_id 
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

输出如下:

There's a place where time stands still. A place of breath taking wonder, but also a are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is are is

输入 tokenizer.encode 后结果为

tensor([[ 6444, 59610, 59575,   562,  1700,  1151,   922,  8954,  1451,    98,
           647,  1700,   593,  8253,  2863,  3755,    97,   796,   962]])

输出 tokenizer.decode 前结果为

tensor([[ 6444, 59610, 59575,   562,  1700,  1151,   922,  8954,  1451,    98,
           647,  1700,   593,  8253,  2863,  3755,    97,   796,   962,   562,
           678,   620,   678,   620,   678,   620,   678,   620,   678,   620,
           678,   620,   678,   620,   678,   620,   678,   620,   678,   620,
           678,   620,   678,   620,   678,   620,   678,   620,   678,   620,
           ..........
           678,   620,   678,   620,   678,   620]], device='cuda:0')

因此判断是 generate 过程中的问题;01-ai/Yi-6B01-ai/Yi-34B 的输出相同。

已经从 HuggingFace 拉取最新模型;关键依赖包版本如下:

torch==1.13.1
transformers==4.34.1

200k上下文

请问推理200k上下文需要多大的资源?看模型文件,没看到常规的提升长度手段

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.