Code Monkey home page Code Monkey logo

llm-export's Introduction

llm-export

English

llm-export是一个llm模型导出工具,能够将llm模型导出为onnx和mnn模型。

  • 🚀 均完成onnxruntime正确性测试
  • 🚀 优化原始代码,支持动态形状
  • 🚀 优化原始代码,减少常量部分
  • 🚀 使用OnnxSlim优化onnx模型,性能提升约5%; by @inisis
  • 🚀 支持将lora权重导出为onnx和mnn

模型支持与下载

  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download
  • Download

用法

  1. 将该项目clone到本地
git clone [email protected]:wangzhaode/llm-export.git
  1. 将需要导出的LLM项目clone到本地,如:chatglm2-6b
git clone https://huggingface.co/THUDM/chatglm2-6b
# 如果huggingface下载慢可以使用modelscope
git clone https://modelscope.cn/ZhipuAI/chatglm2-6b.git
  1. 执行LLMExporter导出模型
cd mnn-llm
# 将chatglm2-6b分为embedding, blocks, lm分别导出为onnx并转换为mnn, 并导出tokenizer.txt
python llm_export.py \
        --path ../chatglm2-6b \
        --export_split \
        --export_token \
        --export_mnn \
        --onnx_path ./chatglm2-6b-onnx \
        --mnn_path  ./chatglm2-6b-mnn 

功能

  • 支持将模型完整导出为一个onnx模型,使用--export
  • 支持将模型分段导出为多个模型,使用--export_split
  • 支持导出模型的词表到一个文本文件,每行代表一个token;其中token使用base64编码;使用--export_verbose
  • 支持导出模型的Embedding层为一个onnx模型,使用--export_embed,同时支持bf16格式,使用--embed_bf16
  • 支持分层导出模型的block,使用--export_blocks导出全部层;使用--export_block $id导出指定层
  • 支持导出模型的lm_head层为一个onnx模型,使用--export_lm
  • 支持导出多模态模型的visual模型为一个onnx模型,使用--export_visual
  • 支持对模型进行对话测试,使用--test $query会返回llm的回复内容
  • 支持在导出onnx模型后使用onnxruntime对结果一致性进行校验,使用--export_test
  • 支持将tokenizer导出为文本文件,使用--export_token
  • 支持将导出的onnx模型转换为mnn模型,默认转换为非对称4bit量化,使用--export_mnn
  • 指定导出路径使用--onnx_path--mnn_path
  • 默认会使用onnx-slim对onnx模型进行优化,跳过该步骤使用--skip_slim

参数

usage: llm_export.py [-h] --path PATH
                     [--type {chatglm-6b,chatglm2-6b,chatglm3-6b,codegeex2-6b,Qwen-7B-Chat,Qwen-1_8B-Chat,Qwen-VL-Chat,Baichuan2-7B-Chat,Llama-2-7b-chat-ms,internlm-chat-7b,TinyLlama-1_1B-Chat,Yi-6B-Chat,deepseek-llm-7b-chat,phi-2,bge-large-zh}]
                     [--onnx_path ONNX_PATH] [--mnn_path MNN_PATH] [--export_mnn] [--export_verbose] [--export_test] [--test TEST] [--export] [--export_split] [--export_token] [--export_embed] [--export_visual] [--export_lm]
                     [--export_block EXPORT_BLOCK] [--export_blocks] [--embed_bf16] [--skip_slim]

llm_exporter

optional arguments:
  -h, --help            show this help message and exit
  --path PATH           path(`str` or `os.PathLike`):
                        Can be either:
                                - A string, the *model id* of a pretrained model like `THUDM/chatglm-6b`. [TODO]
                                - A path to a *directory* clone from repo like `../chatglm-6b`.
  --type {chatglm-6b,chatglm2-6b,chatglm3-6b,codegeex2-6b,Qwen-7B-Chat,Qwen-1_8B-Chat,Qwen-VL-Chat,Baichuan2-7B-Chat,Llama-2-7b-chat-ms,internlm-chat-7b,TinyLlama-1_1B-Chat,Yi-6B-Chat,deepseek-llm-7b-chat,phi-2,bge-large-zh}
                        type(`str`, *optional*):
                                The pretrain llm model type.
  --onnx_path ONNX_PATH
                        export onnx model path, defaut is `./onnx`.
  --mnn_path MNN_PATH   export mnn model path, defaut is `./mnn`.
  --export_mnn          Whether or not to export mnn model after onnx.
  --export_verbose      Whether or not to export onnx with verbose.
  --export_test         Whether or not to export onnx with test using onnxruntime.
  --test TEST           test model inference with query `TEST`.
  --export              export model to an `onnx` model.
  --export_split        export model split to some `onnx` models:
                                - embedding model.
                                - block models.
                                - lm_head model.
  --export_token        export llm tokenizer to a txt file.
  --export_embed        export llm embedding to an `onnx` model.
  --export_visual       export llm visual model to an `onnx` model.
  --export_lm           export llm lm_head to an `onnx` model.
  --export_block EXPORT_BLOCK
                        export llm block [id] to an `onnx` model.
  --export_blocks       export llm all blocks to `onnx` models.
  --embed_bin           export embedding weight as bin file with dtype `bfloat16`
  --embed_bf16          using `bfloat16` replace `float32` in embedding.
  --skip_slim           Whether or not to skip onnx-slim.

llm-export's People

Contributors

inisis avatar wangzhaode avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.