Code Monkey home page Code Monkey logo

💡 [REQUEST] - <title> 关于lora 模型合并的几个问题 about qwen HOT 3 OPEN

wangyao123456a avatar wangyao123456a commented on May 27, 2024
💡 [REQUEST] - 关于lora 模型合并的几个问题 <p>from qwen.</p></section> </section> </article> <article> <h2 class="h2">Comments (3)</h2> <section class="issue-comment"> <section id="2066237830" class="issue-head"> <img class="issue-avatar" src="https://avatars.githubusercontent.com/u/37737955?s=30&v=4" alt="wangyao123456a avatar" /> <a class="issue-username" href="/wangyao123456a">wangyao123456a</a> <span class="issue-time"> commented on May 27, 2024 </span> </section> <section class="markdown markdown-js p-5"><p dir="auto">1、请问base model 和lora model 合并完后,新模型的大小、参数量和base model 一样的吗?<br> 2、推理时如果lora 和base 模型不合并,输出结果和模型合并后推理结果是否一致,有精度上的损失吗?</p><p>from qwen.</p></section> </section> <section class="issue-comment"> <section id="2066269883" class="issue-head"> <img class="issue-avatar" src="https://avatars.githubusercontent.com/u/17811943?s=40&u=3b6af818ba8722012d21562ac7865472356d9757&v=4" alt="jklj077 avatar" /> <a class="issue-username" href="/jklj077">jklj077</a> <span class="issue-time"> commented on May 27, 2024 </span> </section> <section class="markdown markdown-js p-5"><ol dir="auto"> <li>It should be the same.</li> <li>The difference should be negligible.</li> </ol><p>from qwen.</p></section> </section> </article> <section> <h2 class="h2">Related Issues (20)</h2> <div class="issue"> <ul> <li> <a href="/qwenlm/qwen/issues/1220">[BUG] lora微调后,合并成一个模型。这种方式如何加载且推理</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 4</span> </li> <li> <a href="/qwenlm/qwen/issues/1222">[BUG] Qwen/Qwen-72B-Chat-Int8,不能多GPU并行计算</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1223">Qwen/eval中的评测CEval和CMMLU,开大推理的batchsize评测指标会显著降低</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1224">请问基于qwen-72b-chat,基于怎样的配置可以在一台4090上训练起来?</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 4</span> </li> <li> <a href="/qwenlm/qwen/issues/1231">[BUG] <关于model.generate时发现的源码错误></a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 2</span> </li> <li> <a href="/qwenlm/qwen/issues/1232">[BUG] <Qwen-14B-Chat 输入长文本时无输出结果></a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 5</span> </li> <li> <a href="/qwenlm/qwen/issues/1234">[BUG] Function Calling 示例有错误,最新的 openai sdk 运行时提示 api 已经废弃</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 2</span> </li> <li> <a href="/qwenlm/qwen/issues/1236">请问哪里可以找到qwen用于vllm的jinja template?</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1239">[BUG] <title>执行eval中的eval_plugin进行评测 有一个agent从huggingface_hub拉包错误</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1240">请问可以使用高通的npu进行部署和推理吗?</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1241">微调完成后使用llama_factory的vllm和qwen官方的vllm部署方式启动返回的不一样</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 2</span> </li> <li> <a href="/qwenlm/qwen/issues/1243">💡 [REQUEST] - <使用ollama来调用qwen:14B时,怎么设置输出文本长度呢></a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1244">[BUG] <title>fastchat + vLLM +OpenAI API 调用qwen模型,数据不需要预先处理吗</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1245">本地部署后,运行很慢啊 </a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 4</span> </li> <li> <a href="/qwenlm/qwen/issues/1246">请问下 2.5什么时候开源呀?</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1248">File "finetune.py", line 412, in <module> train() File "finetune.py", line 384, in train model = get_peft_model(model, lora_config) File "/opt/conda/envs/qwen/lib/python3.8/site-packages/peft/mapping.py", line 123, in get_peft_model peft_config.base_model_name_or_path = model.__dict__.get("name_or_path", None) AttributeError: 'NoneType' object has no attribute '__dict__'[BUG] <title></a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 2</span> </li> <li> <a href="/qwenlm/qwen/issues/1249">qwen 14b 不微调的情况下,问相同的问题,模型输出也不太一致,是为什么?温度已经设置成0了</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 2</span> </li> <li> <a href="/qwenlm/qwen/issues/1250">[BUG] <title>torch.cuda.OutOfMemoryError: CUDA out of memory.</a> <span class="text-red-600 text-xs font-normal py-0.5 px-1 border border-red-600 rounded-md">HOT 1</span> </li> <li> <a href="/qwenlm/qwen/issues/1251">RuntimeError: Failed to import transformers.models.qwen2.modeling_qwen2 because of the following error (look up to see its traceback): Failed to import transformers.integrations.peft because of the following error (look up to see its traceback): No module named 'torch.distributed.checkpoint.format_utils'</a> </li> </ul> </div> </section> </main> <section id="more" class="flex-none w-full md:w-60 text-gray-600 bg-gray-50 px-5 md:px-3 rounded-md dark-color"> <div class="w-full md:w-60 h-0.5"></div> <section> <!-- recommend projects --> <h2 class="h2 py-3.5">Recommend Projects</h2> <ul> <li class="mb-4"> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/facebook/react"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://raw.githubusercontent.com/facebook/create-react-app/master/packages/cra-template/template/public/logo192.png" alt="React photo" /> React </a> </h3> <p class="article-more pt-1">A declarative, efficient, and flexible JavaScript library for building user interfaces.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/vuejs/vue"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://camo.githubusercontent.com/c8f91d18976e27123643a926a2588b8d931a0292fd0b6532c3155379e8591629/68747470733a2f2f7675656a732e6f72672f696d616765732f6c6f676f2e706e67" alt="Vue.js photo" /> Vue.js </a> </h3> <p class="article-more pt-1">🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/microsoft/TypeScript"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://www.typescriptlang.org/favicon-32x32.png" alt="Typescript photo" /> Typescript </a> </h3> <p class="article-more pt-1">TypeScript is a superset of JavaScript that compiles to clean JavaScript output.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/tensorflow/tensorflow"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://camo.githubusercontent.com/c04e16c05de80dadbdc990884672fc941fdcbbfbb02b31dd48c248d010861426/68747470733a2f2f7777772e74656e736f72666c6f772e6f72672f696d616765732f74665f6c6f676f5f736f6369616c2e706e67" alt="TensorFlow photo" /> TensorFlow </a> </h3> <p class="article-more pt-1">An Open Source Machine Learning Framework for Everyone</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/django/django"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars2.githubusercontent.com/u/27804?s=200&v=4" alt="Django photo" /> Django </a> </h3> <p class="article-more pt-1">The Web framework for perfectionists with deadlines.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/laravel/laravel"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://laravel.com/img/logomark.min.svg" alt="Laravel photo" /> Laravel </a> </h3> <p class="article-more pt-1">A PHP framework for web artisans</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/d3/d3"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://camo.githubusercontent.com/586ccf0aad9684edc821658cee04146cf36d1f1d5ec904bbefd72728909ccb2e/68747470733a2f2f64336a732e6f72672f6c6f676f2e737667" alt="D3 photo" /> D3 </a> </h3> <p class="article-more pt-1">Bring data to life with SVG, Canvas and HTML. 📊📈🎉</p> </article> </li> <li> <div> </div> </li> </ul> </section> <section> <!-- recommend topics --> <h2 class="h2 py-3.5">Recommend Topics</h2> <ul> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/javascript"> javascript </a> </h3> <p class="article-more pt-1">JavaScript (JS) is a lightweight interpreted programming language with first-class functions.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/web"> web </a> </h3> <p class="article-more pt-1">Some thing interesting about web. New door for the world.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/server"> server </a> </h3> <p class="article-more pt-1">A server is a program made to process requests and deliver data to clients.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/machine-learning"> Machine learning </a> </h3> <p class="article-more pt-1">Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/visualization"> Visualization </a> </h3> <p class="article-more pt-1">Some thing interesting about visualization, use data art</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/topic/game"> Game </a> </h3> <p class="article-more pt-1">Some thing interesting about game, make everyone happy.</p> </article> </li> <li> </li> </ul> </section> <section> <!-- recommend users --> <h2 class="h2 py-3.5">Recommend Org</h2> <ul> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/facebook"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/69631?v=4" alt="Facebook photo" /> Facebook </a> </h3> <p class="article-more pt-1">We are working to build community through open source technology. NB: members must have two-factor auth.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/microsoft"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/6154722?v=4" alt="Microsoft photo" /> Microsoft </a> </h3> <p class="article-more pt-1">Open source projects and samples from Microsoft.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/google"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/1342004?v=4" alt="Google photo" /> Google </a> </h3> <p class="article-more pt-1">Google ❤️ Open Source for everyone.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/alibaba"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/1961952?v=4" alt="Alibaba photo" /> Alibaba </a> </h3> <p class="article-more pt-1">Alibaba Open Source for everyone</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/d3"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/1562726?v=4" alt="D3 photo" /> D3 </a> </h3> <p class="article-more pt-1">Data-Driven Documents codes.</p> </article> </li> <li> <article class="small-box"> <h3 class="article-title"> <a class="block break-all" href="/tencent"> <img loading="lazy" class="inline-block w-6 h-6 rounded-md border border-white" width="24" height="24" src="https://avatars.githubusercontent.com/u/18461506?v=4" alt="Tencent photo" /> Tencent </a> </h3> <p class="article-more pt-1">China tencent open source team.</p> </article> </li> <li> </li> </ul> </section> </section> </div> </div> <!-- footer --> <footer class="sizeing text-xs text-center p-5"> <div>Friends: <a class="hover:underline" target="_blank" href="https://www.chanpinqingbaoju.com">ProductDiscover</a> </div> Copyright © 2024 Code Monkey <!-- & <span class="block md:inline">Data Power by github.com</span> --> ❤️ <a class="hover:underline block md:inline" href="mailto:cs.victor.edison@gmail.com">Mail to me</a> </footer> </body> </html>