Comments (4)
您好,感谢您的关注,您运行的脚本是基于文献生成数据的指令微调训练过程,由于一些原因,我们当前开源的文献生成数据仅为样例,并非全部的数据,故训练出的模型效果有限,如有需要,您可以收集一些文献,构造一批指令微调数据,这样的效果会更好些,谢谢
from huatuo-llama-med-chinese.
from huatuo-llama-med-chinese.
你好,请问你是什么操作系统,如何本地部署的啊?我目前本地部署,bitsandbytes总是报错
from huatuo-llama-med-chinese.
的
复读的问题在基于LLaMA指令微调的背景下是很常见的问题,原因也是多样的,根据经验可以通过对模型扩充中文词表,增加训练数据,超参设置等进行缓解
from huatuo-llama-med-chinese.
Related Issues (20)
- 我用A40微调出现了下列问题,推理没有问题,请问有大佬知道是什么原因吗? HOT 1
- 执行infer.sh出现非预期内容 HOT 1
- 在finetune微调时报错!依赖是根据requirements.txt pip的。
- 关于llama与您训练的lora如何合并?
- 您好,您这个项目如何运行起来? HOT 3
- 指令微调的训练集的数据分布 HOT 1
- RecursionError: maximum recursion depth exceeded HOT 2
- 使用huozi模型时出现错误 HOT 4
- 请问数据集什么时候完整公布 HOT 1
- RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) HOT 2
- UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 93: illegal multibyte sequence HOT 2
- How to finetune on huozi? HOT 2
- 生成答案重复 HOT 2
- 但节点多GPU训练 HOT 1
- 为何使用Bloom model测试结果有问题? HOT 1
- 使用cpu进行推理,accelerate报KeyError的问题,key的值随机
- 请问什么时候公开测试集 HOT 2
- 知识微调的代码是否在仓库中?另外推理过程中知识获取在代码哪一步进行了体现?
- 请问融入医学知识库的数据集是data/llama_data.json吗?
- 我用A40推理过程中报错, RecursionError: maximum recursion depth exceeded,显示栈溢出,下面是完整报错信息,请问是什么问题呢 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from huatuo-llama-med-chinese.