Code Monkey home page Code Monkey logo

Comments (13)

guotong1988 avatar guotong1988 commented on June 19, 2024

加速与否是learning-rate决定的

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

num_train_steps=10000,3卡,实际30000

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

README有写

image

from bert-gpu.

rxc205 avatar rxc205 commented on June 19, 2024

README有写

image

请问这里one data LR fixed和one batch LR fixed分别是什么意思呢?
我有一个疑问,我数据总量100条数据,在3张卡跑num_train_steps=100,你的意思是我这一百条数据分别在每张卡跑100step吗?这样同等数据下3*100=300steps?是这个意思吗?

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

300条数据

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

字面意思

from bert-gpu.

rxc205 avatar rxc205 commented on June 19, 2024

300条数据

我的数据总量只有100条,你说的300条数据是什么意思?

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

相当于300

from bert-gpu.

rxc205 avatar rxc205 commented on June 19, 2024

相当于300

行,麻烦再问一个问题:
你这个多GPU和普通的不太一样啊。100条数据不是给3张卡一起处理,而是扩成300条给3张卡处理。
如果通过增大LR来减少训练时间的话,那域训练和微调时候性能是不是必然会下降啊?按照你这个代码的逻辑

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

没人逼你用我这个

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

image

from bert-gpu.

rxc205 avatar rxc205 commented on June 19, 2024

没人逼你用我这个

抱歉,是我言语有些激烈了。但是这问题确实想知道您的一些见解
100条数据3张卡,扩成300条给3张卡处理。
如果通过增大LR来减少训练时间的话,那域训练和微调时候性能是不是必然会下降啊?不知道我哪里有误解

from bert-gpu.

guotong1988 avatar guotong1988 commented on June 19, 2024

learning rate 变大是 batch size 变大带来的,每个data产生的反向传播梯度不变,所以效果不下降

from bert-gpu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.