Code Monkey home page Code Monkey logo

Comments (1)

JingChaoLiu avatar JingChaoLiu commented on August 17, 2024 1

这次commit是对PMTD中的plane clustering进行ablation study。模型还是使用pyramid label训练的模型,输出还是pyramid mask。但是框的预测方式并不是plane clustering,而是直接阈值截断。原则上阈值截断的边界应该是 pyramid mask的z值为0,但是为了鲁棒,我们把z的阈值改成了0.01。也就是说,这里的0.01是表达“通过阈值截断的方式获取文字边界”。

至于为什么sigmoid要从后处理移到主模型,原因是这样的:
当你做2分类任务时,
model.train()时,损失函数binary cross entropy实际对pred干了两件事:
pred = pred.sigmoid(); loss = -(gt * log pred + (1- gt) * log (1-pred))
而当model.eval()时,后处理对pred做了一件事:
pred = pred.sigmoid()

但是当你尝试回归pyramid mask \in [0, 1] 时,损失函数变成了L1 loss:
model.train()时,原来binary cross entropy帮你做的pred = pred.sigmoid(),现在要自己做。
而当model.eval()时,由于主模型里有了pred = pred.sigmoid(),所以后处理就不需要再次调用sigmoid了。

from pmtd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.