Code Monkey home page Code Monkey logo

Comments (4)

xmyqsh avatar xmyqsh commented on May 25, 2024

Try to analogy to FPN, FPN can be used in both two-stage and one-stage detector.

from simpledet.

TWDH avatar TWDH commented on May 25, 2024

@xmyqsh So do you mean you actually didn't fuse them together in one branch but with each branch goes to a individual RPN and RCNN header even without scale-aware training scheme? Thanks

from simpledet.

xmyqsh avatar xmyqsh commented on May 25, 2024

@TWDH
Aha, I got you.
TridentNet is developed on the two stage-detection, inherited from faster-rcnn, not FPN, but could be viewed as another version of FPN. It adopts a similar training scheme that SNIP introduced, but SNIP uses faster-rcnn or R-FCN, not FPN. What the innovate of TridentNet is that it uses dilation to get feature pyramid instead of image pyramid in SNIP or SNIPER and is pretrained on the ImageNet. I'd like to see someone pretrains FPN on the imageNet to see how much gain could be got.

I cannot say if TridentDilation better than FPN, or vice versa, both of them use the feature pyramid. TridentDilation could detect small scale objects with fewer resolution than FPN, but for extreme small object, it will turn to image pyramid. FPN has similar problem and higher resolution for small object. For large object, TridentDilation use the same resolution which is not flexible and efficient. For extreme larger object, TridentNet have to turn to image pyramid again. But for a specific object scale, TridentNet is definitely better than FPN. For a diverse scale, image pyramid is more suitable for TridentNet because of its scale-aware training scheme.

What is scale-aware training scheme? Scale-aware training scheme shout out at the detector: Be stupid! Do what you should do! Do what you good at! Be a scale specific detector! :)

If my remember is correct, the scale-aware training scheme is mainly on rpn phase, removing the extreme-scale harder example for a specific feature map to ease the modeling learning. And the dropped extreme-scale objects could be handle by other suitable feature maps or image pyramid.

For RCNN, all of the two-stage detectors are the same. RPN is on several branch/feature map, and roi-pooling to the same 7x7 size which should be the fuse you wanted.

Now, let's have a conclusion, TridentNet and its scale-aware training scheme could be used in one-stage detector. You could find some clues in the FCOS anchor selection scheme, it have adopted the scale-aware training scheme more or less.

At last, I have developed a detector called CropNet, which can double boost APs without extra order of computation, targeting autonomous driving scenario. Instead of pretrained it on imageNet, we could train it on larger autonomous driving dataset.

I'm not the author of TridentNet, there maybe some misinterpreted of it. I'd love to see the author correct me :)

Ops...
I have missed an important feature of TridentNet, the weight-sharing in the TridentDilation. I have to say, this is the most innovative design that I liked. It allows to use different scales of objects to train the same weight. As a result, only using one branch which is trained by three branch objects could get very promising performance and fast speed.

from simpledet.

TWDH avatar TWDH commented on May 25, 2024

Thanks for the comments. It seems TridentNet split the original resnet into 3 branches and each branch connects to a RPN and RCNN header respectively which means there are 3 RPN,RCNN altoghter without interference each other. I notice that scale-aware acturally just improve about 0.3% which is not that important:) Not sure if im right

from simpledet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.