Code Monkey home page Code Monkey logo

Comments (4)

hellbell avatar hellbell commented on July 28, 2024 1

@vinson2233
Hi, thank you for having interest in our work!

Hi, I really like your paper. But I have a question.
I understand how CutMix is applied to the image classification task, but I'm confused about how do you implement CutMix for Image Captioning task
Given λ that sampled from the uniform distribution (0, 1), how do you merge 2 captions?

It will be great if you could provide an example like :
with λ = 0.2, y1 = "A man wearing a blue jeans standing inside the bus", y2 = "A bird is flying in the sky" will be combined into xxx
Thanks :)

Actually, our Image Captioning experiments do not apply CutMix during training time. Only the pretrained model trained with CutMix is used to validate its transferrable ability.
I think your idea to combine two caption labels is quite interesting, and hope to see your further work!

from cutmix-pytorch.

hellbell avatar hellbell commented on July 28, 2024 1

From my understanding is, NIC workflow is Image -> CNN - > LSTM -> Caption
So the CNN part replaced with your CutMix Pretrained model ?

Yes, thanks.

from cutmix-pytorch.

vinson2233 avatar vinson2233 commented on July 28, 2024

Thanks for the fast response :)
So this section of the paper,

Transferring to MS-COCO image captioning:
We used Neural Image Caption (NIC) [43] as the base model for image captioning experiments. We have changed the backbone
network of encoder from GoogLeNet [43] to ResNet-50.

From my understanding is, NIC workflow is Image -> CNN - > LSTM -> Caption
So the CNN part replaced with your CutMix Pretrained model ?

from cutmix-pytorch.

vinson2233 avatar vinson2233 commented on July 28, 2024

Great, thank you :)

from cutmix-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.