hi, thanks for the great paper. in Fig. 4 you mentioned "red denotes high scores while blue denotes low scores." but shouldn't it will be opposite blue denotes high scores while red denotes low scores? because why backgorund is getting high score
Thanks for releasing the nice work! I test the maskedVectorQuantization module on my task. However, the masked version is 4x slower than the version without the masker and demasker modules. Is there any suggestions to accelerate the training? Thank you in advance!
Hello author, I am very interested in your project! Due to my low computer configuration, may I obtain your pre-trained DQVAE's ckpt to debug the code for the second stage?
Hi, I have a question as described in the issue, the way/rules to separate all feature vectors in the grid feature map from the encoder into important and unimportant samples confused me. As depicted in Sec 3.2: The larger the score s_l is, the more important the region feature z_l is. However, I don't understand the label assignment strategy to distinguish the important samples from the unimportant ones to train the lightweight scoring function. Would you please kindly specify it in this issue? I appreciate it in advance.