Thanks for your awesome work! I'm wondering how do you simulate sensor missing. (e.g., multi-modal model with only one modality input?). Do you turn all the pixel value to 0 to simulate camera missing or something else and How about LiDAR point cloud. I didn't find the information in the paper.
Thank you for your great work. The training scheme is so interesting.
But I am curious about MoE block that this is really for solving task conflict in Multi-task. It seems to be just different version of Attention block.
Anyway, When you will release the code of your great work??