The regularizer term U is the number of the used parameters in the convolution filter. U_regularizer = 2**(self.K + torch.sum(self.gate))
And every DGConv block has recorded these regularizer terms. And use these regularizers as complexity constraints. I was confused, how to add these terms in the final loss function. I have read some discussions about this paper. It said that the author used NAS method to search the architecture. How do you think about this?
Hello, @d-li14 , the google drive you gave us is unfriendly for mainland users, we are struggling to download g_resnext50_cosine.pth , can you present the baidu yunpan download links?
Thanks!