Greetings, This is Aman Goyal. I am currently pursuing research in M

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Greetings <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Thanks for the quick response <a class="user-mention notranslate" data-hovercard-type=

Apologies for the delay <a class="user-mention notranslate" data-hovercard-type="user"

Regarding training using BDD100K dataset about diode HOT 9 CLOSED

nvlabs commented on August 11, 2024

Regarding training using BDD100K dataset

from diode.

Comments (9)

akshaychawla commented on August 11, 2024

Thank you for the interest in our work!

We can perform mean squared error-based knowledge distillation with the bdd100k dataset. Please check out the instructions at https://github.com/NVlabs/DIODE/tree/yolo/knowledge_distillation/yolov3-master that describe how to distill a Yolo-V3 teacher model that was trained on the COCO dataset to a student Yolo-v3 network, using images from the BDD100k dataset as the proxy distillation dataset. Note that we use bdd100k as the proxy dataset for distillation because the data-free assumption in our paper means that once the teacher was trained, we discard the original dataset (hence, data-free) and are only left with the teacher model weights.

However, If you wish to train the teacher on BDD100k and then distill to a student with the BDD100k dataset, it would be better to use a more up-to-date repository for yolo-v3: https://github.com/ultralytics/yolov3 and adapt our distillation code: https://github.com/NVlabs/DIODE/blob/yolo/knowledge_distillation/yolov3-master/utils/distill_utils.py for your purposes. Also note that you may need to convert the bdd100k dataset into a format consistent with the one used by https://github.com/ultralytics/yolov3 .

Let me know if I can help you in any other way. Thanks!

from diode.

AmanGoyal99 commented on August 11, 2024

Greetings,

So I want to perform distillation between 2 backbones.
Currently I already have Resnet-50 trained on BDD100K ready. I want to take Resnet-50 model as teacher and distill it into my own architecture as student.
Could you please guide me on how I can achieve this.
Thanks

from diode.

animesh-007 commented on August 11, 2024

Hi @akshaychawla. Can you share any resources for the SelfSimilarityHook which you have in the deepinversion code? In the original code, it was not present, and I am not able to find relevant papers for it.

from diode.

akshaychawla commented on August 11, 2024

@AmanGoyal99 I'd like to know a little more about the problem you are trying to solve before recommending a solution and pointing you towards a snippet in our repository that might be helpful.

Typically, To make a hard decision, a network must suppress information in its output space that might reveal contextual details on the input. (e.g., through an argmax layer). To distill this hidden/suppressed information into a student network, we must enhance it and then make the student network imitate the enhanced outputs using an appropriate loss function.

To enhance the output space, we must understand the teacher's task, output space, and loss function. Can you tell me:

What is the task of the Resnet-50 backbone (e.g detection, segmentation, classification)
specifics about (1) (e.g if object detection, then are you using Yolo, RPN or something else)
What is the repository used to train (1)
Any other information that you can provide regarding (1) that might be useful for distillation.

Please note that this repository does not support easily loading arbitrary backbones for detection training. And it does not support training models with BDD100k dataset.
We only support the distillation of a pre-trained COCO Yolo-v3 model into another Yolo-v3 model while using proxy datasets such as bdd100k with its' bbox labels discarded.

from diode.

akshaychawla commented on August 11, 2024

@animesh-007 Please refer to issue #7 to discuss the self-similarity hook. Each thread is restricted to one issue as much as possible.

from diode.

AmanGoyal99 commented on August 11, 2024

Greetings @akshaychawla ,

Task of Resnet-50 is to be used as backbone for Faster-RCNN
So there is paper called 'Quasi Dense Tracking for Multiple Object Tracking'. It is basically a tracking method which uses Faster RCNN and RPN with backbone as Resnet-50.
This repo of QDTrack was used to train : https://github.com/SysCV/qdtrack
So my objective is to basically replace Resnet-50 (trained on BDD100K) with a lighter backbone. I am just trying to get the lighter backbone using KD.

Please do let me know if you have any queries and would want any other info about it.

Thanks

from diode.

akshaychawla commented on August 11, 2024

Thanks for the quick response @AmanGoyal99 . After looking at the problem, it seems that our repository will not be appropriate for knowledge distillation for Faster-RCNN based neural networks. Our repository only supports Yolo-v3 single-stage object detection models with a DarkNet backbone.

In order to distill a Faster-RCNN model, you will need to distill 3 items: the backbone, the RPN head and the ROI detection head. I suggest you look at the following papers which distill Faster-RCNN teacher and student models:

I didn't search for the code of these papers but it should be fairly easy to find and/or implement. In our code base, there is only one file https://github.com/NVlabs/DIODE/blob/yolo/knowledge_distillation/yolov3-master/utils/distill_utils.py that implements the hint learning approach described by [1] in Figure 1 which you can adapt when implementing [1].

from diode.

AmanGoyal99 commented on August 11, 2024

So I just want to distill the backbone actually

from diode.

akshaychawla commented on August 11, 2024

Apologies for the delay @AmanGoyal99 . If you just want to distill the backbone, you can use the Distillation.mse loss in this module https://github.com/NVlabs/DIODE/blob/yolo/knowledge_distillation/yolov3-master/utils/distill_utils.py in your distillation code. While this module was designed to distill the single stage detector outputs, it should still work well for just distilling the backbone. The rest of our repository is not relevant for your particular problem.

from diode.

Regarding training using BDD100K dataset about diode HOT 9 CLOSED

Comments (9)

Related Issues (17)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent