spherical-knowledge-distillation's People
spherical-knowledge-distillation's Issues
Replicating the Results as Reported in the Paper
Hi there @forjiuzhou,
The idea of normalising logits is an interesting one.
However, I am unable to replicate the results using the codes provided in this respo. I have set the teacher network to be ResNet50 with the student network as ResNet18.
The other change I made is to change the Pipeline from reading TFRecords to raw data for ImageNet as discussed here https://github.com/NVIDIA/DALI/blob/eb712f593d98afb87ea56700be7cfd83f512a5f8/docs/examples/use_cases/pytorch/resnet50/main.py
All other hyper-parameters are set to match those in the paper.
--LN argument
In Training, there is a --LN argument in the README.md for training ResNet18, but I cannot find it in the code. Besides, -a should be ResNet18, but not ResNet50. Can u address the issue? @forjiuzhou
Question about minimal code for SKD
Hi there, I've read your interesting paper. and thank you for sharing the codes.
I'm rewriting code to adequate RepDistiller format. and I have a question.
As mentioned in the paper, a Teacher's norm(l_avg) is needed to apply your idea.
then we can make our new logit (f^_i(x) * l_avg, f^_j(x) * l_avg)
but your simplified code doesn't seem to reflect these operations.
Is it replacing this operation by multiplying a constant between 2 and 3?
Please understand that I did not fully understand the paper. :)
thanks.
Setting of learning rate and KD-Loss weight
I find that the setting of learning rate and kD-loss weight in your code is quite special. May I ask if it is consistent with your setting? Then please provide the specific operation parameters of your Resnet18/ Resnet50?
if epoch < 30:
args.alpha = 0.9
elif epoch < 60:
args.alpha = 0.9
elif epoch < 80:
args.alpha = 0.5
elif epoch < 100:
args.alpha = 0.1
factor = epoch // 30
# factor = epoch // 100
if epoch >= 80:
factor = factor + 1
# if epoch >= 90:
# factor = factor + 1
lr = args.lr * (0.1 ** factor)
about another paper you have submitted in 2022 ICLR
Hello, your work, ATKD is very interesting and amazing.
Reducing the Teacher-Student Gap via Adaptive Temperatures.
But I find it in openreview, so maybe you have modify it.
Does you plan to modify it by those reviewers' advice to upload it into arxiv?
I am looking forward to your reply! Thank you.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.