Understanding and Robustifying Differentiable Architecture Search

ICLR 2020

https://openreview.net/forum?id=H1gDNyrKDS

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

One sentence

通过学习一个模型无关的元学习器，使得能快速adapt到相似的任务上。理论上能在任意的模型上套上这个元学习器。

Paper

https://arxiv.org/abs/1703.03400

Code

https://github.com/cbfinn/maml

Motivation

Novelties & Key contribution

Method

Experiments

Rethinking

Reference

https://zhuanlan.zhihu.com/p/57864886
https://towardsdatascience.com/paper-repro-deep-metalearning-using-maml-and-reptile-fd1df1cc81b0

Single Path One-Shot Neural Architecture Search with Uniform Sampling

旷视zhangxiangyu团队。

One-Shot表示超网训练完以后，直接通过某种方式（如采样），生成某种子网，而生成的子网，不做任何fine-tuning，直接用来评估子网的推理精度，这可以大大加快结构搜索的速度。而这里用超网参数赋给子网，以这个结果来表示最终子网训练收敛的结果，这个代理的好坏直接影响了One-Shot方法的效果。

Single Path则是训练超网的方法。训练超网时，不是类似DARTS这样，所有的结构以一定的权重带入网络，而是每次，只随机取一条激活的路径。

论文：https://arxiv.org/abs/1904.00420

ADVERSARIAL AUTOAUGMENT

基于GAN实现的自动数据增强

https://arxiv.org/pdf/1912.11188.pdf

SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection

One sentence

华为诺亚的文章，分为两个stage，用coarse-to-grain的方式来搜：
1）Stage-One：搜分辨率、backbone结构（ResNet18 or ResNet50 or ResNext or MobileNetV2）、Feature Fusion（With or without）RPN（with or without）、RCNN Head；Stage-One使用ImageNet预训练模型进行fine-tuning
2）Stage-Two：搜Backbone，直接在COCO上train from scratch

Paper

https://arxiv.org/abs/1911.09929
https://arxiv.org/pdf/1911.09929.pdf

Code

Motivation

Novelties & Key contribution

Method

Experiments

Rethinking

Reference

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

https://arxiv.org/pdf/1905.11946.pdf

BlockQNN: Efficient Block-wise Neural Network Architecture Generation

https://arxiv.org/pdf/1808.05584.pdf

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Understanding Architectures Learnt by Cell-based Neural Architecture Search

https://openreview.net/forum?id=BJxH22EKPS

ICLR 2020

Single-Path Mobile AutoML: Efficient ConvNet Design and NAS Hyperparameter Optimization

可以搜索channel，feature map size，

A Closer Look at Deep Policy Gradients

https://openreview.net/forum?id=ryxdEkHtPS

ICLR 2020

Understanding and Robustifying Differentiable Architecture Search

https://openreview.net/forum?id=H1gDNyrKDS

MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning

https://arxiv.org/abs/2003.14058

https://arxiv.org/pdf/2003.14058.pdf

Neural Optimizer Search with Reinforcement Learning

https://arxiv.org/pdf/1709.07417.pdf

ONCE FOR ALL: TRAIN ONE NETWORK AND SPECIALIZE IT FOR EFFICIENT DEPLOYMENT

训练一个网络，根据不同的时延需求，从网络中采样不同的子网络来使用。

https://arxiv.org/abs/1812.00332

https://arxiv.org/pdf/1908.09791.pdf

https://github.com/mit-han-lab/once-for-all

Multi-objective Neural Architecture Search via Predictive Network Performance Optimization

https://arxiv.org/abs/1911.09336

https://arxiv.org/pdf/1911.09336.pdf

SEARCHING FOR ACTIVATION FUNCTIONS

一句话

强化学习搜激活函数，最后搜出来的是Swish。

https://arxiv.org/pdf/1710.05941.pdf

Bounding Box Regression with Uncertainty for Accurate Object Detection

KL-Loss

https://arxiv.org/abs/1809.08545

https://github.com/yihui-he/KL-Loss

MobileNetV2: Inverted Residuals and Linear Bottlenecks

https://arxiv.org/pdf/1801.04381.pdf

Linear Bottlenecks：

Inverted Residual：
ResNet中的Residual Block，主分支由一个1x1,一个3x3，一个1x1三个卷积结构组成，其中，第一个1x1会将channel数减小到原来的1/4。而Inverted Residual则正好相反，第一个1x1会将channel数升高。

这样的作用是

Identity Mappings in Deep Residual Networks

ResNet的第二篇文章

https://arxiv.org/pdf/1603.05027.pdf

Deep Residual Learning for Image Recognition

ResNet50

https://www.semanticscholar.org/paper/Deep-Residual-Learning-for-Image-Recognition-He-Zhang/29c808b346526fbb6027e67942b62a40a549f019

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7780459&tag=1

Accelerating Neural Architecture Search using Performance Prediction

https://arxiv.org/abs/1705.10823

ICLR 2018

使用少量epoch的精度，来预估模型最终的精度。

R-FCN: Object Detection via Region-based Fully Convolutional Networks

https://arxiv.org/pdf/1605.06409.pdf

Faster-RCNN的改进

IRLAS: Inverse Reinforcement Learning for Architecture Search

Enhancing Generalization of First-Order Meta-Learning

https://openreview.net/forum?id=rJlzbihzdE

https://openreview.net/pdf?id=rJlzbihzdE

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

original paper：
https://arxiv.org/abs/1902.09630

chinese ：
https://zhuanlan.zhihu.com/p/57992040

Regularized Evolution for Image Classifier Architecture Search

One-Sentence

Aging策略的遗传算法作为sampler来搜结构。

Paper

https://arxiv.org/pdf/1802.01548.pdf

Code

Q.1 为什么叫Regularized？

因为在遗传算法中用了Aging的策略，也就是说，淘汰最老，而不是淘汰最差。

其中的一个关键点是：为什么每次去除的是最“老”的模型，而不是性能最差的模型？原因是作者认为如果一个模型效果足够好，那么他有很大概率在他被淘汰之前已经留下了自己的后代。如果每次淘汰的是最差的样本的话，那么队列中的（现存的）样本很有可能是来自一个共同祖先，我们得到的结构很容易失去多样性。这种在遗传学上就叫做近亲繁殖。
作者：无影
链接：https://zhuanlan.zhihu.com/p/81126873
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。