This is a collection of papers that I have read deeply. I believe in taking handwritten notes because it enables me to think and articulate the authors points very well. Also, you can quickly represent an idea in the form of drawings.
The notes may contain the author's points, my points, my questions and some ideas and criticisms.
- https://arxiv.org/pdf/1512.00567.pdf - Rethinking Inception Architecture for CV
- https://arxiv.org/abs/1906.02629 - When does label-smoothing help
- https://arxiv.org/pdf/1705.03122.pdf - Convolutional Sequence to Sequence Learning
- https://arxiv.org/pdf/1705.03122.pdf - Using Output Embeddings to Improve Language Models
- http://u.cs.biu.ac.il/~yogo/nnlp.pdf - A Primer on Neural Network Models for NLP.
- https://arxiv.org/abs/1703.03906 - Massive Exploration of Neural Machine Translation Architectures.
- https://arxiv.org/abs/1607.06450 - Layer Normalization
- https://arxiv.org/abs/1411.4038 - Fully Convolutional Networks for Semantic Segmentation
- https://arxiv.org/abs/1612.03144 - Feature Pyramid Networks for Object Detection
- https://arxiv.org/abs/1904.04514 - High-Resolution Representations for Labeling Pixels and Regions
- https://arxiv.org/abs/1406.4729 - Spacial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- https://arxiv.org/abs/2001.04451 Reformer: The Efficient Transformer -
Train longer contexts using LSH based attention, save space of FFN using Reversible Residual Networks.
- https://arxiv.org/abs/2002.12327 A Primer in Bertology:
A great paper summarizing major developments over BERT
- https://arxiv.org/abs/1906.02715 Visualizing The Geometry of BERT -
What latent representations in BERT capture.
- https://arxiv.org/abs/1911.05507 Compressive Transformers for Long-Range Sequence Modelling
- https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf: Playing Atari with Deep Reinforcement Learning
The DQN Paper