Code Monkey home page Code Monkey logo

constrained_attention_filter's Introduction

Constrained Attention Filter (CAF)

(ECCV 2020) Tensorflow implementation of A Generic Visualization Approach for Convolutional Neural Networks

Paper | 1 Min Video | 10 Mins Video

TL;DR

L2-CAF has three core components:

1- TF filter This is the function that inserts L2-CAF inside a network (E.g, inside a DenseNet). L2-CAF is by default disabled; it is passive during classification. To active/de-activate L2-CAF (turn on and off the filter), I use the bool atten_var_gate. False deactivate L2-CAF, while True activates the filter.

2- Optimization loop In this loop, we computes the class-oblivious and class-specific loss and leverage gradient descent to minimize it. When the loss stabilize (loss - prev_loss< 10e-5), break out of the loop.

3- Finalize filter before saving After convergence, the output filter is normalized (L2-Norm|Softmaxed|Gauss-ed) before generating the heatmap.

Requirements

  • Python 3+ [Tested on 3.7]
  • Tensorflow 1.X [Tested on 1.14]

ImageNet Pretrained Models

I used the following

Usage example

Update base_config._load_user_setup with your configurations. Mainly, set the location of pre-trained model (e.g, densenet). The released code optimizes the constrained attention filter on samples images from the "input_imgs" directory. However, if you plan to run the code on a whole dataset (e.g, ImageNet), you shoud set the local_datasets_dir in _load_user_setup

The unit L2-Norm constrained attention filter has two operating modes.

  • visualize_attention.py is the script for the vanilla "slow" (4 seconds) mode. I recommend running this first before experimenting with the fast L2-CAF version. The code of this mode is much easier to understand. The script's main function sets all the hyper-parameters needed. I will ellaborate more on each hyper-parameter soon.

  • visualize_attention_fast.py is the script for the fast (0.3 seconds) mode. The script only supports denseNet. I will add support to Inception and ResNet soon. This script only works for visualizing attention is the last conv layer. I only use it for quantitative evaluation experiments, for instance, when I evaluate L2-CAF using ImageNet validation split.

TODO LIST

* Add Fast L2-CAF on DenseNet

  • Add InceptionNet and ResNet support
  • Document to use the code
  • Document the intermediate layer visualization
  • Document extra technical tricks not mentioned in the paper

Contributing

It would be great if someone re-implement this in pytorch. Let me know and I will add a link to your Pytorch implementation here

MISC Notes

  • We did not write localization evaluation code. We used the matlab code released by CAM in Tables 1 and 3. We used the python code released by ADL in Table 2. Feel free to evaluate L2-CAF localization with other evaluation codes.

  • The softmax and Gaussian filters are released upon a reviewer request. The current Gaussian filter implementation is hard-coded to support only 7x7 attention filter. It is straight forward to extend it for any odd filter-size (e.g., 13x13). However, for even filter-size I think more changes are required. The last conv layer in standard architectures is 7x7. So the current configuration should cover most typical case-scenario.

  • I used modules of this code (especially the nets package) in multiple projects, so there is a lot of code that is not related to L2-CAF. I will iteratively clean the code. The TL;DR section, at the top of the readme file, highlights the core functions related to L2-CAF.

Release History

  • 1.0.0
    • First commit Vanilla L2-CAF on DenseNet, InceptionV1, and ResNet50V2 on 12, 15,18 July 2020
    • Add Fast L2-CAF on DenseNet 21 July 2020
    • Add Fast L2-CAF on Inception 22 July 2020
    • Add Fast L2-CAF on ResNet 23 July 2020

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.