Code Monkey home page Code Monkey logo

hidanet's Introduction

HiDANet

This is the official implementation of HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness, accepted in TIP'23

Abstract

RGB-D saliency detection aims to fuse multi-modal cues to accurately localize salient regions. Existing works often adopt attention modules for feature modeling, with few methods explicitly leveraging fine-grained details to merge with semantic cues. Thus, despite the auxiliary depth information, it is still challenging for existing models to distinguish objects with similar appearances but at distinct camera distances. In this paper, from a new perspective, we propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection. Our motivation comes from the observation that the multigranularity properties of geometric priors correlate well with the neural network hierarchies. To realize multi-modal and multi-level fusion, we first use a granularity-based attention scheme to strengthen the discriminatory power of RGB and depth features separately. Then we introduce a unified cross dual-attention module for multi-modal and multi-level fusion in a coarse-to-fine manner. The encoded multi-modal features are gradually aggregated into a shared decoder. Further, we exploit a multi-scale loss to take full advantage of the hierarchical information. Extensive experiments on challenging benchmark datasets demonstrate that our HiDAnet performs favorably over the state-of-the-art methods by large margins.

abstract

Train and Test

Please follow the train, inference, and evaluation steps:

python train.py
python test_produce_maps.py
python test_evaluation_maps.py

Make sure that you have changed the path to your dataset in the config file

Saliency Maps

Our saliency maps can be found here

Qualitative Comparison

results

Citation

If you find this repo useful, please consider citing:

@ARTICLE{wu2023hida,
  author={Wu, Zongwei and Allibert, Guillaume and Meriaudeau, Fabrice and Ma, Chao and Demonceaux, Cédric},
  journal={IEEE Transactions on Image Processing}, 
  title={HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness}, 
  year={2023},
  volume={32},
  number={},
  pages={2160-2173},
  doi={10.1109/TIP.2023.3263111}}

Related works

  • ICCV 23 - Source-free Depth for Object Pop-out [Code]
  • ACMMM 23 - Object Segmentation by Mining Cross-Modal Semantics [Code)]
  • 3DV 22 - Robust RGB-D Fusion for Saliency Detection [Code]
  • 3DV 21 - Modality-Guided Subnetwork for Salient Object Detection [Code]

Acknowledgments

This repository is heavily based on SPNet. Thanks to their great work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.