Code Monkey home page Code Monkey logo

vdt-dataset's Introduction

VDT-2048-Dataset

Image acquisition system

Fig4

VDT-2048 Dataset Analysis

This dataset contains 2048 image groups, and each group contains triple-modal images (i.e., visible image, depth image, and thermal image). All of the images have the same resolution of 640×480. This dataset collected 34 household items in the seven most common household scenes. The proportion of each scene and item category is shown in the following figure. summary

V challenging scenes

V information mainly has seven challenging scenes. V-SA (similar appearance): the salient object has a similar color or shape to the background. V-BSO (big salient object): the ratio of the sum of salient pixels to the total pixel sum of the entire image is greater than 0.08. V-SSO (small salient object): the ratio of the sum of salient pixels to the total pixel sum of the entire image is less than 0.007. V-MSO (multiple salient objects): the number of salient objects is more than one. V-LI (low illumination): images are collected under low illumination, and objects are not easier to identify visually. V-SI (side illumination): illumination is given from the side of salient objects, and the brightness of salient objects is uneven. V-NI (no illumination): the image is collected under no illumination, and objects are visually difficult to identify. Fig8

D challenging scenes

D information mainly has four challenging scenes. D-BM (background messy): background messy when there is no wallpaper. D-II (information incomplete): partial lack of D information leads to incomplete information of salient objects. D-SSO (small salient objects): the ratio of the sum of salient pixels to the total pixel sum of the entire image is less than 0.007. D-BI (background interference): using wallpaper as a background to interfere with D information. Fig9

T challenging scenes

T information mainly has three challenging scenes. T-Cr (crossover): the salient object has a similar temperature to the surrounding or other objects. T-RD (radiation dispersion): part of a salient object is more salient than the whole object. T-HR (heat reflection): the heat radiation of the salient object is reflected. Fig10

Proposed HWSI method

The overall architecture of the proposed HWSI method and two main modules are shown in the following figure. Fig7

Visual comparison results

Comparison of the salient map visualization results of the proposed model and the latest methods in dealing with different challenging scenes. 可视化1_看图王 Visual comparison results of two modalities are disturbed. 可视化2

Download the dataset

The dataset and code are available at:https://pan.baidu.com/s/1JyFBtjlJGf4GE2zeciN1wQ?pwd=bipy

Paper

https://ieeexplore.ieee.org/document/9931143/

2023-A Novel Visible-Depth-Thermal Image Dataset of Salient Object Detection for Robotic Visual Perception.pdf

Citation

K. Song, J. Wang, Y. Bao, L. Huang and Y. Yan, "A Novel Visible-Depth-Thermal Image Dataset of Salient Object Detection for Robotic Visual Perception," in IEEE/ASME Transactions on Mechatronics, vol. 28, no. 3, pp. 1558-1569, June 2023, doi: 10.1109/TMECH.2022.3215909.

Related Work of Visible-Depth-Thermal Salient Object Detection

[1] Lightweight Multi-level Feature Difference Fusion Network for RGB-D-T Salient Object Detection [J]. Journal of King Saud University - Computer and Information Sciences, 2023 https://github.com/VDT-2048/MFDF

[2] MFFNet: Multi-modal Feature Fusion Network for VDT Salient Object Detection[J]. IEEE Transactions on Multimedia, 2023. https://ieeexplore.ieee.org/abstract/document/10171982

[3] Quality-Aware Selective Fusion Network for V-D-T Salient Object Detection[J]. IEEE Transactions on Image Processing, vol. 33, pp. 3212-3226, 2024 https://ieeexplore.ieee.org/abstract/document/10516304

https://github.com/Lx-Bao/QSFNet

Related Work of RGB-T Salient Object Detection

[1] Multiple Graph Affinity Interactive Network and A Variable Illumination Dataset for RGBT Image Salient Object Detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(7), 3104-3118. https://github.com/huanglm-me/VI-RGBT1500

Related Survey

RGB-T Image Analysis Technology and Application: A Survey [J]. Engineering Applications of Artificial Intelligence, 2023, 120, 105919. https://www.sciencedirect.com/science/article/abs/pii/S0952197623001033

vdt-dataset's People

Contributors

vdt-2048 avatar

Stargazers

 avatar Arslan avatar  avatar xmy avatar  avatar co0814__ avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

lyf0801 wj-cv

vdt-dataset's Issues

About the saliency maps of the compared methods

Your provided dataset is truly awesome, and I appreciate your generosity in sharing it. Additionally, I noticed in the paper that you've included results obtained by training some state-of-the-art methods on VDT2048. Would it be possible to obtain the saliency maps results for these methods? Alternatively, could you please provide the parameter settings you used when training these methods?
Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.