Code Monkey home page Code Monkey logo

refer_sunspot's Introduction

This repository is a forked version of the refer API. This distribution is compatible with Python3 and contains the new SUNSPOT dataset.

Abstract

SUNSPOT is a new referring expression dataset focused on spatial referring expressions. The purpose of this dataset is to aid human robot collaboration in scenes which provide challenges to object detection and localization due to clutter, occlusion, unknown object classes, and multiple instances of the same object class. It also presents a challenging dataset for natural language and the understanding of spatial prepositional phrases in English. Our dataset provides 7,987 referring expressions of 1948 images with an average of 2.60 spatial prepositions per expression.

For more details about the SUNSPOT dataset, please read our paper SUN-SPOT: Localizing objects with spatial referring expressions.

Example

Example image with bounding boxes

  • The calendar is hanging below the cupboards above the sink
  • The calendar is over the sink.
  • The flowers in the corner of the room, to the right of the silver plaque.
  • The flowers are on the corner of the counters, to the left of the range.
  • The flowers are on top of the white table in a clear vase.
  • The red, white and yellow flowers sitting in the middle of the table.

Image includes depth channel. Annotations include bounding boxes, instance segmentation, and scene and object labels.

Depth image Segmentation image 3d bounding boxes

Citation

We provide the refer.bib file for easy citation.

If you used SUNSPOT, please cite both the SUNSPOT paper and the SUNRGBD papers

C. Mauceri, M. Palmer, and C. Heckman. SUN-SPOT Localizing objects with spatial referring expressions. In AAAI, 2018.
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012.
A. Janoch, S. Karayev, Y. Jia, J. T. Barron, M. Fritz, K. Saenko, and T. Darrell. A category-level 3-d object dataset: Putting the kinect to work. In ICCV Workshop on Consumer Depth Cameras for Computer Vision, 2011.
J. Xiao, A. Owens, and A. Torralba. SUN3D: A database of big spaces reconstructed using SfM and object labels. In ICCV, 2013

If you used RefCOCOg, please cite

J. Mao, J. Huang, A. Toshev, O. Camburu, A. Yuille, and K. Murphy. Generation and Com-prehension of Unambiguous Object Descriptions. In CVPR, 2016.

If you used one of the three datasets, RefClef, RefCOCO and RefCOCO+, that were collected by UNC. Please cite the EMNLP 2014 paper; if you want to compare with recent results, please check the ECCV 2016 paper.

Kazemzadeh, Sahar, et al. "ReferItGame: Referring to Objects in Photographs of Natural Scenes." EMNLP 2014.
Yu, Licheng, et al. "Modeling Context in Referring Expressions." ECCV 2016.

Setup

  1. Repository contains a submodule for evaluating the results so clone recursively with
    git clone --recurse-submodules https://github.com/crmauceri/refer_python3.git
  2. Download and install pycocotools conda install -c conda-forge pycocotools
  3. Prepare Images: Download the SUNRGBD images and add them into the data/images/SUNRGBD directory.
  4. Download additional annotations: If you want to use the refcoco, refcoco+, refcocog, or refclef datasets follow the download instructions from the original repository and add the files to the data folder. The sunspot dataset is included in the repository.

Directory structure

After following the setup instructions, you should have the following directory structure

refer_python3/
├── data/
    ├── images/
        ├── SUNRGBD/ # Download the SUNRGBD images - http://rgbd.cs.princeton.edu
        ├── mscoco/  # Download the mscoco images - http://cocodataset.org/#download
    ├── sunspot/
        ├── instances.json
        ├── refs(boulder).p
    ├── refclef/      # https://github.com/lichengunc/refer/tree/master/data (optional)
    ├── refcoco/      # https://github.com/lichengunc/refer/tree/master/data (optional)
    ├── refcocog/     # https://github.com/lichengunc/refer/tree/master/data  (optional)
    ├── refcoco+/     # https://github.com/lichengunc/refer/tree/master/data (optional)
├── nlg-eval/         # machine translation metrics for evaluating generated expressions
├── setup.py          # supports `pip install -e .` installation
├── refer.py          # the class that loads the dataset
├── evaluate.py       # compares dataset groundtruth to another set of referring expressions
├── pyReferDemo.ipynb # jupyter notebook to view examples and dataset statistics

How to use

Loading the dataset

The refer.py is able to load all 4 datasets from the original repository refer api as well as the new SUNRGBD dataset.

# locate your own data_root, and choose the dataset_splitBy you want to use
refer = REFER(data_root, dataset='sunspot',  splitBy='boulder') # The new dataset!


# Other datasets
refer = REFER(data_root, dataset='refclef',  splitBy='unc')
refer = REFER(data_root, dataset='refclef',  splitBy='berkeley') # 2 train and 1 test images missed
refer = REFER(data_root, dataset='refcoco',  splitBy='unc')
refer = REFER(data_root, dataset='refcoco',  splitBy='google')
refer = REFER(data_root, dataset='refcoco+', splitBy='unc')
refer = REFER(data_root, dataset='refcocog', splitBy='google')   # test split not released yet
refer = REFER(data_root, dataset='refcocog', splitBy='umd')      # Recommended, including train/val/test

Data Demo

The jupyter notebook pyReferDemo.ipynb is an easy way to view some of the dataset examples and statistics.

Evaluating generated expressions

Finally, if you have a set of generated expressions that you want to compare to a dataset's ground truth, use evaluate.py

  1. Save your generated expressions in a csv file with columns "refId" and "generated_sentence" and one row per expression.
  2. Run python evaluate.py csv --csvpath <path_to_your_csv> --dataset sunspot --splitBy boulder
  3. This will print Bleu, ROUGE_L, and CIDEr scores for your generated sentences. For more options, see evaluate.py

refer_sunspot's People

Contributors

lichengunc avatar crmauceri avatar

Stargazers

hkrsnd avatar  avatar Big-faced Cat avatar  avatar Zhao Zhang avatar  avatar

Watchers

Juan Vargas-Murillo avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.