Code Monkey home page Code Monkey logo

activevision-vln-paperlists's Introduction

ActiveVision-VLN-PaperLists

Leaderboard of VLN

A list of papers in ActiveVision and VLN

Dataset, Environment & Metrics

  • Matterport3D: Learning from RGB-D Data in Indoor Environments (3DV 2017) [pdf]
  • Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments (CVPR 2018) [pdf]
  • On Evaluation of Embodied Navigation Agents [pdf]

Methods for VLN

Seq2seq - the most basic baseline

  • Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments (CVPR 2018) [pdf]

Methods with reinforcement learning

  • Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout (NAACL 2019) [pdf]
  • Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation (CVPR 2019) [pdf]
  • Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation (ECCV 2018) [pdf]

Methods without reinforcement learning

  • The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation (CVPR 2019) [pdf]
  • Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation (CVPR 2019) [pdf]
  • Self-Monitoring Navigation Agent via Auxiliary Progress Estimation (ICLR 2019) [pdf]
  • Speaker-Follower Models for Vision-and-Language Navigation (NeurIPS 2018) [pdf]

Methods using imitation learning to explore the unseen environment

  • Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout (NAACL 2019) [pdf]
  • Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation (CVPR 2019) [pdf]

Related Methods

  • Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering (CVPR 2018) [pdf]
  • Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents (CoRL 2018) [pdf]
  • Neural Modular Control for Embodied Question Answering (CoRL 2018) [pdf]

Further Reading (Active vision methods, different problem setting, etc)

  • Adaptive Object Detection Using Adjacency and Zoom Prediction (CVPR 2016) [pdf]
  • Ecological Active Vision: Four Bioinspired Principles to Integrate Bottom-Up and Adaptive Top-Down Attention Tested With a Simple Camera-Arm Robot (TAMD 2015) [pdf]
  • Toward predictive machine learning for active vision (ICLR 2018) [pdf]
  • Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning (ICRA 2017) [pdf]
  • Learning to Follow Directions in Street View [pdf]
  • Habitat: A Platform for Embodied AI Research [pdf]

activevision-vln-paperlists's People

Contributors

xiaotian0328 avatar s0sasaki avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.