Light

xiaotian0328 / activevision-vln-paperlists Goto Github PK

View Code? Open in Web Editor NEW

This project forked from s0sasaki/activevision-vln-paperlists

0.0 1.0 0.0 102.45 MB

A list of papers in ActiveVision and VLN

activevision-vln-paperlists's Introduction

ActiveVision-VLN-PaperLists

Leaderboard of VLN

A list of papers in ActiveVision and VLN

Dataset, Environment & Metrics

Matterport3D: Learning from RGB-D Data in Indoor Environments (3DV 2017) [pdf]
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments (CVPR 2018) [pdf]
On Evaluation of Embodied Navigation Agents [pdf]

Methods for VLN

Seq2seq - the most basic baseline

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments (CVPR 2018) [pdf]

Methods with reinforcement learning

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout (NAACL 2019) [pdf]
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation (CVPR 2019) [pdf]
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation (ECCV 2018) [pdf]

Methods without reinforcement learning

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation (CVPR 2019) [pdf]
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation (CVPR 2019) [pdf]
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation (ICLR 2019) [pdf]
Speaker-Follower Models for Vision-and-Language Navigation (NeurIPS 2018) [pdf]

Methods using imitation learning to explore the unseen environment

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout (NAACL 2019) [pdf]
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation (CVPR 2019) [pdf]

Related Methods

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering (CVPR 2018) [pdf]
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents (CoRL 2018) [pdf]
Neural Modular Control for Embodied Question Answering (CoRL 2018) [pdf]

Further Reading (Active vision methods, different problem setting, etc)

Adaptive Object Detection Using Adjacency and Zoom Prediction (CVPR 2016) [pdf]
Ecological Active Vision: Four Bioinspired Principles to Integrate Bottom-Up and Adaptive Top-Down Attention Tested With a Simple Camera-Arm Robot (TAMD 2015) [pdf]
Toward predictive machine learning for active vision (ICLR 2018) [pdf]
Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning (ICRA 2017) [pdf]
Learning to Follow Directions in Street View [pdf]
Habitat: A Platform for Embodied AI Research [pdf]

activevision-vln-paperlists's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.