Code Monkey home page Code Monkey logo

awesome-embodiedai's Introduction

Awesome EmbodiedAI (still in construct) Awesome

We maintain a curated list of Awesome Embodied AI works. Currently, we include simulators, tasks and datasets in Embodied AI field.

  • Simulators help render images and simulate the behavior of agents, as if they are situated in an real world environment.
  • Datasets provide training data (e.g. navigation instructions) and ground truths (e.g. navigation trajectories).

(Some simulator comes along with a dataset with the same name, so there might be duplicated names in different sections.)

Please feel free to pull requests or open an issue to add papers.

Awesome companies

Awesome Labs

Simulator

Platform to simulate real world environments.

  • Habitat-Simulator
    • Venue/Year: ICCV 2019 | [paper] [code] [homepage]
    • Visual Content: Matterport3D, House3D, AI2-THOR, etc. (partially realistic)
    • Action Space: continuous
  • AI2-THOR
    • Venue/Year: Arxiv 2019 | [paper] [code] [homepage]
    • Visual Content: AI2-THOR
    • Action Space: continuous
    • Interactive: Yes
  • CHALET
    • Venue/Year: Arxiv 2019 | [paper] [code]
    • Visual Content: CHALET
    • Action Space: continuous
    • Interactive: Yes
  • Matterport3D
    • Venue/Year: 3DV 2017 | [paper] [code] [homepage]
    • Visual Content: Matterport3D (realistic)
    • Action Space: graph based
  • MINOS
    • Venue/Year: CVPR 2017 | [paper] [code] [homepage]
    • Visual Content: SUNCG+Matterport3D (partially realistic)
    • Action Space: continuous
  • Gibson
    • Venue/Year: CVPR 2018 | [paper] [code] [homepage]
    • Visual Content: Gibson+2D3DS+Matterport3D (realistic)
    • Action Space: continuous
    • Interactive: Yes
  • House3D
    • Venue/Year: Arxiv 2018 | [paper] [code]
    • Visual Content: SUNCG
    • Action Space: continuous
  • SUNCG
    • Venue/Year: CVPR 2017 | [paper]
    • Visual Content: SUNCG
  • HoME
    • Venue/Year: NIPS 2017 | [paper] [code]
    • Visual Content: SUNCG
    • language content: description of objects
    • Action Space: continuous
  • VirtualHome
    • Venue/Year: CVPR 2018 | [paper] [code] homepage
    • Visual Content: VirtualHome
    • Action Space: continuous
    • Interactive: Yes
  • SceneNet RGB-D
    • Venue/Year: ICCV 2017 | [paper] [code] [homepage]
    • Visual Content: SceneNet RGB-D
    • Action Space: continuous
    • Interactive: Yes

Tasks

Embodied task definitions.

REVERIE - requires an intelligent agent to correctly localize a remote target object (can not be observed at starting location) specified by a concise high-level natural language instruction.

VLN - requires an embodied agent to follow natural language instructions to navigate from a starting pose to a goal location.

VNLA - a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments.

EQA - an agent is spawned at a random location in a 3D environment and asked a question. The agent must first intelligently navigate to explore the environment, gather necessary visual information through first-person (egocentric) vision, and then answer the question.

IQA - requires an agent to navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question.

TOUCHDOWN - requires an agent to first follow navigation instructions in a real-life visual urban environment, and then identify a location described in natural language to find a hidden object at the goal position.

Dataset

Embodied datasets built upon simulators.

  • REVERIE CVPR 2020 based on Matterport3D paper code
    • language content: navigation instructions
    • applicable tasks: REVERIE, VLN, referring expression
  • R2R CVPR 2018 based on Matterport3D paper homepage
    • language content: navigation instructions
    • applicable tasks: VLN
  • VNLA CVPR 2019 based on Matterport3D paper code
    • language content: navigation instructions and assistance
    • applicable tasks: VNLA, VLN, referring expression
  • HANNA EMNLP 2019 based on Matterport3D paper code
    • language content: navigation instructions and assistance
    • applicable tasks: VNLA, VLN, referring expression
  • CVDN CoRL 2019 based on Matterport3D paper code homepage
    • language content: dialogues
    • applicable tasks: VNLA, VLN
  • EQA CVPR 2018 based on House3D paper code homepage
    • language content: question-answer pairs
    • applicable tasks: EQA, VLN
  • IQUADv1 CVPR 2018 based on AI2-THOR paper code
    • language content: question-answer pairs
    • applicable tasks: IQA, EQA, VLN
  • TOUCHDOWN CVPR 2019 based on Google Street View paper code
    • language content: navigation instructions
    • applicable tasks: TOUCHDOWN, VLN, referring expression
  • Talk The Way 2018 paper code
    • visual content: manually captured neighborhoods of New York City
    • language content: navigation dialogues
    • applicable tasks: VNLA, VLN
  • LANI & CHAI 2019 based on CHALET paper code
    • language content: navigation instructions
    • applicable tasks: VLN
  • Activity & ActivityPrograms CVPR 2018 paper code homepage
    • language content: task descriptions
    • applicable tasks: VLN
  • Habitat ICCV 2019 paper code homepage
    • language content: navigation instructions, task descriptions, etc.
    • applicable tasks: IQA, EQA, VLN, language grounding, etc.

awesome-embodiedai's People

Contributors

chengaopro avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.