Temporal-Language-Grounding-in-Videos

Introduction

Task：

Temporal Moment Localization via Language: given a query, find the corresponding moment in a given video. (major focus of this repo)

Format

Markdown format:

- [Paper Name](link) - Author 1 et al, `Conference Year`. [[code]](link)

Change Log

2020/07/27 start the repo.
Papers before 2020 are mainly collected by muketong.

to be updated ...

Keywords used in searching

grounding, retrieval, localization

Papers

Survey

None.

2019

Supervised:

MAC: Mining Activity Concepts for Language-based Temporal Localization - Runzhou Ge Ge et al, WACV 2019. [code]
Multilevel Language and Vision Integration for Text-to-Clip Retrieval - H. Xu et al, AAAI 2019. [code]
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos - He, Dongliang et al, AAAI 2019.
To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression - Y. Yuan et al, AAAI 2019. [code]
Semantic Proposal for Activity Localization in Videos via Sentence Query - S. Chen et al, AAAI 2019.
Localizing natural language in videos - J. Chen et al, AAAI 2019.
ExCL: Extractive Clip Localization Using Natural Language Descriptions - S. Ghosh et al, NAACL 2019.
Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention - B. Jiang et al, ICMR 2019. [code]
Language-Driven Temporal Activity Localization_ A Semantic Matching Reinforcement Learning Model - W. Wang et al, CVPR 2019.
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment - Da Zhang et al, CVPR 2019.
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos - Zhu Zhang et al, SIGIR 2019. [code]
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos - Yitian Yuan et al, NIPS 2019. [code]
DEBUG: A Dense Bottom-Up Grounding Approach for Natural Language Video Localization - Chujie Lu et al, EMNLP 2019.
!(still on arxiv 20200609)Temporal Localization of Moments in Video Collections with Natural Language - V. Escorcia et al, arxiv 2019.

Weakly Supervised:

Weakly Supervised Video Moment Retrieval From Text Queries - N. C. Mithun et al, CVPR 2019.
Weakly-supervised spatio-temporally grounding natural sentence in video - Zhenfang Chen et al, ACL 2019. [code]
WSLLN: Weakly Supervised Natural Language Localization Networks - M. Gao et al, EMNLP 2019.

2020

Supervised:

Moment Retrieval via Cross-Modal Interaction Networks With Query Reconstruction - Zhijie Lin et al, TIP 2020.
Rethinking the Bottom-Up Framework for Query-based Video Localization - Long Chen et al, AAAI 2020.
Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction - Jingwen Wang et al, AAAI 2020. [code]
Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language - Songyang Zhang et al, AAAI 2020. [code]
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video - Jie Wu et al, AAAI 2020. [code]
Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention - C. R. Opazo et al, WACV 2020. [code]
Local-Global Video-Text Interactions for Temporal Grounding - Mun Jonghwan et al, CVPR 2020. [code]
Dense Regression Network for Video Grounding - Zeng Runhao et al, CVPR 2020. [code]
Tripping through time: Efficient Localization of Activities in Videos - Meera Hahn et al, BMVC 2020.
Span-based Localizing Network for Natural Language Video Localization - Hao Zhang et al, ACL 2020. [code]
Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language - Shaoxiang Chen et al, ECCV 2020. [code]
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos - Shaoxiang Chen et al, ECCV 2020.
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization - Daizong Liu et al, MM 2020. [code]
Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos - Xiaoye Qu et al, MM 2020.
Language Guided Networks for Cross-modal Moment Retrieval - Kun Liu et al, arxiv.

Weakly Supervised:

Weakly-Supervised Video Moment Retrieval via Semantic Completion Network - Zhijie Lin et al, AAAI 2020.
Regularized Two-Branch Proposal Networks for Weakly-Supervised Moment Retrieval in Videos - Zhu Zhang et al, MM 2020.
VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval - Minuk Ma et al, ECCV 2020.

Conferences to be update:

MM 2020 (some papers are added, wait for proceedings)
EMNLP 2020 (wait for camera-ready)
ICCV 2020

ammieqi / temporal-language-grounding-in-videos Goto Github PK

temporal-language-grounding-in-videos's Introduction

Temporal-Language-Grounding-in-Videos

Introduction

Format

Change Log

Table of Contents

Keywords used in searching

Papers

Survey

Before

2017

2018

2019

2020

Dataset

Licenses

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent