Yuchong Sun 孙宇冲's Projects
An VideoQA dataset based on the videos from ActivityNet
Reading list for research topics in embodied vision
Reading list for research topics in multimodal machine learning
A curated list of Multimodal Related Research.
Bridging Vision and Language Model
Bling's Object detection tool
The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch
Video embeddings for retrieval with natural language queries
Example models using DeepSpeed
End-to-End Object Detection with Transformers
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
MERLOT: Multimodal Neural Script Knowledge Models
Train transformer language models with reinforcement learning.
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
Config files for my GitHub profile.