Teng Wang's Projects
temporal action detection with SSN
A curated list of awesome computer vision resources
Reading list for research topics in multimodal machine learning
Recent Advances in Vision and Language Pre-training (VLP)
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Awesome papers & datasets specifically focused on long-term videos.
A curated list of prompt-based paper in computer vision and vision-language learning.
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
python codes for CIDEr - Consensus-based Image Caption Evaluation
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
Dense video captioning in PyTorch
Evaluation code for Dense-Captioning Events in Videos
Code for paper "Event-centric hierarchical representation for dense video captioning" (TCSVT2020)
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"
Event Sequence Generation Network
EVA Series: Visual Representation Fantasies from BAAI
A faster pytorch implementation of faster r-cnn
image captioning codebase in pytorch(finetunable cnn in branch "with_finetune";diverse beam search can be found in 'dbs' branch; self-critical training is under my self-critical.pytorch repository.)
MERLOT: Multimodal Neural Script Knowledge Models
Models built with TensorFlow
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Must-read papers on prompt-based tuning for pre-trained language models.
Image-to-image translation in PyTorch (e.g. horse2zebra, edges2cats, and more)
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Feature Extractor module for videos using the PySlowFast framework