A repository of papers in the field of Vision and Language.
Every week, we will post a link next to the paper so that you can join at the time of the Meetup. The link will be posted here a few minutes before the start of the discussion.
During the discussion:
- Raise your (virtual) hand if you want to speak.
- Feel free to make comments or post your questions in the chat.
- 25. Jan 2022 (6pm CET) (Introductory reading) Shagun Uppal, Sarthak Bhagat, Devamanyu Hazarika, Navonil Majumder, Soujanya Poria, Roger Zimmermann, Amir Zadeh. Multimodal research in vision and language: A review of current and emerging trends, Information Fusion, Volume 77, 2022, Pages 149-171, ISSN 1566-2535 Multimodal research in vision and language: A review of current and emerging trends.
For those new to machine learning, these are some recommended reading material:
-
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.
-
Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345-420.
-
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2019). A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596.
Transformer-related resources: