Code Monkey home page Code Monkey logo

evaluating-multimodal-cl's Introduction

multi-modal-continual-learning

Used repositories

Evaluation of continual learning models:

  1. Retrieval evaluation

    1. Assesses how well the model can retrieve relevant items
    2. The purpose is:
      • to measure the model’s ability to understand and relate multimodal data (like image-to-text or text-to-image retrieval)
      • and to evaluate how well the model has learned to map images and text into a shared embedding space
    3. Process:
      • image-to-text retrieval: given an image, retrieve the most relevant text descriptions
      • text-to-image retrieval: given a text description, retrieve the most relevant images
      • the performance is usually measured using metrics like Recall@K
    4. Objective:
      • model’s ability to retrieve relevant data within the same modality
  2. Transfer evaluation

    1. Measures how well the model can adapt its learned representations to new tasks or datasets
    2. The purpose is to
      • assess the generalization capability of the model’s representations to new and diverse tasks
    3. Process:
      • fine-tune the pre-trained model on a new dataset or task like classification
      • evaluate the performance on the new task using task-specific metrics like accuracy
    4. Objective:
      • model’s ability to transfer its learned representations to new, often task-specific contexts

Both methods play an important role in assessing the performance of CL models:

  • Retrieval evaluation helps in understanding how well the model retains and uses its multimodal embedding space over time
  • Transfer evaluation helps in assessing the adaptability and robustness of the model’s learned features when applied to new and varied tasks, providing insights into the model's ability to generalize and prevent catastrophic forgetting

Evaluation done so far

  • The authors evaluate MTIL by utilizing the following metrics Transfer, Average, and Last.
    • The Transfer metric assesses the model’s zero-shot transfer capability on unseen data.
    • Last evaluates the model’s memorization ability on historical knowledge.
    • Average is a composite metric measuring the mean performance across “Transfer” and “Last”.

New components

  • This repository extends existing code with the zero-shot retrieval capability.

  • In paper Learning Transferable Visual Models From Natural Language Supervision the authors check the zero-shot transfer performance of CLIP for both text and image retrieval on the Flickr30k and MSCOCO datset.

clip_retrieval_results.png

  • Retrieval evaluation - TODO

evaluating-multimodal-cl's People

Contributors

efemeryds avatar

Stargazers

Daniel Marczak avatar

Watchers

 avatar Sebastian Cygert avatar Kostas Georgiou avatar  avatar

evaluating-multimodal-cl's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.