Code Monkey home page Code Monkey logo

sclip's Introduction

SCLIP: SBERT + CLIP

Introduction

This project mixes SBERT and CLIP to get better results out of CLIP in a multi-language setup.

0. Requirements

Install dependences from requirements.txt.
Download MSCOCO dataset in the folder SCLIP/coco. The annotations were gotten from MSCOCO 2017 Annotations and the images from MSCOCO 2017 Validation Images. Download only the annotations of Google Conceptual Captions (Image Labels).

Edit the path in preprocessing/config.yaml to point to your dataset folder (e.g. /data/SCLIP).

1. Preprocessing

1.1 To generate train, validation and test files for each dataset, run coco_breaker.py and gcc_breaker.py 1.2 For testing, a list of pairs Image-caption is needed. For that, run pairs.py 1.3 As the testing is needed for several languages, the above mentioned pairs list should be translated by running translate.py

2. Train

Train SCLIP with meta_runner.py for training with different epochs and train sizes. To see plots, meta_runner.ipynb is also available. Train SCLIP only with scrip_training.py for just one run with fixed epochs and train size.

3. Evaluate

To compare the performance of the different models the experiment.py (also a notebook version available) show the Mean Reciprocal Rank of the trained model over SBERT trying to guess wich caption belong to each image, and do the same with CLIP. For a comparison with Multilingual CLIP realeased this year, experiment-three.ipynb can be excecuted.

sclip's People

Contributors

ancordovag avatar potpov avatar ancordova25 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

potpov

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.