gowitheflow-1998 / pixel-linguist Goto Github PK

View Code? Open in Web Editor NEW

15.0 4.0 1.0 68.16 MB

Shell 2.01% Python 97.99%

pixel-linguist's Introduction

Pixel Linguist

Offical Repo of paper Pixel Sentence Representation Learning

Model Checkpoint: at HuggingFace
Github Repo: Github Repo
Paper: https://arxiv.org/pdf/2402.08183.pdf

Overview

Installation

conda create -n pixel python=3.9 -y && conda activate pixel
git clone https://github.com/gowitheflow-1998/Pixel-Linguist.git

package install

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
conda install -c conda-forge pycairo pygobject manimpango scikit-learn
cd Pixel-Linguist
pip install -r requirements.txt
pip install -e .

Fallback fonts downloading

(Do not need to download this if directly git clone our repo)

python scripts/data/download_fallback_fonts.py ‘data/fallback_fonts’

Inference and Evaluation

sts benchmark:

python tools/evaluation_sts.py

Adjust the specific language you want to evaluate.

beir:

python tools/evaluation_retrieval.py

It is convenient to evaluate on other datasets available on BEIR, other than Natural Questions that we evaluated in the paper. Simply modify dataset name in the script.

Reproduce Pixel Linguist Training

Step 0: Visual alignment step:

bash run_bash/0-run_unsup.sh

For this step, run separately on all our unsup datasets to create 4 checkpoints, and do an ensemble using tools/ensemble.py.

Step 1: Topical alignment step:

bash run_bash/1-run_wikispan.sh

Step 2: Reasoning alignment step:

bash run_bash/2-run_allnli_finetune.sh

Step 3: multilingual transfer step:

bash run_bash/3-run_allnli-pm.sh

Go back and forth between Step 2 and Step 3 (see paper for exact procedure of the iterative training where "leapfrogging" pattern is found!) for 2-3 times for maximum performance, please end the training with English allnli, instead of parallel data.

Note

We find that training with an extra MLP (using PIXELForSequenceClassification class) but do inference without (using PIXELForRepresentation, which then drops the MLP), boosts the semantics performance a little bit, providing the performance in the latest version of the paper.

pixel-linguist's People

Contributors

Stargazers

Watchers

Forkers

haorand

pixel-linguist's Issues

Reproduction question

The STS-B is about 0.7371, I made it work on both mteb lib and your eval_sts py.

Building models for Pixel-Linguist/Pixel-Linguist-v0
model type: pixel
Some weights of the model checkpoint at Pixel-Linguist/Pixel-Linguist-v0 were not used when initializing PIXELForRepresentation: ['pooler.linear.bias', 'pooler.ln.weight', 'pooler.linear.weight', 'pooler.ln.bias']
- This IS expected if you are initializing PIXELForRepresentation from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing PIXELForRepresentation from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100% 87/87 [00:12<00:00,  7.08it/s]
spearman all languages: [0.7371328443627424]
anisotropy all languages: [0.207]

It seems about 4 point lower that the digit in your paper.

INFO:pixel.data.rendering.rendering_utils:loading text renderer configuration file ./Pixel-Linguist-v0/text_renderer_config.json from cache at /content/Pixel-Linguist/Pixel-Linguist-v0/text_renderer_config.json
./Pixel-Linguist-v0/renderer.renderer
INFO:pixel.data.rendering.rendering_utils:loading font file ./Pixel-Linguist-v0/renderer.renderer from cache at /content/Pixel-Linguist/data/fallback_fonts/GoNotoCurrent.ttf
INFO:pixel.data.rendering.pangocairo_renderer:Loading font from /content/Pixel-Linguist/data/fallback_fonts/GoNotoCurrent.ttf

Running task:  SICK-R
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - SICK-R, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating SICK-R **********************
INFO:mteb.evaluation.MTEB:Loading dataset for SICK-R
INFO:mteb.abstasks.AbsTaskSTS:
Task: SICK-R, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 9927 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 9927 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for SICK-R on test took 82.60 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7781435725561205, 'spearman': 0.693026233861837}, 'manhattan': {'pearson': 0.7458007547441067, 'spearman': 0.6917351042169897}, 'euclidean': {'pearson': 0.74647644146939, 'spearman': 0.6930262498681216}, 'evaluation_time': 82.6}
Running task:  STS12
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS12, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS12 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS12
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS12, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3108 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3108 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS12 on test took 26.10 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.8148581693135399, 'spearman': 0.7350674849787033}, 'manhattan': {'pearson': 0.7817916408847375, 'spearman': 0.7352018056406955}, 'euclidean': {'pearson': 0.7811356917848975, 'spearman': 0.7350674571026258}, 'evaluation_time': 26.1}
Running task:  STS13
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS13, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS13 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS13
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS13, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1500 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1500 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS13 on test took 12.46 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6425808819903709, 'spearman': 0.655392178430937}, 'manhattan': {'pearson': 0.6577538125092147, 'spearman': 0.6567118768579832}, 'euclidean': {'pearson': 0.6561594766016265, 'spearman': 0.655392178430937}, 'evaluation_time': 12.46}
Running task:  STS14
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS14, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS14 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS14
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS14, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3750 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3750 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS14 on test took 31.00 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6901987553139182, 'spearman': 0.6729398845171846}, 'manhattan': {'pearson': 0.6867942298865358, 'spearman': 0.6731367422383607}, 'euclidean': {'pearson': 0.6863547804637292, 'spearman': 0.6729397051395782}, 'evaluation_time': 31.0}
Running task:  STS15
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS15, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS15 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS15
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS15, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3000 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3000 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS15 on test took 24.45 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7741248388397876, 'spearman': 0.7920999963495237}, 'manhattan': {'pearson': 0.779554210832401, 'spearman': 0.7913264432791515}, 'euclidean': {'pearson': 0.7804673632290055, 'spearman': 0.7920999963495237}, 'evaluation_time': 24.45}
Running task:  STS16
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS16, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS16 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS16
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS16, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1186 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1186 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS16 on test took 9.79 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6909690739150925, 'spearman': 0.6965494553846369}, 'manhattan': {'pearson': 0.6959238122190351, 'spearman': 0.697274648835741}, 'euclidean': {'pearson': 0.6951573797300552, 'spearman': 0.6965494553846369}, 'evaluation_time': 9.79}
Running task:  STSBenchmark
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STSBenchmark, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STSBenchmark **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STSBenchmark
INFO:mteb.abstasks.AbsTaskSTS:
Task: STSBenchmark, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1379 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1379 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STSBenchmark on test took 11.37 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7368840970678752, 'spearman': 0.7371328443627424}, 'manhattan': {'pearson': 0.7446787325062066, 'spearman': 0.7349454241308658}, 'euclidean': {'pearson': 0.7465475020493239, 'spearman': 0.7371330032653948}, 'evaluation_time': 11.37}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.