Code Monkey home page Code Monkey logo

pixel-linguist's Introduction

Pixel Linguist

Offical Repo of paper Pixel Sentence Representation Learning

Overview

Installation

conda create -n pixel python=3.9 -y && conda activate pixel
git clone https://github.com/gowitheflow-1998/Pixel-Linguist.git

package install

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
conda install -c conda-forge pycairo pygobject manimpango scikit-learn
cd Pixel-Linguist
pip install -r requirements.txt
pip install -e .

Fallback fonts downloading

(Do not need to download this if directly git clone our repo)

python scripts/data/download_fallback_fonts.py ‘data/fallback_fonts’

Inference and Evaluation

sts benchmark:

python tools/evaluation_sts.py

Adjust the specific language you want to evaluate.

beir:

python tools/evaluation_retrieval.py

It is convenient to evaluate on other datasets available on BEIR, other than Natural Questions that we evaluated in the paper. Simply modify dataset name in the script.

Reproduce Pixel Linguist Training

Step 0: Visual alignment step:

bash run_bash/0-run_unsup.sh

For this step, run separately on all our unsup datasets to create 4 checkpoints, and do an ensemble using tools/ensemble.py.

Step 1: Topical alignment step:

bash run_bash/1-run_wikispan.sh

Step 2: Reasoning alignment step:

bash run_bash/2-run_allnli_finetune.sh

Step 3: multilingual transfer step:

bash run_bash/3-run_allnli-pm.sh

Go back and forth between Step 2 and Step 3 (see paper for exact procedure of the iterative training where "leapfrogging" pattern is found!) for 2-3 times for maximum performance, please end the training with English allnli, instead of parallel data.

Note

We find that training with an extra MLP (using PIXELForSequenceClassification class) but do inference without (using PIXELForRepresentation, which then drops the MLP), boosts the semantics performance a little bit, providing the performance in the latest version of the paper.

pixel-linguist's People

Contributors

gowitheflow-1998 avatar

Stargazers

Bohao Yang avatar Jeff Carpenter avatar  avatar yihongL avatar junjie avatar  avatar Yumin avatar Ray Sun avatar  avatar  avatar Haoran Duan avatar Danlu Chen avatar  avatar Stefan Schweter avatar Zhuoxu Huang avatar

Watchers

Michael avatar Haoran Duan avatar  avatar junjie avatar

Forkers

haorand

pixel-linguist's Issues

Reproduction question

The STS-B is about 0.7371, I made it work on both mteb lib and your eval_sts py.

Building models for Pixel-Linguist/Pixel-Linguist-v0
model type: pixel
Some weights of the model checkpoint at Pixel-Linguist/Pixel-Linguist-v0 were not used when initializing PIXELForRepresentation: ['pooler.linear.bias', 'pooler.ln.weight', 'pooler.linear.weight', 'pooler.ln.bias']
- This IS expected if you are initializing PIXELForRepresentation from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing PIXELForRepresentation from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100% 87/87 [00:12<00:00,  7.08it/s]
spearman all languages: [0.7371328443627424]
anisotropy all languages: [0.207]

It seems about 4 point lower that the digit in your paper.
image

INFO:pixel.data.rendering.rendering_utils:loading text renderer configuration file ./Pixel-Linguist-v0/text_renderer_config.json from cache at /content/Pixel-Linguist/Pixel-Linguist-v0/text_renderer_config.json
./Pixel-Linguist-v0/renderer.renderer
INFO:pixel.data.rendering.rendering_utils:loading font file ./Pixel-Linguist-v0/renderer.renderer from cache at /content/Pixel-Linguist/data/fallback_fonts/GoNotoCurrent.ttf
INFO:pixel.data.rendering.pangocairo_renderer:Loading font from /content/Pixel-Linguist/data/fallback_fonts/GoNotoCurrent.ttf

Running task:  SICK-R
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - SICK-R, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating SICK-R **********************
INFO:mteb.evaluation.MTEB:Loading dataset for SICK-R
INFO:mteb.abstasks.AbsTaskSTS:
Task: SICK-R, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 9927 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 9927 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for SICK-R on test took 82.60 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7781435725561205, 'spearman': 0.693026233861837}, 'manhattan': {'pearson': 0.7458007547441067, 'spearman': 0.6917351042169897}, 'euclidean': {'pearson': 0.74647644146939, 'spearman': 0.6930262498681216}, 'evaluation_time': 82.6}
Running task:  STS12
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS12, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS12 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS12
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS12, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3108 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3108 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS12 on test took 26.10 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.8148581693135399, 'spearman': 0.7350674849787033}, 'manhattan': {'pearson': 0.7817916408847375, 'spearman': 0.7352018056406955}, 'euclidean': {'pearson': 0.7811356917848975, 'spearman': 0.7350674571026258}, 'evaluation_time': 26.1}
Running task:  STS13
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS13, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS13 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS13
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS13, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1500 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1500 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS13 on test took 12.46 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6425808819903709, 'spearman': 0.655392178430937}, 'manhattan': {'pearson': 0.6577538125092147, 'spearman': 0.6567118768579832}, 'euclidean': {'pearson': 0.6561594766016265, 'spearman': 0.655392178430937}, 'evaluation_time': 12.46}
Running task:  STS14
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS14, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS14 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS14
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS14, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3750 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3750 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS14 on test took 31.00 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6901987553139182, 'spearman': 0.6729398845171846}, 'manhattan': {'pearson': 0.6867942298865358, 'spearman': 0.6731367422383607}, 'euclidean': {'pearson': 0.6863547804637292, 'spearman': 0.6729397051395782}, 'evaluation_time': 31.0}
Running task:  STS15
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS15, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS15 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS15
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS15, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3000 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 3000 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS15 on test took 24.45 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7741248388397876, 'spearman': 0.7920999963495237}, 'manhattan': {'pearson': 0.779554210832401, 'spearman': 0.7913264432791515}, 'euclidean': {'pearson': 0.7804673632290055, 'spearman': 0.7920999963495237}, 'evaluation_time': 24.45}
Running task:  STS16
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STS16, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STS16 **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STS16
INFO:mteb.abstasks.AbsTaskSTS:
Task: STS16, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1186 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1186 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STS16 on test took 9.79 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.6909690739150925, 'spearman': 0.6965494553846369}, 'manhattan': {'pearson': 0.6959238122190351, 'spearman': 0.697274648835741}, 'euclidean': {'pearson': 0.6951573797300552, 'spearman': 0.6965494553846369}, 'evaluation_time': 9.79}
Running task:  STSBenchmark
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
───────────────────────────────────────── Selected tasks  ──────────────────────────────────────────
STS
    - STSBenchmark, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating STSBenchmark **********************
INFO:mteb.evaluation.MTEB:Loading dataset for STSBenchmark
INFO:mteb.abstasks.AbsTaskSTS:
Task: STSBenchmark, split: test. Running...
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1379 sentences1...
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:962: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
INFO:mteb.evaluation.evaluators.STSEvaluator:Encoding 1379 sentences2...
INFO:mteb.evaluation.evaluators.STSEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for STSBenchmark on test took 11.37 seconds
INFO:mteb.evaluation.MTEB:Scores: {'cos_sim': {'pearson': 0.7368840970678752, 'spearman': 0.7371328443627424}, 'manhattan': {'pearson': 0.7446787325062066, 'spearman': 0.7349454241308658}, 'euclidean': {'pearson': 0.7465475020493239, 'spearman': 0.7371330032653948}, 'evaluation_time': 11.37}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.