Code Monkey home page Code Monkey logo

vizseq's People

Contributors

bigfootjon avatar dependabot[bot] avatar kahne avatar multipath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vizseq's Issues

[Bug] Getting Import error (Tokenizer13a) when following instructions

๐Ÿ› Bug

Hi there,

I've installed vizseq as described in the README and then tried to import Tokenizer13a. Unfortunately I got
ImportError: cannot import name 'Tokenizer13a' error.

git clone https://github.com/facebookresearch/vizseq.git
cd vizseq
pip install --editable ./

> Successfully installed sacrebleu-2.0.0 vizseq-0.1.15


To reproduce

git clone https://github.com/facebookresearch/vizseq.git
cd vizseq
pip install --editable ./
python 
>>> from sacrebleu.tokenizers import Tokenizer13a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Tokenizer13a'

** Minimal Code/Config snippet to reproduce **

** Stack trace/error message **

>>> from sacrebleu.tokenizers import Tokenizer13a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Tokenizer13a'

Expected Behavior

System information

  • VizSeq Version 0.1.15
  • Python version 3.6
  • Amazon Linux AMI (fedora)

Additional context

Add any other context about the problem here.

Question about `tag` and `group` in official example

In the official scorer example from https://facebookresearch.github.io/vizseq/docs/getting_started/scorer_example/, the second block confuse me.

Corpus-level BLEU: 67.945
Sentence-level BLEU: [75.984, 61.479]
Group BLEU: {'Test Group 2': 75.984, 'Test Group 1': 75.984}

I can see two generated sentences with corresponding reference sentences in the first block.

ref = [['This is a sample #1 reference.', 'This is a sample #2 reference.']]
hypo = ['This is a sample #1 prediction.', 'This is a sample #2 model prediction.']
tags = [['Test Group 1', 'Test Group 2']]
scores = scorer.score(hypo, ref, tags=tags)
print(f'Corpus-level BLEU: {scores.corpus_score}')
print(f'Sentence-level BLEU: {scores.sent_scores}')
print(f'Group BLEU: {scores.group_scores}')

The first sample belongs to Test Group 1 and the second sample belongs to Test Group 2. If I'm not misunderstanding the use of the tag, according to the Sentence-level BLEU,the Group BLEU should be {'Test Group 2': 61.479, 'Test Group 1': 75.984}.

But the execution result is Group BLEU: {'Test Group 2': 75.984, 'Test Group 1': 75.984}

View Scores NoneType not subscriptable

When running the following for scores like Rouge and others like:

vizseq.view_scores(ref, hypo, ['metric that's not bleu'], tags=tag)

I am getting:

~/.local/lib/python3.6/site-packages/vizseq/scorers/__init__.py in _score_multiprocess_averaged(self, hypothesis, references, tags, sent_score_func)
    170             for t in tag_set:
    171                 indices = [i for i, cur in enumerate(tags) if t in cur]
--> 172                 group_scores[t] = np.mean([sent_scores[i] for i in indices])
    173 
    174         return VizSeqScore.make(

~/.local/lib/python3.6/site-packages/vizseq/scorers/__init__.py in <listcomp>(.0)
    170             for t in tag_set:
    171                 indices = [i for i, cur in enumerate(tags) if t in cur]
--> 172                 group_scores[t] = np.mean([sent_scores[i] for i in indices])
    173 
    174         return VizSeqScore.make(

TypeError: 'NoneType' object is not subscriptable

Any idea why that's going on? Using text data this is at least 3 tokens or more.

This works fine when running view_examples.

[Feature Request] Update BertScorer with oop implementation

๐Ÿš€ Feature Request

Bert Score (https://github.com/Tiiiger/bert_score) new version (0.3.1) supports oop implementation. Current vizseq uses the functional implementation which could be updated to oop implementation.

Motivation

Currently, using Bert Score in a validation loop causes re-loading the model again and again. This can be avoided with oop implementation.

Pitch

I see two solutions: (i) create a separate scorer named bert_score_oop (ii) in the current implementation of bert_score add argument whether to use oop implementation or not.

Are you willing to open a pull request?
Yes, I can send a pull request

[Bug] BLEUScorer uses wrong default tokenizer.

๐Ÿ› Bug

vizseq.scorers.bleu.BLEUScorer does not use Tokenizer13a by default. When I look at the code, it looks like it should be used by default. sacrebleu library uses the Tokenizer13a by default as well.

To reproduce

Minimal Code/Config snippet to reproduce

import vizseq

scorer = vizseq.scorers.bleu.BLEUScorer()
print(scorer.score(["This is really nice."], [["That's really nice."]]))
# corpus_score = 31.947

scorer = vizseq.scorers.bleu.BLEUScorer(extra_args={'tokenizer': '13a'})
print(scorer.score(["This is really nice."], [["That's really nice."]]))
# corpus_score = 39.764

Stack trace/error message

The problem is here. Variable tokenizer is set to string none. When calling method get_default_args (here), default value 13a for parameter tokenize is not used, because the string none is passed.

Expected Behavior

vizseq.scorers.bleu.BLEUScorer should use Tokenizer13a by default.

System information

  • vizseq==0.1.15
  • python==3.7.3
  • macOS

๐Ÿ› AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'

๐Ÿ› Bug

When trying to run the webapp with the example data, I have this error :

AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'

To reproduce

Follow README instructions : download example data and run :

python -m vizseq.server --port 9001 --data-root ./examples/data

Stack trace/error message

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/me/workspace/vizseq/vizseq/server.py", line 14, in <module>
    logger.set_console_mode(enable=True)
AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'

Expected Behavior

The code run normally.

System information

  • VizSeq Version : 0.1.2
  • Python version : 3.6.8
  • Operating system : Ubuntu 16.04

pip3 install vizseq failed on AArch64, Fedora 33

[jw@cn05 ~]$ pip3 install vizseq
Defaulting to user installation because normal site-packages is not writeable
Collecting vizseq
Using cached vizseq-0.1.15-py3-none-any.whl (81 kB)
Collecting nltk>=3.5
Using cached nltk-3.5-py3-none-any.whl
Collecting sacrebleu>=1.4.13
Using cached sacrebleu-1.5.0-py3-none-any.whl (65 kB)
Collecting langid
Using cached langid-1.1.6.tar.gz (1.9 MB)
Requirement already satisfied: tqdm in ./.local/lib/python3.9/site-packages (from vizseq) (4.31.1)
Collecting google-cloud-translate
Using cached google_cloud_translate-3.0.2-py2.py3-none-any.whl (93 kB)
Collecting torch
Using cached torch-0.1.2.post2.tar.gz (128 kB)
Requirement already satisfied: numpy in ./.local/lib/python3.9/site-packages (from vizseq) (1.19.5)
Requirement already satisfied: jinja2 in ./.local/lib/python3.9/site-packages (from vizseq) (2.10.3)
Collecting soundfile
Using cached SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB)
Requirement already satisfied: py-rouge in ./.local/lib/python3.9/site-packages (from vizseq) (1.1)
Requirement already satisfied: matplotlib in ./.local/lib/python3.9/site-packages (from vizseq) (3.3.2)
Requirement already satisfied: tornado in ./.local/lib/python3.9/site-packages (from vizseq) (6.1)
Requirement already satisfied: IPython in ./.local/lib/python3.9/site-packages (from vizseq) (7.18.1)
Collecting bert-score
Using cached bert_score-0.3.7-py3-none-any.whl (53 kB)
Requirement already satisfied: pandas in ./.local/lib/python3.9/site-packages (from vizseq) (1.1.4)
Collecting laserembeddings
Using cached laserembeddings-1.1.1-py3-none-any.whl (13 kB)
Requirement already satisfied: click in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (7.1.2)
Requirement already satisfied: regex in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (2020.11.13)
Requirement already satisfied: joblib in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (0.17.0)
Collecting portalocker
Using cached portalocker-2.2.1-py2.py3-none-any.whl (15 kB)
Collecting transformers>=3.0.0
Using cached transformers-4.3.3-py3-none-any.whl (1.9 MB)
Collecting bert-score
Using cached bert_score-0.3.6-py3-none-any.whl (53 kB)
Using cached bert_score-0.3.5-py3-none-any.whl (52 kB)
Using cached bert_score-0.3.4-py3-none-any.whl (52 kB)
Using cached bert_score-0.3.3-py3-none-any.whl (52 kB)
Using cached bert_score-0.3.2-py3-none-any.whl (52 kB)
Using cached bert_score-0.3.1-py3-none-any.whl (51 kB)
Using cached bert_score-0.3.0-py3-none-any.whl (48 kB)
Using cached bert_score-0.2.3-py3-none-any.whl (15 kB)
Using cached bert_score-0.2.2-py3-none-any.whl (14 kB)
Using cached bert_score-0.1.2-py3-none-any.whl (9.4 kB)
Using cached bert_score-0.1.1-py3-none-any.whl (9.4 kB)
Using cached bert_score-0.1.0-py3-none-any.whl (7.3 kB)
INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of sacrebleu to determine which version is compatible with other requirements. This could take a while.
Collecting sacrebleu>=1.4.13
Using cached sacrebleu-1.4.14-py3-none-any.whl (64 kB)
Using cached sacrebleu-1.4.13-py3-none-any.whl (43 kB)
INFO: pip is looking at multiple versions of nltk to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of vizseq to determine which version is compatible with other requirements. This could take a while.
Collecting vizseq
Using cached vizseq-0.1.14-py3-none-any.whl (81 kB)
Using cached vizseq-0.1.13-py3-none-any.whl (81 kB)
Using cached vizseq-0.1.12-py3-none-any.whl (81 kB)
Using cached vizseq-0.1.11-py3-none-any.whl (81 kB)
Using cached vizseq-0.1.10-py3-none-any.whl (80 kB)
Using cached vizseq-0.1.9-py3-none-any.whl (78 kB)
Requirement already satisfied: nltk in ./.local/lib/python3.9/site-packages (from vizseq) (3.4.5)
Collecting sacrebleu==1.4.7
Using cached sacrebleu-1.4.7-py3-none-any.whl (59 kB)
Requirement already satisfied: typing in ./.local/lib/python3.9/site-packages (from sacrebleu==1.4.7->vizseq) (3.7.4.3)
Collecting mecab-python3
Using cached mecab-python3-1.0.3.tar.gz (77 kB)
INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of sacrebleu to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install vizseq and vizseq==0.1.9 because these package versions have conflicting dependencies.

The conflict is caused by:
vizseq 0.1.9 depends on torch
bert-score 0.3.7 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.6 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.5 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.4 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.3 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.2 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.1 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.3.0 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.2.3 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.2.2 depends on torch>=1.0.0
vizseq 0.1.9 depends on torch
bert-score 0.1.2 depends on torch>=0.4.1
vizseq 0.1.9 depends on torch
bert-score 0.1.1 depends on torch>=0.4.1
vizseq 0.1.9 depends on torch
bert-score 0.1.0 depends on torch>=0.4.1

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
[jw@cn05 ~]$

[Question] Calculated BLEU score

Hi :)

The tool returns a BLEU score for the machine translation and runs great in general, but I am not sure if the BLEU score represents the sentence level or corpus level? I haven't been able to gather anything conclusive from the sacreBLEU implementation, so I am hoping you can help me with this :)

Best regards,
Tobias

Plans for windows 10 support

Hi

This repo is looking really cool and I would like to use the features it offers for my thesis :)

What are the plans/timeline for implementing win 10 support?

Best regards,
Tobias

[Bug] Vizseq CSS breaks Jupyter layout

๐Ÿ› Bug

Executing vizseq.view_stats breaks the layout of the Jupyter. The menu at the top obscures a majority of the screen and a blank area of ~60px appears at the top of the page.

Screenshot

To reproduce

** Minimal Code/Config snippet to reproduce **

  1. start Jupyter jupyter notebook
  2. view any of the example notebooks e.g. speech_translation
  3. Execute the cells one-by-one.

When the first cell containing vizseq.view_stats is finishes the layout changes and appears broken.

Expected Behavior

The display of tables and graphs by vizseq does not affect the layout of the Jupyter notebook.

System information

  • VizSeq Version: '0.1.11' (clone from master yesterday)
  • Python version: Python 3.8.1 (default, Jan 8 2020, 23:09:20) [GCC 9.2.0] on linux
  • Operating system: Manjaro Linux

Additional context

Cause: The bootstrap.min.css and an inline stylesheet loaded by vizseq break the layout. The inline stylesheet is:

body {
   padding-top: 60px; /* 60px to make the container go all the way to the bottom of the topbar */
}

The inline stylesheet is responsible for the blank bar at the top while Bootstrap breaks the menu's formatting.

To test this disable both stylesheets in the stylesheet editor included in the developer tools of a browser.

[Bug] - cannot import name 'tokenize_13a' from 'sacrebleu'

๐Ÿ› Bug

I just followed the installation steps and got this error.

To reproduce

** Minimal Code/Config snippet to reproduce **

** Stack trace/error message **

(base) diegomoussallem@Diegos-MBP examples % python -m vizseq.server --port 9001 --data-root examples/data
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/anaconda3/lib/python3.7/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/__init__.py", line 15, in <module>
    from vizseq.ipynb import *
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/ipynb/__init__.py", line 8, in <module>
    from .core import (view_examples, view_n_grams, view_stats, view_scores,
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/ipynb/core.py", line 15, in <module>
    from vizseq._data import (VizSeqDataSources, PathOrPathsOrDictOfStrList,
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/__init__.py", line 14, in <module>
    from .config_manager import VizSeqTaskConfigManager, VizSeqGlobalConfigManager
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/config_manager.py", line 13, in <module>
    from .tokenizers import VizSeqTokenization
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/tokenizers.py", line 10, in <module>
    from sacrebleu import tokenize_13a, tokenize_v14_international, tokenize_zh
ImportError: cannot import name 'tokenize_13a' from 'sacrebleu' (/opt/anaconda3/lib/python3.7/site-packages/sacrebleu/__init__.py)

Expected Behavior

System information

Additional context

Add any other context about the problem here.

๐Ÿ› TypeError: score() got an unexpected keyword argument 'bert'

๐Ÿ› Bug

I tried to apply BertScore on my data, but received this error :

TypeError: score() got an unexpected keyword argument 'bert'

To reproduce

In configuration, select BertScore as metric.
Refresh the page

Stack trace/error message

Traceback (most recent call last):
  File "/home/me/.venv/presum/lib/python3.6/site-packages/tornado/web.py", line 1590, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/home/me/workspace/vizseq/vizseq/server.py", line 103, in get
    pd = wv.get_page_data()
  File "/home/me/workspace/vizseq/vizseq/_view/web_view.py", line 158, in get_page_data
    sorting_metric=self.sorting_metric, need_lang_tags=True
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 132, in get
    for s in metrics
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 132, in <dictcomp>
    for s in metrics
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 130, in <dictcomp>
    for m, hh in cur_hypo.items()
  File "/home/me/workspace/vizseq/vizseq/scorers/bert_score.py", line 28, in score
    no_idf=True, verbose=self.verbose
TypeError: score() got an unexpected keyword argument 'bert'

Expected Behavior

Able to see BertScore.

System information

  • VizSeq Version : 0.1.2
  • Python version : 3.6.8
  • Operating system : Ubuntu 16.04

๐Ÿ› Uncaught exception GET

๐Ÿ› Bug

When trying to run the webapp with the example data, I have this error :

Uncaught exception GET

To reproduce

Follow README instructions : download example data and run :

python -m vizseq.server --port 9001 --data-root ./examples/data

The server starts fine, but when accessing the webapp at localhost:9001, I can only see 500: Internal Server Error.

Stack trace/error message

INFO - 11/04/19 10:36:39 - 0:00:00 - Application Started
You can navigate to http://localhost:9001
ERROR - 11/04/19 10:36:42 - 0:00:03 - Uncaught exception GET / (192.168.0.30)
                                      HTTPServerRequest(protocol='http', host='192.168.0.231:9001', method='GET', uri='/', version='HTTP/1.1', remote_ip='192.168.0.30')
ERROR - 11/04/19 10:36:42 - 0:00:03 - 500 GET / (192.168.0.30) 1.18ms

Expected Behavior

The webapp run normally.

System information

  • VizSeq Version : 0.1.2
  • Python version : 3.6.8
  • Operating system : Ubuntu 16.04

Example speech task (IWSLT17 dev) not pairing correct audio source with reference [Bug]

๐Ÿ› Bug

Audio segments from speech data in example speech translation task (IWSLT17 dev) are not correctly associated with reference data.

Only the first TED talk audio segments are correctly aligned to the reference... playing the audio segments related to any other talks (from # 3 / 10 / 887 ( 153 / 887 ) onwards on page 16 of the task using the defaults) results in the segments of the first TED talk audio being played rather than the segments specified in the task directory speech_translation_iwslt17_dev/src_0.zip/source.txt

To reproduce

Get the example speech task data (IWSLT7 dev)

$ bash get_example_data.sh speech_translation_iwslt17_dev

Start the server and navigate to :
http://127.0.0.1:5000/view?t=speech_translation_iwslt17_dev&m=&q=&p_sz=10&p_no=16&s=0&s_metric=
Play the audio segments: first two on this page will be correctly associated with reference text, from # 3 / 10 / 887 ( 153 / 887 ) onwards they are not.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.