my89 / co-squac Goto Github PK

View Code? Open in Web Editor NEW

25.0 25.0 6.0 146 MB

A repository for converting between CoQA, SQuAD2, and QuAC and visualizing the data.

Python 6.55% TeX 93.45%

co-squac's People

Contributors

Stargazers

Watchers

Forkers

sparkjiao milllllk amoonhappy hanst tsingww qxl-space

co-squac's Issues

Any future plans to include DROP data and HOTPOTQA data?

Great work to unify important reading comprehension datasets. Any futures plans to include newly released DROP and HOTPOTQA data? Thanks.

Question: why this string 'CANNOTANSWER' is added to every context field?

Question:

why this string 'CANNOTANSWER' is added to every "context" field in the converted datasets (SQUAD 2/COQA/QUAC)? In original datasets this 'CANNOTANSWER' is not there.

For example,
"context": "The Normans ... 11th centuries gave their name to Normandy, a region in France. They were descended from ... and it continued to evolve over the succeeding centuries. CANNOTANSWER","

Any special meaning by adding this into the important context field?

Thanks.

Readme for output.tar files

Hi I am a bit confused as what each file mean in output.tar.
For example what does quacftcoqa_coqa.coqa mean. More specifically I am looking for quac model's output on coqa. Thanks in advance.

are there plans to publish the data analyses?

Hey there,
I was wondering if there are plans to publish the data analyses, specifically the Unanswerable Questions types.

Where is this models/squad2.tar as in the example commandline?

Could you please let us know? Is it a pretrained Allennlp model but where to download it? Just want to give it a try.

Thanks!

The predict command in the README fails.

Running the predict command fails with the error

$ python -m allennlp.run predict models/squad2.tar co-squac/datasets/converted/quac_dev.json --use-dataset-reader --output-file output/squad2.quac --batch-size 10 --cuda-device 0 --silent

 INFO - root - Loading a model trained before embedding extension was implemented; pass an explicit vocab namespace if you want to extend the vocabulary.
ERROR - allennlp.data.vocabulary - Namespace: yesno_labels 
ERROR - allennlp.data.vocabulary - Token: y                                                                                                            
  File "/home/ec2-user/.local/lib/python3.6/site-packages/allennlp/data/fields/label_field.py", line 82, in index      
    self._label_id = vocab.get_token_index(self.label, self._label_namespace)  # type: ignore                          
  File "/home/ec2-user/.local/lib/python3.6/site-packages/allennlp/data/vocabulary.py", line 637, in get_token_index   
    return self._token_to_index[namespace][self._oov_token]                                                            
KeyError: '@@UNKNOWN@@'

The error is probably due to the fact that the yesno_label is always x in squad, but it can be y in quac.

$ tar -axf squad2.tar vocabulary/yesno_labels.txt -O 
x

This error may only be coming up now after the update of allennlp. Could you say which allennlp version you used for your experiments?

my89 / co-squac Goto Github PK

co-squac's People

Contributors

Stargazers

Watchers

Forkers

co-squac's Issues

Any future plans to include DROP data and HOTPOTQA data?

Question: why this string 'CANNOTANSWER' is added to every context field?

Readme for output.tar files

are there plans to publish the data analyses?

Where is this models/squad2.tar as in the example commandline?

The predict command in the README fails.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent