Dependency Annotation and Parsing for Spontaneous Speech
The current directory contains data, code and models for Dependency Parsing Evaluation for Low-resource Spontaneous Speech
- Convert original CHILDES xml file to .conllu file
python3 code/ori_xml2conll.py --input Path_To_Corpus --output Path_To_Output --section Section (e.g. English-NA or English-UK)
eve.py
tailored specifically for the Eve corpus from the Brown corpus
- Semi-automatic conversion from CHILDES annotation to UD annotation
python3 code/converter.py --input Input_Path --output Output_Path
- in
data/Eve/eve_annotated
- Manual annotation
- in
data/Eve/eve_annotated
- in
- Significance testing of parsing results
python3 code/bootrap.py --gold Gold_Annotation_File --pred Predicted_File --n Number_Of_Iterations (e.g. 10000) --c Sample_Size (e.g. number of utterances in the file)
- Descriptive statistics of child information from CHILDES
python3 code/descriptive_statistics.py --input Input_Path --output .csv_file
- English: in
results/en_descriptive.csv
- Chinese: in
results/zh_descriptive.csv
- Models