Complexity/informativeness trade-off in the domain of indefinite pronouns

This folder contains (i) code and (ii) appendix to the paper 'Complexity/informativeness trade/off in the domain of indefinite pronouns' by Milica Denić, Shane Steinert-Threlkeld and Jakub Szymanik, to appear in Proceedings of Semantics and Linguistics Theory 2020.

The appendix file is Negative_indefinites_40_lang_data.pdf.

In the remainder, we describe the code.

Requirements

Python 2.7.14
- Required Python packages are in requirements.txt.
R 3.5.2
- Required R packages are: datasets 3.5.2, dplyr 0.7.8, ggplot2 3.1.0, graphics 3.5.2, grDevices 3.5.2, methods 3.5.2, minpack.lm 1.2.1, plyr 1.8.4, stats 3.5.2, tidyr 0.8.2, utils 3.5.2, rlist 0.4.6, stringr 1.3.1

Scripts and language files

Python and R scripts are in the folder src. CSV files needed for scripts to run and generated by them are in the folder data.

Beekhuizen_priors.R
- Description: It extracts the prior probability distribution over flavors from the annotated corpus from Beekhuizen et al.'s (2017) study downloaded from here, and stores it in Beekhuizen_priors.csv file.
- Dependencies: beekhuizen_full_set.csv
Min-desc-length-algo.R
- Description: It generates the minimum-length feature-based descriptions of all logically possible indefinite pronouns (in terms of which combination of flavors they can express) and stores them in minimum-desc-indef.csv file.
Indefinites_functions.R
- Description: Definitions of a series of useful functions for Experiments 1 and 2.
- Dependencies: Beekhuizen_priors.csv
Exp1_languages.R
- Description: It imports the data file with Haspelmath's 40 natural languages, generates 10000 aritificial languages used in Experiment 1, computes communicative cost and complexity of these languages and stores them into all_complexity_cost_exp1.csv. Finally, it performs synonymy matching and stores the matched languages in syn_matched_exp1.csv.
- Dependencies: Indefinites_functions.R, languages_real_40_updated.csv, minimum-desc-indef.csv
Exp2_languages.R
- Description: It generates 10 000 artificial languages used in Experiment 1 (5000 Haspel-ok and 5000 Not Haspel-ok languages), computes communicative cost and complexity of these languages and stores them into all_complexity_cost_exp2.csv. Finally, it performs synonymy matching and stores the matched languages in syn_matched_exp2.csv.
- Dependencies: Indefinites_functions.R, minimum-desc-indef.csv
Indef_gen_alg.py
- Description: It runs an evolutionary algorithm selecting for Pareto optimal languages for 100 generations. It stores the complexity and communicative cost measures of the final generation in finalgencostcom.csv. Finally, it selects dominant languages in terms of complexity and communicative cost from the final generation and the languages used in Experiment 1 and stores them in pareto_dominant.csv.
- Dependencies: all_complexity_cost_exp1.csv, minimum-desc-indef.csv, allitems.csv, (a file with all logically possible items), Beekhuizen_priors.csv
Exp1_and_2_pareto.R
- Description: It imports the data file with dominant languages in terms of complexity and communicative cost and based on them estimates the Pareto frontier for indefinite pronouns. It plots languages of Experiment 1 and languages of Experiment 2 with respect to the frontier. It computes minimum Euclidian distances of languages of Experiment 1 and 2 from the Pareto frontier, and stores them in natural_distances_pareto.csv, artificial_distances_pareto.csv, Haspok_distances_pareto.csv, Haspnotok_distances_pareto.csv. Finally, it establishes that (i) natural languages are closer to the frontier than artificial languages; and (ii) that languages which satisfy Haspelmath's universals are closer to the frontier than languages which do not satisfy them.
- Dependencies: Indefinites_functions.R, syn_matched_exp1.csv, syn_matched_exp2.csv, pareto_dominant.csv

illc-uva / indefinite-pronouns-salt Goto Github PK

indefinite-pronouns-salt's Introduction

Complexity/informativeness trade-off in the domain of indefinite pronouns

Requirements

Scripts and language files

indefinite-pronouns-salt's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent