Computational corpus analysis of the Rigveda, in order to check if the poetic meter there preserves any traces of a pre-form **-ni- for the Sanskrit ninth class weak suffix -nī-.
Also validate metrical restorations in the Rigveda, in relation to the short root vocalism of the ninth class forms.
Our study is composed of the following distinct stages – the details of each are documented in the corresponding Python notebook file, which can also be used to reproduce its final results.
notebook: 1_raw_corpus.ipynb | helper: src/transform_json_corpus.py
results: data/rv_samhitapatha_vnh.txt, data/rv_padapatha_lubotsky.txt
Retrieve the raw text for the two versions of Rigveda that we are using, used to quickly validate the results in subsequent stages.
notebook: 2_roots.ipynb | helper: src/lib/roots.py
results: data/roots.csv
Parse and compile a list of ninth and fifth class roots/stems based on the comprehensive listing given by Whitney (1887: 213–214).
notebook: 3_roots_with_attestations.ipynb | helper: src/lib/roots_attestations.py
results: data/roots_with_attestations.csv, data/roots_with_attested_words.json
Using VedaWeb’s grammar search api, search the Rigveda for the finite verb forms associated with each of the ninth and fifth class stems, recording the RV location (book.hymn.stanza) where they are attested.
notebook: 4_verse_lines.ipynb | helper: src/lib/verse_lines.py
results: data/rv_lines.csv
Compile the exact pādas with the verbal attestations, saving its text as well as other metadata like stanza meter and strata, obtained via the VedaWeb api.
notebook: 5_verse_lines_with_meter.ipynb | helper: src/lib/meter.py, src/test_meter_analysis.py
Final Files: data/rv_lines_with_meter.csv
For each of the pādas, programmatically generate its metrical scansion (i.e. whether each syllable is long or short), noting down meter failures (if any); also record the expected scansion of our stem vowels based on their position in the meter. This stage produces the final dataset for our main analyses.
notebook: 6_analysis.ipynb
Analyze the overall as well as per-strata counts of -nī- in each of the expected metrical positions (S, L, X), in relation to the control suffixes -nā- and -no-, focussing on pādas composed in one of the popular meters.