Code Monkey home page Code Monkey logo

armenian-intonation's Introduction

Speech corpus of Armenian question-answer dialogues

This is a corpus of elicited controlled speech. The stimuli was a sequence of dialogues with intermittent fillers. This repository is for only the stimuli. The stimuli was designed to elicit intonation patterns for questions and answers in two Armenian dialects: Western Armenian (WA) and Eastern Armenian (EA). The recordings can be used for topics like intonation prosody, forced alignment, or ASR (Automatic Speech Recognition).

The dataset is is open-access at 8,852 dialogues, consisting of 23,711 utterances (individual sound files), for a total of 2.7GB and 8.5hrs. Each utterance has a sound file, a Praat TextGrid (with full linguistic annotation), and text file that has orthographic forms for easier ASR uses. Pronunciation dictionaries are provided for ASR or forced alignment purposes as well. We genereted a forced alignment for these recordings using a cross-language alignment thanks to Interlingual-MFA. See the Alignments folder.

If you use the data in any way, please cite us as:

Chakmakjian, Samuel and Hossep Dolatian. 2022. Speech corpus of Armenian question-answer dialogues. DOI

Stimuli design

Overview

A dialogue is made up of at least a question (Q) and an answer (A). Some dialogues include an interjection (I) and a negated verb (N). We call all these elements (Q, A, I, N) utterances.

The question and answer were SOV sentences. The dialogues were of three types, each with a different position of focus. Focus was either on the subject, object, or verb. Dialogues also varied in the choice of the object word. The object word could have either final stress, penultimate stress, or initial stress.

File utterance-metadata (in Excel and TSV versions) has metadata on the conditions for each recorded utterance.

Dialogue types and focus type

The following is the template for the dialogues. The actual recordings vary in the TARGET word for the object. Note that for Western Armenian, our speakers were from Syria. They usually didn’t aspirate.

Type    
   Subject focus dialogue
Question IPA (WA) *ov* TARGET əsɑv
IPA (EA) *ov* TARGET ɑsɑt͡sʰ
Gloss who TARGET said
Translation *Who* said TARGET?
Orthography Ո՞վ «TARGET»   ըսաւ/ասաց։
Answer IPA (WA) *mɑɾjɑmə* TARGET əsɑv
IPA (EA) *mɑɾjɑmə* TARGET ɑsɑt͡sʰ
Gloss Mariam TARGET said
Translation *Mariam* said TARGET.
Orthography Մարիամը «TARGET»   ըսաւ/ասաց։
Type    
   Object focus dialogue
Question IPA (WA) mɑɾjɑmə *int͡ʃ* əsɑv
IPA (EA) mɑɾjɑmə *int͡ʃʰ* ɑsɑt͡sʰ
Gloss Mariam what said
Translation *What* did Mariam say?
Orthography Մարիամը ի՞նչ   ըսաւ/ասաց։
Answer IPA (WA) mɑɾjɑmə *TARGET* əsɑv
IPA (EA) mɑɾjɑmə *TARGET* ɑsɑt͡sʰ
Gloss Mariam TARGET said
Translation Mariam said *TARGET*.
Orthography Մարիամը «TARGET»   ըսաւ/ասաց։
Type    
   Verb focus dialogue
Question IPA (WA) mɑɾjɑmə TARGET *ɡɑɾtɑt͡s*
IPA (EA) mɑɾjɑmə TARGET *kɑɾtʰɑt͡sʰ*
Gloss Mariam TARGET read
Translation Did Mariam *read* TARGET?
Orthography Մարիամը «TARGET»   կարդա՞ց։
Interjection IPA (WA) vot͡ʃ
IPA (EA) vot͡ʃʰ
Gloss no
Translation No
Orthography Ոչ
Answer IPA (WA) mɑɾjɑmə TARGET *əsɑv*
IPA (EA) mɑɾjɑmə TARGET *ɑsɑt͡sʰ*
Gloss Mariam TARGET said
Translation Mariam *said* TARGET.
Orthography Մարիամը «TARGET»   ըսաւ/ասաց
Negation IPA (WA) t͡ʃəɡɑɾtɑt͡s
IPA (EA) t͡ʃʰəkɑɾtʰɑt͡sʰ
Gloss not.read
Translation She didn't read.
Orthography չկարդաց։

In the typical case, each type of question and answer sentence had its own special intonational contour, summarized in the following table.

Focus type Utterance
Question (q) Answer (a)
Subject focus (tS) Pitch-rise on subject
Post-focal deaccenting
Final rise (WA)
Final fall (EA)
Pitch-rise on subject
Post-focal deaccenting
Final fall
Object focus (tO) Pitch-rise on object
Post-focal deaccenting
Final rise (WA)
Final fall (EA)
Pitch-rise on object
Post-focal deaccenting
Final fall
Verb focus (tV) Pitch-rise on verb = final rise
Optional pre-focal deaccenting
Optional pitch-rise on verb
Final fall

Stress type of target word

The TARGET word varies in its stress location. It has one of the following conditions.

Stress type (code) Subcategory Example WA Example EA Orthography Translation
Final (s3) dɑniki tɑnikʰi տանիքի of the roof
Final (s3a) adverb sutoɾen սուտորեն falsely
Penult (s2) ends in /-ə/ kid͡zeɾə ɡit͡seɾə գիծերը the lines
Penult (s2s) ends in /-əs/ diʒəs tiʒəs պատիժս my punishment
Penult (s2t) ends in /-ət/ didət titət մատիտդ your punishment
Initial (s1o) ordinal uteɾoɾt utʰeɾoɾtʰ ութերորդ eighth
Initial (s1a) adverb sudoɾen սուտօրէն falsely

Materials

Recordings were made with 19 speakers: 10 for Eastern Armenian (5 female, 5 male) and 9 for Western Armenian (5 female, 4 male). In terms of origin, the Eastern Armenian speakers were from Yerevan, Armenia, while the Western Armenian speakers were from Aleppo, Syria. All 19 speakers were living in Yerevan during the time of the recording. Speaker metadata is in file speaker-metadata (in Excel and TSV versions).

The participants were recorded reading the dialogues on a PowerPoint presentation. In our annotation, we broke up each dialogue into its component utterances (Q, A, I, N) using a Praat script. Each utterance is found in the repository in the form of a sound file .wav, a Praat TextGrid .TextGrid, and a transcript file .txt. Data is in the data folder.

We annotated the recordings with information on quality. Most recordings had little to no disfluencies or background noise. These are found in the data-few-issues.

Some recorded examples however had such problems. Files were annotated with the symbol _? if they had a mild issue in data-moderate-issues, and _0 if they had a severe issue in data-severe-issues. We list such problems:

  • Mild or moderate issues:
    • focus-unclear: The intonation is ambiguous.
    • laughing: The participant is laughing.
    • noise-mild: There is mild background noise.
    • pause-mild: There is a small felicitous pause in the middle of the sentence.
    • pause-noise-mild: There is both mild background noise and a small pause.
    • unclear-segments: A segment was pronounced unclearly.
  • Severe issues:
    • focus-wrong-intonation: The participant used the wrong intonation.
    • noise-extreme: There is extreme background noise.
    • pause-extreme: There is a long infelicitous pause in the middle of the sentence.
    • pause-noise-extreme: There is both extreme noise and a long pause.
    • not-template: The utterance was misread in a way that doesn't fit into our templates, such as omitting the subject.
    • stutter-or-missing-sound: The participant stuttered in speech or omitted a sound.

We provided forced alignments using for the data-few-issues recordings. See the Alignments folder.

Recommendations

The recordings can be used for different purposes. We plan on using them for work on intonation phonetics and forced alignment. For phonetic studies, recordings with no or moderate issues can be suitable. But recordings with severe issues are not ideal or recommended. But for forced alignment, the recordings with severe issues might still be useful as a way to prevent overfitting or accommodating noisy data.

The transcript files .txt are to make forced alignment tasks easier. The pronunciation dictionaries for Western Armenian and Eastern Armenian are for forced alignment purposes.

License

The dataset is made available to the research community licensed under the GNU General Public License v3.0.

Contact

Feel free to contact us at [email protected] if you have any questions or concerns.

armenian-intonation's People

Contributors

jhdeov avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

skopeteas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.