Code Monkey home page Code Monkey logo

rhymedata's Introduction

This is the Chicago Rhyming Poetry Corpus, consisting of annotated poetry in English and French.

Much of the English data was collected and annotated by Morgan Sonderegger ([email protected]). The English data was expanded and edited, and the French data added, by Sravana Reddy ([email protected]).

This directory contains raw data files with annotations (english_raw and french_raw), as well as files containing only stanza end-words, which is what is used by our code for training and evaluation (english_gold and french_gold). 

All poetry in the corpus is in the public domain and freely available electronically. 

*Description of raw txt files:*

Each file contains a collection of poems by a single poet. For example, here is an extract from wyatt.txt:

AUTHOR Thomas Wyatt  <= Name of poet.

TITLE Satire II  <= Title of poem that follows.

RHYME-POEM a b a
RHYME a b a 

MY mother's maids, when they did sew and spin,
They sang sometime a song of the field mouse
That, for because her livelihood was but thin,

RHYME-POEM b c b <= If relevant, scheme in the context of the whole poem (here, this stanza shares rhymes with the previous stanza). If there is no rhyme sharing, this field is omitted.
RHYME a b a <= Rhyme scheme of the stanza independent of others in the poem. 

Would needs go seek her townish sister's house.
She thought herself endured too much pain;
The stormy blasts her cave so sore did souse
...

TITLE A Love Song <= Title of next poem. Also denotes the ending of the above poem.

Sometimes, we use a shorthand for the rhyme scheme, like 

RHYME a a *

This denotes the rhyme scheme aabbccdd...

*Description of gold files:*

These are derived from the raw data by extracting only the end-words of each stanza.  The file corresponding to Wyatt looks like this:

POEM0 spin mouse thin  <= Indicates the id of the poem that the stanza belongs to, and a list of the end-words.
1 2 1  <= Rhyme scheme of stanza
1 2 1

POEM0 house pain souse
1 2 1   
2 3 2    <= Rhyme scheme in context of poem (indicating that 1st word rhymes with 2nd word in previous stanza)

This corpus is under development; we hope to expand it to cover more texts and languages. Please e-mail [email protected] if you would like to contribute, or if you find an error in the annotations.

rhymedata's People

Contributors

sravanareddy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.