Code Monkey home page Code Monkey logo

text-fabric-data's Introduction

text-fabric-data

DOI Project Status: Unsupported – The project has reached a stable, usable state but the author(s) have ceased all work on it. A new maintainer may be desired.

text-fabric

Text-Fabric data is a collection of richly annotated data sources. They are stored in a compact, plain text format .tf.

Fully developed TF data sources can be found here: corpora

Here you find a few data sources developed as a pilot project.

See the docs.

In order to work with them, you need Text-Fabric which is a Python package that you can easily install.

Author

Dirk Roorda

The data is based on sources which are the work of various other people.

text-fabric-data's People

Contributors

dirkroorda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

text-fabric-data's Issues

Probable Error Report in "oslots.tf"

Hello.

Recently, I'm developing the Bible Web Application for the Korean readers, and this is already in service (http://app.alphalef.com). The source code appears in my Github repository. I'm so appreciate you to permit the public use of this valuable database.

The following code is to show the each chapter of Hebrew text.

@app.route('/text/')
@app.route('/text/<book>')
@app.route('/text/<book>/<int:chapter>')
def text_page(book='Genesis', chapter=1):
    chpNode = T.nodeFromSection((book, chapter))
    verseNode = L.d(chpNode, otype='verse')
    whole_chpNode = T.nodeFromSection((book,))
    last_chp = T.sectionFromNode(whole_chpNode, lastSlot=True)
    verse = ""
    for v in verseNode:
        verse += '<span class=verseNode><a class=verse_num id=verse_num verse_node='+str(v)+'>'
        verse += str(T.sectionFromNode(v)[2])
        verse += ' </a>'
        clauseNode = L.d(v, otype='clause')
        for c in clauseNode:
            verse += '<span class=clauseNode id=clauseNode clause_node='+str(c)+'>'
            verse += "<span class='syntax clause1 hidden' id=syntax>C:"+ kb.eng_to_kor(F.typ.v(c), 'abbr') +"</span>"
            phraseNode = L.d(c, otype='phrase')
            for p in phraseNode:
                verse += '<span class=phraseNode id=phraseNode phrase_node='+str(p)+'>'
                verse += "<span class='syntax phrase1 hidden' id=syntax>P:"+ kb.eng_to_kor(F.typ.v(p), 'abbr') + "," + kb.eng_to_kor(F.function.v(p), 'abbr') + "</span>"
                wordsNode = L.d(p, otype='word')
                for w in wordsNode:
                    verse += '<span class=wordNode><a tabindex=0 class=word_elm data-poload=/api/word/'+str(w)+' data-toggle=popover data-trigger=focus>'
                    verse += F.g_word_utf8.v(w)
                    verse += '</a></span>'
                    if F.trailer_utf8.v(w):
                        verse += '<span class=trailerNode>'
                        verse += F.trailer_utf8.v(w)
                        verse += '</span>'
                verse += '</span>'
            verse += '</span>'
        verse += '</span>'
    return render_template('text.html', verse=verse, book=book, chapter=chapter, last_chp=last_chp[1])

This code works very well. However, I found the probable error in Esther 4:16. Because this verse is crucial verse in my doctoral dissertation in process, this drew my attention. The last part of Esth. 4:16 appears as follows:

> לֹֽא־כַדָּ֔ת וְאָבָֽדְתִּי׃ כַאֲשֶׁ֥ר אָבַ֖דְתִּי

So I opened the "oslots.tf", and found the following lines.

Line 77919: 367523,367527
Line 77920: 367524-367526

It seems that each slot number corresponds to each word as follows:

367523 ו
367527 אבדתי
367524 כ
367525 אשר
367526 אבדתי

Thus, it seems that the word order is not correct. Although I do not understand fully the data structure of the Text-Fabric, it seems that the each line refers to the clause unit. So I made a correction as follows:

Line 77919: 367523-367526 (WxQ0) וכאשר אבדתי
Line 77920: 367527 (xQt0) אבדתי

After the correction, it works very well in my application.
Please check this issue.

Thanks.

SBL Abbreviations

Seeing the latest pull request I suddenly thought that I should ask about this: I have been using sbl abbreviations on my tf instance (as a predictable standard). Is it worth adding them to the general data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.