Code Monkey home page Code Monkey logo

fabricius-language-schema's Introduction

Fabricius Language Schema

A Fabricius document is a text file containing a JSON object with a .fab file extension, which can be read, edited and transformed with Fabricius tools (or any scripting language).

What problem does it solve?

There is no existing format for encoding hieroglyphic text which links its visual, symbolic and semantic content. This creates huge gaps between different sources of hieroglyphic content and analysis, making them difficult to work with for computing purposes.

The Fabricius document format enables each layer of content to be represented, while maintaining the links between them, providing a multi-purpose data object which supports use cases such as:

  • mapping symbols to areas of an image of a text for harvesting labelled image data as an input to machine learning, or as the output from visual analysis
  • linking different images of the same text to the same symbolic mapping: for example, for high and low-resolution versions of the same image, or for the photographic and illustrated (facsimile) versions of the same text
  • adding translations to words and phrases representing within the symbolic sequence of the text, as a research or teaching resource, or as part of a learning exercise
  • looking up glyph sequences in dictionaries or other lexical services, to speed up research or learning
  • annotating an area of a text for the purposes of research documentation, collaboration or teaching

Further, it provides a standard format for the inputs and outputs of any automated techniques which are developed for working with the images or language. For non-programmers, this format will allow them to view and work with the content using a standard set of tools, and it will encourage people to continue adopting and developing those tools.

What are the challenges?

There are plenty, but the main ones are:

  • Keeping it simple and lightweight, and avoiding complexity or dependencies on other more formal schemas
  • Extensibility to handle more complex language features without introducing complexity or overheads to those working with simpler concepts
  • Standardisation: of sign lists, transliteration schemes, ISO codes, fonts, etc.
  • Support for multiple hieroglyphic or ancient languages

fabricius-language-schema's People

Contributors

justingrayston avatar

Watchers

James Cloos avatar  avatar Ivan Peikov avatar Roy Gardner avatar  avatar

fabricius-language-schema's Issues

Some thoughts

I'm assuming that the symbolic layer refers to the assignment of glyph labels to areas of an image. In hieroglyphic writings, gaps may be generated by erosion, etc. This creates uncertainty when assigning glyph labels based on absent or incomplete image data (see TLA dictionaries). Contextual data (e.g. adjacent symbols, and putative semantics) may be required to fill the gaps, and to generate a probability distribution over a set of candidate glyphs. Parsing a symbolic sequence into words might assist with this process because gaps can be localised and context limited.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.