Code Monkey home page Code Monkey logo

Comments (4)

mnater avatar mnater commented on September 18, 2024 1

I'm using this script (https://github.com/mnater/Hyphenopoly/blob/master/tools/createWasmForLang.sh).

It takes four files as input:

  • .chr.txt -- list of characters
  • .hyp.txt -- hyphenation exceptions
  • .lic.txt -- license
  • .pat.txt -- the hyphenation patterns

Take the following files for german as examples (de.zip).

The script converts this input to a binary representation and adds it as a data-element to the .wasm code that's compiled with assemblyscript. You may need to adapt the paths.

Be aware: There's an issue with assemblyscript (respectively binaryen) that inhibits the addition of pattern-data that is larger than 64kb (WebAssembly/binaryen#5595). Until this is sorted out you'll need to use assemblyscript <=0.26.

If the lic allows it, you may also just send the file and I'll do my best to convert and publish it.

from hyphenopoly.

clauseggers avatar clauseggers commented on September 18, 2024

Thank you Mathias. I have prepared the files following your template. The only deviations is that the hyph-fo.pat.txt file contain a UTF-8 in the first line. See if you can make this compile, that would be awesome.
fo.zip

from hyphenopoly.

mnater avatar mnater commented on September 18, 2024

Thank you.

Where did you get this from? I haven't found it in the hunspell git-repo, but I like to check if there are other languages available...

from hyphenopoly.

clauseggers avatar clauseggers commented on September 18, 2024

I collected a number of Hunspell hyphenation files for less supported languages, and Faroese was one of them. This is what I wrote in the description of where I got it:

Language: Faroese (Faroe Islands) (fo FO)
Origin:   Generated from a collection of hyphenated words provided by the newspaper Dimmalætting.
          http://fo.speling.org/filer/hyph_fo_FO-20040420a.zip (Site no longer online, see below instead)
          https://fedora.pkgs.org/37/fedora-aarch64/hyphen-fo-0.20040420-22.fc37.noarch.rpm.html
License:  GNU General Public License, version 2
Author:   Jacob Sparre Andersen <[email protected]>

Faroese dictionary for spell checking.

I’m pretty sure it was just a Google search that pointed me to them, and that I either downloaded the files from the Fedora repo, or Archive.org.

from hyphenopoly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.