I know nothing about parsing, Haskell and Japanese. This repository is purely for fun, and serving as a test-bed for some Haskell experiments.
Build the most convenient tool for Japanese beginners!
Input:
- https://raw.githubusercontent.com/foreverbell/parakeet/master/test-suite/Butter-fly/Butter-fly.j
- https://raw.githubusercontent.com/foreverbell/parakeet/master/test-suite/Butter-fly/Butter-fly.r
Output:
Full output:
- tex: https://raw.githubusercontent.com/foreverbell/miscellaneous/master/resource/parakeet/Butter-fly.tex
- pdf: https://raw.githubusercontent.com/foreverbell/miscellaneous/master/resource/parakeet/Butter-fly.pdf
Romaji should follow Hepburn romanization.
For an experimential online demo powered by GHCJS, see here for more details.
To build, at least ghc 7.10.2 is required.
$ cabal install parakeet.cabal
For stack users,
$ stack init
$ stack install
or
$ stack install --stack-yaml=stack-ghcjs.yaml
if you want to compile to JavaScript.
$ cabal sandbox init
$ cabal install --only-dependencies
$ cabal build
- XeLaTex package dependencies: xeCJK, ruby
- Font dependencies: MS Mincho, MS Gothic
$ parakeet -j Butter-fly.j -r Butter-fly.r -o Butter-fly.tex
$ xelatex Butter-fly.tex
or directly,
$ parakeet -j Buffer-fly.j -r Buffer-fly.r -o Buffer-fly.pdf
You should guarantee that the two input files are encoded in UTF-8.
- The parsing algorithm is essentially LL(infinity), it is an exponential algorithm of course! So the program may get extremely slow when there is a mistake in a long line of romaji. A proper use of separator
$
can avoid this trap. - The long vowel
ō
is ambiguous in Hepburn romanization, which is interpreted toou
oroo
. To resolve this, we always pick the former one. For example,東京(Tōkyō)
is correctly translated toとうきょう
, while大阪(Ōsaka)
is wrongly translated toおうさか
. - There are two
zu
s andji
s in romanization, namelyずづ
andじぢ
in hiragana respectively. We always pickずじ
when translatingzu
andji
into furigana. If you wantづぢ
, usedu(dzu)
anddi(dji)
instead. - Unfriendly parse error message.
Since I haven't find any potential users, so there will be no document available, please create an issue if you have trouble using it.
- Ambiguous
ō
warning. - Extended katakana support.
- Wiki for Japanese lexical rules.