Code Monkey home page Code Monkey logo

navi-reykunyu's People

Contributors

howlingwolfy avatar valmontechno avatar vawm avatar willem3141 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

navi-reykunyu's Issues

Sorting is incorrect in the complete word list

The sorting of words on the complete word list page is not applicable to Na'vi (in particular, it mixes a and ä, etc.)

There are several reasonable ways to do sorting for Na'vi words. Most importantly, one could argue that ll / rr / diphthongs count as distinct letters that should be sorted separately. In my opinion, this is not very useful if someone wants to search for a word manually (how are they supposed to know if awaiei is a-wa-i-e-i or aw-a-i-e-i?) so I'll just sort these in the English sort order.

Improve Annotated Dictionary conversion scripts

The current script is a bit hacky and sometimes leaves LaTeX garbage in its output. This should be improved.

Known issues:

  • \end{multicols} left at the end of the last word
  • various issues regarding how word hyperlinks are handled, it just takes the LaTeX label instead of where the label points to
  • links don't properly output to Discord, nested bold in a hyperlink isn't valid Markdown
  • -- does not convert to en-dash
  • multiple lemmas for the same headword don't work properly
  • tìtxen is very broken (entire LaTeX source detected as lemma)

Allow spaces in pronunciation values

Right now multiword phrases don't have a pronunciation defined because syllables are separated by dashes, not spaces. Also, we cannot designate more than one syllable as stressed. This should be changed.

Also, this would remove the need to special-case si-verbs.

Add help page

Currently the Help button brings up credits. Divide this screen into tabs: Help, API, and Credits.

Parser improvements

  • #43
    • then we can also send these to the client side for displaying the conjugated forms table
  • Parse attributive forms of adjectives.
  • Parse nouns that are productively created from verbs (tulyu, tìtusaron).
  • Implement external lenition. This should be simple in principle. But there are many special cases:
    • doubly-lenited words (we shouldn't parse hitx as (ay-)kitx with external lenition, but on the other hand, we can parse saykitx as tsay-kitx with external lenition;
    • words with a non-lenitable initial consonant... naively parsing them would return in two results for every such word (one with and one without external lenition);
    • similarly when parsing kitx we don't want to output kxitx twice; instead we want to output it once with two possible derivations;
    • many external lenitions are extremely unlikely to actually occur and can detract from the way-more-likely other result (heyn could be from keyn with external lenition, but it is way more likely just from heyn); we should probably have a scoring system and sort the results afterwards (or alternatively, just put the external lenitions always at the end);
    • if we are in sentence search mode, we should check if there is a leniting preposition before the word, to see if external lenition is even applicable (and in word search mode, we should clearly mark external lenitions).

Support multiple sources per word

The current format for the source field is either:

  • a single string (legacy from the original import), or
  • an array with three elements: source name, source URL and date (yyyy-mm-dd)

To support multiple sources, we should migrate to a new format that is an array of three-element arrays as above. Actually I would also want to add an optional fourth element to the array, namely for remarks.

Things to do here:

  • Migrate words.json:
    • for all "single string" source fields "...", change them into [["..."]]
    • for all three-element array source fields ["...", "...", "..."], change them into [["...", "...", "..."]]
  • Add support to the web frontend
  • Add support to the Discord bot
  • Add support to the editor for reading and writing this format (including the remarks field)
  • Add support to the editor for actually adding more than one source

Pam aylì'uä a nga' "-nk-"

Eyawra lì'upam lì'uä alu zenke lu zeng-ke. Set Reykunyu plltxe san zen-ke. Sweylu txo stiveftxaw fralì'ut a nga' -nk- fte zeykivo keyeyoti.

Implement vtrm. and vinm. word types

Reykunyu currently implements vtrm. verbs as two separate verbs vtr. and vm. This turns out to be confusing for users. Implement the merged types and then merge these in the database.

"ler"

Lì'uri alu ler, plltxe Reykunyu san

Forms
(a)-ler or ler-a

Slä <noun> ler ke lu kangay, ha zene Reykunyu pivlltxe san <noun> a-ler sìk tup san <noun> (a)-ler sìk.

Kezemplltxe, tìngäzìk lu fwa tìtsyul lì'uä alu ler lu le-. Zeyko a fì'u latsu ep'ang nì'it.

Automatically switch between N→E and E→N modes

It is annoying that users have to switch between search directions manually. The default should be some combination mode. Or alternatively, auto-switch the mode when the current mode gives no results. (Is this intuitive, however?)

Corpus editor

Reykunyu already has a corpus editor prototype, but it is unfinished.

Add approximate matches

If someone searches for kaltxi, Reykunyu should still show kaltxì (under some "approximate matches" banner).

Vocab teaching games

Reykunyu should have a game (or several games?) to teach word meanings, stress patterns, etc.

Maintain capitalization and punctuation from the query

image

Reykunyu forgets the query's capitalization and punctuation. It would be good to maintain this.

Unfortunately, this is one of these "it's buried deep in two-year-old code and changing this will probably break ten other things" features. But it would be useful to have for the new corpus feature.

Implement noun affix builder

Just like the verb affix builder this should be a dialog box. Should allow the user to apply any of the noun affixes.

si-verbs don't show affix table

For example, "kaltxì siyevi" shows that it is kaltxì si + , but not the infix table where you can see that it is build from + /<ìy>

(ay)sko treated as leniting

Argh, because sko is a leniting adposition, Reykunyu wouldn't understand a sentence like sko tsìk sunu oeru, because it tries to unlenite tsìk which doesn't yield any results, obviously.

This may not be solvable... at least not easily.

Remove duplicated adjective results

When searching for apxa Reykunyu happily gives the same result thrice because it tries to be helpful by suggesting that it can be apxa, a-apxa and apxa-a. Unfortunately this is not helpful at all in practice, so we should filter these out.

Support multiple definitions per word

This already kind of works but many things still need to be fixed:

  • The Discord bot doesn't show multiple definitions yet.
  • English → Na'vi search doesn't show in more than one definition.
  • Translations are annoying, because many translations may not have the split-up definitions yet. In that case Reykunyu shouldn't show the same definition multiple times for these languages.

"meylltxep"

Txo fko fwivew lì'ut alu meylltxep, tsakrr Reykunyul ke run kea tì'eyngit. Run lu fwa fpìl Reykunyul futa fìlì'uä famrelvi lu m4LTep tup meyLTep. Tsaw kezemplltxe längu tìngäzìk a zeyko fyin ke lu...

Tsun oe tsive'a mefya'o a zeyko:

  • fmi frapamrelfya a tsunslu (slä ke fkan lor oeru fìfya'o)
  • ke sar pamrelfyat apup, ki nìyey sar pamrelfyat letrrtrr (slä tsaw lu ep'ang nì'ul)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.