Code Monkey home page Code Monkey logo

nazeka's Introduction

Nazeka is a rikai replacement.

Nazeka was the first nontrivial thing I wrote in Javascript. The code is terrible and a lot of it is bundled into a small number of files. I need help splitting it up into multiple files, or being told how to split it up into multiple first without requiring any build tools.

Nazeka is ready for general testing; it's still missing a few behaviors from rikaisama, and it's still clunky, but it's close to complete.

Settings

See the tutorial.

Building

Build process requires python and lxml, or an existing copy of the extension to rip JMdict#.json from.

  • Download JMdict.gz from http://www.edrdg.org/jmdict/edict_doc.html
  • Convert it to json with etc/process.py
  • Move the output to under dict/ so that [...]/dict/JMdict1.json and others exist in that location relative to [...]/manifest.json
  • Make sure every .json file is listed as available to the extension in manifest.json - if jmdict added a lot of words, there might be more than 11 files now
  • Package as an extension for your browser of choice, or load it as a temporary/indev extension using your browser's development tools

Copyright and License

Copyright 2017~2019; Licensed under the Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0

nazeka's People

Contributors

wareya avatar o0o0 avatar

Stargazers

大関 金城 秀喜 カシオ avatar nd avatar Timothy Nibert avatar  avatar  avatar  avatar  avatar swkidd avatar  avatar Hermier Jules avatar Niko avatar Imouto avatar hikari_no_yume avatar  avatar Matthew Esch avatar Antonio B. avatar Keneth Ade avatar Patrick Markert avatar luc_ha avatar GAURAV avatar Jan Stefanski avatar meguerreroa avatar Paul Willot avatar  avatar Lo avatar ym555 avatar  avatar Oleg Ivanov avatar  avatar Vincent Guilpain avatar Ryan Quinn avatar  avatar  avatar J avatar David Antunes avatar Mizunashi avatar Farley Oliveira avatar ┐( ̄ー ̄)┌ avatar  avatar Matan Kushner avatar ryuu avatar  avatar Dennis Pražák avatar Marina avatar  avatar rookah avatar  avatar  avatar z411 avatar Firegem avatar

Watchers

 avatar James Cloos avatar Firegem avatar  avatar  avatar

nazeka's Issues

NHK Easy furigana included in mined sentence

When mining a word in a sentence with ruby text/furigana, the furigana is included inline in the sentence.
For example, when I mine a word in the first sentence of this article the sentence field contains the following in Anki:

アメリカは、イランが石油せきゆなどを輸出できないようにする「経済制裁」を日本にっぽんの時間の5日の午後2時過ぎに始めました

It seems to only affect words that have additional styling in the article like 日本 and 石油 but not 午後.

Name lookup support

Are there any plans to add name lookup through resources such as ENAMDICT/JMnedict?

With rikaichamp, it's possible to switch to name lookup by pressing the shift key, it looks like so:
rikaichamp name lookup

I think it would be very useful and really the only feature lacking when compared to rikaichamp.

TODO

  • A way to mine frequency information
  • Find a way to apply dynamic range compression to audio playback
  • Show a message after successfully mining a card
  • Finish writing deconjugation rules
  • Write more importers for zero-epwing json dictionaries
  • Mine specific single readings by clicking on them (???)
  • Cave in and add a lookup-seq blacklist
  • Option to colorize jmdict definition sense numbers
  • Autoupdating to new versions of JMDict
  • Option to not discard JMDict's arbitrary auxiliary information (need to add it to the converted dictionary)
  • Text editing in the reader page somehow
  • "Error timeout" option
  • Option to autosave reader size and position if possible
  • Reader line height option
  • Some kind of websockets-based alternative to clipboard grabbing (though clipboard grabbing will remain supported) deprioritized to "would be nice if it existed"
  • Filter out hiragana-katakana duplication in reading display
  • Prettier in-nazeka review of mined cards (not SRS/flashcards, just a list) decided bad idea
  • Make buttons on reader page go to top on reader-at-bottom mode
  • Find out whether the reader open but using other pages problem is a firefox bug gave up and worked around it
  • Make JSON dictionary spelling rejection feature work with kana-only dictionary entries
  • Option to forcibly sync with ankiweb every time a new card is mined
  • Pause and delete buttons for the reader
  • Option to remove newlines from JSON definitions and replace them with colorized "commas" - customizable comma, default to ⬥?
  • Highlight dictionary titles (#22)
  • Frequency information (closes #18 as well)
  • Look for a way to fix #10
  • Better/overrideable jmdict entry ordering heuristics
    • Regression tests for jmdict entry ordering heuristics
  • Make it so that ascii numbers don't trigger lookups if there's no japanese characters within the lookup context
  • Some way to control which audio clip gets played (play the next one after the one that just got played?)
  • Mine with number keys
  • Import multiple json dictionaries at once and manage them
  • Option to do things with selections
    • Use selection as context
    • Use selection to override search ending position
  • Allow loading your own deconjugation ruleset
  • Fix more edge cases with text pickup from elements
  • Text nudging (for mobile)
  • Hotkeys other than opening the mining window
  • Warn before clearing mined cards
  • Mining formatting (obsolete)
  • Mine audio information (probably needs interactive mining/realtime importing)

[Epwing] 彼処[あそこ] Doesn't get match correctly.

In jmdict あそこ and かしこ are under the same entry while en Kenkyuusha/Shinmeikai those are two different entries. Which somehow results in かしこ being displayed instead of あそこ.

The fix would be either matching the epwing definitions to the first jmdict reading or including all epwing entries (sorted according to the jmdict order)

Make more distinctions between json dictionary and jmdict when mining

I like to look at the kenkyusha dictionary definitions and sample sentences when I'm just looking at words, but it becomes a problem when I mine because the card ends up as some giant list. You ended up suggesting we edit the cards after we make them, but I usually make over 30 cards a day, so it can get tedious. An option to only use jmdicts(or even only a specific json) definition in the card would be very helpful.

missorts

Nazeka has logic for forcibly fixing bad sorting of dictionary entries with per-spelling/per-word overrides now. This issue is for collecting cases that require such a special exception but do not have it yet.

  • 極めて
  • どうか

[Feature] Individual Dictionary entries.

Currently all dictionary entries are bundled together.
It would be nice to add a rikaisama-like toggle for dictionaries where individual dictionaries are swapped by pressing keyboard shortcuts.

For instance the json priority list looks like this:
Kenkyuusha
Shinmeikai
Jmdict

Pressing "D" for instance swaps the pop up definition to Shinmeikai, pressing again swaps to Jmdict and pressing again swaps back to Kenkyuusha.
Pressing "Q" for example would go the other way around.

Entering "Mining UI" while "Dictionary Swap" is used would only generate the definition of the currently selected dictionary.

Use case:

  1. A definition is particularly good on Shinmeikai and I only want this one.
  2. Always displaying every dictionary is a bit of a clutter and inefficient when you start adding up more dictionaries (Daijirin, Kojien,... when supported by nazeka_epwing_converter).

Chrome-specific issues

  • JSON dictionaries don't seem to work, try to figure out why
  • Extension barely tested in general

A select-on-only fork of NAZEKA instead of a hover-on one?

Is there a way to make it? I've recently thought of a chinese-to-russian vocab extension called БКРС used to show definitions just on selections, and now I find it strangely useful. Plus, it means it won't be an active crawler, so it's easier on RAM and all.

Unable to mine cards or play audio

Hello,

I was curious if anyone else had a similar problem happen for them as well. I'm currently using Nazeka's live mining feature with and whenever I try and mine a word an error pops up that says this: https://i.imgur.com/1hselnq.png

It was working up until yesterday, I don't believe anything updated or anything like that so any help would be greatly appreciated!

Shortcuts unreliable on keypress-stealing websites

Trying to use the nazeka shortcuts (mainly m(ining) and k(anji) mode) on websites which have their own keyboard shortcuts (eg. twitter, slack, ...) fails more often than it works. I'm fine with both events firing off, but in most cases, only the native website's shortcut will work, while nazeka's will about ~30% of the time. It's pretty random.

"Popup requires key" setting sometimes gets ignored

I have the popup key set to shift, but sometimes nazeka ignores this and behaves according to the default settings (no popup key). This seems to happen randomly, and refreshing the page always fixes the problem.
When this bug does appear, it doesn't affect all of my open tabs, it only happens on an individual tab basis.

Crawl/mine within <ruby>s

As I've already pointed out, I have a 5K FREQ web-page (which I've lightened now): https://yadi.sk/d/RzLoothMJwfupw. As I've also bordered out words' readings. For things to be perfect it'd be nice to hover these words and have audios played out. The way to do this is to give the extension an ability to crawl within html ruby tags as I tend to think.

I may be missing things. If it's already a feature, please point it out.

[Feature Request] Add a "times mined" column to mining

Default behaviour currently is to add a duplicate of the same word if mined again. It would be more useful to add another tab which indicates the number of times an entry has been mined and so a word encountered more can be learnt on a much higher priority.

Display individual kanji information

Today I found this extension and I am very happy with it. THANK YOU VERY MUCH for developing this project.

I would like to make a feature request. This is a Rikai replacement but it lacks a very important feature of Rikai. In all desktop version of Rikai mods, if you press the Shift key when the mouse pointer is over a kanji, it opens the kanji information panel (See the attached screen shot). In that panel many information about the kanji (onyomi, kunyomu, stroke count, radicals, meaning, etc) are listed. For example, if you hover your mouse on the word "妖怪", it recognizes the word as a whole and at first it doesn't list the translation of the first kanji "妖". However, if you press the Shift key, the kanji information panel opens and gives information about it.

You should really implement this feature. All Japanese learners need it. This is what makes Rikai so popular. and unfortunately all other Rikai replacements lack this feature. Nazeka can make a difference by implementing it.

Please consider my idea.

Many thanks.

rikai - individual kanji information

.

[MINING][EPWING] Remove entry/reading from the definition

Having the entry/reading in the definition beats the purpose of anki reviewing.

For instance:

―研究社 新和英大辞典 第5版―
やさしい【易しい】
〔容易な〕 easy; simple; light; 〔平易な〕 plain.
►ごくやさしい as easy as ABC
►ここで手を貸すのはやさしいが, それでは本人のためにならない. It would be easy enough to help him, but that would not be in his interests.
►(表現を)やさしくする simplify 《English》; paraphrase in plain language

to

―研究社 新和英大辞典 第5版―
〔容易な〕 easy; simple; light; 〔平易な〕 plain.
►ごくやさしい as easy as ABC
►ここで手を貸すのはやさしいが, それでは本人のためにならない. It would be easy enough to help him, but that would not be in his interests.
►(表現を)やさしくする simplify 《English》; paraphrase in plain language

Can't install on Android

Hi,

I can't figure out how to install on Android. I tried with the latest version as well as with various v68 and previous versions, both with Fennec F-Droid as well as old versions of official Firefox from the archives. The blue button to add the addon is just greyed out. Any help would be appreciated.

Unable to download audio files

I understand that not all words have audio but it appears none of them do. When using the "audio_anki" field I'm getting the following error, every time:
AnkiConnect mining failed: 'MediaManager' object has no attribute 'syncDelete'

I am using the latest version of Ankiconnect and connected to the internet.

Clipboard Grabber/Reader

Would it be possible to bring the Clipboard Grabber/Reader to chrome? If so do you have any plans on doing so?

Disable text lookup while sticky search is enabled.

Most of the time when I enable sticky search is too scroll through a definition but words get looked up when you're trying to reach the pop up window.

Note: Applying a pop up key obviously solves this.

Manual Tags and Readings

Two small improvements that I would appreciate. I apologize if these things are already possible and I just didn't find them in the options.

1) Option to automatically have nazeka add an Anki tag other than the default "nazeka" one
I use the tagging system in Anki extensively, as of now I have to manually add a specific tag to mined words, it would be quite helpful if I could just tell nazeka to add one in addition to the default instead.

2) Option to only mine the reading chosen in the mining UI
Nazeka adds all possible readings, but usually I want to drill only a single reading per card, so I have to manually remove the others. It would be nice if I could just choose the reading I want right away.

Nothing major but these would make my life easier. You're doing great work by the way, thank you very much.

Living Mining

Not exactly an issue with the extension itself but there is no where else to post.

Could you please make a more detailed tutorial for Live Mining for dummies.

I got anki connect I checked it's properly connecting but I can't figure how to configure the anki deck and the live mining options so that mining works.

Mining not working since latest AnkiConnect update

Specifically Nazeka doesn't seem to detect Anki. I haven't changed anything since the last time I used it but mining has suddenly stopped working today. I noticed that AnkiConnect also updated today so I thought it might be that, but it might be just me.

positioning bugs

  • popup goes off bottom instead of top when alignment is set to bottom left/right and popup is too tall
  • popup should become flush with the top or bottom of the screen when it's too tall to be flipped/pushed

Add settings import/export

Firefox keeps clearing my settings because of Firefox bugs involving temporary/debug extensions, so I have to do this for my own sanity.

Font size problem

Nazeka sometimes uses the font size of a character instead of the selected font size in its options, you can try it out with: Font size test.txt (just rename the .txt as .html. Apparently Github doesn't support uploading HTML files.)

The displayed text size is also getting affected when one zooms in/out a page, but I think it shouldn't.

Enhancement - Futur EPWING support and Kenkyusha

Something I've always wished from any pop up dictionary is a good support for the kenkyusha waei dictionary.
The best support at the moment is still rikaisama and a lot of people stick to firefox 57/waterfox/palemoon... for the sole purpose of rikaisama and its regex deletion.

It's a good occasion to improve on rikaisama's EPWING feature (especially for kenkyusha due to its large number of of examples and what not).

For the moment with rikaisama and its regex feature we can clean up the definition field BUT in this process we're forced to delete all the examples.
I've been trying to find a fix using regex but I'm not competent enough make a breakthrough, here's what I figured out so far.

Example of an entry :

まにあう【間に合う】 ローマ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う catch [make] the 7 o'clock train
・締め切りに間に合う meet the deadline
・開演に間に合う arrive before curtain time
▲9 時の札幌行きに間に合うように空港に着いた. I arrived in time for the nine o'clock flight to Sapporo.
・「間に合うかな」「走っても間に合いそうにないね」 "Will we be in time?"―"It doesn't look like we'll be in time even if we run."
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」 "And what is the expense?"―"Fifty-thousand yen should cover it."
・これだけあれば丸 1 年は間に合う. This will last us [see us through] one whole year. | This will be enough for a whole year.

Where all entries starting with "▲" or "・" are examples and all entries matching this regex are definitions :

Regular expression that matches everything that is not a definition :
\n[^″*〖〈《⇒=➡【〔(〜A-Za-z0-9].*

Regular expression that matches definitions+one line below :
\n[″*〖〈《⇒=➡【〔(〜A-Za-z0-9].*\n.*

The perfect result should look like this(keeping one example for each definition) :

まにあう【間に合う】 ローマ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う catch [make] the 7 o'clock train
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」 "And what is the expense?"―"Fifty-thousand yen should cover it."

tldr : adding support for kenkyusha while keeping only 1 example for each definition would be a godsend for anyone learning Japanese and will finally achieve a breakthrough in the dictionary pop up app nonsense.

[Feature Request] Parse reading and spelling from json dictionary instead of jmdict

Jmdict readings/spelling are often cluttered and unnecessarily long while json dictionaries tend to be more concise and useful.
De-cluttering those fields is especially useful for Live Mining.

Example:
Jmdict:

煙草、莨、烟草
たばこ(gikun)、えんそう、けぶりぐさ、けむりぐさ、タバコ

Kenkyuusha:

煙草
たばこ

Case-insensitive hotkeys

As is, Nazeka uses case-sensitive hotkeys. So while m works as intended for mining, M doesn't (or vise versa, if you set the hotkey to M instead of m). But I find this undesirable. Is it possible to make the hotkeys case insensitive, optionally or otherwise?

Further "Reader" customizations

Hi! First of all, allow me to thank you for this marvelous extension of yours.

Now that's out the way, I would like to request some customization options that is not already present in Nazeka:

  1. I'd like to be able to hide the "Pause" and "Delete Newest" buttons from the Reader.
  2. I'd like to be able make Reader's scrollbar thinner and/or hide it entirely.
  3. I'd like to be able to change the opacity of the Reader.
  4. I'd like to be able to hide the title bar of Reader window.
  5. I'd like to be able to keep the Reader always on top.

I am not sure which of those are within the abilities of an extension so please consider releasing Nazeka as a standalone program forgive me if I've asked anything unreasonable.

Best regards.

Mining on mobile?

I suppose it's normal mining doesn't work on mobile but is it anywhere on the road map?

Font size options

The default font size for the popup window is a bit too small for me, so I'd like to be able to change it.

Better visual cues for dictionary titles.

As it is it's quite difficult to make up different dictionary definitions at a glance because dictionary titles have the same font color/size as the definitions.
Perhaps modifying the styling of the dictionary titles to make them stand out a bit more would be a great addition.

Also it would be better to have consistent styling between jmdict and epwing titles.
Currently jmdict title is stylized as such:

jmdict:<definition>

When epwing titles are stylized as such:

―<title>―
<definition>

Example:
example

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.