wareya / nazeka Goto Github PK

Nazeka is a rikai replacement

Home Page: https://addons.mozilla.org/en-US/firefox/addon/nazeka/

JavaScript 91.36% HTML 5.64% Python 2.96% Shell 0.05%

nazeka's Introduction

Nazeka is a rikai replacement.

Nazeka was the first nontrivial thing I wrote in Javascript. The code is terrible and a lot of it is bundled into a small number of files. I need help splitting it up into multiple files, or being told how to split it up into multiple first without requiring any build tools.

Nazeka is ready for general testing; it's still missing a few behaviors from rikaisama, and it's still clunky, but it's close to complete.

Settings

See the tutorial.

Building

Build process requires python and lxml, or an existing copy of the extension to rip JMdict#.json from.

Download JMdict.gz from http://www.edrdg.org/jmdict/edict_doc.html
Convert it to json with etc/process.py
Move the output to under dict/ so that [...]/dict/JMdict1.json and others exist in that location relative to [...]/manifest.json
Make sure every .json file is listed as available to the extension in manifest.json - if jmdict added a lot of words, there might be more than 11 files now
Package as an extension for your browser of choice, or load it as a temporary/indev extension using your browser's development tools

Copyright and License

nazeka's People

Contributors

Stargazers

Watchers

Forkers

o0o0 dvdantunes mathias9807 arkgp ltrcao muhmuhten tnibert

nazeka's Issues

NHK Easy furigana included in mined sentence

When mining a word in a sentence with ruby text/furigana, the furigana is included inline in the sentence.
For example, when I mine a word in the first sentence of this article the sentence field contains the following in Anki:

アメリカは、イランが石油せきゆなどを輸出できないようにする「経済制裁」を日本にっぽんの時間の５日の午後２時過ぎに始めました

It seems to only affect words that have additional styling in the article like 日本 and 石油 but not 午後.

Some JDIC audio clips are inaccessible; the JDIC audio pack missed them

e.g. 保証

[Feature Request] Word frequency information

I found Rikaisama's option to display word frequency information a very useful feature. Knowing whether a word is in the top 10000 or 50000 can help you decide if a word is worth mining and learning.

Would it be possible to add this feature to Nazeka?

Name lookup support

Are there any plans to add name lookup through resources such as ENAMDICT/JMnedict?

With rikaichamp, it's possible to switch to name lookup by pressing the shift key, it looks like so:

I think it would be very useful and really the only feature lacking when compared to rikaichamp.

TODO

A way to mine frequency information
Find a way to apply dynamic range compression to audio playback
Show a message after successfully mining a card
Finish writing deconjugation rules
Write more importers for zero-epwing json dictionaries
Mine specific single readings by clicking on them (???)
Cave in and add a lookup-seq blacklist
Option to colorize jmdict definition sense numbers
Autoupdating to new versions of JMDict
Option to not discard JMDict's arbitrary auxiliary information (need to add it to the converted dictionary)
Text editing in the reader page somehow
"Error timeout" option
Option to autosave reader size and position if possible
~~Reader line height option~~
~~Some kind of websockets-based alternative to clipboard grabbing (though clipboard grabbing will remain supported)~~ deprioritized to "would be nice if it existed"
~~Filter out hiragana-katakana duplication in reading display~~
~~Prettier in-nazeka review of mined cards (not SRS/flashcards, just a list)~~ decided bad idea
~~Make buttons on reader page go to top on reader-at-bottom mode~~
~~Find out whether the reader open but using other pages problem is a firefox bug~~ gave up and worked around it
~~Make JSON dictionary spelling rejection feature work with kana-only dictionary entries~~
~~Option to forcibly sync with ankiweb every time a new card is mined~~
~~Pause and delete buttons for the reader~~
~~Option to remove newlines from JSON definitions and replace them with colorized "commas" - customizable comma, default to ⬥?~~
~~Highlight dictionary titles (#22)~~
~~Frequency information (closes #18 as well)~~
~~Look for a way to fix #10~~
~~Better/overrideable jmdict entry ordering heuristics~~
- ~~Regression tests for jmdict entry ordering heuristics~~
~~Make it so that ascii numbers don't trigger lookups if there's no japanese characters within the lookup context~~
~~Some way to control which audio clip gets played (play the next one after the one that just got played?)~~
~~Mine with number keys~~
~~Import multiple json dictionaries at once and manage them~~
~~Option to do things with selections~~
- ~~Use selection as context~~
- ~~Use selection to override search ending position~~
~~Allow loading your own deconjugation ruleset~~
~~Fix more edge cases with text pickup from elements~~
~~Text nudging (for mobile)~~
~~Hotkeys other than opening the mining window~~
~~Warn before clearing mined cards~~
~~Mining formatting~~ (obsolete)
~~Mine audio information (probably needs interactive mining/realtime importing)~~

[Epwing] 彼処[あそこ] Doesn't get match correctly.

In jmdict あそこ and かしこ are under the same entry while en Kenkyuusha/Shinmeikai those are two different entries. Which somehow results in かしこ being displayed instead of あそこ.

The fix would be either matching the epwing definitions to the first jmdict reading or including all epwing entries (sorted according to the jmdict order)

reading-only json dictionary entries are broken

option to highlight detected word in text line, not word position in pop-up window

Hello, i suggest to add an option to disable display of word position in pop-up window, and enable highlighting of detected word in text itself (like in rikai or yomichan)
.
Personally i like nazeka more, but i miss this feature from yomichan.
Is this not too annoying to implement?

Make more distinctions between json dictionary and jmdict when mining

I like to look at the kenkyusha dictionary definitions and sample sentences when I'm just looking at words, but it becomes a problem when I mine because the card ends up as some giant list. You ended up suggesting we edit the cards after we make them, but I usually make over 30 cards a day, so it can get tedious. An option to only use jmdicts(or even only a specific json) definition in the card would be very helpful.

missorts

Nazeka has logic for forcibly fixing bad sorting of dictionary entries with per-spelling/per-word overrides now. This issue is for collecting cases that require such a special exception but do not have it yet.

~~極めて~~
どうか

[Feature] Individual Dictionary entries.

Currently all dictionary entries are bundled together.
It would be nice to add a rikaisama-like toggle for dictionaries where individual dictionaries are swapped by pressing keyboard shortcuts.

For instance the json priority list looks like this:
Kenkyuusha
Shinmeikai
Jmdict

Pressing "D" for instance swaps the pop up definition to Shinmeikai, pressing again swaps to Jmdict and pressing again swaps back to Kenkyuusha.
Pressing "Q" for example would go the other way around.

Entering "Mining UI" while "Dictionary Swap" is used would only generate the definition of the currently selected dictionary.

Use case:

A definition is particularly good on Shinmeikai and I only want this one.
Always displaying every dictionary is a bit of a clutter and inefficient when you start adding up more dictionaries (Daijirin, Kojien,... when supported by nazeka_epwing_converter).

Chrome-specific issues

JSON dictionaries don't seem to work, try to figure out why
Extension barely tested in general

out-of-temporal-order lookup rejection is broken

(has to do with how async stuff works) need to use the timestamp again

A select-on-only fork of NAZEKA instead of a hover-on one?

Is there a way to make it? I've recently thought of a chinese-to-russian vocab extension called БКРС used to show definitions just on selections, and now I find it strangely useful. Plus, it means it won't be an active crawler, so it's easier on RAM and all.

Unable to mine cards or play audio

Hello,

I was curious if anyone else had a similar problem happen for them as well. I'm currently using Nazeka's live mining feature with and whenever I try and mine a word an error pops up that says this: https://i.imgur.com/1hselnq.png

It was working up until yesterday, I don't believe anything updated or anything like that so any help would be greatly appreciated!

Shortcuts unreliable on keypress-stealing websites

Trying to use the nazeka shortcuts (mainly m(ining) and k(anji) mode) on websites which have their own keyboard shortcuts (eg. twitter, slack, ...) fails more often than it works. I'm fine with both events firing off, but in most cases, only the native website's shortcut will work, while nazeka's will about ~30% of the time. It's pretty random.

"Popup requires key" setting sometimes gets ignored

I have the popup key set to shift, but sometimes nazeka ignores this and behaves according to the default settings (no popup key). This seems to happen randomly, and refreshing the page always fixes the problem.
When this bug does appear, it doesn't affect all of my open tabs, it only happens on an individual tab basis.

left/right stops working with sticky mode if you mouse over undetectable text

Crawl/mine within <ruby>s

As I've already pointed out, I have a 5K FREQ web-page (which I've lightened now): https://yadi.sk/d/RzLoothMJwfupw. As I've also bordered out words' readings. For things to be perfect it'd be nice to hover these words and have audios played out. The way to do this is to give the extension an ability to crawl within html ruby tags as I tend to think.

I may be missing things. If it's already a feature, please point it out.

[Feature Request] Add a "times mined" column to mining

Default behaviour currently is to add a duplicate of the same word if mined again. It would be more useful to add another tab which indicates the number of times an entry has been mined and so a word encountered more can be learnt on a much higher priority.

Search within "Add Subtitles" extension's div which is within another div?

I feel like an idiot already, but could you add that feature? Could you also re-write a z-index for that? I mean, I've discovered a hidden holographical reality logos structure since last time I've spammed your issues sections, but this one seems actually urgent.

Doesn't work on android in reader view mode

firefox version: 59.0.2
nazeka version: 0.1.8

Display individual kanji information

Today I found this extension and I am very happy with it. THANK YOU VERY MUCH for developing this project.

I would like to make a feature request. This is a Rikai replacement but it lacks a very important feature of Rikai. In all desktop version of Rikai mods, if you press the Shift key when the mouse pointer is over a kanji, it opens the kanji information panel (See the attached screen shot). In that panel many information about the kanji (onyomi, kunyomu, stroke count, radicals, meaning, etc) are listed. For example, if you hover your mouse on the word "妖怪", it recognizes the word as a whole and at first it doesn't list the translation of the first kanji "妖". However, if you press the Shift key, the kanji information panel opens and gives information about it.

You should really implement this feature. All Japanese learners need it. This is what makes Rikai so popular. and unfortunately all other Rikai replacements lack this feature. Nazeka can make a difference by implementing it.

Please consider my idea.

Many thanks.

"selection context" option can lock up page if selection is extremely large

[MINING][EPWING] Remove entry/reading from the definition

Having the entry/reading in the definition beats the purpose of anki reviewing.

For instance:

―研究社　新和英大辞典　第５版―
やさしい【易しい】
〔容易な〕 easy; simple; light; 〔平易な〕 plain.
►ごくやさしい　as easy as ABC
►ここで手を貸すのはやさしいが, それでは本人のためにならない.　It would be easy enough to help him, but that would not be in his interests.
►(表現を)やさしくする　simplify 《English》; paraphrase in plain language

―研究社　新和英大辞典　第５版―
〔容易な〕 easy; simple; light; 〔平易な〕 plain.
►ごくやさしい　as easy as ABC
►ここで手を貸すのはやさしいが, それでは本人のためにならない.　It would be easy enough to help him, but that would not be in his interests.
►(表現を)やさしくする　simplify 《English》; paraphrase in plain language

"position as though fixed width" no longer works

solutions:

make another wrapper div with no background
rewrite stuff

[Feature Request]Add an extension shortcut to enable/disable nazeka

Strict matching should short-circuit for strings three-or-more characters long

Can't install on Android

Hi,

I can't figure out how to install on Android. I tried with the latest version as well as with various v68 and previous versions, both with Fennec F-Droid as well as old versions of official Firefox from the archives. The blue button to add the addon is just greyed out. Any help would be appreciated.

positioning broken on pages with <body> using position:relative and margin:auto for centering

maybe use window.scrollX and document.body.getClientRects() to fix

Unable to download audio files

I understand that not all words have audio but it appears none of them do. When using the "audio_anki" field I'm getting the following error, every time:
AnkiConnect mining failed: 'MediaManager' object has no attribute 'syncDelete'

I am using the latest version of Ankiconnect and connected to the internet.

Clipboard Grabber/Reader

Would it be possible to bring the Clipboard Grabber/Reader to chrome? If so do you have any plans on doing so?

Disable text lookup while sticky search is enabled.

Most of the time when I enable sticky search is too scroll through a definition but words get looked up when you're trying to reach the pop up window.

Note: Applying a pop up key obviously solves this.

Manual Tags and Readings

Two small improvements that I would appreciate. I apologize if these things are already possible and I just didn't find them in the options.

1) Option to automatically have nazeka add an Anki tag other than the default "nazeka" one
I use the tagging system in Anki extensively, as of now I have to manually add a specific tag to mined words, it would be quite helpful if I could just tell nazeka to add one in addition to the default instead.

2) Option to only mine the reading chosen in the mining UI
Nazeka adds all possible readings, but usually I want to drill only a single reading per card, so I have to manually remove the others. It would be nice if I could just choose the reading I want right away.

Nothing major but these would make my life easier. You're doing great work by the way, thank you very much.

EPWING Support on mobile?

Tapping the "manage JSON Dictionaries" button doesn't do anything.

Living Mining

Not exactly an issue with the extension itself but there is no where else to post.

Could you please make a more detailed tutorial for Live Mining for dummies.

I got anki connect I checked it's properly connecting but I can't figure how to configure the anki deck and the live mining options so that mining works.

Mining not working since latest AnkiConnect update

Specifically Nazeka doesn't seem to detect Anki. I haven't changed anything since the last time I used it but mining has suddenly stopped working today. I noticed that AnkiConnect also updated today so I thought it might be that, but it might be just me.

positioning bugs

popup goes off bottom instead of top when alignment is set to bottom left/right and popup is too tall
popup should become flush with the top or bottom of the screen when it's too tall to be flipped/pushed

Add settings import/export

Firefox keeps clearing my settings because of Firefox bugs involving temporary/debug extensions, so I have to do this for my own sanity.

Font size problem

Nazeka sometimes uses the font size of a character instead of the selected font size in its options, you can try it out with: Font size test.txt (just rename the .txt as .html. Apparently Github doesn't support uploading HTML files.)

The displayed text size is also getting affected when one zooms in/out a page, but I think it shouldn't.

Enhancement - Futur EPWING support and Kenkyusha

Something I've always wished from any pop up dictionary is a good support for the kenkyusha waei dictionary.
The best support at the moment is still rikaisama and a lot of people stick to firefox 57/waterfox/palemoon... for the sole purpose of rikaisama and its regex deletion.

It's a good occasion to improve on rikaisama's EPWING feature (especially for kenkyusha due to its large number of of examples and what not).

For the moment with rikaisama and its regex feature we can clean up the definition field BUT in this process we're forced to delete all the examples.
I've been trying to find a fix using regex but I'm not competent enough make a breakthrough, here's what I figured out so far.

Example of an entry :

まにあう【間に合う】 ﾛｰﾏ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う　catch [make] the 7 o'clock train
・締め切りに間に合う　meet the deadline
・開演に間に合う　arrive before curtain time
▲9 時の札幌行きに間に合うように空港に着いた.　I arrived in time for the nine o'clock flight to Sapporo.
・「間に合うかな」「走っても間に合いそうにないね」　"Will we be in time?"―"It doesn't look like we'll be in time even if we run."
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」　"And what is the expense?"―"Fifty-thousand yen should cover it."
・これだけあれば丸 1 年は間に合う.　This will last us [see us through] one whole year. ｜ This will be enough for a whole year.

Where all entries starting with "▲" or "・" are examples and all entries matching this regex are definitions :

Regular expression that matches everything that is not a definition :
\n[^″*〖〈《⇒＝➡【〔(〜A-Za-z0-9].*

Regular expression that matches definitions+one line below :
\n[″*〖〈《⇒＝➡【〔(〜A-Za-z0-9].*\n.*

The perfect result should look like this(keeping one example for each definition) :

まにあう【間に合う】 ﾛｰﾏ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う　catch [make] the 7 o'clock train
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」　"And what is the expense?"―"Fifty-thousand yen should cover it."

tldr : adding support for kenkyusha while keeping only 1 example for each definition would be a godsend for anyone learning Japanese and will finally achieve a breakthrough in the dictionary pop up app nonsense.

[Feature Request] Parse reading and spelling from json dictionary instead of jmdict

Jmdict readings/spelling are often cluttered and unnecessarily long while json dictionaries tend to be more concise and useful.
De-cluttering those fields is especially useful for Live Mining.

Example:
Jmdict:

煙草、莨、烟草
たばこ(gikun)、えんそう、けぶりぐさ、けむりぐさ、タバコ

Kenkyuusha:

煙草
たばこ

Case-insensitive hotkeys

As is, Nazeka uses case-sensitive hotkeys. So while m works as intended for mining, M doesn't (or vise versa, if you set the hotkey to M instead of m). But I find this undesirable. Is it possible to make the hotkeys case insensitive, optionally or otherwise?

Further "Reader" customizations

Hi! First of all, allow me to thank you for this marvelous extension of yours.

Now that's out the way, I would like to request some customization options that is not already present in Nazeka:

I'd like to be able to hide the "Pause" and "Delete Newest" buttons from the Reader.
I'd like to be able make Reader's scrollbar thinner and/or hide it entirely.
I'd like to be able to change the opacity of the Reader.
I'd like to be able to hide the title bar of Reader window.
I'd like to be able to keep the Reader always on top.

I am not sure which of those are within the abilities of an extension so please ~~consider releasing Nazeka as a standalone program~~ forgive me if I've asked anything unreasonable.

Best regards.

Mining on mobile?

I suppose it's normal mining doesn't work on mobile but is it anywhere on the road map?

Large(?) document lag

I've converted Japanese frequency dict to 5mb HTML just to use with nazeka. Here's the link to it: https://yadi.sk/d/rdkdanephAZQKA (The font in this is ackaisyo, you can get it from https://github.com/Alex1166/marusha/blob/master/assets/fonts/ACKaisyo.ttf)

The thing is it becomes unworkably laggy when hovering.

Is there any way to fix this?

Font size options

The default font size for the popup window is a bit too small for me, so I'd like to be able to change it.

audio mining is broken

I broke the mobile arrows again

Better visual cues for dictionary titles.

As it is it's quite difficult to make up different dictionary definitions at a glance because dictionary titles have the same font color/size as the definitions.
Perhaps modifying the styling of the dictionary titles to make them stand out a bit more would be a great addition.

Also it would be better to have consistent styling between jmdict and epwing titles.
Currently jmdict title is stylized as such:

jmdict:<definition>

When epwing titles are stylized as such:

―<title>―
<definition>

Example: