Code Monkey home page Code Monkey logo

vocabulary2anki.py's Introduction

README

What is vocabulary2anki?

vocabulary2anki是一款用python3写的小工具,用来批量查询英文单词的意义,然后以anki可以倒入的方式输出.

它的特点在于可配置性,理论上通过配置 vocabulary2anki.cfg 可以从任意的json/xml格式的字典接口中查询意义.

为了防止单个字典接口存在单词量不足的情况,允许你一次配置多个字典接口. (默认配置了有道和百词斩连个接口)

若同一个单词在多个字典接口都能查到意义,则以配置文件中靠后配置的那个为准.

Usage

./vocabulary2anki.py -h
usage: vocabulary2anki.py [-h] [-o dest_file] [--fmt format]
                          [source_file [source_file ...]]

查询单词意义,并以anki可以导入的方式输出

positional arguments:
  source_file   存放单词的源文件,每行一个词,如果不填,则会从标准输入读取单词

optional arguments:
  -h, --help    show this help message and exit
  -o dest_file  存放结果的目标文件,如果省略,则会写入到标准输出中
  --fmt format  指定输出的格式,其中{word},{mean},{accent},{sentence_en},{sentence_cn},{
                img},{mp3}会被替换,默认值为"{word}|{mean}|{accent}|{sentence_en}|{sent
                ence_cn}|{img}|{mp3}"

example

假设我有一个名为 vocabulary.txt 的文件,里面包含了一些不认识的单词:

vim
emacs

我要查询这两个单词的意义,英文例句与中文例句,并用 | 作为分隔符写入到 anki.txt 中,那么可以执行:

./vocabulary2anki.py --fmt "{word}|{mean}|{accent}|{sentence_cn}" -o anki.txt vocabulary.txt

则会发现程序新生成了一个 anki.txt 文件,其内容为:

vim|n.精力,活力|He's such an energetic guy, full of <b>vim</b>, vigor and spunk!|
emacs|n. 编辑器,编译与测试;功能强大的编辑环境||

其中 vim 在有道和百词斩中都能查到意思,由于百词斩的接口比较靠后,因此使用的是百词斩中的意思,而英文例句和中文例句在有道接口中是没有的,因此使用了百词斩中的内容.

emacs 只在有道中能查询到释义,在百词斩中查不到,因此这是使用的是来自有道接口中的释义,而中英文例句则为空.

How to Config

下面是一个例子:

[youdao]
type = xml
url = http://dict.youdao.com/fsearch?client=deskdict&keyfrom=chrome.extension&pos=-1&doctype=xml&xmlVersion=3.2&dogVersion=1.0&vendor=unknown&appVer=3.1.17.4208&le=eng&q={}
means = .//custom-translation/translation/content
accent = .//phonetic-symbol
# mp3 = http://dict.youdao.com/dictvoice?audio={}&type=2
[baicizhan]
type = json
url = http://mall.baicizhan.com/ws/search?w={}
means = mean_cn
mean_sep = ;\s+
accent = accent
sentence_en = st
sentence_cn = sttr
mp3 = http://baicizhan.qiniucdn.com/word_audios/{}.mp3
img = img

配置项只能是 type, url, means, mean_sep, accent, sentence_en, sentence_cn, mp3, img 这几个.

其他的配置项会让程序出错!

其中:

type表示字典接口的格式,目前只支持json和xml两种格式的接口
url表示字典接口的URL,其中的 {} 会被替换成待查询的单词
means告诉程序,字典接口中那部分内容为释义,若接口为xml则它的值应该为一个xpath;若接口为json,则它的值是一个类似 a.b.c 这样的路径
accent类似means,但是表示读音
sentence_en类似means,但是表示英文例句
sentence_cn类似means,但是表示中文例句
mp3类似means,但是表示读音mp3的下载地址,且可以是一个类似url的字符串
img类似mp3,但是表示图片的下载地址

vocabulary2anki.py's People

Contributors

lujun9972 avatar ccsheller avatar dependabot-preview[bot] avatar

Stargazers

 avatar lemoo avatar kaka avatar 枫佑翼 avatar  avatar YantaoZhao avatar  avatar  avatar Md Mujtaba Raza avatar  avatar Kelvien Lee avatar  avatar Gamoseimei avatar  avatar Hugh Gao avatar Ninja Huang avatar

Watchers

 avatar  avatar  avatar

vocabulary2anki.py's Issues

Dependabot couldn't find a requirements.txt for this project

Dependabot couldn't find a requirements.txt for this project.

Dependabot requires a requirements.txt to evaluate your project's current Python dependencies. It had expected to find one at the path: /requirements.txt.

If this isn't a Python project, or if it is a library, you may wish to disable updates for it from within Dependabot.

You can mention @dependabot in the comments below to contact the Dependabot team.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.