Code Monkey home page Code Monkey logo

cjklib's Introduction

===========================
Installing and Using Cjklib
===========================

.. contents::

Introduction
============
Cjklib provides language routines related to Han characters (characters based
on Chinese characters named Hanzi, Kanji, Hanja and chu Han respectively) used
in writing of the Chinese, the Japanese, infrequently the Korean and formerly
the Vietnamese language(s). Functionality is included for character
pronunciations, radicals, glyph components, stroke decomposition and variant
information.

Dependencies
============
- Python_ 2.4 or above (currently no support for Python3)
- SQLite_ 3+
- SQLAlchemy_ 0.4.8+
- pysqlite2_ (already ships with Python 2.5 and above)

Alternatively for MySQL as backend:

- MySQL_ 5+
- MySQL-Python_

.. _Python: http://www.python.org/download/
.. _SQLite: http://www.sqlite.org/download.html
.. _MySQL: http://www.mysql.com/downloads/mysql/
.. _SQLAlchemy: http://www.sqlalchemy.org/download.html
.. _pysqlite2: http://code.google.com/p/pysqlite/downloads/list
.. _MySQL-Python: http://sourceforge.net/projects/mysql-python/

Installing
==========

Windows
-------
Install cjklib using the provided ``.exe`` installer. Make sure above
dependencies are satisfied.

Three scripts ``cjknife.exe``, ``buildcjkdb.exe``, and ``installcjkdict.exe``
will be added to the Python ``Scripts`` sub-directory. Make sure this directory
is included in your ``PATH`` environment variable to access these programs from
the command line.

CJK dictionaries are not included by default. If you want to install any of
those run the following (with an Internet connection) from the root directory
of the source package::

    $ installcjkdict CEDICT

This will download CEDICT, create a SQLite database file and install it under
the directory given by the ``APPDATA`` environment variable, e.g.
``C:\windows\profiles\MY_USER\Application Data\cjklib``. Just substitute
``CEDICT`` for any other supported dictionary (i.e. EDICT, CEDICT, HanDeDict,
CFDICT, CEDICTGR).

DEB or RPM based systems
------------------------
Packages are available from the
`project's download page <http://code.google.com/p/cjklib/downloads/list>`_.
An Ubuntu package is available from a
`personal package archive <https://launchpad.net/~cburgmer/+archive/ppa>`_.
Install from the provided .deb or .rpm package. See below for installing
dictionaries.

Linux
-----
If you are installing from the source package you need to deploy the library on
your system::

    $ sudo python setup.py install

Also make sure above dependencies are satisfied. CJK dictionaries are not
included by default. If you want to install any of those run the following
(with an Internet connection)::

    $ sudo installcjkdict CEDICT

This will download CEDICT, create a SQLite database file and install it to
``/usr/local/share/cjklib``. Just substitute ``CEDICT`` for any other supported
dictionary (i.e. EDICT, CEDICT, HanDeDict, CFDICT, CEDICTGR).


Documentation & Usage
=====================
Documentation_ is available online. Also see the `project page`_ and its wiki.
There is a small command line tool ``cjknife`` that offers some of the library's
functions. See ``cjknife --help`` for an overview.

.. _Documentation: http://cjklib.org/
.. _project page: http://code.google.com/p/cjklib/

Examples
--------

- Get stroke order of characters::

    >>> from cjklib import characterlookup
    >>> cjk = characterlookup.CharacterLookup('C')
    >>> cjk.getStrokeOrder(u'说')
    [u'㇔', u'㇊', u'㇔', u'㇒', u'㇑', u'㇕', u'㇐', u'㇓', u'㇟']

- Access a dictionary (here using Jim Breen's EDICT)::

    >>> from cjklib.dictionary import EDICT
    >>> d = EDICT()
    >>> d.getForTranslation('Tokyo')
    [EntryTuple(Headword=u'東京', Reading=u'とうきょう',
    Translation=u'/(n) Tokyo (current capital of Japan)/(P)/')]


Database
========
Packaged versions of the library will ship with a pre-built SQLite database
file. You can however easily rebuild the database yourself.

First download the newest Unihan file::

    $ wget ftp://ftp.unicode.org/Public/UNIDATA/Unihan.zip

Then start the build process::

    $ sudo buildcjkdb -r build cjklibData

SQLite
------
SQLite by default has no Unicode support for string operations. Optionally the
ICU library can be compiled in for handling alphabetic non-ASCII characters.
Cjklib can register own Unicode functions if ICU support is missing. Queries
with ``LIKE`` will then use function ``lower()``. This compatibility mode has
negative impact on performance and as it is not needed for dictionaries like
EDICT or CEDICT it is disabled by default. See ``cjklib.conf`` for enabling.

MySQL
-----
With MySQL 5 the following ``CREATE`` command creates a database with ``utf8``
as character set using the general Unicode collation
(MySQL from 5.5.3 on will support full Unicode given character set
``utf8mb4`` and collation ``utf8mb4_bin``)::

    CREATE DATABASE cjklib DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;

You might need to set access rights, too (substitute ``user_name`` and
``host_name``)::

    GRANT ALL ON cjklib.* TO 'user_name'@'host_name';

Now update the settings in  ``cjklib.conf``.

MySQL < 5.5 doesn't support full UTF-8, and uses a version with max 3 bytes, so
characters outside the Basic Multilingual Plane (BMP) can't be encoded. Building
the Unihan database thus might result in warnings, characters above U+FFFF
can't be built at all. You need to disable building the full character range
by setting ``wideBuild`` to ``False`` in ``cjklib.conf`` before building.
Alternatively pass ``--wideBuild=False`` to ``buildcjkdb``.


Contact
=======
For help or discussions on cjklib, join `[email protected]
<http://groups.google.com/group/cjklib-devel>`_.

Please report bugs to the `project's bug tracker
<http://code.google.com/p/cjklib/issues/list>`_.

cjklib's People

Watchers

James Cloos avatar

cjklib's Issues

Frequency data for Characters and Readings

Frequency data is important in various applications of linguistic data,
e.g. sorting or searching. For CJK there exist several sources of
frequency data built from large corpora. As the selection of the corpus
highly influences calculated frequencies cjklib should not focus on a
single corpus, but allow for a general scheme that allows the user to
select an appropriate source.

Possible sources:
  - Unihan for reading frequencies
  - GPL Pinyin frequencies, http://technology.chtsai.org/syllable/
  - Jun Da's lists (http://lingua.mtsu.edu/chinese-
  - Frequencies for Chinese http://technology.chtsai.org/charfreq/,
unclear license

cjklib is LGPL and should stay this way. Mixing of non-commercial licenses
is not possible and even GPL sources should be considered carefully. The
data doesn't necessarily need to be shipped though, a TableBuilder can be
created allowing the user to add the data later, if requested.

CharacterDomains that are already implemented could be considered a
similar feature. They depend on defining sources and are offered through a
consistent abstraction. Frequency data could thus be implemented in a
similar fashion.

Original issue reported on code.google.com by [email protected] on 13 Aug 2009 at 9:36

Crash with CFDICT

What steps will reproduce the problem?
This code
from cjklib.dictionary import *

d = CFDICT()
result = d.getFor(u'哪儿')

What is the expected output? What do you see instead?
I get this error instead of the translation:
File 
"C:\Python27\lib\site-packages\cjklib-0.3.2-py2.7.egg\cjklib\dictionary\format.p
y", line 151, in format
    for idx, entity in enumerate(reading.split(' ')):
AttributeError: 'NoneType' object has no attribute 'split'

What version of the product are you using? On what operating system?
0.3.2
Windows XP


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 25 Sep 2013 at 11:15

Command line tools not working after install

I am trying to install version 0.3.2 on OSX Mountain Lion. 

I run setup.py as directed, and while it now shows up as a module in python 
interpreter, I cant seem to use any of the command line tools (buildcjkdb, 
installcjkdict, cjknife). This would not be a big issue, as the module seems to 
work, but it means I cannot install any dictionary.

For good measure here is the output from the installer. While there are a few 
warnings, it seems to run successfully:

Cmeret:cjklib-0.3.2 2 colinmeret$ sudo python setup.py install
running install
running bdist_egg
running egg_info
writing requirements to cjklib.egg-info/requires.txt
writing cjklib.egg-info/PKG-INFO
writing top-level names to cjklib.egg-info/top_level.txt
writing dependency_links to cjklib.egg-info/dependency_links.txt
writing entry points to cjklib.egg-info/entry_points.txt
reading manifest file 'cjklib.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'cjklibdoc/_build'
warning: no files found matching 'cjklibdoc/_templates/*'
warning: no files found matching 'cjklibdoc/_static/*'
writing manifest file 'cjklib.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.8-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/cjklib
copying cjklib/__init__.py -> build/lib/cjklib
copying cjklib/characterlookup.py -> build/lib/cjklib
copying cjklib/cjknife.py -> build/lib/cjklib
copying cjklib/dbconnector.py -> build/lib/cjklib
copying cjklib/exception.py -> build/lib/cjklib
copying cjklib/util.py -> build/lib/cjklib
creating build/lib/cjklib/reading
copying cjklib/reading/__init__.py -> build/lib/cjklib/reading
copying cjklib/reading/converter.py -> build/lib/cjklib/reading
copying cjklib/reading/operator.py -> build/lib/cjklib/reading
creating build/lib/cjklib/dictionary
copying cjklib/dictionary/__init__.py -> build/lib/cjklib/dictionary
copying cjklib/dictionary/entry.py -> build/lib/cjklib/dictionary
copying cjklib/dictionary/format.py -> build/lib/cjklib/dictionary
copying cjklib/dictionary/install.py -> build/lib/cjklib/dictionary
copying cjklib/dictionary/search.py -> build/lib/cjklib/dictionary
creating build/lib/cjklib/build
copying cjklib/build/__init__.py -> build/lib/cjklib/build
copying cjklib/build/builder.py -> build/lib/cjklib/build
copying cjklib/build/cli.py -> build/lib/cjklib/build
creating build/lib/cjklib/test
copying cjklib/test/__init__.py -> build/lib/cjklib/test
copying cjklib/test/build.py -> build/lib/cjklib/test
copying cjklib/test/characterlookup.py -> build/lib/cjklib/test
copying cjklib/test/dictionary.py -> build/lib/cjklib/test
copying cjklib/test/readingconverter.py -> build/lib/cjklib/test
copying cjklib/test/readingoperator.py -> build/lib/cjklib/test
creating build/lib/cjklib/data
copying cjklib/data/cantoneseipainitialfinal.csv -> build/lib/cjklib/data
copying cjklib/data/cantoneseyaleinitialnucleuscoda.csv -> build/lib/cjklib/data
copying cjklib/data/cantoneseyalesyllables.csv -> build/lib/cjklib/data
copying cjklib/data/characterdecomposition.csv -> build/lib/cjklib/data
copying cjklib/data/charactershanghaineseipa.csv -> build/lib/cjklib/data
copying cjklib/data/grabbreviation.csv -> build/lib/cjklib/data
copying cjklib/data/grrhotacisedfinals.csv -> build/lib/cjklib/data
copying cjklib/data/grsyllables.csv -> build/lib/cjklib/data
copying cjklib/data/jyutpinginitialfinal.csv -> build/lib/cjklib/data
copying cjklib/data/jyutpingipamapping.csv -> build/lib/cjklib/data
copying cjklib/data/jyutpingsyllables.csv -> build/lib/cjklib/data
copying cjklib/data/jyutpingyalemapping.csv -> build/lib/cjklib/data
copying cjklib/data/kangxiradical.csv -> build/lib/cjklib/data
copying cjklib/data/kangxiradicalisolatedcharacter.csv -> build/lib/cjklib/data
copying cjklib/data/localecharacterglyph.csv -> build/lib/cjklib/data
copying cjklib/data/mandarinipainitialfinal.csv -> build/lib/cjklib/data
copying cjklib/data/pinyinbraillefinalmapping.csv -> build/lib/cjklib/data
copying cjklib/data/pinyinbrailleinitialmapping.csv -> build/lib/cjklib/data
copying cjklib/data/pinyingrmapping.csv -> build/lib/cjklib/data
copying cjklib/data/pinyininitialfinal.csv -> build/lib/cjklib/data
copying cjklib/data/pinyinipamapping.csv -> build/lib/cjklib/data
copying cjklib/data/pinyinsyllables.csv -> build/lib/cjklib/data
copying cjklib/data/radicalequivalentcharacter.csv -> build/lib/cjklib/data
copying cjklib/data/shanghaineseipasyllables.csv -> build/lib/cjklib/data
copying cjklib/data/strokeorder.csv -> build/lib/cjklib/data
copying cjklib/data/strokes.csv -> build/lib/cjklib/data
copying cjklib/data/wadegilesinitialfinal.csv -> build/lib/cjklib/data
copying cjklib/data/wadegilespinyinmapping.csv -> build/lib/cjklib/data
copying cjklib/data/wadegilessyllables.csv -> build/lib/cjklib/data
copying cjklib/data/cantoneseipainitialfinal.sql -> build/lib/cjklib/data
copying cjklib/data/cantoneseyaleinitialnucleuscoda.sql -> build/lib/cjklib/data
copying cjklib/data/cantoneseyalesyllables.sql -> build/lib/cjklib/data
copying cjklib/data/characterdecomposition.sql -> build/lib/cjklib/data
copying cjklib/data/charactershanghaineseipa.sql -> build/lib/cjklib/data
copying cjklib/data/grabbreviation.sql -> build/lib/cjklib/data
copying cjklib/data/grrhotacisedfinals.sql -> build/lib/cjklib/data
copying cjklib/data/grsyllables.sql -> build/lib/cjklib/data
copying cjklib/data/jyutpinginitialfinal.sql -> build/lib/cjklib/data
copying cjklib/data/jyutpingipamapping.sql -> build/lib/cjklib/data
copying cjklib/data/jyutpingsyllables.sql -> build/lib/cjklib/data
copying cjklib/data/jyutpingyalemapping.sql -> build/lib/cjklib/data
copying cjklib/data/kangxiradical.sql -> build/lib/cjklib/data
copying cjklib/data/kangxiradicalisolatedcharacter.sql -> build/lib/cjklib/data
copying cjklib/data/localecharacterglyph.sql -> build/lib/cjklib/data
copying cjklib/data/mandarinipainitialfinal.sql -> build/lib/cjklib/data
copying cjklib/data/pinyinbraillefinalmapping.sql -> build/lib/cjklib/data
copying cjklib/data/pinyinbrailleinitialmapping.sql -> build/lib/cjklib/data
copying cjklib/data/pinyingrmapping.sql -> build/lib/cjklib/data
copying cjklib/data/pinyininitialfinal.sql -> build/lib/cjklib/data
copying cjklib/data/pinyinipamapping.sql -> build/lib/cjklib/data
copying cjklib/data/pinyinsyllables.sql -> build/lib/cjklib/data
copying cjklib/data/radicalequivalentcharacter.sql -> build/lib/cjklib/data
copying cjklib/data/shanghaineseipasyllables.sql -> build/lib/cjklib/data
copying cjklib/data/strokeorder.sql -> build/lib/cjklib/data
copying cjklib/data/strokes.sql -> build/lib/cjklib/data
copying cjklib/data/wadegilesinitialfinal.sql -> build/lib/cjklib/data
copying cjklib/data/wadegilespinyinmapping.sql -> build/lib/cjklib/data
copying cjklib/data/wadegilessyllables.sql -> build/lib/cjklib/data
copying cjklib/cjklib.db -> build/lib/cjklib
copying cjklib/cjklib.conf -> build/lib/cjklib
creating build/bdist.macosx-10.8-x86_64
creating build/bdist.macosx-10.8-x86_64/egg
creating build/bdist.macosx-10.8-x86_64/egg/cjklib
copying build/lib/cjklib/__init__.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib
creating build/bdist.macosx-10.8-x86_64/egg/cjklib/build
copying build/lib/cjklib/build/__init__.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/build
copying build/lib/cjklib/build/builder.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/build
copying build/lib/cjklib/build/cli.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/build
copying build/lib/cjklib/characterlookup.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib
copying build/lib/cjklib/cjklib.conf -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib
copying build/lib/cjklib/cjklib.db -> build/bdist.macosx-10.8-x86_64/egg/cjklib
copying build/lib/cjklib/cjknife.py -> build/bdist.macosx-10.8-x86_64/egg/cjklib
creating build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseipainitialfinal.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseipainitialfinal.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseyaleinitialnucleuscoda.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseyaleinitialnucleuscoda.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseyalesyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/cantoneseyalesyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/characterdecomposition.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/characterdecomposition.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/charactershanghaineseipa.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/charactershanghaineseipa.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grabbreviation.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grabbreviation.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grrhotacisedfinals.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grrhotacisedfinals.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grsyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/grsyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpinginitialfinal.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpinginitialfinal.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingipamapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingipamapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingsyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingsyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingyalemapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/jyutpingyalemapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/kangxiradical.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/kangxiradical.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/kangxiradicalisolatedcharacter.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/kangxiradicalisolatedcharacter.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/localecharacterglyph.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/localecharacterglyph.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/mandarinipainitialfinal.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/mandarinipainitialfinal.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinbraillefinalmapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinbraillefinalmapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinbrailleinitialmapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinbrailleinitialmapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyingrmapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyingrmapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyininitialfinal.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyininitialfinal.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinipamapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinipamapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinsyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/pinyinsyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/radicalequivalentcharacter.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/radicalequivalentcharacter.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/shanghaineseipasyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/shanghaineseipasyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/strokeorder.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/strokeorder.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/strokes.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/strokes.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilesinitialfinal.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilesinitialfinal.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilespinyinmapping.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilespinyinmapping.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilessyllables.csv -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/data/wadegilessyllables.sql -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/data
copying build/lib/cjklib/dbconnector.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib
creating build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/dictionary/__init__.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/dictionary/entry.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/dictionary/format.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/dictionary/install.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/dictionary/search.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary
copying build/lib/cjklib/exception.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib
creating build/bdist.macosx-10.8-x86_64/egg/cjklib/reading
copying build/lib/cjklib/reading/__init__.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/reading
copying build/lib/cjklib/reading/converter.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/reading
copying build/lib/cjklib/reading/operator.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/reading
creating build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/__init__.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/build.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/characterlookup.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/dictionary.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/readingconverter.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/test/readingoperator.py -> 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test
copying build/lib/cjklib/util.py -> build/bdist.macosx-10.8-x86_64/egg/cjklib
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/__init__.py to 
__init__.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/build/__init__.py to 
__init__.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/build/builder.py to 
builder.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/build/cli.py to cli.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/characterlookup.py to 
characterlookup.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/cjknife.py to 
cjknife.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dbconnector.py to 
dbconnector.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary/__init__.py 
to __init__.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary/entry.py to 
entry.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary/format.py 
to format.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary/install.py 
to install.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/dictionary/search.py 
to search.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/exception.py to 
exception.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/reading/__init__.py to 
__init__.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/reading/converter.py 
to converter.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/reading/operator.py to 
operator.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/test/__init__.py to 
__init__.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/test/build.py to 
build.pyc
byte-compiling 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test/characterlookup.py to 
characterlookup.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/test/dictionary.py to 
dictionary.pyc
byte-compiling 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test/readingconverter.py to 
readingconverter.pyc
byte-compiling 
build/bdist.macosx-10.8-x86_64/egg/cjklib/test/readingoperator.py to 
readingoperator.pyc
byte-compiling build/bdist.macosx-10.8-x86_64/egg/cjklib/util.py to util.pyc
creating build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/PKG-INFO -> build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/SOURCES.txt -> 
build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/dependency_links.txt -> 
build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/entry_points.txt -> 
build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/requires.txt -> 
build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
copying cjklib.egg-info/top_level.txt -> 
build/bdist.macosx-10.8-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
cjklib.dbconnector: module references __file__
cjklib.util: module references __file__
cjklib.build.__init__: module references __file__
creating dist
creating 'dist/cjklib-0.3.2-py2.7.egg' and adding 
'build/bdist.macosx-10.8-x86_64/egg' to it
removing 'build/bdist.macosx-10.8-x86_64/egg' (and everything under it)
Processing cjklib-0.3.2-py2.7.egg
removing 
'/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-
packages/cjklib-0.3.2-py2.7.egg' (and everything under it)
creating 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-p
ackages/cjklib-0.3.2-py2.7.egg
Extracting cjklib-0.3.2-py2.7.egg to 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-p
ackages
cjklib 0.3.2 is already the active version in easy-install.pth
Installing installcjkdict script to 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin
Installing cjknife script to 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin
Installing buildcjkdb script to 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin

Installed 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-p
ackages/cjklib-0.3.2-py2.7.egg
Processing dependencies for cjklib==0.3.2
Searching for SQLAlchemy==0.6.9
Best match: SQLAlchemy 0.6.9
Processing SQLAlchemy-0.6.9-py2.7.egg
SQLAlchemy 0.6.9 is already the active version in easy-install.pth

Using 
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-p
ackages/SQLAlchemy-0.6.9-py2.7.egg
Finished processing dependencies for cjklib==0.3.2

Original issue reported on code.google.com by [email protected] on 7 Mar 2014 at 2:58

chinese-english searching

Not a programming issue but I didn't know where I should put the suggestion.  
Is there a way to search for a chinese character without knowing the 
pronunciation. i.e. Reading a text not on the ds and having to look up a 
character by stroke or semantic componant? Thank you

Original issue reported on code.google.com by [email protected] on 22 Sep 2011 at 3:57

Extend decomposition schema

Some characters are decomposed in such a way that the sequential order of
components can not be used to derive the character's stroke order. This
involves layouts ⿴ and ⿻. This is currently solved by supplying the
complete stroke order of the said character.

Character 乘 for example can be decomposed into ⿻禾北, where as strokes 
丿,
㇐, ㇑ from first character 禾 are followed by strokes ㇐, ㇑, ㇀ from the
left hand side of 北, then followed by the right hand side namely 匕,
finally finished by ㇒, ㇏ from the rest of 禾, see
http://kanjivg.tagaini.net/Kanji/木.

This decomposition into parts and strokes has more information than the
current scheme can supply. These special decompositions in general pose
special complexity both on the data and the implementational side.

Ponder useful methods and a reasonable representation on the data side,
and integrate that into cjklib.

Original issue reported on code.google.com by [email protected] on 26 Jun 2009 at 9:16

.exe installer for MS Windows not available

What steps will reproduce the problem?
1. Go to "http://code.google.com/p/cjklib/downloads/list"
2. Find the .exe installer/package that contains it  for MS Windows
3.

What is the expected output? What do you see instead?
Expected to find .exe installer/package that contains it  for MS Windows. But 
failed to do so.

What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 17 Jun 2012 at 2:08

sqlalchemy 0.6 too old for installcjkdict --database=URL

What steps will reproduce the problem?
1. sudo pip install cjklib
2. sudo installcjkdict CEDICT 
--database=http://example.com/http://www.mdbg.net/chindict/export/cedict/cedict_
1_0_ts_utf-8_mdbg.txt.gz 

What is the expected output? What do you see instead?

CEDICT should get downloaded from a local webserver. The CEDICT database rarely 
(if ever) actually finishes downloading from Chinese servers while grabbing the 
database from www.mdbg.net, so instead we need to mirror the URL here in China.

Unfortunately, CJKLIB is using a version of sqlalchemy that is too old:

Traceback (most recent call last):
  File "/usr/local/bin/installcjkdict", line 9, in <module>
    load_entry_point('cjklib==0.3.2', 'console_scripts', 'installcjkdict')()
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/dictionary/install.py", line 731, in main
    if not CommandLineInstaller().run():
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/dictionary/install.py", line 524, in run
    installer.install(dictionary, **options)
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/dictionary/install.py", line 436, in install
    db = dbconnector.DatabaseConnector(configuration)
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/dbconnector.py", line 197, in __init__
    self.engine = engine_from_config(configuration, prefix='sqlalchemy.')
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.9-py2.7.egg/sqlalchemy/engine/__init__.py", line 292, in engine_from_config
    return create_engine(url, **opts)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.9-py2.7.egg/sqlalchemy/engine/__init__.py", line 274, in create_engine
    return strategy.create(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.9-py2.7.egg/sqlalchemy/engine/strategies.py", line 52, in create
    dialect_cls = u.get_dialect()
  File "/usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.6.9-py2.7.egg/sqlalchemy/engine/url.py", line 105, in get_dialect
    module = __import__('sqlalchemy.dialects.%s' % (dialect, )).dialects
ImportError: No module named http

What version of the product are you using? On what operating system?

0.3.2 on Ubuntu 13.04


Please provide any additional information below.

I cannot find the "http" library anywhere in the apt-get repositories, nor on 
the internet. But, clearly sqlalchemy 0.6.9 is too old, and when attempting to 
use a newer version, cjklib complains that it specifically needs that version.

Original issue reported on code.google.com by [email protected] on 28 Apr 2014 at 1:48

Command line support for reading dialects

Cjknife offers most functions from cjklib on the command line. Non-
It is impossible though to specify settings manually so that particularly
conversion targets cannot be adjusted.

Similar to builder options in buildcjkdb support for options as listed by
ReadingOperator.getDefaultOptions() and
ReadingConverter.getDefaultOptions() could be implemented, while taking
into account that for reading conversion source and target reading options
might be disjoint.

Outcome could be (just an idea)
$ cjknife -s Pinyin -t Pinyin?toneMarkType=Numbers -m nv3
nü3

Options not specified should maybe still be subject to guessing, see
example above.

Maybe the option format similar to buildcjkdb will be useful (note the
mixture of operator and converter options):
$ cjknife --Pinyin-toneMarkType=Numbers --Pinyin-keepPinyinApostrophes=yes
-m nv3'hai2
nü3'hai2


Original issue reported on code.google.com by [email protected] on 27 Jul 2009 at 1:04

Windows: Command line usage broken? (Characters don't show)

Hello,

this might very well be the fault of Windows, but I guess its a problem 
anyways:
I tried using cjknife from both cmd.exe and the new Windows Powershell. 
Both show "?" instead of  CJKV characters.

E.g. "cjknife -w EDICT -x "knowledge"" leads to the following (first line 
only): ???? /(n) knowledge/

The other way around (cjknife -i 周) also doesn't work - it passes the 
character ? to cjknife.

Is there any way to work around this?
Do I have to use a different shell?

Original issue reported on code.google.com by [email protected] on 21 May 2010 at 2:22

Alter .deb package for Ubuntu Natty?

The python version requirements in the Debian package (from the ppa) conflict 
with the python version that ships with Ubuntu Natty by default, but I don't 
believe this is a necessary conflict:

python-cjklib depends on python (<< 2.7); however:
Version of python on system is 2.7.1-0ubuntu5.

I understand it's too early to support Python 3, but surely sub-versions of 2.7 
can be accommodated? Is it just a matter of adjusting the package requirements?

Original issue reported on code.google.com by [email protected] on 14 Apr 2011 at 5:35

Update system

We need an update system to rebuild tables on the local system when a table
layout was changed between versions. 

Original issue reported on code.google.com by [email protected] on 13 Nov 2008 at 4:27

Name is a bit boring

CJKnife is a cooler name. Like a swiss army knife. You can use a Perian like 
logo.

Infinitely marketable. Future software packages that use this will put the icon 
in there thing.

or if you keep CJKlib you gotta make a cool logo where there are like chinese 
scrolls and stuff and 
in the front it says CJK.  I mean something in this style:
http://psdtuts.com/designing-tutorials/create-a-custom-mac-osx-style-ring-binder
-address-
book-icon/

Then users of the library will have that icon on their "technology" page.



Original issue reported on code.google.com by [email protected] on 10 Nov 2008 at 4:08

SqlAlchemy changed exceptions to exc

When I tried to install CEDICT using the installcjkdict command, I ran into an 
error, that says that the module exceptions (sqlalchemy) could not be found. I 
made sure that SQLAlchemy was installed and went to check if the module was 
there. It turns out SQL Alchemy has changed their exceptions.py into exc.py - 
changing the two instances where exceptions were imported (init.py and build.py 
in the build directory of cjklib) into exc solved the problem.

I used this on a Macbook Pro using Lion. I wonder if this is a problem with me, 
or just code that has to be changed, because according to SQLAlchemy 
(http://www.sqlalchemy.org/CHANGES) the module can be called with both names. 
It did not work that way for me.

The installation worked in the end. I'm just bringing this issue to the front.

Original issue reported on code.google.com by [email protected] on 16 Sep 2011 at 11:54

LICENSE change to BSD / MIT / Apache

Can we change the license at the software-level to BSD, MIT or Apache?

My reasons are for the ones stated here: 
https://github.com/ScottDuckworth/python-anyvcs/issues/32#issuecomment-28528142.

Due to the nature of cjklib being python and the data libraries being useful in 
pieces, a simpler license would be a more helpful measure at this point.

Cross-posted from https://github.com/cburgmer/cjklib/issues/6

Original issue reported on code.google.com by [email protected] on 4 Dec 2013 at 2:49

installcjkdict CEDICT: access denied - Please help

What steps will reproduce the problem?
1. On windows 7, 64 bits, I installed python 2.7. I tried to install CJKLIB 
with the deprecated windows installer cjklib-0.3.win32.exe.
I also tried to install the 0.3.2 by running setup.py In both case, I run into 
the following problem.

2. When I open a dos window with cmd and type installcjkdict CEDICT as per the 
instructions at http://cjklib.org/0.3/installing.html#windows, it stops with 
the message "access denied".

And the folder I defined in my APPDATA environment variable is empty.

3.

What is the expected output? What do you see instead?

I would expect to see the SQL database in the APPDDATA folder and to be able to 
start using CJKLIB which looks great and will be very useful for me (as I speak 
Japanese but now learn Mandarin and Cantonese)

What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 29 Nov 2012 at 12:47

Missing decomposition and stroke order data

Current support for character decomposition and stroke order is pretty
basic. There are other sources that could be integrated, but they don't
satisfy the LGPL license (e.g. CHISE, see
http://code.google.com/p/cjklib/source/browse/trunk/scripts/convertchise.p
Stroke order data exists only in few projects, this means most probably we
need to create our own source manually. As stroke order can be generated
partially by decomposition data, basically only minimal components need
stroke order data.

Original issue reported on code.google.com by [email protected] on 7 Jul 2009 at 8:40

BMP-only character domain

Character domains are employed to limit results to a certain domain of
characters. Currently these are domains like 'GB2312' or 'JISX0208' which
mirror official character sets. Domain 'Unicode' is the maximum domain
covering all Han characters.

A 'BMP' (Basic Multilingual Plane) character domain would limit the
Unicode domain to codepoints below 0xFFFF. This is important for systems
that currently don't handle characters beyond the 16-bit border.

Original issue reported on code.google.com by [email protected] on 11 Apr 2010 at 2:59

Recomposition of (pseudo-)character

What version of the product are you using? On what operating system?
cjklib-0.3.2-py2.7

Please provide any additional information below.
1. I would like to enquire the possibility of recomposition of character from 
its parts, by using the tree structure from "getDecompositionTreeList". I'm 
trying to generate pseudo-characters.

Many thanks,
Kevin

Original issue reported on code.google.com by [email protected] on 31 Jul 2013 at 11:10

RuntimeError: maximum recursion depth exceeded in cmp in Linuxmint 14 ( Ubuntu 12.10 )


I'm using cjknife as a parser for converting chinese file into pinyin, 

while I was on ubuntu 11.04 , it already give me the same error message but I 
had managed by adding these 2 line to the file 
/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/reading/ope
rator.py

import sys

sys.setrecursionlimit(150000000)

and it worked , but now I'm on linuxmint 14 based on Ubuntu 12.10 and when I 
add theses two line to the same file, it still give me same error message :

"  File 
"/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/reading/op
erator.py", line 1557, in removeApostrophes
    readingEntities[1:]))
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/reading/operator.py", line 1557, in removeApostrophes
    readingEntities[1:]))
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/reading/operator.py", line 1557, in removeApostrophes
    readingEntities[1:]))
  File "/usr/local/lib/python2.7/dist-packages/cjklib-0.3.2-py2.7.egg/cjklib/reading/operator.py", line 1546, in removeApostrophes
    and readingEntities[1] == self.pinyinApostrophe \
RuntimeError: maximum recursion depth exceeded in cmp

what can I do ? 

Original issue reported on code.google.com by [email protected] on 10 Apr 2013 at 10:20

dbconnector will crash in a multithreaded environment with sqlite (patch attached)

What steps will reproduce the problem?
1. Use cjklib in any multithreaded environment (like a threaded webserver, for 
example Django)
2. Access a view that uses cjklib a few times (make sure it's a function call 
that hits the database)

What is the expected output? What do you see instead?

Instead of working, you get a backtrace. This is the relevant part:

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 244, in decompose
   decompositionParts = self.getDecompositionTree(readingString)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 299, in getDecompositionTree
   segmentations = self.segment(part)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 358, in segment
   segmentationTree = self._recursiveSegmentation(readingString)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 388, in _recursiveSegmentation
   readingString[0:substringIndex].lower()):

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 1650, in _hasEntitySubstring
   readingString)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 467, in _hasEntitySubstring
   return readingString in self._substringTable

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/util.py", line 631, in fget_wrapper
   value = fget(self)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 450, in _substringTable
   entities = self.getReadingEntities() | self.getFormattingEntities()

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/util.py", line 658, in oneshot
   result = self.fget(*args, **kwargs)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 1847, in getReadingEntities
   syllables = self.getPlainReadingEntities()

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/util.py", line 658, in oneshot
   result = self.fget(*args, **kwargs)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/reading/operator.py", line 1799, in getPlainReadingEntities
   select([self.db.tables['PinyinSyllables'].c.Pinyin])))

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/dbconnector.py", line 511, in selectScalars
   result = self.execute(request)

 File "/home/shengci/dev/lib/python2.6/site-packages/cjklib-0.3-py2.6.egg/cjklib/dbconnector.py", line 467, in execute
   return self.connection.execute(*options, **keywords)

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1358, in execute
   params)

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1491, in _execute_clauseelement
   compiled_sql, distilled_params

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1558, in _execute_context
   None, None)

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1721, in _handle_dbapi_exception
   self._autorollback()

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1295, in _autorollback
   self._rollback_impl()

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1215, in _rollback_impl
   self._handle_dbapi_exception(e, None, None, None, None)

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/base.py", line 1212, in _rollback_impl
   self.engine.dialect.do_rollback(self.connection)

 File "/home/shengci/dev/lib/python2.6/site-packages/SQLAlchemy-0.7.1-py2.6-linux-i686.egg/sqlalchemy/engine/default.py", line 294, in do_rollback
   connection.rollback()

ProgrammingError: (ProgrammingError) SQLite objects created in a thread can 
only be used in that same thread.The object was created in thread id 
-1218057328 and this is thread id -1234834544 None None

What version of the product are you using? On what operating system?

Version 0.3 on linux.

Please provide any additional information below.

The attached patch fixes the dbconnector to use threadlocal variables for the 
database connection.

Original issue reported on code.google.com by [email protected] on 6 Jun 2011 at 12:36

Attachments:

Missing support for Japanese readings

No source for Japanese readings is included so that methods
CharacterLookup.getCharactersForReading() and
CharacterLookup.getReadingForCharacter() don't provide any data for
Japanese.

Other readings use data provided by Unihan, while it is unclear there what
format and source this Japanese data has. Better support can be given by
Kanjidic (http://www.csse.monash.edu.au/~jwb/kanjidic2/, ja_on, ja_kun),
but special forms are included, that probably need proper integration.

Original issue reported on code.google.com by [email protected] on 2 Jul 2009 at 1:24

CEDICT reading problem

What steps will reproduce the problem?
1. import cjklib.dictionary
2. d = cjklib.dictionary.CEDICT(databaseUrl='sqlite:////path/to/your/cedict.db')
3. d.getAll()

The method above should return all entries in CEDICT database. However, an 
AttributeError exception is raised while applying format on this record:
卡拉OK|卡拉OK|ka3 la1 O K|/karaoke (loanword)/

The problem is, reading is not a standard Pinyin. Method 
SingleColumnAdapter.format returns None therefore; 
NonReadingEntityWhitespace.format raises the exception trying to call split 
method on None type.


Problem exists in SVN trunk version (Rev: 446). I am using Ubuntu Linux 11.04.1 
LTS

I suggest either fixing such records in installcjkdict script, or fix the 
formatter of dictionary module to be able handle such records. My hotfix:

(line 126):
    def format(self, string):                                                   
        toReading = self.toReading or self.fromReading                          
        try:                                                                    
            return self._readingFactory.convert(string, self.fromReading,       
                toReading, sourceOptions=self.sourceOptions,                    
                targetOptions=self.targetOptions)                               
        except (exception.DecompositionError, exception.CompositionError,       
            exception.ConversionError):                                         
            # wighack                                                           
            return string                                                       
            #return None                                                        

Original issue reported on code.google.com by [email protected] on 3 Oct 2012 at 9:37

Invalid decomposition entries for 卪, 叉 and 丼.

What steps will reproduce the problem?
1. Setup:
from cjklib import characterlookup
cjk = characterlookup.CharacterLookup('C')

2. Break stuff:
for char in [u'卪', u'叉', u'丼']:
    try: cjk.getStrokeOrder( char )
    except: pass


What is the expected output? What do you see instead?
I get the following output (times three):
.../site-packages/cjklib/characterlookup.py:1204: UserWarning: Invalid 
decomposition entry [u'\u2ff4', (u'\u5369', 0), (u'\u4e36', 0)]
  "Invalid decomposition entry %r" % subTree)

I'm not sure what I'm expecting. There doesn't seem to be decomposition data 
for these characters, but that is not the problem. The error I get is caused by 
inconsistent use of IDS data.
According to characterlookup.py (line 1195): 
# ⿴ should only occur for 囗
This is the case for neither 卪, 叉 nor 丼, resulting in the errors.

What version of the product are you using? On what operating system?
0.3.2 from pip.
python 2.7
OSX 10.6.8 Snow Leopard.

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 29 May 2014 at 3:40

Make cjklib more pythonic

Many parts of cjklib resemble more the Java way of doing things (at least 
that's how I see it). Moving to a Python way of doing things would make it 
easier and more natural to use.
This ticket does need a bit of design to be done. Ideas welcome.

Original issue reported on code.google.com by [email protected] on 7 Nov 2010 at 5:49

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.