Code Monkey home page Code Monkey logo

iuliia-py's Issues

split() requires a non-empty pattern match

Приветствую.
Выполняю:

#!/usr/bin/python3

#import tensorflow as tf
#print(tf.__version__)
import iuliia

source = "Юлия Щеглова"
iuliia.translate(source, schema=iuliia.MOSMETRO)

Результат:

Traceback (most recent call last):
File "./test.py", line 8, in
iuliia.translate(source, schema=iuliia.MOSMETRO)
File "/usr/local/lib/python3.6/site-packages/iuliia/engine.py", line 17, in translate
translated = (_translate_word(word, schema) for word in _split_sentence(source))
File "/usr/local/lib/python3.6/site-packages/iuliia/engine.py", line 22, in _split_sentence
return (word for word in SPLITTER.split(source) if word)
ValueError: split() requires a non-empty pattern match.

Python 3.6.8

major changes

Я набросал альтернативку, она совсем мелкая. Гляньте, может что пригодится.
python >= 3.8

import json, re
from itertools import chain
from functools import partial


def factory(path):
    def translate_word(m):
        word = w = (Word := m.group(0)).lower()
        if ending := len(word) > 2 and ending_mapping(word[-2:]):
            w = w[:-2]
        it, buf = chain(w, ('',)), []
        a, b = '', next(it)
        for c in it:
            buf.append(prev_mapping(a + b) or next_mapping(b + c) or mapping(b, b))
            a, b = b, c
        if ending:
            buf.append(ending)
        w = ''.join(buf)
        if word == Word:
            return w
        return w.capitalize() if len(Word) == 1 or Word[-1].islower() else w.upper()

    with open(path, encoding='utf-8') as fi:
        data = json.load(fi)
    mapping = data['mapping'].get
    prev_mapping = (data['prev_mapping'] or {}).get
    next_mapping = (data['next_mapping'] or {}).get
    ending_mapping = (data['ending_mapping'] or {}).get
    return partial(re.compile(r'\w+').sub, translate_word)


f = factory('mosmetro.json')
print(f('Юлия, съешь ещё этих мягких французских булок из Йошкар-Олы, да выпей алтайского чаю'))

Ошибка при импорте

При попытке импортировать пакет прилетает исключение AttributeError: ala_lc.

>>> import iuliia
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/gittmp_gleb/tmp.yZ1EPjt5eG_105510_/iuliia/__init__.py", line 13, in <module>
    ALA_LC = Schemas.ala_lc.value  # type: ignore
  File "/usr/lib/python3.8/enum.py", line 341, in __getattr__
    raise AttributeError(name) from None
AttributeError: ala_lc
$ python --version
Python 3.8.3
$ uname -a
Linux GlebPC 5.7.2-arch1-1 #1 SMP PREEMPT Wed, 10 Jun 2020 20:36:24 +0000 x86_64 GNU/Linux

wrong transliteration

Я попробовал схему WIKIPEDIA и MOSMETRO, и получил такой результат:
маленький >> malenkiĭ
бесплатный >> besplatnyĭ

По документации же должно быть на конце "y", а не "ĭ"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.