Code Monkey home page Code Monkey logo

kawazu's People

Contributors

cutano avatar ookii-tsuki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

kawazu's Issues

One or more errors occurred. (Index was outside the bounds of the array.)

Hi,
First thanks for the great work with Kawazu, very heplful!
But sometimes it generates an exception.
Step to reproduce :

var converter = new KawazuConverter();
var inputH = converter.Convert("だれでうどうんづしますか", To.Hiragana, Mode.Normal).Result;

Stacktrace:

   at Kawazu.Division..ctor(MeCabIpaDicNode node, TextType type, RomajiSystem system)
   at Kawazu.KawazuConverter.<>c__DisplayClass6_0.<Convert>b__1(MeCabIpaDicNode node)
   at System.Linq.Enumerable.SelectArrayIterator`2.ToList()
   at Kawazu.KawazuConverter.<>c__DisplayClass6_0.<Convert>b__0()
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Kawazu.KawazuConverter.<Convert>d__6.MoveNext()

Resources(dictionary) not been copied automatically

When I test your lastest nuget package following the approach in readme.md, it still report char.bin not found error.

image

Then I test the elder verison(Kawazu1.0.0) with .net core 3.1, the resources was not been copied, either.

image

.net5 support request

image

the current version Kawazu works only with .Net core1-3, I tested .Net5 and it works good only if I copy the matrix.bin and blabal files to the build output floder.

So it would be easy to support .Net5 by modify the csproj file.


BTW, it would helpful if you can support the old .net frameworks4+, there are a lot apps running on .net frameworks, but I'm not sure if Kawzu can run on .NetFramework4 without code changing, no test yet.

Question about Romaji

I don't know much about Japanese Romaji
But why it gets differet Romaji of こんにちは from google and Kawazu? Can I get the google's Romaji from Kawazu?

Google

return Kon'nichiwa

image

Kawazu

I tried different parameters of Converter.GetDivisions all return: konnichiha

ArgumentOutOfRangeException for input 鷺ノ森中ノ丁

I tried to use Kawazu-Cli to test input ノ森中ノ丁, but exception occurred.
image
The exception occurred in Division.cs file line 144:
image

Can you check this problem? I tried to understand the code but didn't succeed.

Thank you!

Errors in transliterations

I could be doing things incorrectly but i am trying to basically do 2 things given almost any imput in a japanese text field.

  1. Convert to Kana
  2. Convert to Romaji

The Romaji conversion is thowing a heap of erorrs most of the time like the one below from a unit test.

        [Theory]
        [InlineData("袖ケ浦港運", "Sodegaura-kō un")]
        public async Task RomajiTransliterationTest(string input, string expectedOutput)
        {
            KawazuConverter converter = new();
            var output = await converter.Convert(input, To.Romaji, Mode.Okurigana, RomajiSystem.Hepburn);
            Assert.Equal(expectedOutput, output);
        }
System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.Text.StringBuilder.ToString(Int32 startIndex, Int32 length)
   at Kawazu.Division..ctor(MeCabIpaDicNode node, TextType type, RomajiSystem system)
   at Kawazu.KawazuConverter.<>c__DisplayClass6_0.<Convert>b__1(MeCabIpaDicNode node)
   at System.Linq.Enumerable.SelectArrayIterator`2.ToList()
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Kawazu.KawazuConverter.<>c__DisplayClass6_0.<Convert>b__0()
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__277_0(Object obj)
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)

Hiraganas does not take context into account (like numbers)

Hi,
Not sure if this is a Kawazu or LibNMeCab issue, but when converting kanjis ignores exceptions.
For example, if one wants to convert 300 with 三百, it will output さんひゃく (sanhyaku) but the correct answer is さんゃく (sanbyaku).
Same for 600 and 900.

Currently working on a workaround on my fork: lasyan3@156bf7e

Parts of speech

Hi Cutano. This is more of a request than an issue. Would you include a PartsOfSpeech property to Division.cs? I believe you can pull it from MeCabIpaDicNode.

Can I get a list of pronunciations for every char.

Hi, thanks for your nice work, but I'd like to get pronunciations one by one.

as"今晩"

var result = await converter.Convert("今晩は", To.Romaji, Mode.Normal, RomajiSystem.Hepburn, "(", ")");

Kawazu return "komban" now

I wanna get a list {"kom", "ban"} for searching match purpose

is there any way to get this list?

Romaji to Hiragana

Hi Cutano. Sorry to bother you again. I was just wondering. Is it possible to covert Romaji to Hiragana?

Dispose method to release unmanaged memory

I think you should add a dispose method in the KawazuConverter class to dispose the MeCab tagger and release unmanaged memory because the garbage collector can't clean unmanaged memory.

Furigana is sometimes inaccurate

Hi Cutano. I was playing around with the KawazuConverter and found that when converting to To.Hiragana in Mode.Furigana it sometimes returns inaccurate results. For example, with the word:

あの方
the result was: あのほう
and it should have been: あのかた

License

Hi. This is more of a question than an issue. If I include this library in a commercial project. What do I need to do to satisfy this condition in your license?

"The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.