haran / bmdmsoundex Goto Github PK
View Code? Open in Web Editor NEWBeider-Morse + Daitch-Mokotoff Phonetic Matching (soundex) Algorithm
License: GNU General Public License v3.0
Beider-Morse + Daitch-Mokotoff Phonetic Matching (soundex) Algorithm
License: GNU General Public License v3.0
Add Lithuanian language to the library.
Also provide necessary files for BMPM procedural version.
This code causes a very long loop, which might be a major problem for real time systems:
$phonetic = Phonetic::app()->run();
$phonetic->BMSoundex->getPhoneticKeys("الفندقومية");
It took me 383 seconds to finish.
Version 2 NullHandler does not accept setFormatter(), $handler must be instanceof Monolog\Handler\FormattableHandlerInterface
I placed $handler->setFormatter($formatter) in this conditions:
if (Logger::API === 1) { $handler->setFormatter($formatter); } elseif (Logger::API === 2 && $handler instanceof Monolog\Handler\FormattableHandlerInterface) { $handler->setFormatter($formatter); }
Name MICHALINSKY in sephardic.
BMPM:
applying language rules from (rulesany) to michalinsky using languages 50
char codes = [#6d]m [#69]i [#63]c [#68]h [#61]a [#6c]l [#69]i [#6e]n [#73]s [#6b]k [#79]y
applying rule #98 pattern=m lcontext= rcontext= subst=m result=m
applying rule #94 pattern=i lcontext= rcontext= subst=i result=mi
applying rule #33 pattern=ch lcontext= rcontext= subst=(S|tS[32]|dZ[32]) result=(miS[50]|mitS[32]|midZ[32])
BMDM:
Applying language rules from 'rulesany' to 'michalinsky' using language code '50'
Char codes = [#6d]m [#69]i [#63]c [#68]h [#61]a [#6c]l [#69]i [#6e]n [#73]s [#6b]k [#79]y
Applying rule #98 [pattern='m', lcontext='', rcontext='', subst='m', result='m']
Applying rule #94 [pattern='i', lcontext='', rcontext='', subst='i', result='mi']
Applying rule #33 [pattern='ch', lcontext='', rcontext='', subst='(S|tS[64]|dZ[64])', result='miS[50]']
So everything was the same up to applying rule #33 from rulesany. At this step BMPM code generates three values (miS mitS midZ) whereas BMDM generates only one (miS). And at this step, BMPM code has the substitution part of the rule as S|tS[32]|dZ[32] whereas BMDM has it as S|tS[64]|dZ[64. So the attribute is different -- 32 in BMPM and 64 in BMDM And that is the cause of the difference between two results.
sep/rulesany.php:33
array("ch","","","(S|tS[$spanish]|dZ[$spanish])")
So the attribute should correspond to Spanish, because in languagenames.php we have ("any", "french", "hebrew", "italian", "portuguese", "spanish");
To avoid parsing the excessive amount of rulesets upon recognition state, it would be useful to pass the predefined language name or list before processing.
Original procedural BMPM approach:
$languageCode = $spanish; // where $spanish might be 32
$result = Phonetic_UTF8(
$name,
$rules[LanguageIndexFromCode($languageCode, $languages)],
$approxCommon,
$approx[LanguageIndexFromCode($languageCode, $languages)],
$languageCode
);
hi,
running the examples the following warning is returned:
PHP Warning: count(): Parameter must be an array or an object that implements Countable in /workspace/vendor/dautkom/bmdm/library/BeiderMorse.php on line 321
This could be an error in the code:
for( $j = 0; $j < count($clist); $j++ ) {
should be:
for( $j = 0; $j < $clist; $j++ ) {
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.