Comments (5)
The -t
option is related, but is currently broken I think, see #8.
from lttoolbox.
89c2a06 can probably be simplified (-t code looked simpler, but doesn't seem to have support for word blanks), but seems to DTRT and runs in 0.7s on something that regular analysis uses 3.2s on while wake-up-mark-pgen takes 0.2s, seems acceptable. Still have to check @khannatanmai 's extensive pgen test suite
from lttoolbox.
@unhammer I've made the relevant modifications to the tests in abc337d. It currently fails the first wblank test and I haven't made enough sense of the wblank logic to track down the issue.
from lttoolbox.
Say for all words in your dictionary, you want to apply the rule
…inh t…
→…is…
. It's just noisy to have to add a<a/>
(or explicit~
in hfst/lexc) to the RL form-side of every place in your dictionary where that happens, and it's especially noisy if the parts of the forminh
are generated by different pardefs.
It occurs to me that this could also be fixed by composing
"postgen"
0:%~ <=> _ i n h .#. ;
with the generator (though making postgen able to handle this directly is probably still a good idea).
from lttoolbox.
fix reverted in 957bc09 due to #123
from lttoolbox.
Related Issues (20)
- utfcpp HOT 1
- Python module undefined symbol
- auto-section big fst's for faster compilation HOT 2
- Duplication in generation HOT 13
- compounding on multiwords
- soft hyphens not always ignored
- configure fails because of missing utf8.h HOT 4
- postgenerator nests wordblanks HOT 1
- `<t/>` alignment in `lsx-comp`
- lsx-comp --debug HOT 2
- lt-proc loops on non-alpha followed by 2-unit codepoint HOT 2
- Post-generation problems when uppercase
- lt-trim trims valid analyses HOT 4
- Transliteration mode doubling inserted characters sometimes
- possible optional input format for lt-comp: lt-expanded dictionaries
- Expand (some) ICU character classes in regex_compiler? HOT 3
- Option to set compound_max_elements in lt-proc HOT 1
- Support for `ANY_CHAR` in regular dix files? HOT 1
- lt-proc -g -b no longer works
- lt-proc -b explodes on a-zA-Z regexes + long input HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lttoolbox.