Comments (6)
Even when not parsing with nulls (removed quite
so it parses):
And yet he should be always ready to have a perfectly terrible scene, whenever we want one, and to become miserable, absolutely miserable, at a moment’s notice, and to overwhelm us with just reproaches in less than twenty minutes, and to be positively violent at the end of half an hour, and to leave us for ever at a quarter to eight, when we have to go and dress for dinner when, after that, one has seen him for really the last time, and he has refused to take back the little things he has given one, and promised never to communicate with one again, or to write one any foolish letters, he should be perfectly broken-hearted, and telegraph to one all day long, and send one little notes every half-hour by a private hansom, and dine alone at the club, so that every one should know how unhappy he was.
Trace: classic_parse: Sentence disjunct count 108279 exceeded limit 105123
And this sentence has only 174 tokens. Longer sentences may have many more disjuncts.
from link-grammar.
This somehow has to be fixed since it silently totally skips parsing of some long sentences.
No error is reported in that case (not even `+++++ error' on batches, so the skipped sentences are considered correct.)
If this limit is important for the Atomese dict usage, I propose to implement one of the following:
pasre_options_*_disjunct_limit()
- Set it using
parse_options_set_test("disjuncts_limit:1234456")
.
In any case I propose that a default of 0 -1 (means unset), and if the setting exceeded, produce a parse error (instead of just a debug message like now).
Or alternatively, enable it only for the Atomese dict!
from link-grammar.
This limit can be disabled.
There were several bugs & mis-designs that lead to the introduction of this limit, all of which have been fixed. Some comments:
- In this figure: #1402 (comment) it appears that pride-n-prejudice never needs more than about 300K disjuncts, which one of the ways I selected this limit.
- I never measured Russian, or the long-sentence cases.
- The intent was that an error would be visible, as there would be zero parses.
sent->num_linkages_found = 0;
-- Clearly, I didn't test enough to see if this was true.
For now, I no longer need this check. I don't know what might happen in the future.
We can disable this by setting the default to -1. Completely chopping out that code is OK, too.
from link-grammar.
300K disjuncts, which one of the ways I selected this limit.
But it was set to ~100K.
In any case, note that for parsing with null_count>1, there are many more disjuncts than in null_count==0, because the pruning is much less effective for higher null_counts (and hence it is done per null_count).
This limit can be disabled.
We can disable this by setting the default to -1. Completely chopping out that code is OK, too.
If the code is corrected to produce an error (and without a try to then parse with more nulls), we can set it to -1 permanently. However, I propose not to make any efforts in fixing it and just remove it.
from link-grammar.
Closed because resolved in #1447
from link-grammar.
But it was set to ~100K.
It was meant to be 1M
from link-grammar.
Related Issues (20)
- asert in do_count HOT 8
- insane ./configure in github circleci HOT 3
- incorrect use of pcre2_match_data would result in severe memory leak HOT 15
- Emscripten build fails HOT 3
- multi-threaded dictionary crash HOT 18
- Problems in printing sentence split time HOT 2
- pool management idea... HOT 1
- Link-deduplication multi-connector issue
- Sharing parse choice elements HOT 8
- pool_alloc_vec() with too many elts.
- Stop litering `.lg_history` in current directory. HOT 8
- Get rid of `null_count>0` parsing HOT 26
- Flood-counting HOT 35
- SAT parser improvements HOT 4
- Get rid of max-cost HOT 14
- Open work items for 5.12.5 HOT 12
- Word "test" in English dict 5.12.1 vs older ones HOT 3
- `www.abisource.com` is not accessible HOT 16
- Make - failure to find link-names.o HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from link-grammar.