Code Monkey home page Code Monkey logo

Comments (6)

ampli avatar ampli commented on May 29, 2024

Even when not parsing with nulls (removed quite so it parses):
And yet he should be always ready to have a perfectly terrible scene, whenever we want one, and to become miserable, absolutely miserable, at a moment’s notice, and to overwhelm us with just reproaches in less than twenty minutes, and to be positively violent at the end of half an hour, and to leave us for ever at a quarter to eight, when we have to go and dress for dinner when, after that, one has seen him for really the last time, and he has refused to take back the little things he has given one, and promised never to communicate with one again, or to write one any foolish letters, he should be perfectly broken-hearted, and telegraph to one all day long, and send one little notes every half-hour by a private hansom, and dine alone at the club, so that every one should know how unhappy he was.

Trace: classic_parse: Sentence disjunct count 108279 exceeded limit 105123

And this sentence has only 174 tokens. Longer sentences may have many more disjuncts.

from link-grammar.

ampli avatar ampli commented on May 29, 2024

This somehow has to be fixed since it silently totally skips parsing of some long sentences.
No error is reported in that case (not even `+++++ error' on batches, so the skipped sentences are considered correct.)

If this limit is important for the Atomese dict usage, I propose to implement one of the following:

  1. pasre_options_*_disjunct_limit()
  2. Set it using parse_options_set_test("disjuncts_limit:1234456").

In any case I propose that a default of 0 -1 (means unset), and if the setting exceeded, produce a parse error (instead of just a debug message like now).

Or alternatively, enable it only for the Atomese dict!

from link-grammar.

linas avatar linas commented on May 29, 2024

This limit can be disabled.

There were several bugs & mis-designs that lead to the introduction of this limit, all of which have been fixed. Some comments:

  • In this figure: #1402 (comment) it appears that pride-n-prejudice never needs more than about 300K disjuncts, which one of the ways I selected this limit.
  • I never measured Russian, or the long-sentence cases.
  • The intent was that an error would be visible, as there would be zero parses. sent->num_linkages_found = 0; -- Clearly, I didn't test enough to see if this was true.

For now, I no longer need this check. I don't know what might happen in the future.

We can disable this by setting the default to -1. Completely chopping out that code is OK, too.

from link-grammar.

ampli avatar ampli commented on May 29, 2024

300K disjuncts, which one of the ways I selected this limit.

But it was set to ~100K.

In any case, note that for parsing with null_count>1, there are many more disjuncts than in null_count==0, because the pruning is much less effective for higher null_counts (and hence it is done per null_count).

This limit can be disabled.

We can disable this by setting the default to -1. Completely chopping out that code is OK, too.

If the code is corrected to produce an error (and without a try to then parse with more nulls), we can set it to -1 permanently. However, I propose not to make any efforts in fixing it and just remove it.

from link-grammar.

linas avatar linas commented on May 29, 2024

Closed because resolved in #1447

from link-grammar.

linas avatar linas commented on May 29, 2024

But it was set to ~100K.

It was meant to be 1M

from link-grammar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.