Code Monkey home page Code Monkey logo

Comments (22)

erezsh avatar erezsh commented on May 16, 2024

Hi Uriva,

If you present your use-case, perhaps I can provide some insight into the best way to solve it.

Right now, ambiguity is resolved by choosing the shortest matching rule. When the length of the rules is equal, then yes, their position in the grammar marks their priority, however this is not an official feature (at least not right now), so don't rely on it.

My hunch is that adding a priority modifier should be enough for most purposes. Something in the spirit of:

rulename.3 : some thing in side
anotherrule.5: etc etc

Do you need something more intricate than that?

from lark.

uriva avatar uriva commented on May 16, 2024

If it is by the position in the grammar this is good enough.

I mean - just adding it to the library contract (and ignoring the rule length).

If you feel the length of the rule is important, then introducing a priority as you suggested sounds reasonable.

Thanks!!

from lark.

erezsh avatar erezsh commented on May 16, 2024

I feel that implicit priority is a bad idea. Grammars are confusing enough already, and ambiguity is even more-so.

I will add numbered priority soon. Let me know if you have any preferences regarding it.

from lark.

uriva avatar uriva commented on May 16, 2024

On second thought I agree. Implicitness here will be unclear.

from lark.

erezsh avatar erezsh commented on May 16, 2024

Hi Uriva,
I added this feature and pushed it to master. If you clone the repo to latest HEAD, you should be able to use it.
For an example of how to use it, see:

tests/test_parser.py   :  test_earley_prioritization()

Its code should demonstrate proper usage and effects. Let me know if you have any questions. Let me know if it works or not.

from lark.

uriva avatar uriva commented on May 16, 2024

I'm stumbling across the assert in common.py:45

When printing the rule that doesn't have 3 components I'm getting:

('import', ['_IMPORT', 'import_args', '_NL'], 'autoalias_import__IMPORT_import_args__NL', <lark.load_grammar.RuleOptions instance at 0x8d82950>)

Seems to be an import.

My imports:

%import common.NUMBER
%import common.WS
%ignore WS

from lark.

erezsh avatar erezsh commented on May 16, 2024

Are you sure you have the updated version? In the latest "master/HEAD", that assert is in line 44.

Add the full exception. It will help me understand why you're getting it.

from lark.

uriva avatar uriva commented on May 16, 2024

Probably my mistake because now it seems to be working.

However I have a new error:

AssertionError: Priority is the same between both rules: <rule1 : token1 token2> == <rule1 : token1 token2>

Should this error occur with the same rule on both sides?

from lark.

erezsh avatar erezsh commented on May 16, 2024

No, this is a silly bug on my part. Try the latest master and see if it solves the problem for you.

from lark.

uriva avatar uriva commented on May 16, 2024

Cool:)

from lark.

uriva avatar uriva commented on May 16, 2024
lexer_conf = LexerConf(tokens, ['WS', 'COMMENT'])
TypeError: __init__() takes exactly 4 arguments (3 given)

from lark.

uriva avatar uriva commented on May 16, 2024

Some of my rules have priority, but not all - I assume this is ok?

from lark.

erezsh avatar erezsh commented on May 16, 2024

That's weird. Try to erase *.pyc ?

Yeah, it has a default priority if you don't specify it.

from lark.

uriva avatar uriva commented on May 16, 2024

Sorry I had a merge issue.
But still getting the same original error.
Could it be that the two rules are identical but still pass the equality check?

from lark.

erezsh avatar erezsh commented on May 16, 2024

Maybe. If you can give me some use case that produces this error, it would be much easier for me to correct it.

from lark.

uriva avatar uriva commented on May 16, 2024

Ok, I'm trying to produce a minimal example.

from lark.

uriva avatar uriva commented on May 16, 2024

The string: a b c a b c

And the grammar:

rule1.1: "a" rule4 | "a" rule3

rule2.2: rule3 "a"

rule3: "b" "c"

rule4: rule3 | "b"

start: (rule1 | rule2)+

%import common.WS
%ignore WS

from lark.

erezsh avatar erezsh commented on May 16, 2024

Okay. Pushed a fix to master. Try it now.

from lark.

uriva avatar uriva commented on May 16, 2024

Seems good 👍

from lark.

uriva avatar uriva commented on May 16, 2024

Could you elaborate a bit on how this works?
e.g. if several rules were used in a parse, and they have different scores, how is the parse score computed? Is it simply the min/max priority of the rules used?

from lark.

erezsh avatar erezsh commented on May 16, 2024

Basically, whenever there is an ambiguity, the resolver chooses by these conditions, in this order:

  1. Priority (if and only if specified on both rules)
  2. If both are part of the same rule (like rule1 in your example), choose the shortest one (if such exists)
  3. Otherwise, choose the tree with the least amount of children

Just from writing it down I can see this isn't good enough (but I already knew that), but I'm not sure yet how to fix it. If you have any ideas, let me know. I will also consider partial fixes that will solve your current problem.

from lark.

erezsh avatar erezsh commented on May 16, 2024

I'm closing this issue. Let me know if there's anything that isn't resolved.

from lark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.