Code Monkey home page Code Monkey logo

Comments (12)

badicsalex avatar badicsalex commented on June 12, 2024

I would rather implement left-recursion support (with a directive at the recursion point). It's not that hard anyway, since memoization is already implemented.

(EDIT: I realized that a left-recursive grammar and a precedence climber would produce different parse trees. Also most things that would need left-recursion is already solved by the repetition/closure operator, as seen in the operator example)

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

Also I'm not really sure how the output of a precedence climber would look like in the AST.

Something like this?

enum Expression {
    Number(String),
    InfixExpression(InfixExpression),
    // could be also prefix, postfix, bracket, etc.
}

struct InfixExpression {
    left_side: Box<Expression>,
    operator: Operator,
    left_side: Box<Expression>,
}

enum Operator {
    Add,
    Sub,
    Mul,
    Div,
}

I'm also not really sure how this could be represented in the grammar in a readable way, so that it remains really intuitive what kind of AST will be generated.

from peginator.

oovm avatar oovm commented on June 12, 2024

I have a question, how should Sequence be handled?

Sequence = { parts:DelimitedExpression };

Is Sequence an atomic operation or an infix operation?

from peginator.

oovm avatar oovm commented on June 12, 2024

I think at least self can be introduced for judging direct left recursion.

@precedence
arithmetic =
    = x:self op:"+" y:self
    | x:self op:"-" y:self
    | x:self op:"*" y:self
    | x:self op:"/" y:self
    | x:self op:"^" y:self
    | x:number
    | "(" x:self ")"
    ;

This syntax does not changed, but cannot represent priority groups (like rust-peg).

However, the meaning of priority groups needs to be discussed, and it may be better to form a strict partial order of priorities.


EDIT: I thought about it on the way home, priority groups really don't make sense.

Because you can always define AddLike = '+' | '-', to achieve the same effect

from peginator.

oovm avatar oovm commented on June 12, 2024

Another question is what is the name of the variant in this case? There is no way to tag.

ANTLR uses # to mark branches, the usage might be

@precedence
arithmetic =
    @Add| x:self op:"+" y:self
    @Sub| x:self op:"-" y:self
    @Mul| x:self op:"*" y:self
    @Div| x:self op:"/" y:self
    @Pow| x:self op:"^" y:self
    @Num| x:number
    @Gro| "(" x:self ")"
    ;

Here I introduced an operator like @VAR_NAME|, It looks weird, but the name of the enum can be determined by analogy with the @:Rule and @NAME:Rule operator.

I think we can think of @ as a macro that acts on |

from peginator.

oovm avatar oovm commented on June 12, 2024

Another problem is the ternary operator and the sequence operator.

@precedence
arithmetic =
    | x:self op1:"?" y:self op2:":" z:self
    | x:self y:self 
    ;

The writing is consistent but logically unreasonable, and special treatment is required for this writing.

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

Is Sequence an atomic operation or an infix operation?

I'd consider Sequence a special case of infix where there is no actual operator token.

but cannot represent priority groups (like rust-peg).

Since it's such a special syntax, it might as well copy most of the rust-peg syntax with some modification so it looks more like the rest of the syntax. Something like this maybe (with a ton of restrictions, e.g. only specific forms and string literals are allowed):

@precedence
Expression = (
    Add = (@) "+" @;
    Sub = (@) "-" @;
    --
    Mul =  (@) "*" @;
    Div = (@) "/" @;
    --
    Power = @ "^" (@);
    --
    Number = @:NUMBER;
    Group = "(" @ ")";
);

ternary operator

Apparently that's a solvable problem.

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

Then again, at that point it might be better to create a separate library (or use an existing one) that does precedence climbing simply and reliably, and then just pulling that in with @extern

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

Again, with left recursion you can have the following grammar:

Expression =
    left:*Expression (op:Add | op:Sub) right:*Expression |
    left:*Expression (op:Mul | op:Div) right:*Expression |
    left:*Expression op:Power right:*Expression |
    "(" atom:*Expression ")" |
    atom:Number ;

Add = "+";
Sub = "-";
Mul= "*";
Div = "/";
Power = "^";

Resulting types:

struct Expression {
    left: Box<Expression>,
    right: Box<Expression>,
    atom: Box<Expression_atom>,
    op: Expression_op,
}

enum Expression_atom {
    Expression(Box<Expression>),
    Number(Number),
}

enum Expression_op {
    Add(Add), Sub(Sub), ...
}

I think it would be just as easily readable as the precedence climbing proposals above. And it's way more general too, you can fit the ?: and array indexing in nicely.

P.S.: CPython's own parser is also written this way:
https://medium.com/@gvanrossum_83706/peg-at-the-core-developer-sprint-8b23677b91e6#4a74
https://github.com/we-like-parsers/pegen/blob/main/data/python.gram#L125

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

I've been playing with left recursion this afternoon, and it's very difficult to get right. See a left-recursive calculator here:
https://github.com/badicsalex/peginator/blob/eade5ae60f777e101df173ac95b5509ec235e55f/peginator_test/src/calculator_example/grammar.ebnf

The main problem is that you cannot use Expression as the left side of multiplications, it must be a Term, so you need to have a Term type that only allows other multiplications and things above it in precedence (e.g. power, or groups).

To make it a tiny bit prettier, I tried using >Term instead of @:Term in Expression, but that completely broke the left recursion cache handling (going through Expression and Term have different cache entries). This is why I put a warning in the readme.

But, on the bright side, even though types look horrible when used as literals, the code itself can be very compact, non-redundant and all around pretty.

from peginator.

oovm avatar oovm commented on June 12, 2024

Operation precedence, and operation associativity, are very special grammar rules.

Widely used in programming languages, it is worth using a special syntax.

The operation priority of the programming language has more than ten levels, if such an expression tree is used.

It is not very convenient to nested match the required level, which is inconsistent with the ADT model of rust.

from peginator.

badicsalex avatar badicsalex commented on June 12, 2024

Yeah, you convinced me.
I'll definitely implement this when I have the time. Right now a bunch of other projects have more priority.

PRs are welcome too, of course :)

from peginator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.