Code Monkey home page Code Monkey logo

Comments (10)

zevv avatar zevv commented on May 22, 2024

I tried implementing this by reusing captures, but that doesn't fit will because of the nature of captures in NPeg, as they can be nested and are consumed by code blocks. I guess some explicit mechanism would be needed to mark a pattern like a capture, so i can be referenced at a later time.

Do you happen to know of another PEG implementation that solves this? I'd be interested in the notation used.

from npeg.

Varriount avatar Varriount commented on May 22, 2024

zevv: It appears this Ruby PEG library has backreferences - https://github.com/sander6/chomsky

The notation used is a function-call like syntax - captures are done using cap(), whiile references to those captures are done using ref().

A heredco is represented as rule :heredoc { (r(/[A-Z]+/) >= cap(:delim)) & _.* & ref(:delim) }

from npeg.

zevv avatar zevv commented on May 22, 2024

Ok, I made an implementation of back refs, but I'm not really happy with the
result because of the added complexity given the limited functionality it
offers, IMHO.

If you want to check it out: https://github.com/zevv/npeg/tree/backref

Usage looks like this:

  let p = peg "doc":
    S <- *Space
    doc <- +word * "<<" * Ref("sep", sep) * S * >heredoc * Backref("sep") * S * +word
    word <- +Alpha * S
    sep <- +Alpha
    heredoc <- +(1 - Backref("sep"))

This will match the following subject:

This is a <<EOT here document
  with multiple lines EOT end

and result in the capture here document\n with multiple lines

Note that the usage is clumsy: The here-doc leader is <<, after which a separator
is matched (+Alpha in this case) and stored under the name ref. Then the heredoc
rule is matched, which matches a sequence of characters which explicitly do not
match the stored ref. Then the backref itself is matched, which completes the
here document.

This works, but is not ideal, as the heredoc rule has to be explicit in not matching
the heredoc separator string.

I do not understand how the Ruby peg library handles this, because I do not see
anything similar in the example:

rule :heredoc { (r(/[A-Z]+/) >= cap(:delim)) & _.* & ref(:delim) }`

Lua's LPEG also supports back references, and it seems that here the are also explicit
in not matching the terminator:

equals = lpeg.P"="^0
open = "[" * lpeg.Cg(equals, "init") * "[" * lpeg.P"\n"^-1
close = "]" * lpeg.C(equals) * "]"
closeeq = lpeg.Cmt(close * lpeg.Cb("init"), function (s, i, a, b) return a == b end)
string = open * lpeg.C((lpeg.P(1) - closeeq)^0) * close / 1

so maybe my current implementation is good enough.

from npeg.

zevv avatar zevv commented on May 22, 2024

from npeg.

zevv avatar zevv commented on May 22, 2024

Also, I do not really like the Ref() and Backref() syntax. Any ideas are welcome.

from npeg.

Varriount avatar Varriount commented on May 22, 2024

I mean, it's up to you to judge whether the complexity is ultimately worth it - I will admit that dynamic tokens are not something that come up in many languages.

If you are looking for optimization ideas, then theoretically a you can use an an array instead of a table for the backreferences - the names only matter during compilation, so they can be rewritten to inidex references.

I was actually originally planning, as a workaround for the lack of functionality, to have my parser emit the beginning token (the start of the heredoc), then handle that parsing manually. It would have been somewhat clumsy, but would have worked.

from npeg.

zevv avatar zevv commented on May 22, 2024

from npeg.

Varriount avatar Varriount commented on May 22, 2024

I mean, since the names are only needed for readability, they could be transparently converted to index references at compile time. Then, at runtime, a sequence or array could be used to store/retrieve captures.

from npeg.

zevv avatar zevv commented on May 22, 2024

from npeg.

zevv avatar zevv commented on May 22, 2024

Closed by ee7122c

from npeg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.