Comments (10)
I tried implementing this by reusing captures, but that doesn't fit will because of the nature of captures in NPeg, as they can be nested and are consumed by code blocks. I guess some explicit mechanism would be needed to mark a pattern like a capture, so i can be referenced at a later time.
Do you happen to know of another PEG implementation that solves this? I'd be interested in the notation used.
from npeg.
zevv: It appears this Ruby PEG library has backreferences - https://github.com/sander6/chomsky
The notation used is a function-call like syntax - captures are done using cap()
, whiile references to those captures are done using ref()
.
A heredco is represented as rule :heredoc { (r(/[A-Z]+/) >= cap(:delim)) & _.* & ref(:delim) }
from npeg.
Ok, I made an implementation of back refs, but I'm not really happy with the
result because of the added complexity given the limited functionality it
offers, IMHO.
If you want to check it out: https://github.com/zevv/npeg/tree/backref
Usage looks like this:
let p = peg "doc":
S <- *Space
doc <- +word * "<<" * Ref("sep", sep) * S * >heredoc * Backref("sep") * S * +word
word <- +Alpha * S
sep <- +Alpha
heredoc <- +(1 - Backref("sep"))
This will match the following subject:
This is a <<EOT here document
with multiple lines EOT end
and result in the capture here document\n with multiple lines
Note that the usage is clumsy: The here-doc leader is <<
, after which a separator
is matched (+Alpha in this case) and stored under the name ref
. Then the heredoc
rule is matched, which matches a sequence of characters which explicitly do not
match the stored ref. Then the backref itself is matched, which completes the
here document.
This works, but is not ideal, as the heredoc
rule has to be explicit in not matching
the heredoc separator string.
I do not understand how the Ruby peg library handles this, because I do not see
anything similar in the example:
rule :heredoc { (r(/[A-Z]+/) >= cap(:delim)) & _.* & ref(:delim) }`
Lua's LPEG also supports back references, and it seems that here the are also explicit
in not matching the terminator:
equals = lpeg.P"="^0
open = "[" * lpeg.Cg(equals, "init") * "[" * lpeg.P"\n"^-1
close = "]" * lpeg.C(equals) * "]"
closeeq = lpeg.Cmt(close * lpeg.Cb("init"), function (s, i, a, b) return a == b end)
string = open * lpeg.C((lpeg.P(1) - closeeq)^0) * close / 1
so maybe my current implementation is good enough.
from npeg.
from npeg.
Also, I do not really like the Ref()
and Backref()
syntax. Any ideas are welcome.
from npeg.
I mean, it's up to you to judge whether the complexity is ultimately worth it - I will admit that dynamic tokens are not something that come up in many languages.
If you are looking for optimization ideas, then theoretically a you can use an an array instead of a table for the backreferences - the names only matter during compilation, so they can be rewritten to inidex references.
I was actually originally planning, as a workaround for the lack of functionality, to have my parser emit the beginning token (the start of the heredoc), then handle that parsing manually. It would have been somewhat clumsy, but would have worked.
from npeg.
from npeg.
I mean, since the names are only needed for readability, they could be transparently converted to index references at compile time. Then, at runtime, a sequence or array could be used to store/retrieve captures.
from npeg.
from npeg.
Closed by ee7122c
from npeg.
Related Issues (20)
- Print return stack on overflow HOT 4
- `-d:npegTrace` code allocates a number of closures equal to the number of states HOT 1
- Implement slicing operators for captures HOT 10
- PEG section in the Nim book HOT 2
- Just wanted to say npeg is great HOT 1
- More/larger examples of Error reporting HOT 5
- Memorization & Bryan Ford HOT 3
- Prevent code blocks captures from executing HOT 7
- A paper citing Npeg HOT 1
- Implement handling of left recursion HOT 5
- [Question] Parsing comments with two characters limits HOT 2
- Performance and Complexity HOT 1
- push() captures broken? HOT 2
- When defining grammar block parameterized rules block name needed? HOT 1
- [Question] Why is my npeg parser done, then starts backtracking and fails. HOT 8
- FR: Deferred code block captures HOT 1
- Enhancing parser flexibility: improving type parsing with the push template HOT 1
- Captures not working inside of proc with generic arguments HOT 21
- javascript support HOT 6
- Help - how to handle captures and optionals HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from npeg.