Comments (6)
Looks great! I labeled it as 'information', so that users could benefit from it.
from cpp-peglib.
@Beedeebee, thanks for the feedback. I think there is no easy solution for that unless you give up using %whitespace
...
A workaround that I can come up with this situation is as below:
I know this is not a super beautiful solution, but it works with %whitespace
. Hope it helps!
from cpp-peglib.
Thanks, the workaround you're suggesting works perfectly for me! :)
from cpp-peglib.
@olivren, thanks for the report. I tried to handle this situation without making any code change in peglib.h, and found the following solution.
TEST_CASE("WHITESPACE test3", "[general]") {
peg::parser parser(R"(
StrQuot <- < '"' < (StrEscape / StrChars)* > '"' > # Nested token operators
StrEscape <- '\\' any
StrChars <- (!'"' !'\\' any)+
any <- .
%whitespace <- [ \t]*
)");
parser["StrQuot"] = [](const SemanticValues& sv) {
REQUIRE(sv.token() == R"( aaa \" bbb )"); // Get text in the inner token operator
};
auto ret = parser.parse(R"( " aaa \" bbb " )");
REQUIRE(ret == true);
}
The key of the solution is to use token operators effectively. The peglib ignores white-spaces in text surrounded by a token operator.
Also I discovered that when we use nested token operators in a rule and call sv.token()
method in the corresponding action handler, we can capture only text in the inner token operator like aaa \" bbb
.
Please let me know if the above solution can work for you. Thanks!
from cpp-peglib.
Thanks for your answer. I played a bit with the placement of the token operators in my grammar, and I found a combination that works for my case. I am not sure I understand exactly the logic of when whitespaces are ignored though.
Here is the full code for reference. It may be useful for other users. It implements python-style string literals (no string prefix, no raw string, no \o \x \N \u escape codes, and no \ newline management)
StrDblQuot <- < '"' < (StrEscape / StrDblQuotChars)* > '"' >
StrEscape <- '\\' any
StrDblQuotChars <- (!'"' !'\\' any)+
any <- !'\n' !'\r' .
%whitespace <- [ \t]*
rule["StrDblQuot"] = [](const SemanticValues& sv) -> string {
ostringstream ss;
for(string& e: sv.transform<string>())
ss << e;
return ss.str();
};
rule["StrEscape"] = [](const SemanticValues& sv) -> string {
string tok = sv.token();
assert (tok.size() == 2);
switch(tok.back()) {
case '\\': return "\\";
case '\'': return "'";
case '"': return "\"";
case 'a': return "\a";
case 'b': return "\b";
case 'f': return "\f";
case 'n': return "\n";
case 'r': return "\r";
case 't': return "\t";
case 'v': return "\v";
default: return tok;
}
};
rule["StrDblQuotChars"] = [](const SemanticValues& sv) -> string {
return sv.token();
};
The input (" a \" \\ b ") yields the string ( a " \ b )
from cpp-peglib.
Hello, I have a similar problem with whitespaces. Not sure if I should contribute to this issue or create a new one. I apologize in advance if I chose the wrong option.
I'm trying to write a grammar where the whitespace matters only in one point: between an identifier and a '('
to distinguish between a function call (no space admitted) and a sequence of an identifier and a grouped expression. Something like:
EXPR <- GROUP+
GROUP <- '(' EXPR ')' / PRIMARY
PRIMARY <- IDENT '(' EXPR (',' EXPR)* ')' # function call - no space allowed after IDENT
/ IDENT # variable name
/ LITERAL # literal
I'd like to parse x(x)
as a function call and x (x)
as an IDENT
followed by a GROUP
.
I believe that cpp-peglib's syntax works exactly like this as well.
Is it possible to use Tokens or any other solution to say that no whitespace is allowed between IDENT
and '('
for function calls, without having to give up %whitespace altogether?
from cpp-peglib.
Related Issues (20)
- Tests fail to compile HOT 3
- Performance comparison with boost::spirit, PEGTL etc.? HOT 2
- Tag 1.8.5 needed HOT 1
- String token rule problem HOT 2
- Warnings due to -Woverloaded-virtual HOT 3
- Grammar performance, dictionary case insensitive not accepted HOT 3
- Inconsistent syntax error for the attached grammar HOT 1
- Use named capture in error message? HOT 1
- Last-resort failure HOT 7
- Expose a choice field for SemanticValues in dictionaries HOT 1
- Whitespace rule can spoil error messages HOT 7
- Parsing strings containing `#` HOT 1
- Severe performance regression in 1.8.6 HOT 8
- I wrote a page about cpp-peglib HOT 2
- Compatibility with CPM package manager HOT 3
- Nondeterministic parsing failure in grammar HOT 4
- C++11 Version? HOT 1
- Docker container exits immediately with "pywintypes.error: (109, 'ReadFile', 'pipe has ended')" when building a project in Windows
- call_once failed in mingw64 HOT 1
- Changing root feature HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cpp-peglib.