Code Monkey home page Code Monkey logo

Comments (6)

mqnc avatar mqnc commented on August 14, 2024

I've done some more investigation. Here is my complete c++ file:

#include "peglib.h"
#include <iostream>
#include <cstdlib>

using namespace peg;
using namespace std;

int main(int argc, const char** argv)
{

	parser parser(R"(
		term <- ( ws1 atom1 ws2 op )* ws3 atom2 ws4
		#term <- (atom1 op)* atom2
		op <- '+'
		ws1 <- ' '*
		ws2 <- ' '*
		ws3 <- ' '*
		ws4 <- ' '*
		atom1 <- [0-9]*
		atom2 <- [0-9]*
	)");

	for(auto& rule:parser.get_rule_names()){
		parser[rule.c_str()] = [rule](const SemanticValues& sv, any&) {
			cout << "rule " << rule << " called\n";
			for(int i = 0; i<sv.size(); i++){
				cout << "  sv" << i << ": " << sv[i].get<string>() << "\n";
			}
			return rule + " at " + to_string(sv.c_str()-sv.ss);
		};
	}

	auto expr = "1";
	if (parser.parse(expr)) {
		return 0;
	}

	cout << "syntax error..." << endl;

	return -1;
}

I've numbered the rules so we can see which one is called. And I also display at which position it matches. Here is the output:

rule ws1 called
rule atom1 called
rule ws2 called
rule ws3 called
rule atom2 called
rule ws4 called
rule term called
  sv0: atom1 at 0
  sv1: ws2 at 1
  sv2: ws3 at 0
  sv3: atom2 at 0
  sv4: ws4 at 1

It should run into the parens, match ws1, atom1, ws2, not match op and then reject the whole parenthesized chunk but somehow the results are conserved (except for ws1 for some reason).

If I do the same thing without the ws, everything's fine:

term <- (atom1 op)* atom2

->

rule atom1 called
rule term called
  sv0: atom1 at 0

but one whitespace already messes things up:

term <- (ws1 atom1 op)* atom2

->

rule ws1 called
rule atom1 called
rule atom2 called
rule term called
  sv0: atom1 at 0
  sv1: atom2 at 0

I will try to dive into the code, let's see if I can spot the bug.

from cpp-peglib.

mqnc avatar mqnc commented on August 14, 2024

Ok I think I know what it is.

parser parser(R"(
	term <- ( a b c x )? a b c
	a <- 'a'
	b <- 'b'
	c <- 'c'
	x <- 'x'
)");
auto expr = "abc";

->

rule term called
  sv0: b at 1
  sv1: c at 2
  sv2: a at 0
  sv3: b at 1
  sv4: c at 2

It seems when a parenthesis is rejected, only one item from the list of semantic values is deleted. I will see if I find that in the code.

from cpp-peglib.

yhirose avatar yhirose commented on August 14, 2024

@mqnc, thanks for the detailed bug report. I'll also look into it.

from cpp-peglib.

yhirose avatar yhirose commented on August 14, 2024

@mqnc, I guess I fixed it. Could you try your test with the latest code?

from cpp-peglib.

mqnc avatar mqnc commented on August 14, 2024

You're ok! :D
Normally you have things implemented about 10 minutes after they were reported, so I was already worried about you! :D
Just kidding :)

Thanks for implementing it! I will try if it works with my stuff now. But I see you already included the test in your tests. It's strange that this didn't cause problems before, it should have also occured in your calc.cc example...

But I have a question. Pegdebug does a number of indexed string substitutions with the semantic values. For instance:

"Hello world!"
sv0: substitute characters 0..4 with "(div)Hello(/div)"
sv1: substitute characters 6..10 with "(div)world(/div)"

For this to work, I have to work strictly backwards, since the first substitution would change the character indices of the latter substitutions. I am assuming that the semantic values are always ordered by their occurance in the text. Is that guaranteed or should I always sort them manually?

Cheers!

from cpp-peglib.

yhirose avatar yhirose commented on August 14, 2024

Yes. Semantic values should be sorted by their occurrences. Hope your project will go well!

from cpp-peglib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.