A recursive-descent parser generator for D
LLtool generates the body of a parser from a context free grammar written in EBNF. The generated code fragment can be mixed into D source.
The generated code makes some assumptions about the environment:
- It assumes there are functions
expect()
,consume()
andadvance()
. - It assumes there exists an enumeration
TokenKind
. - It assumes there is a member/variable
tok
. The type oftok
is not important. Only a member/propertykind
(of typeTokenKind
) is required. - Currently the generator assumes that
std.algorithm.among
is imported.
The input for syntax LLtool is similar to yacc/bison. It has the following specification:
%token identifier, code, argument, string
%start lltool
%%
lltool : ( header )? ( rule )+ ;
header : ( "%start" identifier | "%token" tokenlist | "%eoi" identifier )* "%%" ;
tokenlist : tokendecl ("," tokendecl )* ;
tokendecl : (identifier | string) ( "=" identifier )? ;
rule : nonterminal "=" rhs "." ;
nonterminal : identifier ( argument )? ;
rhs : sequence ( "|" sequence )* ;
sequence : ( group | identifier ( argument)? | string | code | "%if" code )* ;
group : "(" rhs ( ")" | ")?" | ")*" | ")+" ) ;
This specification uses the following tokens:
identifier
: a sequence of letters and digits. First element must be a letter. Only ASCII characters are supported.string
: an arbitrary sequence of characters, enclosed by"
and"
or'
and'
.code
: an arbitrary sequence of characters, enclosed by{.
and.}
.argument
: an arbitrary sequence of characters, enclosed by<
and>
or<.
and.>
.
Single-line comments start with //
and run until the end of line.
Multi-line comments use /*
and */
as delimiters. Multi-line comments may not
be nested.
- Support parser generation at compile time
- Implement error recovery
Feedback of any kind is very much appreciated!