beelsebob / coreparse Goto Github PK
View Code? Open in Web Editor NEWA shift/reduce parsing framework for Mac OS X and iOS
License: BSD 3-Clause "New" or "Revised" License
A shift/reduce parsing framework for Mac OS X and iOS
License: BSD 3-Clause "New" or "Revised" License
There are some fixed since 1.1, if you feel comfortable, would you please push a new version tag to github?
The grammar parser currently will not parse this rule:
A ::= (<B> | <C>) because of the use of '|' inside parentheses.
At the moment a tokeniser must produce its entire output before the parser can start, these two should be able to run as producer/consumer on separate threads.
I was creating a parser:
Expression ::= (<Orientation>)?
(<Superview><Connection>)?
<View> (<Connection><View>)*
(<Connection><Superview>)?;
Connection ::= '-' | '-' 'Number' '-';
View ::= '[' 'Identifier' ']';
Superview ::= '|';
Orientation ::= 'V:' | 'H:';
I encountered error when I try to create parsers (SLR/LR1/LALR1). ("Could not insert reduce in action table for state 14, token -")
Test Case:
NSError* error;
NSString* grammarPath = [[NSBundle mainBundle] pathForResource:@"VFL" ofType:@"grammar"];
NSString* grammarString = [NSString stringWithContentsOfFile:grammarPath
encoding:NSUTF8StringEncoding
error:&error];
CPGrammar* grammar = [[CPGrammar alloc] initWithStart:@"Expression"
backusNaurForm:grammarString
error:&error];
NSLog(@"grammar: %@", grammarString);
CPTokeniser* tokenizer = [[CPTokeniser alloc] init];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"V:"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"H:"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"|"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"-"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@","]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"("]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@")"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"["]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"]"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@">="]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"<="]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"=="]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@">"]];
[tokenizer addTokenRecogniser:[CPKeywordRecogniser recogniserForKeyword:@"<"]];
[tokenizer addTokenRecogniser:[CPIdentifierRecogniser identifierRecogniser]];
[tokenizer addTokenRecogniser:[CPNumberRecogniser numberRecogniser]];
[tokenizer addTokenRecogniser:[CPWhiteSpaceRecogniser whiteSpaceRecogniser]];
CPTokenStream* stream = [tokenizer tokenise:@"H:|-50-[label]-50-|"];
NSLog(@"stream: %@", stream);
CPParser* parser = [[CPLALR1Parser alloc] initWithGrammar:grammar];
CPSyntaxTree* ast = [parser parse:stream];
NSLog(@"ast: %@", ast);
If I remove (<Connection><View>)*
from the grammar it will work, I wonder can I rewrite the grammar such that I can make it work with CoreParse?
As title.
Thanks in advance!
This would allow it to begin tokenising mid way through downloading content, rather than requiring it all to be in memory before beginning.
When the tokeniser fails to recognise anything, a delegate method should be called allowing the user to skip over characters and/or insert an error token in the token stream.
Summary:
At the moment, CPParseResult implementations are highly reliant on the position of elements within productions. This means that if the grammar is edited, some significant modifications to the relevant initWithSyntaxTree: method. It would be extremely useful to provide a method of accessing elements without relying on position in the rule.
Proposal:
Allow the use of keys as so in productions:
A ::= b@<B> c@<C> ('{' d@<B>* '}')?
CPSyntaxTree would gain an additional method -childForKey:. In this instance, with the following results:
[tree childForKey:@"b"] – equal to the current [[tree children] objectAtIndex:0].
[tree childForKey:@"c"] – equal to the current [[tree children] objectAtIndex:1].
[tree childForKey:@"d"] – equal to the current [[[[[tree children] objectAtIndex:2] objectAtIndex:0] objectAtIndex:0] objectAtIndex:1] if the ? production exists, nil otherwise.
[tree childForKey:@"e"] – nil.
The use of keys inside ()*s and ()+s would be disallowed, as would duplicate keys on the same section of an alternative. All of the following would be invalid:
A ::= b@<B> b@<B>
A ::= ( b@<B> <C> )*
A ::= ( b@<B> <C> )+
This however would be allowed:
A :: = b@<B> | b@<C>
Notes: It has been suggested that : is a better character than @ for keys:
A ::= b:<B> c:<C> ('{' d:<B>* '}')?
CoreParse needs a method of handling shift/reduce and reduce/reduce conflicts to allow for dealing with ambiguous grammars.
Grammars should be specifyable using single quote marks to surround terminals to avoid escaping double quotes in strings all the time.
CoreParse needs documentation for all classes.
I'm new to Xcode and iOS and the procedure to add CoreParse into my iOS project is not straightforward. When I do the basic stuff I am still getting linker errors, I believe because of CoreParse's use of categories.
Step-by-step instructions on how best to do this would be helpful.
CoreParse does not currently support creating LALR(1) parsers.
Currently the project contains only a Mac OS X project, an iOS static library would be nice too.
Core parse currently just bombs out printing the remaining token stream when it fails to parse – this needs to be improved, both in terms of making it return errors rather than logging, and by making it able to recover from errors and continue parsing.
What's a recommend way to create a tokenizer that will handle Python style blocks where the blocks are denoted by indentation?
There's been a lot of interesting changes in CP recently, including some API changes and a large number of optimizations.
This is just a reminder from your friendly CocoaPod spec maintainer to push up a new tag when you feel like you have a stable/finished release. When you do, close this issue and I'll have an updated Podspec committed in a short while. (Existing spec resides at here).
Thanks!
I'm working on a grammar that should accept optional white space. I added the optional white space rule and the parser complain error "Could not insert reduce in action table for state 4, token #".
I could not tell if this is problem in grammar or it is not supported by CoreParse. Appreciate any pointer to this error!
Before I added optional space:
CSSSelectors ::= <CSSSelectorSequence> (<CSSCombinator> <CSSSelectorSequence>)*;
CSSCombinator ::= <Greater> 'Whitespace'* | <Plus> 'Whitespace'* | <Tilde> 'Whitespace'*| 'Whitespace'+;
Plus ::= '+';
Greater ::= '>';
Tilde ::= '~';
After I added optional space, parser return error:
CSSSelectors ::= <CSSSelectorSequence> (<CSSCombinator> <CSSSelectorSequence>)*;
CSSCombinator ::= <Greater> 'Whitespace'* | <Plus> 'Whitespace'* | <Tilde> 'Whitespace'*| 'Whitespace'+;
Plus ::= 'Whitespace'* '+';
Greater ::= 'Whitespace'* '>';
Tilde ::= 'Whitespace'* '~';
Full grammar: https://github.com/siuying/CSSSelectorConverter/blob/core-parse/CSSSelectorConverter/CSSSelectorGrammar.txt
Parsers should have a method of recovering from errors and continuing parsing.
We are developing an app that uses CoreParse for parsing specific strings. I realised that the library leaks memory at some point. The most critical leak is in CPShiftReduceParser
when instantiating a CPRHSItemResult
In order to fix this leak i just modified this line:
result = [(id)[c alloc] initWithSyntaxTree:tree];
and added the autorelease
call
result = [[(id)[c alloc] initWithSyntaxTree:tree] autorelease];
This fixed several leaks for me.
I've fleshed out the example described on the main CoreParse web page:
https://github.com/gavineadie/ParseTest
I noticed for some CPGrammar, when archived and then unarchived, will be broken and cannot be used, and this only happened on iOS platform.
An example test case: siuying@1344c73
Somehow the order of the rules changed after archive/unarchive, and the resulting grammar is not correct.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.