Code Monkey home page Code Monkey logo

parsley's Introduction

Parsley PRE-RELEASE

Parsley is a recursive descent parser combinator library that makes it simple to write complex, type-safe parsers in Swift. Note that Parsley does not promise source compatibility for the time being. The library is in active development, and many thing might change in the near future. Specifically,

  • I want to somehow modify it to support tracking indices while parsing. This might introduce a requirement that what's being parsed is indexable.
  • I want to improve error handling, especially in its suggestion of expected value v. actual value.
  • I want to clean up the library so that it is more lightweight with another library adding more functionality on top of it.

That said, feel free to play around with it if you'd like, it doesn't bite!

Lexing

It's super easy to define lexable types in Parsley! The simplest types to lex are those whose values can be enumerated and matched against the input. For example, if you'd like to lex a finite set of operators, your operator type can conform to Matchable and it will be automatically Parsable.

Conformance to Matchable simply requires a static member all listing all the possible values of that type and an instance member matcher that will only accept input that matches, throwing a ParseError otherwise.

enum OperatorSymbol: Character, Matchable {
    case Plus  = "+"
    case Minus = "-"

    static var all: [OperatorSymbol] = [.Plus, .Minus]
    
    var matcher: Parser<Character, ()> {
        return character(rawValue).discard()
    }
}

let testPlus  = OperatorSymbol.parse("+") // Optional(Operator.Plus)
let testMinus = OperatorSymbol.parse("-") // Optional(Operator.Minus)

Sometimes we'd like to lex more complex types, literal values for example. In these cases, we'd like to manually conform our type to Parsable by defining a static member parser that will transform valid input into an instance of our type.

enum LiteralValue: Parsable {
    case StringLiteral(String)
    case IntegerLiteral(Int)
    
    static var parser = coalesce(
        between(character("\""), parseFew: any()).stringify().map(LiteralValue.StringLiteral),
        prepend(within("+-").otherwise("+"), many1(digit)).stringify().map{ Int($0)! }.map(LiteralValue.IntegerLiteral)
    )
}

let testString  = LiteralValue.parse("\"hey\"") // Optional(LiteralValue.StringLiteral("hey"))
let testInteger = LiteralValue.parse("-123")    // Optional(LiteralValue.IntegerLiteral(-123))

Once we've defined our basic token types, we might want to define some union type that can hold each of the cases. This type itself ought to be Parsable since each of its cases refer to Parsable tokens.

enum Token: Parsable {
    case Value(LiteralValue)
    case Operator(OperatorSymbol)
   
    static var parser = coalesce(
        LiteralValue.parser.map(Token.Value),
        OperatorSymbol.parser.map(Token.Operator)
    )
}

let testTokens = try! tokens(Token.self).parse("532 - 11 + \"hello\" + -3")
// [
//  Token.Value(LiteralValue.IntegerLiteral(532)), 
//  Token.Operator(OperatorSymbol.Minus),
//  Token.Value(LiteralValue.IntegerLiteral(11)),
//  Token.Operator(OperatorSymbol.Plus),
//  Token.Value(LiteralValue.StringLiteral("hello")),
//  Token.Operator(OperatorSymbol.Plus),
//  Token.Value(LiteralValue.IntegerLiteral(-3))
// ]

Note that the order we coalesce the two parsers does have an impact on the result. If we had swapped the order, we would be unable to recognize negative integer literals since we'd always parse the minus sign as an operator first.

Parsing

Once we've finished lexing the input, it's time to parse! Now, the dividing line between these two stages isn't quite as clear as one might expect (as is evident by the fact we talked about a Parsable protocol in the lexing section of this document)! The lexing stage ought not care about the recursive tree-like structure of the input. Instead, the lexing stage ought to emit a linear sequence of tokens that simplifies the parsing logic. For example, the lexer ought to deal with discarding (or handling) whitespace so the parser doesn't have to complicate its logic worrying about these cases.

To be continued.

Impatient? Check out the documentation here or dive into the source code!

parsley's People

Contributors

jadengeller avatar

Stargazers

gsho avatar Patrick Hart avatar  avatar James Bean avatar

Watchers

James Cloos avatar James Bean avatar  avatar

parsley's Issues

Coalesce callback

It might be nice to get a callback when things like coalesce fail one part of it (and other parsers) so we can deal w/ it somewhat (for debug purposes for example).

Then method

It'd be really nice to have an extension Parser that adds a "then" method that throws away () results. The issue is that, to throw away the self result, we'd have to have a extension Parser where Result == (), which is not possible in Swift. One solution would be to make all void-returning methods return struct Ignored { } instead where Ignored conforms to some protocol IgnoredResult { }. This would also have the added benefit of extension Array: IgnoredResult where Element: IgnoredResultโ€”if Swift supported that too.

Optimizations

We could optimize this, but it would require a whole different parser structure. Might be worth, though.

sequence(a, b) ?? a

No need to backtrack and reparse the exact same thing, and no reason to try it again if it fails. Transform to this:

sequence(a, optional(b))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.