Code Monkey home page Code Monkey logo

Comments (7)

zevv avatar zevv commented on May 22, 2024

from npeg.

reverendpaco avatar reverendpaco commented on May 22, 2024

Zevv, thanks for the response.

I will work on putting together a small example for you, but I have a 95% done grammar here:
https://gist.github.com/reverendpaco/fc64863d15539349e856a5e1883b73df

I was using Tatsu (for Python) before I decided I needed/wanted a language that could produce static binaries and was linkable to C (hence Nim). I also like that Npeg offers Pratt precedence operators as I hope to parse SQL expressions according to their defined precedence.

In general, the committed choices I used in this grammar came from two sources:

  1. a desire to create a cleanliness/modularization to the look of the grammar
  2. actual ambiguity.

For instance in my Datalog-to-SQL transpiling language, I hope to model Prolog/Datalog predicates to SQL tables, so everything is based off the functor form

employee(*). // employee is the functor

But as I hope to allow internal subqueries, a predicate/query can look like any of the following:

employee([first_name,last_name]). // select first_name,last_name from employee
employee(+[length(last_name)]). // add length as new column to all other columns
employee(-[salary]) // remove the salary column from the list of all columns
employee(*). // select everything
employee(**). // select everything but bind all column names to LVars
a@prod.employee(*). // put an alias on the table employee 'a' 
              // and use the employee table in the 'prod' schema
employee(FirstName,LastName, #... etc). // traditional Datalog positional

So a rule for this would look like in NPeg:

 named_predicate <-  >?alias * ?ns_wrapper * >functor * "(" * functorArgs * ")"
 functorArgs <- projection_like | star | double_star | eraser_like | datalog_like 
 eraser_like <- "-[" * columns * "]" 
 datalog_like <- columns 
 columns <- # recursively defined rule of individual columns

If you want to see a whole bunch of examples of what I hope my language does from transpiling from a Datalog to a SQL, see here:
https://docs.google.com/document/d/1qf19JmPbSnX4h-fPgPU6QmL6AvEKVXxUBGtaCbyJenM/edit?usp=sharing

I am only 3 days into learning Nim (what better way then to get in over my head with a parser project?). I am really eager to be able to use it for its performance and interoperability (especially with SQLite which is all in C).

I suppose, in the spirit of being some random person requesting help from an open-source author, I could offer another thought on how I could accomplish this, though it would be a significant change to how NPeg works today and it sidesteps the entire idea of global state: if npeg were not to return a seq[string] in its capture but a richer AST marked up with the rule that generated it, then I could walk the AST and not have to worry about real-time serialization to a concrete model until after parsing.

I am using Tatsu as an example here, but I've seen the Guile peg parser and a some others do it. If I were to have a rule:

some_rule <- >first * second * >third * >fourth * >my_name:fith 
fourth <- "MATCHME!" 
fifth <- "HERE" # second is used for matching but not captured

then the AST returned would be a dictionary/array combo like:

{some_rule: [ {first: ...} , {third: ...}, {fourth: "MATCHME!"}, {my_name: "HERE"} ] }

much like JSON but with only strings , arrays and dictionaries.

I really am a fan of the way Tatsu does this and allows modification/tweaking of the AST as its generated, including changing singletons to arrays and 'join' like rules which auto-rollup separator-delimited lists.

Sorry for the longish comment, and I appreciate all the work you and other open-source library authors put in.

from npeg.

zevv avatar zevv commented on May 22, 2024

from npeg.

reverendpaco avatar reverendpaco commented on May 22, 2024

I also find that it was too limited in the types and forms of trees it could built.

I have been struggling with this as I've learned Nim. The variant object facility provides a semi-union type, but not quite ideal for the needs of a parser. I would think it's best to ignore richer types beyond just the strings that were matched, the arrays representing the concatenative order in which rules were composed and the table representing the name of rules. Stealing from the JSON source code, I think something like this could model it. Your match() proc could return a seq[NPegNode] and someone could write (or you could provide) a DFS walker to inductively generate their concrete types from this parse tree:

type
  NPegNodeType = enum
    dic,leaf,sequence
  NPegNode {.acyclic.} = object 
    case kind: NPegNodeType
      of dic: table: OrderedTable[string, NPegNode] 
      of leaf: data: string
      of sequence: values: seq[NPegNode]

I am also looking at waxeye since it compiles to C and I could use it from Nim.

from npeg.

zevv avatar zevv commented on May 22, 2024

from npeg.

zevv avatar zevv commented on May 22, 2024

Is there anything I can do to help you out with this, still?

from npeg.

zevv avatar zevv commented on May 22, 2024

Closing this, feel free to reopen if it's still relevant.

from npeg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.